In some cases it happens that you need to collect several data points into one single field. In this article I’ll share with you how to extract such data from a JSON string with Google Looker Studio. Key benefit is that you have maximum flexibility to access all data points contained in such a string. To achieve this, In the following how to, I’ll show how to extract data with the Google Looker Studio regular expression function called „REGEXP_EXTRACT“.
1. Problem & challenge
For one of my clients, we are collecting a huge amount of data within a single event in Google Analytics. It’s a results data set of a configurator tool. For several reasons, we decided to send a complete JSON string with all configurator results into one event data point.
As a next step, we had the challenge to unravel that data from that JSON string again. From this, we came up with the solution introduced below.
Here’s how we did it before developing this new approach. We worked with the function „CONTAINS_TEXT“ and the following looker studio resource custom field construct:
But there are some constraints coming with this solution:
- Only a few parameter values can be added the manual way. In case of above shown example of cities‘ names, one can imagine there are way more cities that users can come from.
- Use the data as dimension, not as metric. Thus, this is not so flexible.
- Couldn’t extract values with many different values. (Show example of 0.12345 value)
A great way to accomplish this is Google Looker Studios Regular Expressions function: REGEXP_EXTRACT.
You can find Google’s documentation of the function here.
The following outcomes come with that approach:
- More flexibility
- collect *all* values contained in the JSON string
- Do calculations with numbers coming from a JSON string
Two more annotations before we start:
-
- This also works with any other format in which data is packed into a single field. For example comma separated values (CSV).
- This article is not about how to connect a JSON data source to Google Looker Studio. To this topic others have already delivered solutions like Garrett Vorce: „Google Looker Studio Tutorial with JSON“ on Medium.
So, let’s now have a look into the details, how to achieve this:
3.1 Pre-Requesites: Setup a Google Looker Studio dashboard and connect your data source
You can read at google.com how to accomplish that: Click here.
3.2 Add a new data source field
- In Google Looker Studio, click in the menu on „Resource“
- Then click on „Manage added data sources“
- Find your data source
- Click on „Edit“
- Click on „(+) Add a field“
3.3 Use regexp_extract to extract the data out of the JSON string
3.3.1 Extract a text string from JSON – with a Regular Expression
- Now, in the „Add a field“ dialogue, first give your new field a name.
In our example, we want to extract the information „city“ from the JSON string. So we can call the field i.e. „City (JSON)“. - Then it’s time to enter the formula:
REGEXP_EXTRACT(Ereignisaktion,'"city":"((\"|[^"!,])*)"'
- So now, what does this formula do in detail? We’ll have a look:
- With
REGEXP_EXTRACT
the function gets called Ereignisaktion
is the (german) name of the Analytics event tracking data field, in which the JSON string is stored. (In english it would beEvent action
.)- The key part now is the regular expression (also called „regex“ or „regexp“):
'"city":"((\"|[^"!,])*)"'
. It does the following:- The two
'
signs mark beginning and end of the „regex“. "city":"
are the signs that are directly before the searched string (here the cities‘ names) that we want to extract.- With the part
((\"|[^"!,])*)"
the searched string (the city name) gets collected into our newly added field.
- The two
- With
There are different kinds of data that could be extracted:
3.3.2 Extract numbers from JSON
You want to extract numbers from JSON? This is also possible.
- You can do this with following regular expression (regex) formula. Example here is a Longitude definition named lon:
REGEXP_EXTRACT(Ereignisaktion,'"lon":(([^"!,])*)')
With this regex formula we do not search for a string in brackets, but for a string without brackets. This is like numbers are defined in JSON. - To be able to calculate with this new metric also, you will need to change the data fields type to „number“:
3.4 Add a new chart to your Looker Studio Dashboard with the new data source field
Now you can add the new data to your Google Looker Studio dashboard:
Subscribe to my news & blog updates
Your data will not be disclosed to third parties. More about this in the privacy policy. Of course, you can unsubscribe at any time.
4. Some final words
In case you have any questions or annotations, please leave a comment.
About me: My name is Thomas Gerstmann. I support organizations to understand their users & data, so they can reach their goals online better. My focus is more on data-driven strategy and analytics & research as well as Experimentation. But I also work often – like documented here in this article – on technical questions. My goal is to help reaching #ZeroHunger.
Recent posts in “Datadriven Marketing”
Vortrag “Datengetriebene Entscheidungen” – Als Gastdozent an der FH Aachen (14. November 2024)
Heute, am 14. November 2024, hatte ich die Gelegenheit, als Gastdozent an der FH Aachen...
How to get Web Analytics right – Part 2 – Strategy, Goals and KPI
A well-founded strategy for web analytics data analysis is key to an organizations...
How to get Web Analytics right – Part 1 – 3 Fatal Flaws in Your Digital Analytics Setup
During the last years I had the chance to see – and setup/maintain – many Digital...