Stape

How to export Google Universal Analytics raw data to BigQuery

Edited
Oct 24, 2022
Published
Nov 26, 2020

Before Google released Google Analytics 4, only GA 360 users had the ability to send raw events from Google Analytics to BigQuery. Because GA 360 is not free (the price starts from $70 000 per year), not all businesses used the raw data exploring feature.

If you use GA4, you can follow official documentation from Google on how to set up BigQuery Export.

For Google Universal Analytics, Google still has not provided free options to export raw data to BigQuery or any other database for that matter. But what if you don’t have plans to migrate to GA4 and want to get the benefits of BigQuery raw data? I decided to solve this question with GTM Server Side tagging and created this how-to article for you.

universal analytics stops working

How I send GA events from browser to BigQuery tableCopy link to this section

1. gtag.js sends GA event to Google Tag Manager Server container.

2. Universal Analytics client gets the request and triggers JSON HTTP Request Tag.

3. JSON HTTP Request Tag sends data to the URL of the Request To Google Cloud Storage function.

4. GCS function saves data on your GCP Storage bucket.

5. Use BigQuery GCP table type to perform data querying.

Things that you should know about this flow before you start to go thru how-toCopy link to this section

1. BigQuery can use Google Cloud Storage bucket as a table. More info about this feature can be found in this official documentation.

2. You can store on GCS 5 GB of data for free. One GA request is near 1 KB, so that’s 5 242 880 requests. More details about GCS pricing can be found on the pricing page.

3. The First 2 million requests to Cloud Functions are free, so that’s more than enough for our task. But bear in mind that you need to check usage and pricing if you already use it.

I decided to divide this how-to into two parts. First, I will tell you how to save data that came in request on GCS. And in the second part, I will tell you how to query this data with BigQuery. I hope this helps, and you will get the desired result faster.

How-to store data from Google Analytics client inside Google Cloud StorageCopy link to this section

Using this how-to, you can store data from any GTM Server Side client, not only Google Analytics.

1. We will start by creating a Google Cloud Storage bucket that will contain our raw event data. Search for Storage product in GCP search prompt. Then click on “Create Bucket”

google cloud platform storage
google cloud platform create bucket

2. Choose a name for your bucket, click “Continue” and choose region “us-central1”. Leave other options with default values.

3. Now you need to create a Google Cloud function that receives data from Google Tag Manager Server Side Container. To do this, go to Google Cloud Developer Console and open the Cloud Function section. Then, click “Create Function”.

google cloud platform create function

4. Fill in the name of the function and select the “Allow unauthenticated invocations” Authentication option. Copy “Trigger URL” because we will use it in the next steps. Click on the Next button at the bottom of the page.

google cloud platform configuration

5. Set “Entry point” field to “init”. Also, copy the content of this file and paste it into the code window. You need to change ‘your-bucket-name’ to the bucket name you’ve created in the previous steps. Also, change ‘very-secret-random-key’ to some random key. We will need it in the future. The runtime needs to be set to Node.js 10. One more thing, you need to edit the package.json file. Put this file content into it.

google cloud configuration index
google cloud configuration package

6. Now you can click on the deploy button. Google Cloud needs a few minutes to create the function for the first time so wait a little bit before deploying.

cloud functions

7. Time to create a tag inside the GTM Server container. If you don’t have Google Tag Manager Server-Side Container, you can follow this instruction for creating it. Also, you need to have a JSON HTTP request tag template installed on your GTM Container. If you don’t have it, follow this instruction for adding. Create a new JSON HTTP request tag and fill “Destination URL” with the URL you got when creating Cloud Function in step 4 and add to the end of it your secret key like this “?key=your-secret-key”. Don’t forget to add the trigger you use to determine Universal Analytics (check my screenshots for this step).

json http request
json http request
trigger configuration custom

8. Publish GTM Server container changes and generate test requests to GA Tag. You can simply open your website. If you did everything right, you will see files generated on the Google Cloud Storage bucket.

gtm-bigquery

If you set up storing raw events on GCS correctly, I salute you! The hardest part is done. Let’s follow a few more steps before you can start query data.

How to query data to Google Cloud Storage using BigQueryCopy link to this section

1. Go to BigQuery product in GCP and click on the “create dataset” button.

gtm bigquery
create dataset

2. Leave all options by default just fill the name field.

3. Now let’s create a table in this dataset. Click on “create table”. In the source selector, choose “Google Cloud Storage”. Field “Select file from GCS bucket” has to look like this: “your-bucket-name/*.json”. Table type set to “External table”. Choose any table name you want. Check the “Auto detect” option. At the end of this form, click on the create table button.

4. Now you can query your data. Click on the query table data button and set “*” as the field name. Run query. If you did everything right, you will see raw GA data in query results.

Please feel free to ask any questions about GA events export in the comments or our customer support channels. 

Got problem with exporting data?

Just tell us where you are stuck. Click Need assistance, fill up the form, and we will send you a quote.

Need assistance

Relevant posts

Edited Oct 24, 2022

3 Easy Steps to Test Server-Side Universal Analytics/GA4 and Facebook Conversion API

Google released a Server-Side version of Google Tag Manager in 2020. Server-side tracking is more complicated than the web container set up, at least for now, since the idea and technology behind server tagging are entirely different from what we used to have on the web. But server-side tagging will give your site huge benefits. Server-side tagging is getting more popular due to its ability to track people using AdBlockers, browsers with ITPs, and other tracking restrictions. The purpose of this article is not to convince you to start using server-side tagging (there is another blog post that describes the main benefits of server-side tracking). I assume that you’ve already decided to implement server-side tagging on your site. This blog post will show you how to check whether Server-Side tracking for Universal Analytics, GA4, and Facebook conversion API was set up correctly.

Edited Feb 23, 2024

Google tag manager server-side: How to set up server Tag Manager, Universal Analytics, GA4, and Facebook conversion API

Server-side tagging has been one of the main trends in web analytics for the last couple of years. Ad blockers, Intelligent Tracking Protection, 3rd party cookie restrictions, regulations like GDPR made analytics and advertising companies start worrying about how and what information they collect about site visitors. Server-side tagging allows moving third-party tags off your site and into a cloud server. In this case, third-party pixels are loaded directly from the could server rather than your site. In this article, I will explain and demonstrate the basics of setting up Google Tag Manager server container, server Universal Analytics, GA4, and Facebook Conversion API.

Edited Oct 24, 2022

How to prevent your Google Analytics data from being blocked by ad blockers

You’ve probably noticed that data about your website users and conversions collected by different analytics tools might vary. Most of the time, you’ll see that the purchase number in Google Analytics is lower than in your CRM. Where this data discrepancy comes from, why is it happening, and how to fix it?

Host your GTM server at Stape