RegEx guide for Google Analytics 4, Google Tag Manager, Search Console, Looker Studio

Regular Expressions (RegEx) may seem complicated  at first, but once you get to know them, you will manage Google Analytics, Search Console, Google Tag Manager and Looker Studio like never before.

Content:

Part 1. RegEx basics

Regular Expression (RegEx) is a string describing a specific text pattern. So instead of having multiple “contains” conditions, you can match different data with just one Regular Expression. Think of it as a shape fitting game for kids – you have to define the hole so only the desired shape could fit.

RegEx for kids
RegEx is all about defining a pattern only selected items could fit

RegEx ABC

To build a pattern you need to learn the RegEx characters. Here are the characters most frequently used in Google Analytics and Google Tag Manager (GTM).

RegExMeaningExample
|ora|b – matches a or b
.any single charactera.c – matches abc, acc, adc, …
?zero or one previous charactergoo?gle – matches gogle and google, but not gooogle
*zero or more previous charactersgoo*gle – matches gogle, google, gooogle
+one or more previous charactersgoo+gle – matches google, gooogle, but not gogle
^start of the string^apple – matches apple juice, but not pineapple
$end of the stingapple$ – matches pineapple, but not apple juice
[]list of items to match to[a-z] – matches any lowercase letter from a to z
b2[cb] – matches b2c, b2b
()group elementsJan(uary)? – matches Jan, January
January?  – matches Januar, January
{}define character count {x}, {x,y}[0-9]{2} – matches any two number string  from 01 to 99
[0-9]{1,3} – matches any number string from 1-999
 \treat RegEx characters like normal characters\? – matches a question mark, not zero or one character

How to know you made the RegEx right?

  1. Test. Apply filters and see if they work as expected.
  2. Use RegEx debuggers – tools that allow you to enter your pattern and test different self-created strings to see if there is a match or not. If you google them, there will be a dozen different tools. One of my favorite is https://www.debuggex.com/ as they have a pretty cool pattern visualization.

Part 2. RegEx in Google Analytics 4

Main difference between between Universal Analytics (those who remember)  and Google Analytics 4

Universal Analytics regular expressions had partial pattern match by default, while in Google Analytics 4 pattern has to match fully.

So using matches regex filter for source / medium dimension with google value, will match only data exactly matching google. To filter all the entries that contain google, you would need to use the regular expression .*google.* 

Where can you use RegEx matching in GA4? 

Regular expressions can be used in:

  • Exploration custom report data filters
  • Report filters in Reports section, when editing those
  • Referral Exclusions (Admin > Data Streams)
  • Audience creation (Admin > Audiences)
  • Data Stream filters to define internal traffic rules based on IP address ranges (Admin > Data Streams)
  • Custom Channel Group creation (Admin > Data Settings)
  • Event Modification (Admin > Events)

RegEx is not supported in (at least for now):

  • Interactive Table filters in Standard reports

Won’t cover all the use cases here, rather mention a few examples and tips.

RegEx in GA4 Reporting section

Unfortunately, RegEx is not supported in standard report interactive filters for tables. Hope Google will improve it someday, as it was very convenient in Universal Analytics. 

RegEx won’t work here :(

While you can edit default or create your own reports and use RegEx in filters via report editing functionality. There are both full and partial match options.

Click on Customise option, and there you can apply filters for the whole report or summary cards.

RegEx in GA4 Exploration section

Explorations is the main playground for reports in Google Analytics 4 and here RegEx matching is available along other filtering options.

Probably most frequent RegEx use would be in report filtering.

You can also use RegEx in Funnel step filters. For example, to create and analyse a funnel for specific products. 

RegEx for Referral Exclusion in GA4

Referral exclusion is very important to ensure proper conversion attribution, especially for e-commerce, where payment gateways often “steal” conversion credit from true conversion sources.

In Google Analytics 4, Referral Exclusion is well hidden, so it won’t be that easy to find.

  • Go to Data Streams
  • Click on a Web Data Stream
  • Click on Configure tag settings
  • Click on Show more in Settings section (almost there)
  • Should finally see List unwanted referrals option
  • Add the domains you want to exclude, one per each entry or using RegEx matching

 

Part 3. RegEx in Google Tag Manager

Regular expressions in Google Tag Manager have partial match by default. There are also often case sensitive and insensitive options to use, for example in Trigger conditions. As well as negative (does not match) and positive (match).

Where can you use RegEx matching in GTM? 

  • Trigger conditions
  • RegEx Table Variable
  • Custom JavaScript Variable, using RegExp() function
  • Custom HTML Tags, with JavaScript code

Will cover few example below, but strongly suggest to check an article from MeasureSchool.

Trigger for all events (.*)

Dot means any character, asterisk – none or more previous (that is any) characters. So the pattern matches any event name. Just don’t forget to check the “Use regex matching” option.

Trigger for multiple event names (|)

Similarly, instead of creating many separate Triggers, you can create one for different event names.

Trigger for Home pageviews

Pattern for a homepage  usually would be ^/$. In human language it means “starts and immediately ends with /”.

As URL variable contains domain name and possible query parameters (that is what goes after ?), better to use Page Path (Make sure you have it enabled in your GTM Account, under Variables > Enabled Built-In Variables).

If the page is multilingual, homepage can be also /en, /en/, /lv or /lv/. The pattern then would be ^/(lv|en)?/?$.

Looks complicated, right? :) To describe this pattern in plain English – Page Path path must:

  1. ^/ –  start with a slash;
  2. (lv|en)? –  can have either one or none of lv or en;
  3. /?$ – must end with / or nothing.

The question mark is the one that says “one or none characters” and () brackets are used to group elements. See the nice visualization of this RegEx below.

RegEx visualization with debuggex.com
Multilingual homepage RegEx visualization with debuggex.com

 

Part 4. RegEx in Search Console

Regular expressions in Google Search Console use partial match by default and are case insensitive.  See Google documentation for more details.

Most common use case could be filtering Queries or URLs in Performance reports. For example, matching misspellings or grouping search queries.

 

Part 5. RegEx in Looker Studio

You can use RegEx in report Filters.

Also quite popular RegEx use case would be in custom dimension formulas, for example grouping Search Console queries for this blog (see this article for more details). Similarly can group or cleanup traffic sources, campaign names and etc.  Suggest to check this article by OptimizationUp for more examples.

CASE
 WHEN REGEXP_MATCH ( Query , "" ) THEN "not set"
 WHEN REGEXP_MATCH ( Query , ".*(regex|regular|match|path|url).*" ) THEN "RegEx"
 WHEN REGEXP_MATCH ( Query , ".*(pageview|virtual|event).*" ) THEN "Pageview vs Events"
 WHEN REGEXP_MATCH ( Query , ".*(link|click).*" ) THEN "Link Tracking"
 WHEN REGEXP_MATCH ( Query , ".*(javascript|js|variable|lowercase).*" ) THEN "JS variables"
 WHEN REGEXP_MATCH ( Query , ".*(debug|working).*" ) THEN "Debugging"
 WHEN REGEXP_MATCH ( Query , ".*blog.*" ) THEN "Blog"
 ELSE "other" END

Part 6. Common RegEx patterns

GA4 Ecommerce events

.*(select_promotion|view_promotion|view_item_list|select_item|view_item|add_to_wishlist|add_to_cart|remove_from_cart|view_cart|begin_checkout|add_payment_info|add_shipping_info|purchase).*

Payment gateway referrals (global) for exclusion

.*(paypal|stripe|pay\.google|secure|visa|klarna|3ds|payments).*

Payment gateways referrals (local) for exclusion

.*(swedbank|seb|citadele|klix|dnb|luminor|privatbank|makecommerce|maksekeskus|lpb|paysera).*

Authorization referrals for exclusion

.*(accounts\.google).*

Other RegEx resources

Instead of any closing wisdom, I better share some more useful resources to check.

Theory:

Theory with Examples:

Tools:

Fun:

Note: This article was first published on Jan 24st 2015 and updated in 2023.

[cover photo by markus spiske]

12 thoughts on “RegEx guide for Google Analytics 4, Google Tag Manager, Search Console, Looker Studio

  1. Simply awesome! This gives a great breakdown of some basic patterns and serves as a launch point. Now I’m really starting to understand.

    The table explains what each symbol does in plain English, and the examples are actually usable in my work.

  2. This is so helpful. Nice to get an explanation that’s well written and so straightforward that it’s easy to understand. Most articles on regex and GTM are not!
    Thanks Aleksandrs

  3. Hi aleksandrs,

    Thanks for the article. I have a question: we have a multilangual website, for example: http://www.example.com/nl/.

    I want to track the the NL site seperatly into GA, using GTM, but I’m not getting the code installed correctly, is still measures all website traffic. Is is because of the GA code, or does the GTM trakcing installed incorrectly? Below a screenshot of the settings:

    PagePath > Matches RegEx > ^/(lv|nl)?/?$
    Page Hostname > Contains > http://www.example.com

    Please help me out!

    Regards,

    Reinier

    1. Hi,

      Would need more information to propose something specific, but in your case maybe it will be enough just to have “Page Path starts with /nl/”. Or if you have multiple language versions and want to track them within different Properties, you can:
      1) JS variable with language version – get lang from the URL (see the code here)
      2) Lookup Table by language variable just created to return the language specific GA properly

  4. Thanks a lot, really useful article. It helped me to determine home page in particular for scroll tracking tag :)

  5. I have for example

    http://www.example.com/en
    http://www.example.com/en-AU
    http://www.example.com/en-CA
    http://www.example.com/fr-FR
    http://www.example.com/de-DE
    http://www.example.com/pt-PT

    plus

    http://www.example.com/en/about
    http://www.example.com/en-AU/help
    http://www.example.com/en-CA/help
    http://www.example.com/fr-FR/faq
    http://www.example.com/de-DE/login
    http://www.example.com/pt-PT/register and so on….

    I have a tag created to add dynamic titles to each of these pages. What I need is to make default titles and fire on all these occurances because there are more as per countries.

    What will be RegeX pattern to detect:

    /en
    /en-AU
    /en/
    /fr-FR/
    /es-DE/ etc.

    Please help
    Thanks!

  6. Thank you for publishing the Homepage filter with the single slash. I just kept returning everything with a slash in it (so the whole site : ) ).

  7. Great page thanks very much.

    I was working on GTM trigger for homepage and also had to ensure to include any UTM params. I think you can also use Page Url with the following regex:
    ^.*website.com($|\/$|\?.*)

    See here for details:
    https://regex101.com/r/hcG2lq/1

    1. Hi,

      Thanks! Yes, good point if using Page URL, as with Page Path query parameters would be omitted from the value matched by RegEx.

Leave a Reply

Your email address will not be published. Required fields are marked *