Regular Expressions (RegEx) may seem complicated at first, but once you get to know them, you will manage Google Analytics, Search Console, Google Tag Manager and Looker Studio like never before.
Content:
- Part 1. RegEx basics
- Part 2. RegEx in Google Analytics 4
- Part 3. RegEx in Google Tag Manager
- Part 4. RegEx in Search Console
- Part 5. RegEx in Looker Studio
- Part 6. Common RegEx patterns
- Other resources
Part 1. RegEx basics
Regular Expression (RegEx) is a string describing a specific text pattern. So instead of having multiple “contains” conditions, you can match different data with just one Regular Expression. Think of it as a shape fitting game for kids – you have to define the hole so only the desired shape could fit.
RegEx ABC
To build a pattern you need to learn the RegEx characters. Here are the characters most frequently used in Google Analytics and Google Tag Manager (GTM).
RegEx | Meaning | Example |
| | or | a|b – matches a or b |
. | any single character | a.c – matches abc, acc, adc, … |
? | zero or one previous character | goo?gle – matches gogle and google, but not gooogle |
* | zero or more previous characters | goo*gle – matches gogle, google, gooogle |
+ | one or more previous characters | goo+gle – matches google, gooogle, but not gogle |
^ | start of the string | ^apple – matches apple juice, but not pineapple |
$ | end of the sting | apple$ – matches pineapple, but not apple juice |
[] | list of items to match to | [a-z] – matches any lowercase letter from a to z b2[cb] – matches b2c, b2b |
() | group elements | Jan(uary)? – matches Jan, January January? – matches Januar, January |
{} | define character count {x}, {x,y} | [0-9]{2} – matches any two number string from 01 to 99 [0-9]{1,3} – matches any number string from 1-999 |
\ | treat RegEx characters like normal characters | \? – matches a question mark, not zero or one character |
How to know you made the RegEx right?
- Test. Apply filters and see if they work as expected.
- Use RegEx debuggers – tools that allow you to enter your pattern and test different self-created strings to see if there is a match or not. If you google them, there will be a dozen different tools. One of my favorite is https://www.debuggex.com/ as they have a pretty cool pattern visualization.
Part 2. RegEx in Google Analytics 4
Main difference between between Universal Analytics (those who remember) and Google Analytics 4
Universal Analytics regular expressions had partial pattern match by default, while in Google Analytics 4 pattern has to match fully.
So using matches regex filter for source / medium dimension with google value, will match only data exactly matching google. To filter all the entries that contain google, you would need to use the regular expression .*google.*
Where can you use RegEx matching in GA4?
Regular expressions can be used in:
- Exploration custom report data filters
- Report filters in Reports section, when editing those
- Referral Exclusions (Admin > Data Streams)
- Audience creation (Admin > Audiences)
- Data Stream filters to define internal traffic rules based on IP address ranges (Admin > Data Streams)
- Custom Channel Group creation (Admin > Data Settings)
- Event Modification (Admin > Events)
RegEx is not supported in (at least for now):
- Interactive Table filters in Standard reports
Won’t cover all the use cases here, rather mention a few examples and tips.
RegEx in GA4 Reporting section
Unfortunately, RegEx is not supported in standard report interactive filters for tables. Hope Google will improve it someday, as it was very convenient in Universal Analytics.
While you can edit default or create your own reports and use RegEx in filters via report editing functionality. There are both full and partial match options.
Click on Customise option, and there you can apply filters for the whole report or summary cards.
RegEx in GA4 Exploration section
Explorations is the main playground for reports in Google Analytics 4 and here RegEx matching is available along other filtering options.
Probably most frequent RegEx use would be in report filtering.
You can also use RegEx in Funnel step filters. For example, to create and analyse a funnel for specific products.
RegEx for Referral Exclusion in GA4
Referral exclusion is very important to ensure proper conversion attribution, especially for e-commerce, where payment gateways often “steal” conversion credit from true conversion sources.
In Google Analytics 4, Referral Exclusion is well hidden, so it won’t be that easy to find.
- Go to Data Streams
- Click on a Web Data Stream
- Click on Configure tag settings
- Click on Show more in Settings section (almost there)
- Should finally see List unwanted referrals option
- Add the domains you want to exclude, one per each entry or using RegEx matching
Part 3. RegEx in Google Tag Manager
Regular expressions in Google Tag Manager have partial match by default. There are also often case sensitive and insensitive options to use, for example in Trigger conditions. As well as negative (does not match) and positive (match).
Where can you use RegEx matching in GTM?
- Trigger conditions
- RegEx Table Variable
- Custom JavaScript Variable, using RegExp() function
- Custom HTML Tags, with JavaScript code
Will cover few example below, but strongly suggest to check an article from MeasureSchool.
Trigger for all events (.*)
Dot means any character, asterisk – none or more previous (that is any) characters. So the pattern matches any event name. Just don’t forget to check the “Use regex matching” option.
Trigger for multiple event names (|)
Similarly, instead of creating many separate Triggers, you can create one for different event names.
Trigger for Home pageviews
Pattern for a homepage usually would be ^/$. In human language it means “starts and immediately ends with /”.
As URL variable contains domain name and possible query parameters (that is what goes after ?), better to use Page Path (Make sure you have it enabled in your GTM Account, under Variables > Enabled Built-In Variables).
If the page is multilingual, homepage can be also /en, /en/, /lv or /lv/. The pattern then would be ^/(lv|en)?/?$.
Looks complicated, right? :) To describe this pattern in plain English – Page Path path must:
- ^/ – start with a slash;
- (lv|en)? – can have either one or none of lv or en;
- /?$ – must end with / or nothing.
The question mark is the one that says “one or none characters” and () brackets are used to group elements. See the nice visualization of this RegEx below.
Part 4. RegEx in Search Console
Regular expressions in Google Search Console use partial match by default and are case insensitive. See Google documentation for more details.
Most common use case could be filtering Queries or URLs in Performance reports. For example, matching misspellings or grouping search queries.
Part 5. RegEx in Looker Studio
You can use RegEx in report Filters.
Also quite popular RegEx use case would be in custom dimension formulas, for example grouping Search Console queries for this blog (see this article for more details). Similarly can group or cleanup traffic sources, campaign names and etc. Suggest to check this article by OptimizationUp for more examples.
CASE WHEN REGEXP_MATCH ( Query , "" ) THEN "not set" WHEN REGEXP_MATCH ( Query , ".*(regex|regular|match|path|url).*" ) THEN "RegEx" WHEN REGEXP_MATCH ( Query , ".*(pageview|virtual|event).*" ) THEN "Pageview vs Events" WHEN REGEXP_MATCH ( Query , ".*(link|click).*" ) THEN "Link Tracking" WHEN REGEXP_MATCH ( Query , ".*(javascript|js|variable|lowercase).*" ) THEN "JS variables" WHEN REGEXP_MATCH ( Query , ".*(debug|working).*" ) THEN "Debugging" WHEN REGEXP_MATCH ( Query , ".*blog.*" ) THEN "Blog" ELSE "other" END
Part 6. Common RegEx patterns
GA4 Ecommerce events
.*(select_promotion|view_promotion|view_item_list|select_item|view_item|add_to_wishlist|add_to_cart|remove_from_cart|view_cart|begin_checkout|add_payment_info|add_shipping_info|purchase).*
Payment gateway referrals (global) for exclusion
.*(paypal|stripe|pay\.google|secure|visa|klarna|3ds|payments).*
Payment gateways referrals (local) for exclusion
.*(swedbank|seb|citadele|klix|dnb|luminor|privatbank|makecommerce|maksekeskus|lpb|paysera).*
>
Authorization referrals for exclusion
.*(accounts\.google).*
Other RegEx resources
Instead of any closing wisdom, I better share some more useful resources to check.
Theory:
- About regular expressions by Google
- Video – Regular Expressions Explanation
- Regular Expressions Cheat Sheet for Google Analytics by Loves Data
Theory with Examples:
- Video – Using regular expressions in GA4 by Measureschool
- How to use RegEx for Google Tag Manager by MeasureSchool
- Regular Expressions: Don’t Use Google Analytics Without Them by annielytics
- Practical Guide to Regex in Google Data Studio by optimizationup.com
Tools:
- https://www.debuggex.com/ and http://regexpal.com/ for RegEx debugging
Fun:
Note: This article was first published on Jan 24st 2015 and updated in 2023.
[cover photo by markus spiske]