Google Analytics provides a lot of data and reports. Most come in the form of tables that can contain over 100,000 rows. Sometimes a normal text search is not enough to find something in it. Regular expressions allow complex searches and can be used in many places in Analytics .
What are regular expressions?
You can define complex search queries with a regular expression or regex for short. There are a number of meta characters as placeholders and to define groups or sets. Regular expressions thus give more possibilities than the usual matches or begins with . Regex also enable search-and-replace rules and are therefore a powerful tool when working with data.
A few common examples:
|192 \ .168 \.  \.||Matches all IP addresses from the network range 192.168.7., 192.168.8. and 192.168.9. each with the digits 1 to 255.|
|(www | blog) \. tirami \ .biz||Matches the host names www.tirami.biz and blog.tirami.biz.|
|\ .pdf $||Fits all files with the extension .pdf.|
|/service/.*\.pdf$||Matches all files with the extension .pdf in the / service directory or a subdirectory of / service.|
|^ / .. / service /||Matches all URLs whose first directory is two characters long, for example / de / service / or / en / service / or / it / service /, but not to / global / service /.|
|^ / [^ i] [^ t] / service /||Matches all URLs with a service directory, except the Italian (it).|
|/ blog / \ d +||All pages in the / blog / directory that begin with a number|
How do I define regular expressions in Google Analytics?
With a regex you define placeholders or groups with meta characters. These are in a regex, not for the actual character. The point . means any character. That’s why you have to explicitly state in many places that you are using a regex and not a simple text string.
The most important meta characters for regular expressions:
|.||Placeholder for any character||t.rami matches the character strings tirami, tarami, torami and t6rami.|
|*||Repeat previous character as often as you like. The symbol can also be completely absent.||ti * rami matches tirami, tiiiiiirami and trami.|
|+||Repeat previous character as often as you like. The character must appear at least once.||ti + rami matches tirami and tiiiirami, but not trami.|
|?||The previous character may or may not appear.||ti? rami matches tirami and trami.|
||||Two characters can appear alternatively. Corresponds to a link with “or”.||a | b matches a or b.|
|^||The following characters must be at the beginning of the character string.||^ / service matches the / service page, but not / customer service.|
|$||Preceding characters must be at the end of the character string.||products / $ matches / products /, but not / products / wines /.|
|()||group several strings, for example for an OR link||(Red | White) wine goes well with red wine and white wine.|
|\||Cancels the meta function for the following special character. This makes the entry a common character.||tirami.biz fits on tirami.biz, but also on tirami8biz. tirami \ .biz on the other hand only matches tirami.biz (with a point between tirami and biz).|
Meta characters for lists and character classes:
|||List of characters that can appear at this point in the character string. Can with *, + and? be used.||t [iao] rami matches tirami, tarami and torami.t [iao] + rami matches tirami, tiiiirami, but also tiaoiaoirami.|
|–||Within lists, the minus indicates a range of connected characters.||[AZ] corresponds to a list with all capital letters of the alphabet.|
|^||The following character is considered negative within a list, that is, it must not be in this position.||t [^ i] rami matches tarami and torami, but not tirami.|
|\ d||any digit 0 to 9||12 \ d matches 123, 124 and 128. But not 12A|
|\ D||Sign that there is no digit||12 \ D matches 12B but not 128|
|\ w||Letter, number or underscore|
|\ W||Sign that there is no letter, number or underscore||in URLs \ W matches # or?|
|\ s||whitespace – space or tab||Regular \ sexpression matches regular expression|
|\ S||any character that is not a space|
Where can I use regular expressions in Google Analytics?
Google Analytics allows the use of regular expressions in some places:
Tip : The search field automatically detects when a regular expression is entered. So you don’t have to go to Advanced and change the menu. A regular expression without metacharacters is processed like a “contains”.
The vertical line means something like either the text on my left or on my right must fit . Entering two entries doesn’t make it much easier, but you can list as many elements as you want in a row. With many entries, this saves some clicking.
As with the segments, you can add multiple filters . However, these filters are always linked with AND. One line only Cologne and one line only Munich are mutually exclusive. You can create an OR list with a regex.
Fun fact : Here it says RegEx in the menu , while it says regular expression for segments and RegExp in the table search field .
The expression “/ jobs / [az]” describes all pages that are within the / jobs / folder – in this case the job offers.
By the way: If you choose regular expression for the target, you can also specify regex in the steps of the (optional) funnel.
Tip : Test your filter pattern for a target in the search field of the page report. There should come all sides in the result that you want to have as a goal.
With filters you can control the incoming data of a data view. You can
- only show selected data in the data view
- exclude certain data
- Change data
Translated, this filter means:
- Take everything (. *) That is in field A (host name)
- Take everything (. *) That is in field B (URI)
- and write both $ A1 and $ B1 one after the other again the URI
The regex “/ blog / \ d +” describes all pages that contain “/ blog /” and immediately followed by at least one number “\ d +”.
Note: Regexs work slightly differently for channel definitions than in the other cases. For channels, the specified pattern must match the entire source (medium, etc.). It is not enough that only one part fits. So google only matches the source google , but not google.com
With regular expressions you can define patterns more precisely than with a simple contains or starts with a filter. In many cases, using a regex is faster than combining multiple text filters. Some requirements can even be implemented only with regex, some search-and-replace filters. Not only does Google Analytics understand regular expressions, other Google tools and many other services can also use them. So it’s worth the effort to learn them 🙂