Regular Expressions Usage Guide
Regular expressions are used in computer programming, allowing people to search through many lines of code for a specific piece of data or find a very specific and precise set of information that would otherwise take them many hours of searching and sifting.
At Qualaroo, we use regular expressions (shortened to regex and regexes) to let you target your surveys to a specific set of pages or URLs that are complex or dynamic.
Key Benefits
- Targeting Specific Pages: Customize your surveys to display only on designated pages using regex patterns.
- Dynamic URL Matching: Seamlessly adapt your survey targeting to dynamic URLs that follow a pattern.
- Complex URL Structures: Effectively handle intricate URL structures and still achieve accurate targeting.
In this article,
1. Excluding URLs, Focusing on specific URLs - Negative and Positive Lookaheads
3. Question Mark - Not Required
6. The OR Pipe (|) - How to Target Several Specific Pages on Your Site
7. Using Parentheses, Brackets, and Sets
Excluding URLs, Focusing on specific URLs - Negative and Positive Lookaheads
Excluding a portion of your site with Negative Lookahead
Negative lookahead allows you to exclude whole sets of pages, files, subdomains, or any other part of the URL you don't want to target. In regex terms, it looks for something NOT followed by something else. You specify what you DON'T wish to include and put it inside of these characters (?!StuffYouDontWant)
Understanding the Negative Lookahead with examples
1. Excluding a Section
If you want to target your survey to all the pages on http://mysite.com/photos/, /cats/, and /documentation/ but not /users/ or any other single page, add the following URL to your regex fields:
Subdomain: (www)?
TLD: mysite.com
Path: (?!users).*
With this, you can target any page on your site in a subfolder, EXCEPT for all pages in the /users/ section and anything not in a subfolder. For example:
-
http://mysite.com/photos/NevadaDesert.html
-
http://mysite.com/photos/DeathValley.html
-
http://mysite.com/photos/Carrum.html
-
http://mysite.com/cats/PeggySue.html
-
http://mysite.com/cats/Turbo.html
-
http://mysite.com/documentation/Qualaroo.html
-
http://mysite.com/documentation/personalwebsite.txt
-
http://mysite.com/documentation/NextBigAndroidApp.php
-
http://mysite.com/documentation/1337Resume.html
But the following pages will not display the survey:
-
http://mysite.com/users/admin
-
http://mysite.com/users/user1234
-
http://mysite.com/users/mom
-
http://mysite.com/contact
-
http://mysite.com/about-us
-
http://mysite.com/pricing/
-
http://blog.mysite.com/
2. Excluding Groups of Pages
If you want to target all the item pages in snacktastic.com/products/item-###, snacktastic.com/products/seasonal-### but exclude snacktastic.com/products/promo-###, add the following to your regex fields:
Subdomain: (www)?
TLD: snacktastic.com
Path: products/(?!promo).*-d+/?
The survey will be displayed on the following pages:
-
http://snacktastic.com/products/item-0733
-
http://snacktastic.com/products/item-561211
-
http://snacktastic.com/products/seasonal-559
-
http://snacktastic.com/products/seasonal-01223
But not on:
-
http://snacktastic.com/products/promo-001
-
http://snacktastic.com/products/promo-55776
You could also use the
|
character to get the same results.Subdomain: (www)?
TLD: snacktastic.com
Path: products/(item|seasonal)-d+/?
There are multiple ways to get the same results with regular expressions. You might use one set of tools more frequently than another, and that's fine.
Including a specific portion of your site with a Positive Lookahead
A Positive Lookahead is basically the opposite of a negative lookahead - it defines a pattern that MUST appear in the URL for the page to be targeted. You can do this by adding (?= ) around whatever you want to require.
If you want to target any page on your site with a "dragonfly" in the path, you can do so very easily.
Subdomain: (www)?
TLD: naturaljewelrydesigns.com
Path: .*(?=dragonfly)
This regex will match any page with "dragonfly" anywhere in the URL path:
-
http://www.naturaljewelrydesigns.com/dragonfly
-
http://www.naturaljewelrydesigns.com/products/rings/dragonfly
-
http://www.naturaljewelrydesigns.com/new_designs/greendragonfly.php
Also, you can combine the positive lookahead with other regex characters by adding the following value to the regex fields:
Subdomain: (www)?
TLD: naturaljewelrydesigns.com
Path: .*(?=dragonf(ly|lies|ire))
A regex like this will match any page with the words
dragonfly
,dragonflies
anddragonfire
in the URL path.
-
How to Target a Path Using Backward Slash
A backslash can target a specific path using the regular expression.
Step 1: Navigate to WHERE in the TARGETING section.
Step 2: Select the radio button in front of the “Use an advanced URL” option.
Step 3: To target: http://staging.company.com/section/cart?promo=749387493, enter the following URL components in the regex fields:
Subdomain:
staging
TLD:
company.com
Path:
section/cart?promo=.*
Validate the path using the regex validator to see how the whole URL appears in the Qualaroo:
Here,
-
The purple arrows are where Qualaroo automatically escapes the periods and slashes in between the three fields.
-
The pink arrows are where you escape the backslashes, periods, and question marks that we want to include in the URL.
-
The green arrow at the end of the regex shows where the period-asterisk regex pattern is left unescaped because those are special characters that are part of the regex.
-
How to Use a Question Mark for Targeting Subdomain
Some websites are set up to load both “www.site.com” and “site.com” versions. If you use the Simple URL targeting field, Qualaroo loads it automatically.
But if you use a regex on a website, you will need to use the question mark to ensure pages on the www.site.com version appear.
Targeting the www Subdomain
Step 1: Navigate to WHERE in the TARGETING section.
Step 2: Select the radio button in front of the “Use an advanced URL” option.
Step 3: Enter the following URL components in the regex fields:
Subdomain:
(www)?
TLD:
site.com
Path:
This will target both http://www.site.com and http://site.com.
For websites using www, always add the following URL components in the regex fields:
Subdomain:
www
TLD:
company.com
Path:
NOTE: You don't have to escape the period after the "www" as Qualaroo will automatically load the URL starting with www.
How to Use Digit Characters in Regular Expression
By using product-\d\.html in the regex field, the search results will be like this:
-
product-0.html
-
product-1.html
-
product-2.html
-
...
-
product-9.html
And if the digits are in hundreds and above, you will have to enter the URL components in the regex fields.
Subdomain:
(www)?
TLD:
website.com
Path:
product-\d+\.html
With this regex, the search results will be
-
product-0.html
-
product-1.html
-
product-2.html
-
product-10.html
-
product-2450.html
-
...
How to Use Words in Regular Expression
Add the following URL components in the regex fields to use the “\w” function:
Subdomain:
(www)?
TLD:
website.com
Path:
\w+\.php
To match the following results:
-
paris.php
-
Melbourne.php
-
McMurdo_Field_Work_Presentation.php
-
premium_plan_2016Sept.php
-
Loflo_washer_23998.php
Using the \w function \w+\.php in the regular expression, you can target pages with human-readable names that don't use special characters. These can be anything like photo album folders, documents that your users have created, or product pages that include the name and ID of the product.
NOTE: Word characters are case-sensitive in the regular expression.
-
How to Use the Dot-Star Combination
Add “products\/.*\/help” to the following regex fields:
Subdomain:
(www)?
TLD:
website.com
Path:
products\/.*\/help
This combination of .* in the regex: products\/.*\/help will match the following URLs
-
website.com/products/iphone_case_blue-346610/help
-
website.com/products/photoalbum-8x12/help
-
website.com/products/spatulas/help
-
website.com/products/any-P0ss1ble_characters/help
NOTE: If you want to target every page on your website or every page in a specific section, you can also use the Simple URL Targeting field. You only need to use a star(*) in the right part of your URL, and Qualaroo will do the rest.
-
The OR Pipe (|) - How to Target Several Specific Pages on Your Site
How to Target Multiple Pages
Say you want to match these pages:
blog.mycats.com/peggysue.html
blog.mycats.com/turbo.html
Add the following URL components in the regex fields:
Subdomain: blog
TLD: mycats.com
Path: (peggysue|turbo)\.htmlto match those pages.
How to Target Multiple Subdomains
If you want to target the gallery pages of the Brazil, Chile, and Argentina sections on largetravelcompany.com, using the OR pipe makes this very easy.
Add the following URL components in the regex fields:
Subdomain: (brazil|chile|argentina)
TLD: largetravelcompany.com
Path: gallery
to target the required gallery pages of Brazil, Chile, and Argentina.
How to Target Multiple Pages Across Multiple Subdomains
You can even use multiple OR characters in the same regex, to target several pages across multiple subdomains.
Add the following URL components in the regex fields:
Subdomain:
(library|parks)
TLD:
smalltown.gov
Path:
(kids|special_events|holiday)-activities\/sign_up_form
to target several pages across multiple subdomains.
Using Parentheses, Brackets, and Sets
Here is how you can use Parentheses ( ), brackets [ ], and curly brackets { } in regular expression:
-
(a|b) - Matches a OR b
-
[xyz] – Matches any single character in the brackets: x, y, OR z.
-
[^a-z] – When inside of a character class, the ^ means NOT. Here, match anything that is NOT a lowercase letter.
-
[A-Z] – Capital A through Capital Z.
-
[a-z]{2} – Exactly 2 a-z letters.
How to Use Parenthesis
Using the parentheses and the OR pipe, you can tell your regex to target one word (sometimes called a "string") or another in your URL.
Add the following characters in the regex fields:
Subdomain :
blog
TLD :
mycats.com
Path :
(peggysue|turbo)\.html
to target the URLs: blog.mycats.com/peggysue.html AND blog.mycats.com/turbo.html.
How to Use Brackets and Curly Brackets
By using
-
Brackets - you can target a range of letters (like a-z or a-f) or numbers (0-9, 1-5). You can also use the
-
Curly brackets - you can ask for a specific number of letters, numbers, or a range you wish to allow.
If you want to use the brackets to show the letter range [a-z] and curly brackets to determine letter count by allowing {2}, add the following characters in the regex fields:
Subdomain :
www
TLD :
international.com
Path :
[a-z]{2}\/products
to target the following sections:
-
www.international.com/en/products
-
www.international.com/ca/products
-
www.international.com/uk/products
-
www.international.com/au/products
-
www.international.com/nz/products
In this way, target a survey across several sections of your website, you can use the OR pipe, or the brackets if they have a similar format.
Further, if you want to target pages with a specific URL format, like six letters and eight numbers, add the following characters to the regex fields:
Subdomain :
(www)?
TLD :
gifts-for-everyone.org
Path :
holiday\/special_deals\/[a-z]{6}-[0-9]{8}
to match the following pages:
-
www.gifts-for-everyone.org/holiday/special_deals/lawnmo-45061367
-
www.gifts-for-everyone.org/holiday/special_deals/hairdr-00002239
-
www.gifts-for-everyone.org/holiday/special_deals/poster-08825041
and ignore the following pages:
-
development.gifts-for-everyone.org/holiday/special_deals/lawnmo-45061367---Wrong subdomain
-
www.gifts-for-everyone.org/holiday/special_deals/lawnmower-45061367---Wrong number of letters
-
www.gifts-for-everyone.org/holiday/special_deals/lawnmo-451367---Wrong number of numbers
NOTE: If you'd like more flexibility in the ranges of letters and numbers in the pages you want to target, the curly brackets can also be used for this.To target the same kinds of pages mentioned above, but with 4-8 letters and 2-8 numbers, add the following characters to the regex fields:
Subdomain :
(www)?
TLD :
gifts-for-everyone.org
Path :
holiday\/special_deals\/[a-z]{4,8}-[0-9]{2,8}
to allow the survey to be targeted at a wider range of pages, including:
-
www.gifts-for-everyone.org/holiday/special_deals/bowl-27
-
www.gifts-for-everyone.org/holiday/special_deals/heatlamp-00019223
-
www.gifts-for-everyone.org/holiday/special_deals/boots-4512
-
www.gifts-for-everyone.org/holiday/special_deals/catmitte-123380
How to Use Multi-Digit Number Ranges
Regular expressions restrict dealing with numbers greater than 9. So, to set ranges in the double or triple digits, you must specify the range of each digit.
For targeting pages with numbers 25-50, you can use a few sets of numbers and ranges. In this, you must precisely define the range from 25-29, then 30-49, and finally 50.
First, we will define each range and then put it into a single regex. Here
-
2[5-9] will match 25-29
-
(3|4)[0-9] will match 30-30, and 40-49
-
50 will match 50
To target the pages : (2[5-9]|(3|4)[0-9]|50)using the OR (pipe) to separate each number range, add the following characters to your regex fields:
Subdomain :
www
TLD :
learning_math_is_fun.com
Path :
chapter_(2[5-9]|(3|4)[0-9]|50)
-
You can also download this guide in a single PDF from the link below:
That is all about the introduction to regular expressions for URL targeting.