Sometimes you may want to target a wide range of pages, but not include some that might otherwise be caught up (like mysite.com/products/item-###, /products/seasonal-### but excluding /products/promo-###)
Regular expressions have this great feature called "lookahead" and "lookbehind" and these are used quite literally, to look ahead in the URL to target something, or to look behind a particular portion and make sure that another part is included. These can be used in the "positive" or "negative" sense, meaning a regex can look ahead and explicitly include (positive) or exclude (negative) a particular thing.
Negative Lookahead - How to exclude a portion of your site
One of the most useful implementations of this is the negative lookahead. It allows you to exclude whole sets of pages, files, subdomains, or any other part of the URL you don't want to target. In regex terms, it looks for something that is NOT followed by something else. You specify what you DON'T want to include, and put it inside of these characters (?!StuffYouDontWant)
Example 1: Excluding a Section
I want to target my survey to all the pages on http://mysite.com/photos/, /cats/, and /documentation/ but not /users/ or any other single pages.
To use this example, you would add the following to your regex fields: Subdomain: TLD: Path: ( |
This will target any page on my site in a subfolder, EXCEPT for all pages in the /users/ section, and anything not in a subfolder. For example,
- http://mysite.com/photos/NevadaDesert.html
- http://mysite.com/photos/DeathValley.html
- http://mysite.com/photos/Carrum.html
- http://mysite.com/cats/PeggySue.html
- http://mysite.com/cats/Turbo.html
- http://mysite.com/documentation/Qualaroo.html
- http://mysite.com/documentation/personalwebsite.txt
- http://mysite.com/documentation/NextBigAndroidApp.php
- http://mysite.com/documentation/1337Resume.html
And these pages do not show surveys:
- http://mysite.com/users/admin
- http://mysite.com/users/user1234
- http://mysite.com/users/mom
- http://mysite.com/contact
- http://mysite.com/about-us
- http://mysite.com/pricing/
- http://blog.mysite.com/
Example 2: Excluding Groups of Pages
For the example used earlier in this section, say you want to target all the item pages in snacktastic.com/products/item-###, snacktastic.com/products/seasonal-### but excluding snacktastic.com/products/promo-###.
To use this example, you would add the following to your regex fields: Subdomain: TLD: Path: |
The survey will show up on:
- http://snacktastic.com/products/item-0733
- http://snacktastic.com/products/item-561211
- http://snacktastic.com/products/seasonal-559
- http://snacktastic.com/products/seasonal-01223
But not
- http://snacktastic.com/products/promo-001
- http://snacktastic.com/products/promo-55776
You could also use the |
character to get the same results.
To use this example, you would add the following to your regex fields: Subdomain: TLD: Path: |
There's lots of ways to get to the same answer with regular expressions. You might find yourself using one set of tools more frequently than another, and that's fine.
Positive Lookahead - Focusing on a specific portion of your site
A Positive Lookhead is basically the opposite of a negative lookahead - it defines a pattern that MUST appear in the URL for the page to be targeted. This is done by adding (?= ) around whatever you want to require.
If you want to target any page on your site with "dragonfly" in the path, you can do so very easily.
To use this example, you would add the following to your regex fields: Subdomain: TLD: Path: .* |
This regex will match any page with "dragonfly" anywhere in the URL path:
- http://www.naturaljewelrydesigns.com/dragonfly
- http://www.naturaljewelrydesigns.com/products/rings/dragonfly
- http://www.naturaljewelrydesigns.com/new_designs/greendragonfly.php
You can also combine the positive lookahead with other regex characters:
To use this example, you would add the following to your regex fields: Subdomain: TLD: Path: .* |
A regex like this will match any page with the words dragonfly
, dragonflies
and dragonfire
in the URL path.
Comments
0 comments
Article is closed for comments.