Follow this Blog!

Oct 18, 2012

f Comment

Limiting Scope Of Replacement with Regular Expression in MINUTES!

Amazon This article will teach you how to limit the scope of substitution with regular expression. Sometimes we'd like to run substitution on only specific part of a large text. Suppose you have the following text:

<b>whatever text</b>I do NOT want to do anything here<b>another text</b>
And you'd like to replace every instance of the text surrounded by <b></b> with the word BOLD so that the following text is the result of the substitution:

<b>BOLD</b>I do NOT want to do anything here<b>BOLD</b>
How do I do it with regular expression?

I'll use PHP to show the solution but you can adapt it to any programming language.
Solution #1
Suppose you have a variable $s that holds the following text:

<b>whatever text</b>I do NOT want to do anything here<b>another text</b>
The following line of PHP code gives you the desired result:
preg_replace('/(<b>)(.+?)(<\/b>)/e', '"$1".str_replace("$2","BOLD","$2")."$3"', $s)
Note preg_replace() does a global replace when you do not supply the limit in the parameters.
We use modifier 'e' so that we can evaluate a PHP function to yield the desired result.

This line of code says "Replace every block of text surrounded by <b> and </b> with BOLD using un-greedy matching".

In this case what's matched by (<b>) is stored in $1, what's matched by (.+?) is stored in $2, and what's matched by (<\/b>) is stored in $3. We simply concatenate $1, the result of replacing $2 with BOLD, and $3 to get the final result.

Solution #2
Suppose you have a variable $s that holds the following text:

<b>whatever text</b>I do NOT want to do anything here<b>another text</b>
The following line of PHP code gives you the desired result:
preg_replace('/(?<=<b>)(.+?)(?=<\/b>)/e', 'str_replace("$1","BOLD","$1")', $s)
Note preg_replace() does a global replace when you do not supply the limit in the parameters.
We use modifier 'e' so that we can evaluate a PHP function to yield the desired result.

We use the look-ahead and look-behind operators to achieve this effect. This line of code basically says "Replace every block of text that immediately follows <b> and immediately ends before </b> with BOLD using un-greedy matching".

We use reference $1 because look-behind and look-ahead assertions are NOT stored in back references. Therefore what's matched by (.+?) is stored in $1.

Easy right?

If you have any questions let me know and I will do my best to help you!
Please leave a comment here!
One Minute Information - by Michael Wen
Find Michael on Google or Facebook
ADVERTISING WITH US - Direct your advertising requests to Michael