2

I am using a regular expression to rewrite some URLs. I need to replace all ampersands & in a URL with & but not when the ampersand already starts an & occurrence. So I ended up with this:

search: (^[^&]*?)&((?!amp;)[^W]*?.htm)
replace: $1&$2

Which transformed:

    Bob_&_Carol.htm  

into:

    Bob_&_Carol.htm

But this only works with the first ampersand and fails on multiple ampersands only converting the first occurrence.

    Bob_&_Carol_&_Alice.htm 

into:

    Bob_&_Carol_&_Alice.htm

So I modified the match expression to find the multiple ampersands:

    ^(?:([^&]*?)&(?!amp;))*([^W]*?.htm)

But I have no idea how to write the Replace string to handle the multiple captures. How do I write the replacement string to replace all captures?

anubhava
  • 761,203
  • 64
  • 569
  • 643
Vincent James
  • 1,120
  • 3
  • 16
  • 27
  • 1
    Which tool/language are you using for search/replace? – speakr Apr 23 '13 at 07:44
  • IIS URL Rewrite Module will use the final regex. I have been testing using a free tool called Rad Regex Designer. – Vincent James Apr 23 '13 at 07:45
  • Instead a single regex that would capture multiple groups with ampersand and some text ine each group, you need a regex that would capture one ampersand, but replace it multiple time. You should use replace function with parameter specifying to replace all occurencies, or use ReplaceAll function, or something like this, dependent on your language (what language are you using)? – AdamL Apr 23 '13 at 07:48
  • TY I will try a redesign on the expression. I am sorry I cannot answer the "which language" question better. I am just feeding the search and replace expressions to Microsoft IIS URL Rewrite module. It has no options besides accepting the two expressions. Normally I use C# and .NET objects and these expressions seem to follow those rules. – Vincent James Apr 23 '13 at 08:01

1 Answers1

1

You need to match following regex with capturing groups

^(.*?)&(?!amp;)(.*)$

And use following String for replacement:

/$1&$2

Overall Rewrite rule can appear like this:

RewriteRule ^(.*?)&(?!amp;)(.*)$ /$1&$2 [L]
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • TY for your time. This combination is still just replacing the first occurrence of the ampersand. – Vincent James Apr 23 '13 at 08:02
  • Yes it replaces one occurrence but RewritRule is applied by webserver for **as many times as it matches**. I don't have IIS but have tested above RewriteRule in Apache and it worked by replacing all occurrences of `&`. – anubhava Apr 23 '13 at 08:09
  • It seems he was trying something similar but couldn't get it to work with IIS either. TY for your help. http://stackoverflow.com/questions/1058789/iis7-url-rewriting-module-replace – Vincent James Apr 23 '13 at 08:32
  • Did you test this out on your IIS? I don't have IIS installed (pls let me know if I can some free version install IIS on a XP laptop I have) – anubhava Apr 23 '13 at 10:41
  • Yes I have been trying all solutions in both the test environment and real time on IIS. I settled for the ugly solution on the link. I checked all URLs and there are never more than 4 ampersands, so I set up 4 different rules for each case. Ugly but it works. – Vincent James Apr 23 '13 at 15:35