1

If someone types webadress twice by wrongly i want to find the duplicate webaddress using regex. Example:

http://stackoverflow.com/questions/ask/advice?http://stackoverflow.com/questions/ask/advice?

It should throw an error.

I check the rule \b(\w+)\s+\1\b, it's not working for me.

Can someone help me to find the rule?

Biffen
  • 6,249
  • 6
  • 28
  • 36
  • Why the `\s+`, there are no spaces in there? And `\w` won't match all characters in a typical URL. And the last `\b` makes it not work in this case 'cause there's no `\b` after a `?`. – Biffen Feb 11 '15 at 12:08
  • 1
    `/(.*)\1/` works, does it have to be more exact than that? – Biffen Feb 11 '15 at 12:10

1 Answers1

0

This won't work because (\w+) is just for characters.

Looks like you want to search for non-whitespace characters in a string that starts with "http://"

You could do that like this: \b(http://\S+)\s*\1

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • 1
    It works with the example *if* you remove the last `\b` (see my comment to the question) (and escape the slashes, but that could depend on the engine). – Biffen Feb 11 '15 at 12:26
  • @Biffen Thanks, I reflexively added back the `\b` but I'm still surprised it did not work. Is there a reason that trailing `\b`s are not allowed? – Jonathan Mee Feb 11 '15 at 12:28
  • 1
    In this case it's because there's no `\b` after a `?`; `\b` is for *word* boundaries, and `?` is not a word character. – Biffen Feb 11 '15 at 12:29