1
http://mywebsite/index.aspx?db=DAYTON#id%3D7304%3Bpage%3D1%3Bview%3Dpages

http://mywebsite/#id%3D3D7304%3Bpage%3D1%3Bview%3Dpages

The two URL's above go to the exact same place but are of different styles. I am trying to write out a one line expression that will create a match no matter which style of URL is thrown at it. I have been focusing primarily on everything after the "mywebsite/"

Any help would greatly be appreciated!

Wes F.
  • 13
  • 2
  • 3
    What is the language? Please tag the language or tool when you are asking regex question! – nhahtdh Mar 05 '13 at 17:20
  • 3
    Seeing as those fragments will never arrive at a webserver, it's a pretty safe bet the language is `javascript`. – Wrikken Mar 05 '13 at 17:23
  • Do you want a regex that just matches those two URLs or others as well? – Bernhard Barker Mar 05 '13 at 17:31
  • If you are indeed looking for a javascript solution, then you may want to check out this q/a: http://stackoverflow.com/q/6644654/211627 – JDB Mar 05 '13 at 17:56
  • **This might not be a job for regexes, but for existing tools in your language of choice.** What language are you using? You probably don't want to use a regex, but rather an existing module that has already been written, tested, and debugged. If you're using PHP, you want the [`parse_url`](http://php.net/manual/en/function.parse-url.php) function. If you're using Perl, you want the [`URI`](http://search.cpan.org/dist/URI/) module. If you're using Ruby, use the [`URI`](http://www.ruby-doc.org/stdlib-1.9.3/libdoc/uri/rdoc/URI.html) module. – Andy Lester Mar 05 '13 at 17:58
  • I do apologize for not tagging the programming language being used. I was not sure if it mattered or not since I thought that the Regular Expression language was universal. I am using .NET – Wes F. Mar 06 '13 at 11:24

2 Answers2

0

First of all, decode the URLs to make things easier:

http://mywebsite/index.aspx?db=DAYTON#id=7304;page=1;view=pages
http://mywebsite/#id=3D7304;page=1;view=pages

Now you can write two regular expressions to match each path and combine them with | operator(Demo):

http://mywebsite/(index.aspx\?db=(\w+)#id=(\d+);page=(\d+);view=(\w+)|#id=3D7304;page=1;view=pages)

you can also use named groups if your programming language supports them.


Note that the regular expression above won't match the URLs if arguments order changes.

I suggest you to use a URL parser if you can.

fardjad
  • 20,031
  • 6
  • 53
  • 68
  • If the web standard says the parameters can be in any order (which it does), then it would be a bad idea to code your website in such a way that a valid query breaks it. – JDB Mar 05 '13 at 18:02
0

Here's a heavy One-Liner Regex:

^http:\/\/mywebsite\/(index\.aspx\?db=[A-Z]+)?#((id%\w+(%3B)?)|(view%\w+(%3B)?)|(page%\w+(%3B)?))*$

It will accept your website with an optional index.aspx?db= (set to some UPPERCASE value) and any order of the 3 variables you use: id, view and page.

Colorful explained demo here: http://regex101.com/r/jB2jS3

CSᵠ
  • 10,049
  • 9
  • 41
  • 64