0

I am trying to remove the webpage part of the URL

For example,

www.example.com/home/index.html 

to

www.example.com/home 

any help appreciated.
Thanks

Andrew Grimm
  • 78,473
  • 57
  • 200
  • 338
anusuya
  • 653
  • 1
  • 9
  • 24

3 Answers3

10

It's probably a good idea not to use regular expressions when possible. You may summon Cthulhu. Try using the URI library that's part of the standard library instead.

require "uri"
result = URI.parse("http://www.example.com/home/index.html")
result.host # => www.example.com
result.path # => "/home/index.html"
# The following line is rather unorthodox - is there a better solution?
File.dirname(result.path) # => "/home"
result.host + File.dirname(result.path) # => "www.example.com/home"
Community
  • 1
  • 1
Andrew Grimm
  • 78,473
  • 57
  • 200
  • 338
  • 1
    +1 URL's are not regular, cannot parse them with regex, use URI lib – clyfe Sep 30 '10 at 10:52
  • 1
    Addressable::URI is another good URI module for Ruby and is a bit more full-featured. Ruby's built-in URI should be sufficient for this purpose though. http://github.com/sporkmonger/addressable – the Tin Man Sep 30 '10 at 14:48
0
irb(main):001:0> url="www.example.com/home/index.html"
=> "www.example.com/home/index.html"
irb(main):002:0> url.split("/")[0..-2].join("/")
=> "www.example.com/home"
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
0

If your heart is set on using regex and you know that your URLs will be pretty straight forward you could use (.*)/.* to capture everything before the last / in your URL.

irb(main):007:0> url = "www.example.com/home/index.html"
=> "www.example.com/home/index.html"
irb(main):008:0> regex = "(.*)/.*"
=> "(.*)/.*"
irb(main):009:0> url =~ /#{regex}/
=> 0
irb(main):010:0> $1
=> "www.example.com/home"
Doug
  • 563
  • 4
  • 10