0

I'm crawling a website and saving selected pages for offline browsing. Of course the links between pages are broken so I'd like to do something about that. Is there a somewhat simple way in python to relativize url paths like this without reinventing the wheel:

/folder1/folder2/somepage.html    becomes---->   folder2/somepage.html
/folder1/otherpage.html           becomes---->   ../otherpage.html

I understand the function would need two url paths to determine the relative path as the linked resource's path is relative to the page it appears on.

  • 1
    Possible duplicate of [Python: Get relative path from comparing two absolute paths](http://stackoverflow.com/questions/7287996/python-get-relative-path-from-comparing-two-absolute-paths) – Mureinik May 18 '17 at 08:07

1 Answers1

0

the newer pathlib (only briefly mentioned in the duplicate) also works with urls:

from pathlib import Path

abs_p = Path('https://docs.python.org/3/library/pathlib.html')
rel_p = abs_p.relative_to('https://docs.python.org/3/')
print(rel_p)  # library/pathlib.html
hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
  • Thanks! This looks promising, however I forgot to mention that I'm using python 2.7 because that is the version installed on mac osx and I want this to be portable. If nothing else, I might have a look at the source code and pick the bits that I need assuming the source works in 2.7. – yogert909 May 18 '17 at 15:52