0

I do scraping on a site which has similar html

<a href="/pages/1></a>

I also have the window.location object, where I have

origin:"http://www.example.org"

so I can build the absolute path like origin + href = http://www.example.org/pages/1

I made a mockup of the page on my file system for testing.

-www.example.org
  |-2017
    |-pages
      |-1.html
      |-2.html
  |-2016
    |-pages
      |-1.html
      |-2.html

in those html files the links look something like this:

<!-- www.example.org/2016/pages/1.html --> <a href="../../2017/pages/1.html">2017</a>

In the test the same code won't work, because the window.location object's origin is file://:

hash:""
host:""
hostname:""
href:"file:///home/me/projects/fp/src/test/fixtures/www.example.org/2016/pages/1.html"
origin:"file://"
pathname:"/home/me/projects/fp/src/test/fixtures/www.example.org/2016/pages/1.html"
port:""
protocol:"file:"

which produces origin + href = file://../../2017/pages/1.html . With some string manipulation I could make file:///home/me/projects/fp/src/test/fixtures/www.example.org/2017/pages/1.html from location.pathname if the protocol is file: . But is it the right way to handle this problem?

user3568719
  • 1,036
  • 15
  • 33
  • Possible duplicate of [File Uri Scheme and Relative Files](https://stackoverflow.com/questions/7857416/file-uri-scheme-and-relative-files) – Vaibhav Nigam Aug 02 '17 at 15:49

1 Answers1

1

file:// can be used only for absolute paths.

Only relative path that works is the current working directory.

Vaibhav Nigam
  • 1,334
  • 12
  • 21