0

I am using requests in Python in order to get some data from a website, but there are multiple pages. They are accessible by modifying the URL parameter page=x. However by using GET like this :

r = requests.get(page, params={'sort':0, 'perPage':40, 'page':i})

the parameters are placed before the end of the URL, as print(r.url) shows :

link.html?sort=0&perPage=40&page=1#/

instead of

link.html#/?sort=0&perPage=40&page=1
(see the "#/" placement)

Moreover, it seems that I can't pass the direct URL in the GET to access the other pages like this :

page_initial = "link.html#/
for i in range(1,50):
    page = page_initial+"?sort=0&perPage=40&page={}".format(i)
    r = requests.get(page)
    ...

This always returns the content of the first page again.

Am I missing something or am I using this wrongly ?

Joshua Varghese
  • 5,082
  • 1
  • 13
  • 34
Welharden
  • 53
  • 2
  • 11
  • 1
    Everything after the # is only interpreted by the browser, not the server (https://stackoverflow.com/questions/30997420/what-are-fragment-urls-and-why-to-use-them), so requests is helping you. If you see parameters after the # when you browse this page, that means that they are being handled by javascript and requests can't help you, and you'll need to switch to Selenium. – Alex Hall Apr 18 '20 at 13:47
  • 1
    See also https://stackoverflow.com/questions/12682952/proper-url-forming-with-query-string-and-anchor-hashtag – Alex Hall Apr 18 '20 at 13:48
  • Ok I understand, thanks a lot – Welharden Apr 18 '20 at 13:49

0 Answers0