0

I'm trying to get a bunch of PDFs off of a site that sits behind a login so I don't manually have to download each and every one. I figured this would be easy, but I'm getting a "Missing Key-Pair-Id query parameter" error back. Here's what I have:

payload={'username':'user','password':'pass'}

with requests.Session() as session:
    post = session.post('https://website.com/login.do', data=payload)
    r = session.get('https://files.website.com/1.pdf')
    print(r.text)

I'm printing r.text just because that's where I'm getting the above message. My post variable is giving me a response of 200 and the contents post.text is a redirect link with "code: success", too. If I click that link (or copy paste it into a private browser), I'm logged in just fine. And browsing to the pdf link works just fine. What am I missing here? Thanks.

  • It's possible you need to pass some kind of [CSRF token](https://stackoverflow.com/questions/5207160/what-is-a-csrf-token-what-is-its-importance-and-how-does-it-work/33829607#33829607) – Iguananaut Aug 26 '21 at 16:34
  • Thanks! That seems like it could be it? So if that's the case, I just can't use requests for this task? The text of the post variable is something like:{"redirect":"https://EXAMPLE.com/postlogin?token=REALLYLONGSTRING – itsallkosher Aug 26 '21 at 16:46
  • I'd have to see the website, but yes you probably have to download the login form first, obtain the token using scraping, and pass it along with your post request. – Iguananaut Aug 26 '21 at 17:01

0 Answers0