49

I am trying to scrape some selling data using the StubHub API. An example of this data seen here:

https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata

You'll notice that if you try and visit that url without logging into stubhub.com, it won't work. You will need to login first.

Once I've signed in via my web browser, I open the URL which I want to scrape in a new tab, then use the following command to retrieve the scraped data:

r = requests.get('https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata')

However, once the browser session expires after ten minutes, I get this error:

<FormErrors>
<FormField>User Auth Check</FormField>
<ErrorMessage>
Either is not active or the session might have expired. Please login again.
</ErrorMessage>

I think that I need to implement the session ID via cookie to keep my authentication alive and well.

The Requests library documentation is pretty terrible for someone who has never done this sort of thing before, so I was hoping you folks might be able to help.

The example provided by Requests is:

s = requests.Session()

s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get("http://httpbin.org/cookies")

print r.text
# '{"cookies": {"sessioncookie": "123456789"}}'

I honestly can't make heads or tails of that. How do I preserve cookies between POST requests?

Stevoisiak
  • 23,794
  • 27
  • 122
  • 225
user2238685
  • 707
  • 1
  • 6
  • 8
  • If you have some legal obligation to remove the content, please flag for moderation attention and explain the situation clearly and we will take appropriate action. Please do not just edit the body of your question out. – animuson Sep 27 '13 at 00:12
  • thats how I got to know about stubhub.com – Ciasto piekarz Apr 03 '20 at 10:02
  • Relevant: https://stackoverflow.com/questions/38122379/performing-login-with-python-requests-cookies-not-activated – Frank May 19 '20 at 20:00

1 Answers1

88

I don't know how stubhub's api works, but generally it should look like this:

s = requests.Session()
data = {"login":"my_login", "password":"my_password"}
url = "http://example.net/login"
r = s.post(url, data=data)

Now your session contains cookies provided by login form. To access cookies of this session simply use

s.cookies

Any further actions like another requests will have this cookie

Fenikso
  • 9,251
  • 5
  • 44
  • 72
Michał
  • 2,456
  • 4
  • 26
  • 33
  • Can you help me ? I tried to login a website in this way but it's not working. – MD. Khairul Basar Jun 13 '17 at 18:59
  • @MD.KhairulBasar Perhaps you can ask another question providing the deails and link it in comment? – Michał Jun 14 '17 at 05:59
  • Though I already got the answer, I'm linking my [question](https://stackoverflow.com/questions/44548471/python-requests-cant-login-to-a-website) here as it might help others. – MD. Khairul Basar Jun 14 '17 at 15:20
  • Further post request does not seem to be working. Is only get request supposed to work afterwards? – Satys Aug 09 '18 at 08:37
  • @Satys anything should work fine as long as you perform requests from session object. If you have specific problem then please issue a new question and link it, I'll try to help. – Michał Aug 09 '18 at 11:14
  • @Michal, yeah I got it working actually, I had to change the format of payload. I used the same object though. Btw, thanks for the answer. :) – Satys Aug 09 '18 at 14:20
  • It doesnt work if website redirect a few times. At that time need to set `allow_redirects = False` – KC. Oct 05 '18 at 04:58
  • 1
    @kcorlidy actually if you are aware of number of acceptable redirects you can set a specific limit like: `s.max_redirects = 3` where `s` is a session object. – Michał Oct 05 '18 at 14:37