1

I ran into some trouble trying to build a python bot to check the status of a application. For explanation, here is an example of the process:

1.) Visit website (https://examplewebsite.com/checkinfo.do?a=sample

2.) Assuming the query string is correct, the website will drop a cookie. This cookie is a 'session cookie' and therefore expires immediately upon exiting closing or navigating away from the webpage.

3.) Once the cookie has been obtained visit https://examplewebsite.com/step2.do?a=sample

4.) Parse result of the request to https://examplewebsite.com/step2.do?a=sample

Although this may appear simple, I am having trouble sending the second request to https://examplewebsite.com/step2.do?a=sample and am routinely met with a page saying "session expired".

I have tried to reproduce this process using Python requests, Python requests.session, as well as dryscrape. But I can not get it to work - my gut feeling is that because the cookie is a "session cookie" it is being immediately when the second request to https://examplewebsite.com/step2.do?a=sample is initiated but before the page loads.

Perhaps a better way to explain the issue is with browser behavior, using Firefox and IE the website behaves like the below:

1.) Visit website (https://examplewebsite.com/checkinfo.do?a=sample)

  • A cookie is successfully obtained

2.) Replace the URL with https://examplewebsite.com/step2.do?a=sample

  • A "session expired" warning is presented

HOWEVER, this does work:

1.) Visit website (link same as above)

  • A cookie is successfully obtained

2.) While keeping the first tab open, create a second tab and paste https://examplewebsite.com/step2.do?a=sample into the browser

  • The information is correctly loaded and no "session expired" warning is presented.

So my question is, how do I reproduce the behavior of "creating a new tab" within Python, while simultaneously keeping the session cookie shared between the requests.

Here is my attempts to do it in dryscrape

import dryscrape

a = dryscrape.Session()
a.set_header("User-Agent", "Firefox")
a.visit('checkinfo.do URL')

b = dryscrape.Session()
b.set_header("User-Agent", "Firefox")
b.set_cookie(a.cookies())   #This is my attempt to share the session cookies in a seperate dryscrape object to simulate putting the URL in a second tab.
b.visit('step2.do URL')

Unfortunately, the above does not work, and a.cookies does not match b.cookies, both before and after the second request.

NOTE: The website page does have code that will end the session when the page is unloaded. So if dryscrape is doing anything equivalent to unloading the page the session cookie would be marked invalid by the server.

  • My attempt to recreate the same effects of creating a new tab in dryscrape are not working, can someone confirm if I am passing the cookies correctly? Is there a special way to pass "session cookies" because it appears as though the full cookies are not being passed. See edits above. – Taylor Amarel Nov 18 '17 at 13:20
  • I have tried setting the cookie information via set_header but have still not been able to get it to work. I believe that for some reason, when I get the cookie via dryscrape it is expiring, perhaps because the behavior of dryscrape causes the cookie to expire on client side or server side. – Taylor Amarel Nov 18 '17 at 14:24
  • Still wacking away at this: I found that I can loan the second page via dryscrape if I: 1.) Visit the first page in firefox; 2.) Visit the second page in firefox: 3.) Copy the request headers from the second firefox page load into a dryscrape object using set_headers, then using visit, to load the second page. – Taylor Amarel Nov 18 '17 at 14:50
  • did you find a solution? – akhilsp Jan 21 '18 at 11:26

1 Answers1

0

Now I have also ran into a similar problem.

The problem with your code is that a.cookies() returns a list of cookies. You need to set each cookie separately.

Try something like this:

b = dryscrape.Session()
b.set_header("User-Agent", "Firefox")
for cookie in a.cookies():
    b.set_cookie(cookie)

Hope it helps :)

akhilsp
  • 1,063
  • 2
  • 13
  • 26