0

So I am trying to write up a small script with the requests library that makes a request to a site (eg. github.com), and parses the cookies in the response headers. So when you make a request to github.com, there are 3 different Set-Cookie headers as:

Set-Cookie: has_recent_activity=1; path=/; expires=Thu, 27 Dec 2018 07:54:16 -0000
Set-Cookie: logged_in=no; domain=.github.com; path=/; expires=Mon, 27 Dec 2038 06:54:16 -0000; secure; HttpOnly
Set-Cookie: _gh_sess=MldFM3p...; path=/; secure; HttpOnly

Now, when you make a request via the requests API and check the Set-Cookie header via req.headers.get('Set-Cookie'), all those cookie values get clumped into one as:

has_recent_activity=1; path=/; expires=Thu, 27 Dec 2018 07:54:16 -0000, logged_in=no; domain=.github.com; path=/; expires=Mon, 27 Dec 2038 06:54:16 -0000; secure; HttpOnly, _gh_sess=MldFM3p...; path=/; secure; HttpOnly

So my question is how can I obtain 3 distinctly separate intact cookies as it was sent by the server alongwith all cookie metadata information (maybe in form of a list)?

I am a novice to Python, so any help will be highly appreciated. Cheers!

0xInfection
  • 2,676
  • 1
  • 19
  • 34
  • How do you know there are three set-cookie headers if requests clumps them all into one? – robert Dec 27 '18 at 07:10
  • @robert I tried this out obviously before posting this. :) – 0xInfection Dec 27 '18 at 07:11
  • 1
    What I mean is how else did you send a get request where you saw 3 set-cookies headers? Regardless, I don't know of a way to differentiate between the actual set-cookie headers if they all get clumped, if you know the cookie names, you can get the values individually from cookies.get_dict(). (If you need to see the path/domain of the cookies, see https://stackoverflow.com/questions/25091976/python-requests-get-cookies) – robert Dec 27 '18 at 07:18
  • 1
    @robert, having a look at the cookies being set reveals they all have their own `expires` flag, you can only have a single expire flag per cookie being set, so it makes sense that they were different. On the other hand you can have a try via a simple browser/fiddler/burpsuite, intercept the request and you'll see them. Or you can try it out here: https://hackertarget.com/http-header-check/. And for the second part, sadly I don't know the cookie names. :( – 0xInfection Dec 27 '18 at 07:23

1 Answers1

1

Honestly, i can't understand what you desire to know in question comments, but if you want a way to solve the question below, it will be easy.

So my question is how can I obtain 3 distinctly separate cookies as it was sent by the server (maybe in form of a list)?

import requests
with requests.Session() as s:
    resp = s.get("https://github.com")
    print(resp.cookies)
    #resp.cookies.items()
    #resp.cookies.get_dict()
    # More details: http://docs.python-requests.org/en/master/_modules/requests/cookies/

<RequestsCookieJar[
<Cookie logged_in=no for .github.com/>,
<Cookie _gh_sess=UHd5aUZ0ZXlBVDVPMitaVVBaWFp0c1p6dFA0TWVSanJzRGgrbU1XbVkxV3VXRW9LeWgwWHpWZ2pOOHFxZmtGaTZpRExpT2NjTHRyK3hHZG5GZjlxTzllbklqK0thQytHYi9HZWsrZ1poZ1ZUakJkRU9OZmJINEh3QUR2N3h3UUh6aVdFTmFCRHlHcVpwWHo1bEM5d25adnhUemJ6Y3pFMUxTbk50Q0M0UUJrVG5hR3kxRUVoUTB2TjdUc2hWbHk3cDJDWUZ4UW85NVRuR09keFJRTlc1QT09LS1RUnZHWUpsQ3BQU0hPZGtsWDAxQXFBPT0%3D--d2bd04e94c369f425fb7e9cc57b5b5499909b140 for github.com/>,
<Cookie has_recent_activity=1 for github.com/>]>

Also it can extract by regex(regex101_A regex101_B)

import requests, re
with requests.Session() as s:
    resp = s.get("https://github.com")
    show_cookie = lambda x: [re.findall(r"([^,;\s]*?=.*?(?=;|$))|(\w+(?=;|$|,))",cookie) for cookie in re.findall(r"((?:^|,\s).*?)(?=,\s\S+;|$)",x)]
    print(show_cookie(resp.headers.get('Set-Cookie')))
[[('has_recent_activity=1', ''), ('path=/', ''), ('expires=Sat, 29 Dec 2018 14:43:45 -0000', '')], [('logged_in=no', ''), ('domain=.github.com', ''), ('path=/', ''), ('expires=Wed, 29 Dec 2038 13:43:45 -0000', ''), ('', 'secure'), ('', 'HttpOnly')], [('_gh_sess=eHBNWkZscHFMeXJ3NEJUU0VXZlBQaHg0S01rby9MK24xNnFvR3gvVTBsOUJjTWNWenJPZ0RRdk9RNE9ZV2V0MTQ1bTg2NEduY3phSWRrd3l0L252KzBJNkRYZlpjWXh5c2NBZktkWGFsdjZDbEJjTEdhVmZ0YnpldDFHTEpuQzFTcDNNS21sT3BRaHhBVUFqTHQ1cDZyQWNPU005ODY0bFh0MGxCbWI5d2kwait5RlcvVjlUc2FwTTdNRE8wOHZQb0RGak5YbG1ZSDJTM2ZpQmVUUkkrdz09LS11M0ZHem1YYjdWYkVLaWtRMkhscW5nPT0%3D--f778e2d24e96f3386a2da36e2d33d2b73418deed', ''), ('path=/', ''), ('', 'secure'), ('', 'HttpOnly')]]
KC.
  • 2,981
  • 2
  • 12
  • 22
  • 1
    I have seen this method but it actually does not return the other for flags associated like the expire flags, secure flags, etc. I wanted the cookies just __intact as they were sent by the server__ (with all for metadata alongwith intact). Thank you for your answer tho. – 0xInfection Dec 29 '18 at 13:26
  • 2
    @InfectedDrake Thx for your mention, i edit my answer. But i don't know how to transform it into dict. Such as `('', 'secure')` – KC. Dec 29 '18 at 13:48
  • Thank you for your answer. You're awesome. In the meantime, I also found another workaround method. – 0xInfection Dec 30 '18 at 16:32