427

I am just beginning to start learning web application development, using python. I am coming across the terms 'cookies' and 'sessions'. I understand cookies in that they store some info in a key value pair on the browser. But I have a little confusion regarding sessions, in a session too we store data in a cookie on the user's browser.

For example - I login using username='rasmus' and password='default'. In such a case the data will be posted to the server which is supposed to check and log me in if authenticated. However during the entire process the server also generates a session ID which will be stored in a cookie on my browser. Now the server also stores this session ID in its file system or datastore.

But based on just the session ID, how would it be able to know my username during my subsequent traversal through the site? Does it store the data on the server as a dict where the key would be a session ID and details like username, email etc. be the values?

I am getting quite confused here. Need help.

Rasmus
  • 8,248
  • 12
  • 48
  • 72
  • 11
    “Does it store the data on the server as a dict where the key would be a session id and details like username, email etc be the values?” ...yes. The ‘dict’ might be a relational database, but that's basically how it works. – bobince Sep 27 '10 at 13:56
  • In case if you don't know: storing password on client side is not safe, even if the password is hashed (it doesn't make a difference, in fact. Cracker can directly input the hashed password by creating a fake cookie) There are better ways to store the login status. – cytsunny Nov 27 '14 at 06:21
  • 1
    I written my own using protocol level details - http://www.bitspedia.com/2012/05/how-session-works-in-web-applications.html – Asif Shahzad Apr 03 '18 at 15:36

6 Answers6

517

Because HTTP is stateless, in order to associate a request to any other request, you need a way to store user data between HTTP requests.

Cookies or URL parameters ( for ex. like http://example.com/myPage?asd=lol&boo=no ) are both suitable ways to transport data between 2 or more request. However they are not good in case you don't want that data to be readable/editable on client side.

The solution is to store that data server side, give it an "id", and let the client only know (and pass back at every http request) that id. There you go, sessions implemented. Or you can use the client as a convenient remote storage, but you would encrypt the data and keep the secret server-side.

Of course there are other aspects to consider, like you don't want people to hijack other's sessions, you want sessions to not last forever but to expire, and so on.

In your specific example, the user id (could be username or another unique ID in your user database) is stored in the session data, server-side, after successful identification. Then for every HTTP request you get from the client, the session id (given by the client) will point you to the correct session data (stored by the server) that contains the authenticated user id - that way your code will know what user it is talking to.

unicorn2
  • 844
  • 13
  • 30
Luke404
  • 10,282
  • 3
  • 25
  • 31
  • 3
    "you don't want that data to be maintained client side". Why not? If you employ strong cryptography you can let the client keep hold of the session data encrypted and stored in a cookie. This greatly simplifies scaling out to multiple nodes as the servers don't need to 'remember' anything. – Matt Harrison Jan 30 '15 at 20:56
  • 6
    @MattHarrison how would you decrypt the data without "remembering anything" server-side? I've tried to expand this topic in my answer anyhow. – Luke404 Feb 02 '15 at 10:00
  • It is possible to only remember 1 key. But security will drop... a lot. – Jared Teng Mar 04 '15 at 14:08
  • 6
    @MattHarrison keep in mind that storing lots of data user-side will increase your traffic. – nitsas Jun 25 '15 at 11:21
  • 7
    Wouldn't a third party be able to act as the user if they could intercept the user's session key? Assuming the site doesn't use HTTPS, it seems like a third party could masquerade as the user with a session key even if the key is encrypted. The server would just decrypt it. – user137717 Aug 01 '15 at 04:31
  • does this definition apply to session in SIP ( session initiation protocol)? – pankaj kushwaha Sep 05 '15 at 12:38
  • @user137717 yes this is a valid concern – Celeritas Apr 03 '16 at 04:28
  • 5
    @user137717 yes that is a possibility if you allow access to the session to literally "every one that presents the correct session id". There are a number of restrictions you can put in place, one of the easiest and most common is to store the client IP in the session: if a client from another ip presents the same session id you mark that as forged and delete the session. – Luke404 Aug 08 '17 at 14:03
  • "The solution is to store that data server side, give it an "id", and let the client only know (and pass back at every http request) that id. There you go, sessions implemented." Are they sessions if 1. The id won't expire; 2. The id doesn't store in cookie but in query arguments, request body or HTTP header; – Cloud Jun 09 '18 at 00:21
  • @SiminJie yes, they are non expiring sessions and you can pass the id around however you like. Also, just because that could technically qualify as _sessions_, it doesn't mean it is automatically _a good idea_. – Luke404 Jul 02 '18 at 18:40
  • 2
    For anyone reading this now: - "It is possible to only remember 1 key. But security will drop... a lot." - This is true. One can employ a method where this secret key is changed regularly. This would invalidate the existing tokens that the users have, but that can be solved by using one access token (that uses this key) and one refresh token. About detecting of token theft, you could use the concept of rotating refresh tokens as highlighted in RFC 6819. Here is an article explaining all the different protocols that can be used: https://medium.com/@supertokens.io/ee5245e6bdad – Rishabh Poddar Jun 11 '19 at 07:21
205

Explanation via Pictures:

Sessions explained via Picture

You can think of a session kinda like a library ID card. Everytime you go to a library, then you you show them your ID card which was issued by that particular library. Now they can match up who you are with the records stored on file.

Let's elaborate step-by-step:

Simple Explanation by analogy

Imagine you are in a bank, trying to get some money out of your account. But it's dark; the bank is pitch black: there's no light. You are surrounded by another 20 people. They all look the same. And everybody has the same voice. And everyone is a potential bad guy. In other words, HTTP is stateless.

This bank is a funny type of bank - for the sake of argument here's how things work:

  1. you talk to your teller and make a request to withdraw money, and then
  2. you have to wait briefly on the sofa, and 20 minutes later
  3. you collect your money from the teller.

But how will the teller tell you apart from everyone else?

The teller can't see or readily recognise you, remember, because the lights are all out.

What if your teller gives your $10,000 withdrawal to someone else - the wrong person?! It's absolutely vital that the teller can recognise you as the one who made the withdrawal, so that you can get the money (or resource) that you asked for.

Solution:

When you first appear to the teller, he or she tells you something in secret:

"When ever you are talking to me," says the teller, "you should first identify yourself as GNASHEU329 - that way I know it's you".

Nobody else knows the secret passcode.

Example of How I Withdrew Cash:

So I decide to go to and chill out for 20 minutes and then later I go to the teller and say "I'd like to collect my withdrawal"

The teller asks me: "who are you??!"

"It's me, Mr. George Banks!"

"Prove it!"

And then I tell them my passcode: GNASHEU329

"Certainly Mr. Banks!"

That basically is how a session works. It allows one to be uniquely identified in a sea of millions of people. You need to identify yourself every time you deal with the teller.

Difference between Sessions and Cookies

Cookie: You can think of a cookie as simply plastic card upon which information is printed on. You can store anything on that card, like:

  • name / age / sex / marital status
  • passcodes

Sessions: Think of it like a temporary passcode. The passcode is stored in the cookie but it that does not make it a cookie. You wouldn't want anyone to be able to easily tamper with people's passcodes, or to be able to easily reproduce it - see below for examples:

Security Concerns with Cookies

The bank can write information onto your card - and so can you. But this can be dangerous:

name: Ben Koshy
sex: male
bank balance: $1.99 :'(

If I wanna be sneaky, I could edit my ID card:

name: Ben Koshy
sex: male
bank balance: $1 billion bucks.  <------ new line

Hooray! I could print more money than Yellen and Powell combined. This presents a security risk: it is for this reason that banks "encrypt" information on cookies, so that if you tampered with it, the bank would know. As a general rule you should never put anything compromising, that can be tampered into a cookie - the bank balance should be stored on the server, where nobody can directly tamper with it.

In this case, Powell decided to tamper with bank balance in his cookie. The bank can now invalidate his session, and log him out - permanently:

name: Jerome Powell
title: "independent" bureaucrat
skill: lying to congress;.
session: tampering with the Fed's balance sheet, 
         lying to congress, criminal incompetence 
         -> session invalid. log him out the fed, permanently. 

Good bye sir!

Duplicating Sessions

Think of any web-based service: facebook, gmail: if I have your password, then I get access to your account. It's the same with sessions: if I can reproduce or recreate your session, then I can impersonate you. This might happen if you get really sloppy / careless and accidentally release your private key on the internet. Someone in Github quite recently (February 2023) released some private keys. If they released Github's secret_base_key, which is some random number in their Rails application, then I could use that private key to create sessions. And once I create a session, I could effectively impersonate someone else.

BenKoshy
  • 33,477
  • 14
  • 111
  • 80
  • 16
    Love this explanation - in your analogy, how would you prevent other ppl from eavesdropping and also hearing the secret passcode the teller tells you? In other words, if the session_id is stolen, wouldn't it be possible for someone to mimic your credentials? – wmock Apr 15 '17 at 17:08
  • 3
    lovely example!! it shall be shared with eager minds looking for learning! – Victor Jan 17 '19 at 17:06
  • in your analogy, `GNASHEU329` is the user password, which generates an auth token that expires until a certain time; Mr Banks can then use the auth token to make several successive withdrawls without having to repeatedly give the teller his password? – Daniel Lizik Mar 09 '20 at 05:11
  • In this analogy, what is the cookie and what is the session/how are they different? Also, if a session closes, how does it get the new information when it reopens? Thanks :D – BeastCoder Jul 14 '20 at 16:30
  • @BKSpurgeon That does make sense, I just have one more question. So, sessions do close, but there is some info either on your computer or in the web server that has the info for when you create a new one and it can be accessed with a session ID (is the session ID always the same?), is that right? Just want to make sure that I am understanding correctly. – BeastCoder Jul 16 '20 at 05:31
  • very nice explanation. just logged in to write comment and say thank you for your time and effort, it really helped me understand. the first answer did not. thank you – Daniel_Ranjbar Jan 15 '23 at 06:47
  • folks given the above questions, i have amended the answer. hopefully that should remove the confusion. – BenKoshy Mar 25 '23 at 03:59
55

"Session" is the term used to refer to a user's time browsing a web site. It's meant to represent the time between their first arrival at a page in the site until the time they stop using the site. In practice, it's impossible to know when the user is done with the site. In most servers there's a timeout that automatically ends a session unless another page is requested by the same user.

The first time a user connects some kind of session ID is created (how it's done depends on the web server software and the type of authentication/login you're using on the site). Like cookies, this usually doesn't get sent in the URL anymore because it's a security problem. Instead it's stored along with a bunch of other stuff that collectively is also referred to as the session. Session variables are like cookies - they're name-value pairs sent along with a request for a page, and returned with the page from the server - but their names are defined in a web standard.

Some session variables are passed as HTTP headers. They're passed back and forth behind the scenes of every page browse so they don't show up in the browser and tell everybody something that may be private. Among them are the USER_AGENT, or type of browser requesting the page, the REFERRER or the page that linked to the page being requested, etc. Some web server software adds their own headers or transfer additional session data specific to the server software. But the standard ones are pretty well documented.

Hope that helps.

Tim Rourke
  • 831
  • 3
  • 9
  • 15
  • I know on the IIS servers I use I can get the user name from a USER_NAME header, but that may be IIS-specific. – Tim Rourke Sep 27 '10 at 15:36
  • What does the REFERRER means here? – Gab是好人 Aug 13 '15 at 12:00
  • @Gab是好人 REFERRER usually means an arbitrary string that the client sends in the "Referer" HTTP request header. It _should_ contain the URL of the resource that, you know, referred the client to the current resource. – Luke404 Aug 08 '17 at 14:05
  • Thanks, it *should*, so not necessarily. so I think people often use this header with different semantics than suggested in the RFC, right? – Gab是好人 Aug 08 '17 at 15:02
  • First you wrote `Like cookies, this usually doesn't get sent in the URL anymore` and then `Session variables are like cookies - they're name-value pairs sent along with a request for a page`. What happens exactly? Is it sent with the next time you make any request? – asn Jan 28 '20 at 21:15
  • By the way, what information is sent between the server and browser also depends on the WebService, i.e., Apachie, and the server-side scripting language, i.e., PHP, and any framework that may be used, i.e., Symfony, In my case, at least some, if not all of the session's global variables are sent to the browser, but not exposed in the page. These, as far as I know are pretty secure and can't be viewed with JavaScript or in the browser's Inspect/Debugger. When a page makes a request back to the server, these very-hidden values are also sent to the server to restore its session variables. –  Oct 25 '20 at 11:38
21

HTTP is stateless connection protocol, that is, the server cannot differentiate between different connections of different users.

Hence comes cookie, once a client connects first time to a server, the server generates a new session id, which later will be sent to the client as cookie value. And from now on, this session id will identify that client connection, because within each HTTP request it will see the appropriate session id inside cookies.

Now for each session id, the server keeps some data structure, which enables him to store data specific to user, this data structure you can abstractly call session.

nbro
  • 15,395
  • 32
  • 113
  • 196
Artem Barger
  • 40,769
  • 9
  • 59
  • 81
  • 2
    Can you put some more light on this - "Now for each session id, server keeps some data structure, which enables him to store data specific to user, this data structure you can abstractly call session."? What specific client information does the server stores? – realPK Feb 27 '14 at 18:00
  • Can you put some more light on this - "Now for each session id, server keeps some data structure, which enables him to store data specific to user, this data structure you can abstractly call session."? What specific client information does the server stores? – Gab是好人 Aug 13 '15 at 11:59
  • Same question as above too, it would be helpful if you response. – Suraj Jain Dec 30 '17 at 05:58
13

Think of HTTP as a person(A) who has SHORT TERM MEMORY LOSS and forgets every person as soon as that person goes out of sight.

Now, to remember different persons, A takes a photo of that person and keeps it. Each Person's pic has an ID number. When that person comes again in sight, that person tells it's ID number to A and A finds their picture by ID number. And voila !!, A knows who is that person.

Same is with HTTP. It is suffering from SHORT TERM MEMORY LOSS. It uses Sessions to record everything you did while using a website, and then, when you come again, it identifies you with the help of Cookies(Cookie is like a token). Picture is the Session here, and ID is the Cookie here.

Luv33preet
  • 1,686
  • 7
  • 33
  • 66
10

Session is broad technical term which can be used to refer to a state which is stored either on server side using in-memory cache or on the client side using cookie, local storage or session storage.

There is nothing specific on the browser or server that is called session. Session is a kind of data which represents a user session on web. And that data can be stored on server or client.

And how it stored and shared is another topic. But the brief is when a user is logged in, the server creates a session data and generates a session ID. The session Id is sent back to user in custom header or set-cookie header which takes care of automatically storing it on user's browser. And then when next time the user revisits, the session ID is sent along the request and server check if there is existing session by that ID and processes accordingly.

You can store whatever you want in an session but the the main purpose is to remember the the user (browser) who have previously visit your site whether it's about login, shopping cart, or other activities.

And that's why it also important to protect the session ID from being intercepted by a hacker who will use it to identify himself as an another user.

By reading about Cookie, you will get the idea of session: (https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies)

Excerpt from MDN:

Cookies are mainly used for three purposes:

Session management

    Logins, shopping carts, game scores, or anything else the server should remember
Personalization

    User preferences, themes, and other settings
Tracking

    Recording and analyzing user behavior
Tenzin Chemi
  • 5,101
  • 2
  • 27
  • 33