6

I got to use Python to access (read) web-pages in an automatic way. Using Python I can easily access the content of the web-pages (HTML code) as well as cookies sent by the server.

Now, in HTML5 we have a new concept "Local Storage". So, I need to modify my Python scripts so that I can also read the data stored in the local storage.

Is possible to do so? Is there any Python library that makes it easy?

Roman
  • 124,451
  • 167
  • 349
  • 456

2 Answers2

5

Yes, Python itself, however, does not include a JavaScript interpreter. So you might execute custom script thru Selenium upon a web browser instance as thibpat has mentioned.

Other option is PhantomJS, running headless browser.

Script to iterate over localStorage

for (var i = 0; i < localStorage.length; i++){
    key=localStorage.key(i); 
    console.log(key+': '+localStorage.getItem(key));
}

Advanced script

As mentioned here HTML5 feature browser should also implement Array.prototype.map. So script will be:

Array.apply(0, new Array(localStorage.length)).map(function (o, i) 
   { return localStorage.key(i)+':'+localStorage.getItem(localStorage.key(i)); }
)

Python bindings

You might want to use the Python binding with development framework for desktop. Ex. PyQt.

Why JavaScript to fetch local storage

From the definition:

Unlike cookies, which can be accessed by both the server and client side, web storage falls exclusively under the purview of client-side scripting. Web storage data is not automatically transmitted to the server in every HTTP request, and a web server can't directly write to Web storage. However, either of these effects can be achieved with explicit client-side scripts, allowing for fine-tuning of the desired interaction with the server.

So in my view the local storage is data stored by web browser (ex. Opera) somewhere on hard drive (or cloud machine) where browser is run. So to fetch them you need to locally hack Opera's executive, library and/or data files, which is hard. The simplest way is to apply the client-scripting, namely JavaScript.

Community
  • 1
  • 1
Igor Savinkin
  • 5,669
  • 8
  • 37
  • 69
  • Than you for that answer. What I do not understand, is why do we need a JavaScript interpreter. Yes, I know that data stored in local storage is read and used locally by JavaScript but it does not necessarily mean that data itself is saved as JavaScript code that has to be "interpreted". So, why do we need an interpreter? – Roman Oct 13 '15 at 08:52
  • @Roman, under **the JavaScript interpreter** I meant **JavaScript_engine** https://en.wikipedia.org/wiki/JavaScript_engine, that is something that executes JS on the client side and gets the Local Storage content. – Igor Savinkin Feb 23 '21 at 11:59
0

I don't know which library you're using right now, but you could use Selenium and the Web Driver API. This API allows you to control a browser such as Chrome/Firefox or a headless browser such as PhantomJS.

Thanks to this api you can navigate to the right page and then execute a javascript snippet to access the localStorage variable.

thibpat
  • 714
  • 5
  • 17