0

I'm working on a digital art project that involves gathering cookies from a set of websites that I visit. I'm dabbling in writing some code to help me with this but overall I'm just looking for the easiest/fastest way to gather all of the contents of the cookies dropped in a single visit into a text file for re-use later.

Right now - I'm using this script in a JavaScript bookmarklet which replaces the page I'm on with the contents of the cookies in an array (I'm later putting this array into a python script I wrote...).

The contents of the bookmarklet is below but the problem right now is it only returns the contents of the cookies from the single domain.

So for example - if I run this script on the NYTimes.com homepage I get approx 48 cookies dropped by the domain. But if I look in Chrome I see that all of the 3rd party tracking scripts have hundreds of cookies. How do I gather them all? Not just the NYtimes.com ones?

This is the current JavaScript code I'm running via a bookmarklet right now:

function get_cookies_array() {

var cookies = { };

    if (document.cookie && document.cookie != '') {
        var split = document.cookie.split(';');
        for (var i = 0; i < split.length; i++) {
            var name_value = split[i].split("=");
            name_value[0] = name_value[0].replace(/^ /, '');
            cookies[decodeURIComponent(name_value[0])] = decodeURIComponent(name_value[1]);
        }
    }

    return cookies;

}

function quotationsanitize(cookie){
    if(cookie.indexOf('"') === -1)
        {
          return cookie;
        }
        else{
            alert("found a quotation!");
            return encodeURIComponent(cookie);
        }
}


function sanitize(cookie){
    if(cookie.indexOf(',') === -1)
        {
          return quotationsanitize(cookie);
        }
        else{
            alert("found a comma!");
            return quotationsanitize(encodeURIComponent(cookie));
        }
}

function appendCookies(){
    $("body").empty();
    var cookies = get_cookies_array();
    $("body").append("[");
        for(var name in cookies) {
            //$("body").append(name + " : " + cookies[name] + "<br />" );
            var cookieinfo = sanitize(cookies[name]);
            $("body").append('"' + cookieinfo + '",<br />' );

        }
    $("body").append("]");
}


var js = document.createElement('script');
js.src = "https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js";
document.head.appendChild(js);

jqueryTimeout = window.setTimeout(appendCookies, 500);

I'm removing " and , from the output because I'm putting this data into an array in Python by copying and pasting it. I admit that it's a hack. If anyone has any better ideas I'm all ears!

Makoto
  • 104,088
  • 27
  • 192
  • 230
tomcritchlow
  • 785
  • 2
  • 11
  • 28

1 Answers1

1

I'd write a simple little HTTP proxy. And then set your browser to use the proxy, and have it record all the cookies as they pass through.

There's a question about writing a simple proxy here, seriously simple python HTTP proxy? which might get you started.

You'd need to extend it to read the headers, and extract the cookies, but that's relatively easy, and if you're happy in python, you''l find libraries that do most of what you want already. You would want to record the Related header too, so you knew which cookies came from which page request, but you could then record and entire browsing session quite simply.

Community
  • 1
  • 1
Peter Bagnall
  • 1,794
  • 18
  • 22
  • Thanks! Looks like Python is going to be a better way to do this... Thanks for the link - I'm not super comfortable in Python but with the help of some copy and paste I'll figure it out! :) Really appreciate the answer – tomcritchlow May 26 '15 at 20:21