4

Is there a simple cookie manager in Jsoup that stores the cookies by host? the example in this thread is quite lacking.

Community
  • 1
  • 1
codekitty
  • 1,438
  • 3
  • 20
  • 30
  • How about Apache [HttpClient](http://hc.apache.org/)? I have been using it. It works very good with cookies. –  Mar 18 '12 at 14:45

2 Answers2

9

I didn't find a standard solution that works with Jsoup. Here's my simple cookie handling using a HashMap. It's probably missing a bunch of functionalities but I hope it'll work well enough for my basic crawler:

private static HashMap<String, HashMap<String, String>> host2cookies = new HashMap<String, HashMap<String, String>>();

public static String[] DownloadPage(URL url) throws Exception
{
    Connection con = Jsoup.connect(url.toString()).timeout(600000);
    loadCookiesByHost(url, con);


    Document doc = con.get();
    url = con.request().url();

    storeCookiesByHost(url, con);

    return new String[]{url.toString(), doc.html()};
}

private static void loadCookiesByHost(URL url, Connection con) {
    try {
        String host = url.getHost();
        if (host2cookies.containsKey(host)) {
            HashMap<String, String> cookies = host2cookies.get(host);
            for (Entry<String, String> cookie : cookies.entrySet()) {
                con.cookie(cookie.getKey(), cookie.getValue());
            }
        }
    } catch (Throwable t) {
        // MTMT move to log
        System.err.println(t.toString()+":: Error loading cookies to: " + url);
    }
}

private static void storeCookiesByHost(URL url, Connection con) {
        try {
            String host = url.getHost();
            HashMap<String, String> cookies = host2cookies.get(host);
            if (cookies == null) {
                cookies = new HashMap<String, String>();
                host2cookies.put(host, cookies);
            }
            cookies.putAll(con.response().cookies());
        } catch (Throwable t) {
            // MTMT move to log
            System.err.println(t.toString()+":: Error saving cookies from: " + url);
        }    
}   
codekitty
  • 1,438
  • 3
  • 20
  • 30
  • Rather than iterate through the entries of `cookies` in `loadCookiesByHost(..)`, you can use [`Connection#cookies(cookies)`](http://jsoup.org/apidocs/org/jsoup/Connection.html#cookies-java.util.Map-) which adds all cookies in a map to the connection. – FThompson Dec 02 '15 at 02:49
3

The Connection.Base class has everything you need to know about how jsoup deals with cookies.

Essentially, it will let you get and set them on each connection, but beyond that it's up to you to "manage" them.

cdeszaq
  • 30,869
  • 25
  • 117
  • 173