0

TL;DR:

I have some very simple PHP code utilizing cURL that makes single HTTP requests (in practice, to a Diaspora* pod, though that shouldn't be relevant to the question). The code takes note of any cookies returned by the web server and then manually sets those values to libcurl's CURLOPT_COOKIE. However, in trying to hunt down a bug, I'm finding that when I use CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR, the values of the cookies in the cookie file are different than when I use CURLOPT_COOKIE. Why is this the case? (See code below.)

PRIOR RESEARCH

I have already looked other questions such as this one that suggest various ways of manipulating libcurl's options to keep the same resource handle around and the cookies in memory, but this is not suitable to my application. I need to access the cookie values directly and notably not on a filesystem (to save them into a database, but again, this should not matter with regards to the question).

CODE

For completeness, here is a test case for code I am using:

<?php
// This function simply extracts the cookie set by a webserver by looking at the full HTTP source traffic.
function readCookie ($str) {
    $m = array();
    preg_match('/Set-Cookie: (.*?);/', $str, $m);
    return (!empty($m[1])) ? $m[1] : false;
}

// This function does the same for the CSRF token required for login.
function parseAuthenticityToken ($str) {
    $m = array();
    preg_match('/content="(.*?)" name="csrf-token"/', $str, $m);
    return (!empty($m[1])) ? $m[1] : false;
}

// Get first page, to find the CSRF token.
$ch = curl_init('https://diasp.org/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$resp = curl_exec($ch);
curl_close($ch);

$csrf_token = parseAuthenticityToken($resp);

$params = array(
    'user[username]' => 'my_username',
    'user[password]' => 'my_password',
    'authenticity_token' => $csrf_token
);

// Make POST request to the log in controller.
$ch = curl_init('https://diasp.org/users/sign_in');
curl_setopt($ch, CURLOPT_POSTFIELDS, $params);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// In order to work, the COOKIEFILE/JAR options must be used. Why?
//curl_setopt($ch, CURLOPT_COOKIEFILE, '/tmp/test_cookiejar');
//curl_setopt($ch, CURLOPT_COOKIEJAR, '/tmp/test_cookiejar');
$resp = curl_exec($ch);
curl_close($resp);

$cookies = readCookie($resp);

// Even if the login is successful, this fails if and only if no COOKIEFILE/JAR is specified.
// Why?
$ch = curl_init('https://diasp.org/stream');
curl_setopt($ch, CURLOPT_COOKIE, $cookies);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// If I use COOKIEFILE here, the request works. What is this line doing that CURLOPT_COOKIE is not?
//curl_setopt($ch, CURLOPT_COOKIEFILE, '/tmp/test_cookiejar');
$resp = curl_exec($ch);
curl_close($ch);

var_dump($resp);

SUMMARY

I am making very simple, step-by-step, procedural calls to a web server. These requests are being made one after the other, and the resulting output (of the entire HTTP conversation, including headers), is saved in a variable, which is then read and the values of the cookies are parsed from the Set-Cookie HTTP header lines. However, these values are never the same as the values that libcurl writes to the COOKIEFILE if those lines are uncommented.

What am I doing wrong with CURLOPT_COOKIE or what am I not doing with it that the CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR options are doing? Is it encoded or decoded in some reversible way? Thanks in advance.

M12
  • 737
  • 1
  • 9
  • 14
  • 1
    CURLOPT_COOKIEJAR is used to store the cookies and CURLOPT_COOKIEFILE is used to get the cookies. when Login or send first request the returned cookies are stored in a one file by using CURLOPT_COOKIEJAR , and after second request onwards we can use those cookies by using CURLOPT_COOKIEFILE. – rajana sekhar Feb 18 '15 at 04:57
  • Yes, I know that. The question is why using CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR "works" (sends cookies correctly transparently) whereas CURLOPT_COOKIE doesn't, and specifically why the latter seems to result in using cookies that are not the same as the ones written to the file. – M12 Feb 18 '15 at 05:01
  • If the credentials of the first request and second request are same, then we can use the cookies of the first request to the second request, otherwise it's not possible. – rajana sekhar Feb 18 '15 at 05:19
  • I don't think you're understanding the question. The question is not whether the request is successful or not, the question is why the above code seems to read different cookie values in the case of whether it uses CURLOPT_COOKIEFILE/CURLOPT_COOKIEJAR or whether it instead parses out the Set-Cookie headers manually and replaces the manually-parsed out values usin CURLOPT_COOKIE. It would really be a lot less confusing for you and less frustrating for me if you just read the question before regurgitating something you read in the manual. Thanks. – M12 Feb 18 '15 at 05:45
  • Can you show us the difference between the two approaches, i.e. the resulting `Cookie:` header(s) as part of the logs when using `curl_setopt($ch, CURLOPT_VERBOSE, true);`? – Hans Z. Feb 18 '15 at 09:14
  • I wasn't aware of `CURLOPT_VERBOSE` and it's exactly what I needed! Thank you so much, Hans! It turns out that the problem wasn't *different* cookie values at all, but rather a misunderstanding on my part of how cookie *responses* were being replaced by `CURLOPT_COOKIEFILE`/`JAR`. The difference was that by using the cookie file/jar feature, every request was saving and replacing those cookies, which masked the bug in my code that assumes only the *third* (rather than the *second*) HTTP request needs a cookie value, too. So, a mishmash of misunderstanding Diaspora & libcurl was the culprit. :( – M12 Feb 18 '15 at 15:33

1 Answers1

0

You probably did not notice the difference between CURLOPT_COOKIE and CURLOPT_COOKIELIST/FILE/JAR. The both handle cookies but, CURLOPT_COOKIE does not store the cookies you set this time in the memory, or store them in the cookie file specified by CURLOPT_COOKIEJAR; instread, CURLOPT_COOKIELIST does.

There is a mechanism called cookie engine in libcurl. It is triggered enabled when you set any one of CURLOPT_COOKIELIST/FILE/JAR, libcurl takes care of sending/parsing/reading/storing cookies in all subsequent session.

CURLOPT_COOKIE is just a quick hack way to set a extra cookie for one go.

Murphy Meng
  • 237
  • 3
  • 8
  • Thank you for this. One clarifying question: I do not see `CURLOPT_COOKIELIST` in the PHP manual, although I do see [a clear and helpful explanation for it in the cURL API docs](http://curl.haxx.se/libcurl/c/CURLOPT_COOKIELIST.html), which I believe is why I "missed" it. Does PHP not have support for `CURLOPT_COOKIELIST`? That sounds like exactly what I needed, as it seems easier than using `CURLOPT_COOKIE` to manually manage cookies myself on each HTTP request I make. Again, thank you for the pointer to this. – M12 Feb 26 '15 at 00:06