UPDATE: See new code at the end of the message, this was actually pretty easy to do with cURL but I went about it incorrectly the first time.
I wasn't able to get the Twitter stream API to work using cURL in conjunction with a CURLOPT_READFUNCTION
but have had success using fsockopen() and fread()
. I'm not sure why the readfunction wasn't working as I have used it before with success, but it must have something to do with the fact that the response data is "streaming" and is not sent using HTTP chunked encoding. Essentially, my read function never got called so I couldn't process the data.
The method I used that is working now:
- Connect using fsockopen to
ssl://stream.twitter.com
- Issue the basic HTTP request for stream data using
fputs
- Consume the HTTP response headers and make sure there were no errors
- Read an amount data using
fread
in an infinite loop
- Each time a chunk of data is read, I call an internal buffer function
- The buffer function appends the new data to a buffer
- Buffer function then tries to process all messages in the buffer (if we have 1 or more complete messages)
- As it processes each message, the buffer is reduced until it is empty and then the function returns and data is read again
I've had it running for a couple of hours now and haven't had a dropped connection and I've processed over 30,000 messages with no errors yet.
Basically I implemented a callback system so that each time a full message is read from the buffer, it calls the user-defined callback with the json message so the application can do whatever it needs to do with the message (e.g. insert to database).
I don't have any short snippets to post here yet, but if you want, message me by going to the website listed on my profile and filling in the contact form and I'd be happy to share. Maybe we can work together if anyone is interested. I only did this for fun, I have no interest in Twitter and am not using it for financial reasons. I'll put it on GitHub eventually perhaps.
EDIT:
Here is some cURL code that will connect to the streaming API and pass the JSON messages to a callback function as they are available. This example uses gzip encoding to save bandwidth.
<?php
$USERNAME = 'youruser';
$PASSWORD = 'yourpass';
$QUERY = 'nike';
/**
* Called every time a chunk of data is read, this will be a json encoded message
*
* @param resource $handle The curl handle
* @param string $data The data chunk (json message)
*/
function writeCallback($handle, $data)
{
/*
echo "-----------------------------------------------------------\n";
echo $data;
echo "-----------------------------------------------------------\n";
*/
$json = json_decode($data);
if (isset($json->user) && isset($json->text)) {
echo "@{$json->user->screen_name}: {$json->text}\n\n";
}
return strlen($data);
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://stream.twitter.com/1/statuses/filter.json?track=' . urlencode($QUERY));
curl_setopt($ch, CURLOPT_USERPWD, "$USERNAME:$PASSWORD");
curl_setopt($ch, CURLOPT_WRITEFUNCTION, 'writeCallback');
curl_setopt($ch, CURLOPT_TIMEOUT, 20); // disconnect after 20 seconds for testing
curl_setopt($ch, CURLOPT_VERBOSE, 1); // debugging
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate'); // req'd to get gzip
curl_setopt($ch, CURLOPT_USERAGENT, 'tstreamer/1.0'); // req'd to get gzip
curl_exec($ch); // commence streaming
$info = curl_getinfo($ch);
var_dump($info);