23

I am writing a Python application that queries social media APIs via cURL. Most of the different servers I query (Google+, Reddit, Twitter, Facebook, others) have cURL complaining:

additional stuff not fine transfer.c:1037: 0 0

The unusual thing is that when the application first starts, each service's response will throw this line once or twice. After a few minutes, the line will appear several several times. Obviously cURL is identifying something that it doesn't like. After about half an hour, the servers begin to time out and this line is repeated many tens of times, so it is showing a real problem.

How might I diagnose this? I tried using Wireshark to capture the request and response headers to search for anomalies that might cause cURL to complain, but for all Wireshark's complexity there does not seem to be a way to isolate and display only the headers.

Here is the relevant part of the code:

output = cStringIO.StringIO()
c = pycurl.Curl()
c.setopt(c.URL, url)
c.setopt(c.USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:17.0) Gecko/20100101 Firefox/17.0')
c.setopt(c.WRITEFUNCTION, output.write)
c.setopt(c.CONNECTTIMEOUT, 10) 
c.setopt(c.TIMEOUT, 15) 
c.setopt(c.FAILONERROR, True)
c.setopt(c.NOSIGNAL, 1)

try:
    c.perform()
    toReturn = output.getvalue()
    output.close()
    return toReturn

except pycurl.error, error:
    errno, errstr = error
    print 'The following cURL error occurred: ', errstr
Peter O.
  • 32,158
  • 14
  • 82
  • 96
dotancohen
  • 30,064
  • 36
  • 138
  • 197
  • Are you sure this is something they're actually returning in the headers, not, say, a warning that cURL is just printing to `stderr` or `syslog` or whatever in the middle of you logging the headers? (Especially since transfer.c is exactly the file I'd expect to see curl logging something like this…) You may need to show us the actual code you're using, and tell us the versions of libcurl and whichever Python wrapper you're using. – abarnert Dec 18 '12 at 23:44
  • Thanks abarnert. A the lines do begin with `*` and not `<` I did also think that they were not part of the header itself. I updated the question. – dotancohen Dec 18 '12 at 23:57
  • I think you're already clear on this, and just didn't update the entire question, but just in case: the reason you can't isolate this message in Wireshark is that it never goes over the wire; it's just printed out locally. – abarnert Dec 19 '12 at 00:08
  • I'm not trying to isolate the message in wireshark, but rather the entire request and response headers to look for abnomalies. – dotancohen Dec 19 '12 at 00:09
  • Oh, for that, you don't even need Wireshark—just write all of the headers to a log, from inside your app. That way, you can get things in any format you want, without having to worry about connecting up corresponding requests and responses after the fact, etc. – abarnert Dec 19 '12 at 00:16
  • How can I output the actual headers, and not what I think the headers are? Note that I don't see anything unusual when outputting with `VERBOSE`. – dotancohen Dec 19 '12 at 00:19
  • Well, you can do a quick spot check with `wireshark` to verify that what `libcurl` thinks the headers are matches what's actually on the wire, then log the headers from `libcurl`. (To give you actual code would require knowing which `curl` wrapper you're using, but I'm sure you can read the docs as well as I can anyway.) – abarnert Dec 19 '12 at 00:24

3 Answers3

29

I'm 99.99% sure this is not actually in any HTTP headers, but is rather being printed to stderr by libcurl. Possibly this happens in the middle of you logging the headers, which is why you were confused.

Anyway, a quick search for "additional stuff not fine" curl transfer.c turned up a recent change in the source where the description is:

Curl_readwrite: remove debug output

The text "additional stuff not fine" text was added for debug purposes a while ago, but it isn't really helping anyone and for some reason some Linux distributions provide their libcurls built with debug info still present and thus (far too many) users get to read this info.

So, this is basically harmless, and the only reason you're seeing it is that you got a build of libcurl (probably from your linux distro) that had full debug logging enabled (despite the curl author thinking that's a bad idea). So you have three options:

  1. Ignore it.
  2. Upgrade to a later version of libcurl.
  3. Rebuild libcurl without debug info.

You can look at the libcurl source for transfer.c (as linked above) to try to understand what curl is complaining about, and possibly look for threads on the mailing list for around the same time—or just email the list and ask.

However, I suspect that actually may not relevant to the real problem at all, given that you're seeing this even right from the start.

There are three obvious things that could be going wrong here:

  1. A bug in curl, or the way you're using it.
  2. Something wrong with your network setup (e.g., your ISP cuts you off for making too many outgoing connections or using too many bytes in 30 minutes).
  3. Something you're doing is making the servers think you're a spammer/DoS attacker/whatever and they're blocking you.

The first one actually seems the least likely. If you want to rule it out, just capture all of the requests you make, and then write a trivial script that uses some other library to replay the exact same requests, and see if you get the same behavior. If so, the problem obviously can't be in the implementation of how you make your requests.

You may be able to distinguish between cases 2 and 3 based on the timing. If all of the services time out at once—especially if they all do so even when you start hitting them at different times (e.g., you start hitting Google+ 15 minutes after Facebook, and yet they both time out 30 minutes after you hit Facebook), it's definitely case 2. If not, it could be case 3.

If you rule out all three of these, then you can start looking for other things that could be wrong, but I'd start here.

Or, if you tell us more about exactly what your app does (e.g., do you try to hit the servers over and over as fast as you can? do you try to connect on behalf of a slew of different users? are you using a dev key or an end-user app key? etc.), it might be possible for someone else with more experience with those services to guess.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • Thank you, I updated the question in light of the fact that this is in fact a cURL message. However, when the message starts showing up, the connections start timing out. Therefore I would like to know what is throwing them, to solve the timeout issue. Note that the timeout issue occurs even if `VERBOSE` is not enabled and I don't actually see the message. – dotancohen Dec 18 '12 at 23:58
  • Thanks. Stopping and restarting the application does eliminate the problem for a few minutes, so I suspect that I'm actually sending bad request headers to begin with. I only hit each server once per minute. It looks like they all start timing out at about the same time, but in all cases the amount of times that the message is printed increases from one time when the application is first started to tens of times when the servers are timing out. – dotancohen Dec 19 '12 at 00:19
  • @dotancohen: Does stopping it and _immediately_ restarting it eliminate the problem for a while, or is it only, say, giving it a 60-second break that makes a difference? If it's the former, you could be leaking `curl` handles or sockets or something… – abarnert Dec 19 '12 at 00:25
  • 2
    Note that the were hundreds of posts on this debug line in the [curl mailing list](http://curl.haxx.se/mail/) and that it has been removed in the release 7.28.1 - November 20 2012 as stated in the [curl changelog](http://curl.haxx.se/changes.html). Of course not having the spurious message, don't solve your timeout, but you (@dotancohen) should use the last 7.29 release. – marcz Feb 13 '13 at 09:36
4

I disagree with this - I get the same message when attempting to call a website via a BIGIP LTM external VIP address.

For example:

I call the website http://11five.10.10.10/index.html (IP address is random in this case). The BIG F5 should be load balancing the traffic to two internal web servers (17two.20.0.10 and 17two.20.0.11) via a pool associated with the virtual server.

In this case, the request coming from the external source (Internal Client) to the VIP address on TCP 80 should round robin between the two web servers. What I find is that all the servers receive an initial SYN packet and never a SYN-ACK back.

If I sit on a terminal within the local subnet where the real servers reside, I can "wget" the index.html webpage - sourced from 17two.20.0.11 to http://17two.20.0.10}/index.html.

Coming from external, I get the *additional stuff not fine transfer.c:1037 0 0 message.

You are right in saying that it's a built in debug mechanism for CURL in older revisions of the libcurl library but I disagree with the below statement;

A bug in curl, or the way you're using it.
Something wrong with your network setup (e.g., your ISP cuts you off for making too many outgoing connections or using too many bytes in 30 minutes).
Something you're doing is making the servers think you're a spammer/DoS attacker/whatever and they're blocking you.

What ever is causing this is due to a networking issue within the environment, I.E... the web servers cannot return the traffic back to the original source and hence displays this error or two, there is something wrong with the request header and the response back from the web server.

In this case I will opt to say that the original issue is more likely as when I performed a curl using different URis on the original request from a test host in the local subnet, I could retrieve the index.html web page fine. This implies that the server is listening and accepting connections using the FQDN and short name of the server.

I believe that this error is there to suggest that curl received a response that it is unsure on and therefore produces the above error. Without developing curl or reading the source code, I cannot comment further.

Any additional response that questions this logic would be welcome - all up for learning new things.

Andy

Andrew
  • 41
  • 1
  • 1
    Hi Andrew, welcome to Stack Overflow! You should know that your message was posted as an answer to the original question, but by its content it seems to be a reply to the previous answer. You should use the `add comment` feature for replying to an existing answer. Thanks! – dotancohen Apr 21 '13 at 09:43
  • @dotancohen look at the size of this post, its over 2000 characters long. if comments allowed 2000+ chars, he'd might. but as it stood in 2014, its a max of ~500 characters for a comment. – hanshenrik Sep 06 '15 at 00:00
0

confirming

A bug in curl, or the way you're using it.

Systen info: Linux alt 3.2.0-4-amd64 #1 SMP Debian 3.2.63-2+deb7u1 x86_64 GNU/Linux

I've updated curl library, and continuous messages (which were caught on twitter rest api testing)

  • additional stuff not fine transfer.c:1037: 0 0

have disappeared

my newly updated curl --version data

$ curl -V

curl 7.38.0 (x86_64-pc-linux-gnu) libcurl/7.38.0 OpenSSL/1.0.1e zlib/1.2.7 libidn/1.25 libssh2/1.4.3 librtmp/2.3 Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smtp smtps telnet tftp Features: AsynchDNS IDN IPv6 Largefile GSS-API SPNEGO NTLM NTLM_WB SSL libz TLS-SRP

Community
  • 1
  • 1
Arij
  • 100
  • 1
  • 6