6

I'm attempting to consume the Gnip PowerTrack API which requires me to connect to an HTTPS stream of JSON with basic auth. I feel like this should be fairly trivial so I'm hoping some rubyist who is smarter than me can point out my obvious mistake.

Here's relevant parts my ruby 1.9.3 code:

require 'eventmachine'
require 'em-http'
require 'json'

usage = "#{$0} <user> <password>"
abort usage unless user = ARGV.shift
abort usage unless password = ARGV.shift
GNIP_STREAMING_URL = 'https://stream.gnip.com:443/foo/bar/prod.json'

http = EM::HttpRequest.new(GNIP_STREAMING_URL)
EventMachine.run do
  s = http.get(:head => { 'Authorization' => [user, password], 'accept' => 'application/json', 'Accept-Encoding' => 'gzip,deflate' }, :keepalive => true, :connect_timeout => 0, :inactivity_timeout => 0)

  buffer = ""
  s.stream do |chunk|
    buffer << chunk
    while line = buffer.slice!(/.+\r?\n/)
      puts JSON.parse(line)
    end
  end
end

The stream connects (My Gnip dashboard repors a connection) but then just buffers and never outputs anything. In fact, it seems like it never enters the s.stream do.. block. Note that this is a GZip encoded stream.

Note that this works:

curl --compressed -uusername $GNIP_STREAMING_URL

EDIT: I'm sure this is kinda implicit, but I can't give out any login creds or the actual URL, so don't ask ;)

EDIT #2: yajl-ruby would probably work if I could figure out how to encode credentials for the URL (simple URL encoding doesn't seem to work as I fail authentication with Gnip).

EDIT #3: @rweald found that em-http does not support streaming gzip, I've created a GitHub issue here.

EDIT #4: I have forked and fixed this in em-http-request, you can point at my fork if you want to use em-http this way. The patch has been merged into the maintainer's repo and will be working in the next release.

EDIT #5: My fixes have been published in em-http-request 1.0.3, so this should no longer be an issue.

Eric Wendelin
  • 43,147
  • 9
  • 68
  • 92

4 Answers4

2

The problem lies within em-http-request. If you look at https://github.com/igrigorik/em-http-request/blob/master/lib/em-http/decoders.rb

You will notice that the GZIP decompressor can not do streaming decompression :( https://github.com/igrigorik/em-http-request/blob/master/lib/em-http/decoders.rb#L100

You would need to fix the underlying streaming gzip problem if you wanted to be able to read a stream using em-http-request

rweald
  • 426
  • 2
  • 3
  • Nice find! Maybe we fix the em-http gem. If not, is there a way to use yajl-ruby or curb to keep the connection alive and then try reconnecting in an exponential backoff pattern? – Eric Wendelin Feb 22 '12 at 14:02
  • Yeah so I actually found a workaround yesterday that will allow streaming gzip json from GNIP. I am working on cleaning up the code now and you will be able to see it in my github project https://github.com/rweald/gnip-stream – rweald Feb 23 '12 at 16:00
  • I think I should be able to generalize the fix as well so that it could be added as a patch to em-http-request. I will have a look over the weekend. – rweald Feb 23 '12 at 16:01
  • Only problem with taking yajl-ruby's impl is that they don't want native extensions, but it looks like you're kicking ass on gnip-stream so thanks :) - Let me know if I can help any – Eric Wendelin Feb 23 '12 at 17:38
  • Bounty awarded really just due to all the great work you're doing on gnip-stream :) Will use curb in the interim. Thanks! – Eric Wendelin Feb 28 '12 at 15:13
1

I have been using some code base off of this Gist to connect to Gnip console. https://gist.github.com/1468622

Wizard of Ogz
  • 12,543
  • 2
  • 41
  • 43
0

it looks like using https://github.com/brianmario/yajl-ruby would solve this nicely

  • It did look promising, but I can't figure out how to encode the username and password such that I don't get this error: "lib/ruby/1.9.1/uri/generic.rb:411:in `check_user': bad component(expected userinfo component or user component)" – Eric Wendelin Feb 22 '12 at 02:07
  • 1
    This actually wont help either. If you look at the yajl-ruby code for http_request you will notice that it only supports gzip if the response is not "Chunked" which the GNIP response is. https://github.com/brianmario/yajl-ruby/blob/master/lib/yajl/http_stream.rb#L160 – rweald Feb 23 '12 at 16:02
0

Gnip suggested I use curb and here's what I came up with from their example:

require 'rubygems'
require 'curb'

# Usage: <script> username password url
# prints data to stdout.
usage = "#{$0} <user> <password> <url>"
username, password, url = ARGV.first 3

Curl::Easy.http_get url do |c|
  c.http_auth_types = :basic
  c.username = username
  c.password = password
  c.encoding = 'gzip'
  c.on_body do |data|
    puts data
    data.size # required by curl's api.
  end
end

Though I would like something that will reconnect when the connection is dropped and handle different types of failures gracefully.

Eric Wendelin
  • 43,147
  • 9
  • 68
  • 92