How can I QUICKLY get a string from one of the first couple lines of a long CSV at a remote URL?

Question

I'm working on an assignment where I retrieve several stock prices from online, using Yahoo's stock price system. Unfortunately, the Yahoo API I'm required to use returns a .csv file that apparently contains a line for every single day that stock has been traded, which is at least 5 thousand lines for the stocks I'm working with, and over 10 thousand lines for some of them (example).

I only care about the current price, though, which is in the second line.

I'm currently doing this:

require 'open-uri'
def get_ticker_price(stock)
   open("http://ichart.finance.yahoo.com/table.csv?s=#{stock}") do |io|
      io.read.split(',')[10].to_f
   end
end

…but it's really slow.

Is all the delay coming from getting the file, or is there some from the way I'm handling it? Is io.read reading the entire file?
Is there a way to download only the first couple lines from the Yahoo CSV file?
If the answers to questions 1 & 2 don't render this one irrelevant, is there a better way to process it that doesn't require looking at the entire file (assuming that's what io.read is doing)?

This sounds suspicously like:http://stackoverflow.com/questions/1120350/how-to-download-via-http-only-piece-of-big-file-with-ruby — Jerdak, Jul 16 '12 at 03:27
The way the open command seems to work is to firstly save the downloaded webpage to a temp file, then pass that Tempfile IO object to the given block. Ie `open("http://...") { |io| puts File.read(io.path) }` outputs the contents of the downloaded webpage. So the `open` method downloads the entire file before it even gets to your block. Unfortunately I don't know how to partly download a file (never needed to do that before), so I can't answer 2 or 3, however I'm pretty sure you won't be able to use `open` to do this. — David Miani, Jul 16 '12 at 03:29
You can reduce the file size by specifying last trade date in the query string, if you use the quotes service. example: http://finance.yahoo.com/d/quotes.csv?s=MO&f=snd1l1yr If you can use this other service, more info here: http://greenido.wordpress.com/2009/12/22/yahoo-finance-hidden-api/ — Tim, Jul 16 '12 at 03:38
actually, i found a better reference for the service you are using. Here is example of getting just todays data: http://ichart.finance.yahoo.com/table.csv?s=MO&a=06&b=13&c=2012&d=6&e=13&f=2012&g=d — Tim, Jul 16 '12 at 03:53

score 3 · Accepted Answer · answered Jul 16 '12 at 03:55

3

You can use query string parameters to reduce the data to the current date, by using date range parameters.

example for MO on 7/13/2012: (start/end month starts w/ a zero-index, { 00 - 11 } ).

http://ichart.finance.yahoo.com/table.csv?s=MO&a=06&b=13&c=2012&d=6&e=13&f=2012&g=d

api description here: http://etraderzone.com/free-scripts/47-historical-quotes-yahoo.html

answered Jul 16 '12 at 03:55

Tim

1,174
12
19

Thanks, Tim! This solved my problem, but I'm going to hold off on accepting it for a day or two to see if anyone else can come up with a Ruby way of doing it, since that was *technically* the question asked. :) – Oblivious Sage Jul 16 '12 at 04:54
The first comment to your question links to another SO question where that is answered. – Lars Haugseth Jul 16 '12 at 13:29
@LarsHaugseth: Both of the answers in that question entail low-level socket manipulation, which, as the author of one of the answers pointed out, isn't really a very Ruby way to solve the problem. Since that question is 3 years old, I figured it might be worth waiting a bit to see if there's a cleaner solution available now (several Ruby versions later). – Oblivious Sage Jul 16 '12 at 16:35
Looks like the server on which the service is running does not support the "Range" HTTP request header, so you're probably out of luck doing this with pure HTTP libraries. – Lars Haugseth Jul 16 '12 at 22:06

How can I QUICKLY get a string from one of the first couple lines of a long CSV at a remote URL?

1 Answers1

Linked