2

I don't think this little piece of code I'm working on has any practical application, but I'm trying to wrap my head around async and seem to be struggling a bit. Let's say I want to pull historical stock price data from Yahoo, save all the data to a single csv file, and then load it into SQL Server using bulk copy. I'm not too worried about loading the data into SQL Server, but I'm wondering how to write the data to a new csv file. Can/should it be done asynchronously?

As far as I know there is no way to grab the ticker as part of the stream when grabbing the historical data, so I am grabbing the stream and mapping it to a new list with the ticker appended to the front of each item. From time to time when I run a test I'll get a record without a ticker and there will be a record with multiple tickers (eg, "MSFT, YHOO").

So, my question is, how can I dump this data into a single csv file without causing issues? Secondarily, when I split the data I'm getting an empty tail item. What's the best method for dropping that?

Like I said, I don't know that this has any practical application, but I'm trying to learn, so I hope you'll pardon my ignorance. Thanks for any help, I really appreciate it. Here's what I have:

open System
open System.IO
open System.Web
open System.Net

let fromDate = new DateTime(2013, 1, 1)

let getTickers = 
    "MSFT" :: "YHOO" :: []

let getData (ticker : string) = 
    async {
        let url = System.String.Format("http://ichart.finance.yahoo.com/table.csv?s={0}&g=d&ignore=.csv&a={1}&b={2}&c={3}", ticker, fromDate.Month - 1, fromDate.Day, fromDate.Year)

        Console.WriteLine(url)

        let req = WebRequest.Create(url)
        let! rsp = req.AsyncGetResponse()
        use stream = rsp.GetResponseStream()
        use reader = new StreamReader(stream)

        let lines = 
            reader.ReadToEnd().Split('\n')
            |> Seq.skip 1 // skip header
            |> Seq.map (fun line-> (String.Format("{0}, {1}", ticker, line.ToString())))

        Seq.iter (fun x->printfn "%s" (x.ToString())) lines
        ()
    }

let z = 
    getTickers 
    |> List.map getData
    |> Async.Parallel
    |> Async.RunSynchronously
nickfinity
  • 1,119
  • 2
  • 15
  • 29
  • You can post messages to an agent and then have a single thread getting them from the agent and writing to the csv file – Gustavo Guerra Jan 10 '13 at 15:38
  • This probably belongs on [codereview](http://codereview.stackexchange.com/), but you might find this [answer to a related question](http://stackoverflow.com/a/11677368/162396) helpful. If you only want to load the data into SQL Server you don't need to write it to a file first. – Daniel Jan 10 '13 at 15:40
  • @ovatsus - I'll look into that. Know of any good examples? – nickfinity Jan 10 '13 at 15:51
  • @Daniel - Thanks for the help. I know I don't need to write to a file first. In fact, I may not even send the data to the database at all, but I'd like to get to the point where I have one file that was created from multiple streams. The answer you referenced was actually another of my questions. I tend to use pulling data from Yahoo as a learning tool, so the db in this case is mostly an afterthought. – nickfinity Jan 10 '13 at 15:52
  • Yes, I'd like to write to the file in parallel. So, let's say I'm pulling data for 1000 stocks, can I write MSFT, YHOO, and however many others to the file at the same time? I'm also not sure if my List.map call is in the correct spot, since I sometimes get goofy results. – nickfinity Jan 10 '13 at 16:07

1 Answers1

2

IMO, this is overdone, but hopefully it demonstrates what you want to know:

open System
open System.IO
open System.Net

let tickers = 
  [ 
    "MSFT"
    "YHOO"
  ]

let getData (writer: TextWriter) ticker =
  async {
    let url = sprintf "http://ichart.finance.yahoo.com/table.csv?s=%s&g=d&ignore=.csv" ticker
    let req = WebRequest.Create(url)
    let! resp = req.GetResponseAsync() |> Async.AwaitTask
    use stream = resp.GetResponseStream()
    use reader = new StreamReader(stream)
    let! lines = reader.ReadToEndAsync() |> Async.AwaitTask
    let lines = 
      lines.Split('\n')
        |> Seq.skip 1
        |> Seq.filter ((<>) "") //skip empty lines
    for line in lines do
      do! writer.WriteLineAsync(String.Format("{0}, {1}", ticker, line)).ContinueWith(ignore) |> Async.AwaitTask
  }

let writeAllToFile path =
  use writer = new StreamWriter(path=path)
  tickers
    |> Seq.map (getData writer)
    |> Async.Parallel
    |> Async.RunSynchronously
    |> ignore

writeAllToFile @"C:\quotes.csv"
Daniel
  • 47,404
  • 11
  • 101
  • 179