3

I am using MongoDB do to tick data analysis in R. Initially I used MySQL, which worked fine, but I wanted to test MongoDB for this purpose. The data set contains about 200 million entries at the moment. Using RODBC I could get the query result into a data.frame very quickly using sqlQuery(conn, "select * from td where prd = 'TY' and date = '2012-01-03'")

In MongoDB I have documents like Document{{_id=5537ca647a3ad42a84374f0a, prd=TY, time=1325661600043, px=130.6875, sz=11}}

In Java I can retrieve a days worth of tick data - roughly 100,000 entries, create Tick objects and add them to an array, all in less than 2 seconds.

Using rmongodb, the below takes forever. Any ideas how to improve this?

query <- mongo.bson.from.list( list(product = "TY", date = as.POSIXct("2012-01-04")) )
res.cursor <- mongo.find(mongo, db.coll, query, limit = 100e3, options=mongo.find.exhaust)
resdf <- mongo.cursor.to.data.frame(res.cursor)

Using find.all is equally slow.

rosor
  • 31
  • 2

0 Answers0