How to synchronize market data frequently and show as a historical timeseries data

Question

http://pubapi.cryptsy.com/api.php?method=marketdatav2

I would like to synchronize market data on a continuous basis (e.g. cryptsy and other exchanges). I would like to show latest buy/sell price from the respective orders from these exchanges on a regular basis as a historical time series.

What backend database should I used to store and render or plot any parameter from the retrieved data as a historical timeseries data.

Can you provide more info on what the underlying parameters are? You indicate that you want to (a) sync data on a continuous basis, (b) from external sources, (c) store the data and (d) output data rendered in various time-series formats. Is the question which database can hold that much data as it increases, which one will quickly update when it's coming in, which one is best suited for time series data, or what? From your question so far, it's hard for me to believe you are sure what to do with the data once it's stored. — Anthony, Jul 14 '14 at 09:35
If your data already has timestamps, why not just shove the JSON into a CouchDB database and then move on to your most likely question of "whats the best way to retrieve this data?" — Anthony, Jul 14 '14 at 09:37

score 0 · Answer 1 · edited May 23 '17 at 11:59

0

I'd suggest you look at a database tuned for handling time series data. The one that springs to mind is InfluxDB. This question has a more general take on time series databases.

edited May 23 '17 at 11:59

Community

1
1

answered Jun 24 '14 at 12:25

Synchro

35,538
15
81
104

score 0 · Answer 2 · answered Jul 07 '14 at 13:27

I think it needs more detail about the requirement. It just describe, "it needs sync time series data". What is scenario? what is data source and destination?

Option 1.

If it is just data synchronization issues between two data based, easiest solution is CouchDB NoSQL Series (CouchDB, CouchBase, Cloudant)

All they are based on CouchDB, anyway they provides data center level data replication feature (XCDR). So you can replicate the date to other couchDB in other data center or even in couchDB in mobile devices.

I hope it will be useful to u.

Option 2.

Other approach is Data Integration approach. You can sync data by using ETL batch job. Batch worker can copy data to destination periodically. It is most common way to replicate data to other destination. There are a lot of tools it supports ETL line Pentaho ETL, Spring Integration, Apache Camel.

If you provide me more detail scenario, i can help u in more detail

Enjoy -Terry

Couchbase is based on Membase *not* on CouchDB. They are significantly different technologies. http://www.couchbase.com/couchbase-vs-couchdb — Asya Kamsky, Jul 08 '14 at 21:00

score 0 · Answer 3 · answered Jul 10 '14 at 15:14

I think mongoDB is a good choice. Here is why:

You can easily scale out, and thus be able to store tremendous amount of data. When using an according shard key, you might even be able to position the shards close to the exchange they follow in order to improve speed, if that should become a concern.
Replica sets offer automatic failover, which implicitly could be an issue
Using the TTL feature, data can be automatically deleted after their TTL, effectively creating a round robin database.
Both the aggregation and the map/reduce framework will be helpful
There are some free classes at MongoDB University which will prevent you to avoid the most common pitfalls

How to synchronize market data frequently and show as a historical timeseries data

3 Answers3