0

I have no experience on No-SQL databases since I was mostly working on SQL ones. But I think now my design might benefit for what I try to accomplish. I want to monitor prices of some products and save them in a db. In the beginning, the number of products will be limited (500) and I will monitor their price once per day (so max of 365 per year).

Original thought was to have a table price_history with columns like id|date|price - so for a year I will have 365 days * 500 products number of rows.

Would that benefit from using No-SQL databases, where (if I have read correctly) I can use a document format (e.g. JSON style) and which will make querying for the history a single product faster?

For this amount of data, maybe SQL is ok but what if :

  • columns increase on the price_history table
  • I want to save data for years not only for one year (it will continue growing)
  • products increase

So, is it worth the effort reading about No-SQL dbs in my paradigm above?

Community
  • 1
  • 1
ghostrider
  • 5,131
  • 14
  • 72
  • 120

1 Answers1

2

It's definitely worth reading more about NoSQL to see if it suits your workload. More information is a good thing.

However, nothing about the problem that you have described so far calls out for NoSQL as the solution.

You've tagged your question with so I assume that's the SQL database you're considering. It's definitely possible to add columns to a MySQL table, even after it has been populated. The more data you have in the table, the longer that takes. But it's possible.

If you need to continue querying the table while it's being restructured in this way, a tool like pt-online-schema-change can help.

It sounds like a year's worth of data for you would be 365 * 500 or 182,500 rows. Frankly, this is a pretty modest amount of data. MySQL database administrators often deal with much larger databases.

One table in one of the databases I currently manage that is about 4.5 billion rows, and it's growing by a 2-10 million rows per day. I use a combination of indexes and partitioning to make sure queries perform as well as possible. I manage other tables that have over 100 million rows of data each.

No database, SQL or NoSQL, allows you to keep growing indefinitely. Any data scalability strategy must include some policy for archiving or summarizing old data.

Another piece of advice I give is that choosing between SQL and NoSQL is more or less the same exercise as choosing between normalized SQL and denormalized SQL. That is, you choose the DBMS based on its ability to optimize the types of queries you will run against the data, not the structure or volume of data you need to store.

I would guess that your data is basically going to be used as a data warehouse, and your queries will be doing aggregate calculations or computing trends and so on. For this, you might consider a specialized column-store database. This is still an SQL database, but it stores data in a way that optimizes for OLAP queries.

Examples of column-oriented databases include:

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828