I have the following problem. I need to 150 MM records/day for 10 years. The total of records 150MM * 365 * 10 = 547.500.000.000 records. Database records have a unique key {date, id}. I need to recover 40MM records daily, using this database. I will always use the key {date, id} to search. The process can be run in batch. I thought about using a key-value database, such as HBase, sharding my database by date. (not sure if HBase allows you to choose how to partition the records within the cluster.). Or simply leave the HBase sharding for me.
I saw a simmilar question that uses MYSQL partitioning. ( Efficiently storing 7.300.000.000 rows ) I don't know if MYSQL can partitioning in multiple machines. Or maybe if i can use just one machine to handle this problem.
Do you believe that this architecture will work? If not, what would be another way to solve the problem? Suggestions and tips are welcome!