0

I want to store more than a million records (~50 GB) which are in below key value format. What are the best ways to it with only 16GB RAM?

Key: abc_1234 (randomly generated for each record)
Value: {
  name: "abc"
  number: 4
  add: "ads asf"
}
assylias
  • 321,522
  • 82
  • 660
  • 783
Shail
  • 1
  • 11
    A database? ... – assylias Feb 27 '17 at 15:48
  • A Key-Value Database (Cassandra, etc.)? :) – rvit34 Feb 27 '17 at 15:52
  • Possible hint http://stackoverflow.com/questions/28675644/java-caching-frameworks-for-maintaining-huge-data – LazerBanana Feb 27 '17 at 15:53
  • yes I want to use DB to store data. Other than the spring batch framework to store in the DB, are there any better or as good solutions? – Shail Feb 27 '17 at 15:54
  • You can start with a simple sql DB using your key as a primary key. Implementation should be trivial and if it's not good enough, you can investigate alternative options. – assylias Feb 27 '17 at 15:58
  • hashmap backed by disk : http://stackoverflow.com/questions/2654709/disk-based-hashmap/39986541 – fmgp Feb 27 '17 at 15:58
  • Have you already tried to just use a HashMap? The payload of your example data does not look that huge. – Alexander Feb 27 '17 at 16:06
  • @Alexander the question states there's 50G of data. That's not appropriate for an in memory hashmap – MeBigFatGuy Feb 27 '17 at 16:12
  • It was said there are over a million records. The record below has a payload of about 25%. That means 12,5GB. So, why shouldn't it work? Perhaps there is some other optimization potential like treating the number as `int` or `long`. Or skipping the first part of the key because it is the same as the name. I don't know, that's why I ask. It should be done in memory, so why not trying the trivial approach first. – Alexander Feb 27 '17 at 16:21
  • Why do you think there is 50GB of data? That's 50,000 bytes per record and I only see about 50 there in the example. – rghome Feb 27 '17 at 18:29

1 Answers1

1

Because you only have 16GB of ram, you would need to store the records to disk and access them as a stream. You could do this by writing them to a file such as a plaintext file or a database.

One kind of database you could use is SQLite

Carl Poole
  • 1,970
  • 2
  • 23
  • 28