0

I have the below scenario:

  1. A huge list of messages from external system (Message contains a id and a payload)
  2. I am filtering those messages based on the id and storing the payload in a list and finally the id and List in a map.
  3. Later on based on the id I am retrieving the list of payload from map and submitting the entire list of payload for further processing to an executor service.

Well, I do not like this approach as at run time I am having a map containing all data (Point 2). I might end up with memory related issue. Is there any good alternative of the above approach.

EDIT

I am using Java. I am getting the messages from some external system (I have no idea about the volume of messages that could come) and finally processing them based on their ID. After processing these are getting stored in the database. However, the problem is while I am loading the messages into Map based on the ID. I have to group the messages based on the ID and then send for processing. So I have to keep the entire Map in memory for certain period of time.

Thanks in advance.

Somnath Musib
  • 3,548
  • 3
  • 34
  • 47
  • 3
    Storing the messages in a database? (pick your poison: relational or noSQL)... Or buying an expensive product (such as Terracotta big memory) and a lot of RAM (1TB is cheap nowadays :) – Augusto Aug 07 '15 at 13:46
  • 1
    How big is the data? Do you consider using caching tools such as hazelcast or cassandra.. – dogant Aug 07 '15 at 13:47

1 Answers1

2

I remember using myself MapDB for this. Basically it gives you a Map interface, but backed up by off-heap memory (think memory mapped files in Linux).

You can find an example here: https://github.com/jankotek/mapdb/blob/master/src/test/java/examples/CacheOffHeap.java

Will copy relevant parts here for easier reference:

        final double cacheSizeInGB = 1.0;

        // Create cache backed by off-heap store
        // In this case store will use ByteBuffers backed by byte[].
        HTreeMap cache = DBMaker
                .memoryDirectDB()
                .transactionDisable()
                .make()
                .hashMapCreate("test")
                .expireStoreSize(cacheSizeInGB) //TODO not sure this actually works
                .make();

        //generates random key and values
        Random r = new Random();
        //used to print store statistics
        Store store = Store.forEngine(cache.getEngine());


        // insert some stuff in cycle
        for(long counter=1; counter<1e8; counter++){
            long key = r.nextLong();
            byte[] value = new byte[1000];
            r.nextBytes(value);

            cache.put(key,value);

            if(counter%1e5==0){
                System.out.printf("Map size: %,d, counter %,d, store size: %,d, store free size: %,d\n",
                        cache.sizeLong(), counter, store.getCurrSize(),  store.getFreeSize());
            }

        }

        // and release memory. Only necessary with `DBMaker.memoryDirect()`
        cache.close();
Community
  • 1
  • 1
Fermin Silva
  • 3,331
  • 2
  • 17
  • 21