-2

I have the following in my program

class memModel
{
    struct Addrlist
    {
        vector<string> data;
        vector<int> timestamp;
        vector<string> client;
    }

    map<int, Addrlist> AddrMap ; //store based address and list of all accesses
}

In main() I read from a few files and store millions of entries into this stuct

int main()
{
    memModel newObj ;
    ifstream file1("dataStream");
    ifstream file2("timeStampSteam");
    ifstream file3("clientStream");
    ifstream file4("addrStream") ; 
    string dataSTR,clientSTR;
    int time = 0 ; 
    int addr;
    for(int i=0; i<10000000/*10mil*/ ; i++)
    {
        getline(file1,dataSTR);
        getline(file3,clientSTR);
        file2 >> time ; 
        file4 >> hex >> addr ; 

        newObj.AddrMap[addr].data.push_back(dataSTR) ; 
        newObj.AddrMap[addr].time.push_back( time) ;
        newObj.AddrMap[addr].client.push_back(clientSTR) ;
    }      

  }    

So the problem is I am running out of memory and get the std::Bad_alloc exception. This code works with smaller data sizes.

I am trying to understand where the struct and Map are being stored. Is everything going on the Stack ? The vectors are dynamically allocated right. Are those going to the heap ?

This is my first time working with large data sets so I would like to understand the concepts better. How can I change this to make sure I am using the heap and I do not run out of memory.

trincot
  • 317,000
  • 35
  • 244
  • 286
akslah
  • 49
  • 7

1 Answers1

4
    newObj.AddrMap[addr].data[i] = dataSTR ; 
    newObj.AddrMap[addr].time[i] = time ;
    newObj.AddrMap[addr].client[i] = clientSTR ;

This stores three items of data into three vectors, here.

Unfortunately, all of these vectors are empty, and they contain no elements. This results in undefined behavior.

You either have to use push_back(), or resize() these vectors in advance, so they are of sufficient size to store the items you're placing into the vectors, here.

A std::vector's operator[] does not automatically create or resize the array. It merely accesses the existing element in the array.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
  • Damn that was a mistake in the code example I wrote. The actual program does not have that issue, I use push_back(). Updated the example in the question. – akslah May 22 '16 at 21:42
  • 2
    Well, in that case take this as a lesson learned: post real code, if you want a real answer, instead of make-believe code. – Sam Varshavchik May 22 '16 at 21:42
  • The real code is huge. I think this is the simplest example that explains the issue I am facing. I am asking for some tips from people who have experience working with large data sets. Please do not fix other peoples code, they will not learn that way. – akslah May 22 '16 at 21:46
  • 2
    The "simplest" example should accurately reproduce the issue. Before posting a "simplest example" you should actually compile and verify that the "simplest example" does accurately reproduce the issue at hand. That's what a [mcve] requirement actually requires. Which you clearly didn't, and ended up wasting everyone's time. – Sam Varshavchik May 22 '16 at 21:47
  • OK back to people who can answer the question. I do not want someone to fix my code, just use the example as a discussion base point. – akslah May 22 '16 at 21:49
  • Prediction: you will not get an answer. Without a [mcve], an authoritative answer is logically impossible. The most that can be done here is random speculation. – Sam Varshavchik May 22 '16 at 22:04
  • The code above will reproduce the issue. Obviously since this is an out of memory issue, you might need files that are bigger or smaller than 10million entries each. – akslah May 22 '16 at 22:09
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/112626/discussion-between-akslah-and-sam-varshavchik). – akslah May 22 '16 at 22:13