I would like to read big (3.5GB) file as fast as possible - thus I think I should load it into RAM first, instead of using ifstream
and getline()
.
My goal is to find lines of data with same string. Example
textdata abc123 XD0AA
textdata abc123 XD0AB
textdata abc123 XD0AC
textdata abc123 XD0AA
So I would need to read first line, then iterate through all file until I find the fourth (in this example) line with same XD0AA string.
This is what I did so far:
string line;
ifstream f("../BIG_TEXT_FILE.txt");
stringstream buffer;
buffer << f.rdbuf();
string f_data = buffer.str();
for (int i = 0; i < f_data.length(); i++)
{
getline(buffer, line);//is this correct way to get the line (for iteration)?
line = line.substr(0, line.find("abc"));
cout << line << endl;
}
f.close();
return 0;
But it takes twice more RAM usage than file (7GB).
Here is fixed code:
string line, token;
int a;
ifstream osm("../BIG_TEXT_FILE.txt");
stringstream buffer;
buffer << f.rdbuf();
//string f_data = buffer.str();
f.close();
while (true)
{
getline(buffer, line);
if (line.length() == 0)
break;
//string delimiter = "15380022";
if (line.find("15380022") != std::string::npos)
cout << line << endl;
}
return 0;
But how do I make getline() read all over again?