My cpp code needs to read a 7 MB text file of space separated float values. It's taking about 6 seconds to parse the string values into a float array, which is too much for my use case.
I've been checking online and people say it is usually the physical IO that takes time. To eliminate this, I'm reading the file into a stringstream in one shot and using this for the float parsing. Still no improvement in code speed. Any ideas how to get it to run faster ?
Here's my code (replaced the array entries with dummy_f for simplicity):
#include "stdafx.h"
#include <iostream>
#include <fstream>
#include "time.h"
#include <sstream>
using namespace std;
int main()
{
ifstream testfile;
string filename = "test_file.txt";
testfile.open(filename.c_str());
stringstream string_stream;
string_stream << testfile.rdbuf();
testfile.close();
clock_t begin = clock();
float dummy_f;
cout<<"started stream at time "<<(double) (clock() - begin) /(double) CLOCKS_PER_SEC<<endl;
for(int t = 0; t < 6375; t++)
{
string_stream >> dummy_f;
for(int t1 = 0; t1 < 120; t1++)
{
string_stream >> dummy_f;
}
}
cout<<"finished stream at time "<<(double) (clock() - begin) /(double) CLOCKS_PER_SEC<<endl;
string_stream.str("");
return 0;
}
Edit:
Here's a link to the test_cases.txt file https://drive.google.com/file/d/0BzHKbgLzf282N0NBamZ1VW5QeFE/view?usp=sharing
Please change the inner loop dimension to 128 when running with this file (made a typo)
Edit: Found a way to make it work. Declared dummy_f as string and read from the stringstream as a string word. Then used atof to convert the string into float. Time taken is 0.4 seconds which is good enough for me.
string dummy_f;
vector<float> my_vector;
for(int t = 0; t < 6375; t++)
{
string_stream >> dummy_f;
my_vector.push_back(atof(dummy_f.c_str()));
for(int t1 = 0; t1 < 128; t1++)
{
string_stream >> dummy_f;
my_vector.push_back(atof(dummy_f.c_str()));
}
}