QFile::readAll()
may cause a memory problem and std::getline()
is slow (as is ::fgets()
).
I faced a similar problem where I needed to parse very large delimited text files in a QTableView
. Using a custom model, I parsed the file to find the offsets to the start of a each line. Then when data is needed to display in the table I read the line and parse it on demand. This results in a lot of parsing, but that is actually fast enough to not notice any lag in scrolling or update speed.
It also has the added benefit of low memory usage as I do not read the file contents into memory. With this strategy nearly any size file is possible.
Parsing code:
m_fp = ::fopen(path.c_str(), "rb"); // open in binary mode for faster parsing
if (m_fp != NULL)
{
// read the file to get the row pointers
char buf[BUF_SIZE+1];
long pos = 0;
m_data.push_back(RowData(pos));
int nr = 0;
while ((nr = ::fread(buf, 1, BUF_SIZE, m_fp)))
{
buf[nr] = 0; // null-terminate the last line of data
// find new lines in the buffer
char *c = buf;
while ((c = ::strchr(c, '\n')) != NULL)
{
m_data.push_back(RowData(pos + c-buf+1));
c++;
}
pos += nr;
}
// squeeze any extra memory not needed in the collection
m_data.squeeze();
}
RowData
and m_data
are specific to my implementation, but they are simply used to cache information about a row in the file (such as the file position and number of columns).
The other performance strategy I employed was to use QByteArray
to parse each line, instead of QString
. Unless you need unicode data, this will save time and memory:
// optimized line reading procedure
QByteArray str;
char buf[BUF_SIZE+1];
::fseek(m_fp, rd.offset, SEEK_SET);
int nr = 0;
while ((nr = ::fread(buf, 1, BUF_SIZE, m_fp)))
{
buf[nr] = 0; // null-terminate the string
// find new lines in the buffer
char *c = ::strchr(buf, '\n');
if (c != NULL)
{
*c = 0;
str += buf;
break;
}
str += buf;
}
return str.split(',');
If you need to split each line with a string, rather than a single character, use ::strtok()
.