2

I have a text area which i populate dynamically(to be specific I have a QPlainTextEdit in Qt, but its not important for algorithm suggestion).

Now problem is sometimes Large amounts of data comes and as more data comes in my application becomes heavy,since all the text data is in main memory.

So I thought of the following. We can use a file for storing all the text data and display only limited amount of data dynamically, but at the same time I have to illusion the user that the data size is that of the file, by creating scroll events that trigger when new lines comes.

Is there any standard algorithm for such problem?

Sumeet
  • 8,086
  • 3
  • 25
  • 45
  • Memory mapped files lets the OS load nad unload your data for you into virtual memory. This will get you into the single Gigabyte range pretty safely in terms of size. Do you need more than this? – Michael Dorgan May 03 '17 at 18:58
  • Yes, can u please elaborate – Sumeet May 03 '17 at 19:26
  • I've not used them myself, but have read enough responses on them to know of there existence. https://msdn.microsoft.com/en-us/library/ms810613.aspx has an overview of it. http://stackoverflow.com/questions/22047673/transfering-data-through-a-memory-mapped-file-using-win32-winapi for stack overflow – Michael Dorgan May 03 '17 at 20:38
  • How big data is? 10k lines of text (about 100kB), or more? If is around 100k lines it should work fine. How do you update this text? What kind of API you are using? Do you use `QTextDocument`? – Marek R May 04 '17 at 12:13

3 Answers3

1

Subclass QAbstractListModel implement cache there. When cell value is read you are fetching data from cache and update it if value is not present in cache.

Tweak QTableView, by altering delegate to achieve needed visualization of cells. Note you have to use QTableView since other QAbstractItemViews have broken items recycling and they don't handle very large models well (QTableView doesn't have such issue).

Some time ego I've wrote hex viewer of large files and tested that with file size 2GB and it was working perfectly.

Ok, I found my old code which could be a good example:

#include <QAbstractTableModel>

class LargeFileCache;

class LageFileDataModel : public QAbstractTableModel
{
    Q_OBJECT
public:
    explicit LageFileDataModel(QObject *parent);

    // QAbstractTableModel
    int rowCount(const QModelIndex &parent) const;
    int columnCount(const QModelIndex &parent) const;
    QVariant data(const QModelIndex &index, int role) const;

signals:

public slots:
    void setFileName(const QString &fileName);

private:
    LargeFileCache *cachedData;
};

// ----- cpp file -----
#include "lagefiledatamodel.h"
#include "largefilecache.h"
#include <QSize>

static const int kBytesPerRow = 16;

LageFileDataModel::LageFileDataModel(QObject *parent)
    : QAbstractTableModel(parent)
{
    cachedData = new LargeFileCache(this);
}

int LageFileDataModel::rowCount(const QModelIndex &parent) const
{
    if (parent.isValid())
        return 0;
    return (cachedData->FileSize() + kBytesPerRow - 1)/kBytesPerRow;
}

int LageFileDataModel::columnCount(const QModelIndex &parent) const
{
    if (parent.isValid())
        return 0;
    return kBytesPerRow;
}

QVariant LageFileDataModel::data(const QModelIndex &index, int role) const
{
    if (index.parent().isValid())
        return QVariant();
    if (index.isValid()) {
        if (role == Qt::DisplayRole) {
            qint64 pos = index.row()*kBytesPerRow + index.column();
            if (pos>=cachedData->FileSize())
                return QString();
            return QString("%1").arg((unsigned char)cachedData->geByte(pos), 2, 0x10, QChar('0'));
        } else if (role == Qt::SizeHintRole) {
            return QSize(30, 30);
        }
    }

    return QVariant();
}

void LageFileDataModel::setFileName(const QString &fileName)
{
    beginResetModel();
    cachedData->SetFileName(fileName);
    endResetModel();
}

Here is a cache implementation:

class LargeFileCache : public QObject
{
    Q_OBJECT
public:
    explicit LargeFileCache(QObject *parent = 0);

    char geByte(qint64 pos);
    qint64 FileSize() const;

signals:

public slots:
    void SetFileName(const QString& filename);

private:
    static const int kPageSize;

    struct Page {
        qint64 offset;
        QByteArray data;
    };

private:
    int maxPageCount;
    qint64 fileSize;

    QFile file;
    QQueue<Page> pages;
};

// ----- cpp file -----
#include "largefilecache.h"

const int LargeFileCache::kPageSize = 1024*4;

LargeFileCache::LargeFileCache(QObject *parent)
    : QObject(parent)
    , maxPageCount(1024)
    , fileSize(0)
{

}

char LargeFileCache::geByte(qint64 pos)
{
    // largefilecache
    if (pos>=fileSize)
        return 0;

    for (int i=0, n=pages.size(); i<n; ++i) {
        int k = pos - pages.at(i).offset;
        if (k>=0 && k< pages.at(i).data.size()) {
            pages.enqueue(pages.takeAt(i));
            return pages.back().data.at(k);
        }
    }

    Page newPage;
    newPage.offset = (pos/kPageSize)*kPageSize;
    file.seek(newPage.offset);
    newPage.data = file.read(kPageSize);
    pages.push_front(newPage);

    while (pages.count()>maxPageCount)
        pages.dequeue();

    return newPage.data.at(pos - newPage.offset);
}

qint64 LargeFileCache::FileSize() const
{
    return fileSize;
}

void LargeFileCache::SetFileName(const QString &filename)
{
    file.close();
    pages.clear();
    file.setFileName(filename);
    file.open(QFile::ReadOnly);
    fileSize = file.size();
}

I wrote cache manually since I was handling a row data, but you can use QCache which should help you do a caching logic.

Marek R
  • 32,568
  • 6
  • 55
  • 140
0

Using mmap only addresses how you may read the file while only having pieces of it in memory. It doesn't address how the edit control would only have pieces at a time.

I have to think that any such system would be fairly specific to the text editing widget involved. In this case, you would either need to figure out how to extend QPlainTextEdit with the desired functionality, or make a new text editing widget (possible forking an existing one). There are numerous text editing widgets available as open source that could be used as a starting point.

I've been assuming so far that you want to edit the text in this large file. If you are only using QPlainTextEdit as a read-only viewer, then that writing your own that is only a large text stream reading widget may be much easier than extending an existing editor widget.

Joshua D. Boyd
  • 4,808
  • 3
  • 29
  • 44
0

It's my two cents,

When I googled the similar question, I've found the answer from Fast textfile reading in c++

In short, Memory mapped files in the boost Lib. might be helpful not only for performance but also for handling a large amount of data.

The sample in the link, I could check the number of lines and getting data from Lib.

Good Luck

Community
  • 1
  • 1
Kwang-Chun Kang
  • 351
  • 3
  • 12