2

I want to read a file which contains a batch of strings like this:

TGCCACAGGTTCCACACAACGGGACTTGGTTGAAATATTGAGATCCTTGGGGGTCTGT GTTCACGGGCCTCACGCAACGGGGCCTGGCCTAGATATTGAGGCACCCAACAGCTCT TGCCACAGGTTCCACACAACGGGACTTGGTTGAAATATTGAGATCCTTGGGGGTCTGT TGCCACAGGTTCCACACAACGGGACTTGGTTGAAATATTGAGATCCTTGGGGGTCTGT TTCCACGGACTTCACGCAACGGAACTTGGTCTAGCGGCTGAGGTATCCAACAGCTCTT
......

The serial method to do this:

ifstream input_subset("subset.txt");
thrust::host_vector < string > h_output_subset;

string s;
while (getline(input_subset, s)) {
    h_output_subset.push_back(s);
}

Is that possible to fill host_vector in a parallel way by using some thrust function or implements it in a CUDA project? I mean batch filling, say, one thread fills one cell of a vector at the same time.
thx a lot.

fanhk
  • 745
  • 1
  • 10
  • 15
  • 4
    Well, have you tried preallocating the size, and blitting it segment-wise? Note that in this case, the newlines had better be perfectly uniformly spaced or absent. Even despite this, the real problem you have is going to be disk IO and "not getting in the way of it". Nothing you seem to be doing above will be as slow as disk IO. – Yakk - Adam Nevraumont Oct 27 '15 at 02:47
  • 9
    Your assumption is that adding more threads to read from a *single* disk will speed things up. I would argue that unless your storage is a RAID adding more readers will **slow** things down. – YePhIcK Oct 27 '15 at 02:55
  • 1
    You might consider loading the file with memory-mapped I/O. – Davislor Oct 27 '15 at 03:40
  • got it, thank you for all replies! – fanhk Oct 27 '15 at 07:01
  • 6
    If you "got it", please add your own answer to this question ( perfectly OK to do that). Otherwise this question will likely sure around unanswered forever – talonmies Oct 27 '15 at 07:10
  • 1
    Using CUDA to load files in parallel is unlikely going to speed things up, unless the file is stored in some special way allowing for fast reading. Similar question has been asked here: http://stackoverflow.com/questions/8970830/read-multiple-text-files-in-parallel-using-cuda/8992428#8992428 – CygnusX1 Oct 27 '15 at 18:24

0 Answers0