0

I would like to search for a given string in multiple files in parallel using CUDA. I have planned to use pfac library to search for the given string. The problem with this is how to access multiple files in parallel.

Example: We have a folder containing 1000s of files which has to be searched.

The problem here is how should i access multiple files in the given folder.The files in the folder should be dynamically obtained and each thread should be assigned a file to search the given string.

Is it possible????

Edit:

In this post: very fast text file processing (C++) .He is using the boost library to read a 3 GB text file in 16 seconds.While in my case I have to read 1000s of smaller files

Thank you

Community
  • 1
  • 1
Tanmay J Shetty
  • 197
  • 2
  • 3
  • 9
  • How many files, typically ? 10s, 100s, 1000s, more ? – Paul R Jan 23 '12 at 11:29
  • 1
    It is possible. The problem is that reading them from the disk is inherently sequential due to the single head. If the search is a string like search, you'll probably have a hard time beating grep doing this. – Ira Baxter Jan 23 '12 at 13:01
  • So you mean parallel reading of files could be done but this would be slower than grep.I have to search the given string in the contents of the file ,I am not searching for filenames if you are mistaken. – Tanmay J Shetty Jan 23 '12 at 13:12
  • Yes, this doesn't seem like a good ft for CUDA - the cost of reading the files from disk and then copying the data to GPU memory will probably be far greater than any possible speed benefit in the string search. The only way this would make sense would be if you needed to do many searches on the same set of files and could load all the files into GPU memory simultaneously . – Paul R Jan 23 '12 at 14:16
  • I apologise if this sounds silly since i am new to cuda,but is transfer of files to GPU memory compulsory.I do not want to copy the files to GPU memory.Is there any alternative??? – Tanmay J Shetty Jan 23 '12 at 15:05
  • You have no choice in this - even if your GPU board supports transparent access to host memory this still requires host<->GPU bandwidth, so the cost is much the same either way. – Paul R Jan 23 '12 at 16:22
  • Is this post helpful: http://stackoverflow.com/questions/8123094/very-fast-text-file-processing-c .He is using the boost library to read a 3 GB text file in 16 seconds. – Tanmay J Shetty Jan 23 '12 at 22:51
  • You have thousands of files, likely randomly scattered around your disk. File read times are going to be dominated by seek/rotational latency measured in tens of milliseconds, times "thousand(S)" of files ==> 20 seconds elapsed time *just to read* the files. So, you have ~~ 20 seconds of CPU to waste searching the files as they arrive. Hardly matters what library you use; the physics is against you. – Ira Baxter Jan 24 '12 at 05:37

2 Answers2

4

Doing your task in CUDA will not help much over doing the same thing in CPU.

Assuming that your files are stored on a standard, magnetic HDD, the typical single-threaded CPU program would consume:

  1. About 5ms to find the sector where the file is stored and put it under the reading head.
  2. About 10ms to load 1MB file (assuming 100MB/s read speed) into RAM memory
  3. Less than 0.1ms to load 1MB data from RAM to CPU cache and process it using a linear search algorithm.

That is 15.1ms for a single file. If you have 1000 files, it will take 15.1s to do the work.

Now, if I give you super-powerful GPU with infinite memory bandwith, no latency, and infinite processor speed, you will be able to perform the task (3) with no time. However, HDD reads will still consume exactly the same time. GPU cannot parallelise the work of another, independent device. As a result, instead of spending 15.1s, you will now do it in 15.0s.

The infinite GPU would give you a 0.6% speedup. A real GPU would be not even close to that!


In more general case: If you consider using CUDA, ask yourself: is the actual computation the bottleneck of the problem?

  • If yes - continue searching for possible solutions in the CUDA world.
  • If no - CUDA cannot help you.

If you deal with thousants of tiny files and you need to perform reads often, consider techniques that can "attack" your bottleneck. Some may include:

  • RAM buffering
  • Putting your hard drives in a RAID configuration
  • Getting an SSD

there may be more options, I am not an expert in that area.

CygnusX1
  • 20,968
  • 5
  • 65
  • 109
1

Yes, it's probably possible to get a speed-up using CUDA if you can reduce the impact of read latency/bandwidth. One way would be by performing multiple searches concurrently. I.e. If you can search for [needle1], .. [needle1000] in your large haystack then each thread could search haystack-pieces and store the hits. Some analysis of the throughput required per-comparisons is required to determine whether your search is likely to be improved by employing CUDA. This may be useful http://dl.acm.org/citation.cfm?id=1855600

Community
  • 1
  • 1
axon
  • 1,190
  • 6
  • 16