3

In continuation of my previous question. Is CUDA suitable for real-time quick applications? The task is: I need my application to make a lot of calculations in 0.1-0.3 ms. CUDA kernels cope with these calculations in a very good time suitable for my project, but with all the overheads I get (memory copy) the time is not acceptable.

Is CUDA just not usable for this kind of applications or there are some hacks to avoid sutuations described in my previous question?

These guys provide so called "GPU Workbench" with the modified gpu driver built on their own linux verson. They say that their system performs much faster then typical GPU configuraions. Anyone knows about them?

Community
  • 1
  • 1
otter
  • 515
  • 2
  • 7
  • It's unclear what you're asking. – Robert Harvey Oct 29 '12 at 23:37
  • 1
    Just wanted to know if anyone used CUDA for an application that is time-critical: the full cycle (writing to gpu memory, kernel, reading from gpu) must last 0.1-0.3 ms. May be some advices or someone encountered with a strange overhead described in my previous question. May be someone used GPU Workbench and it is really faster then standart cuda driver and run-time. – otter Oct 29 '12 at 23:49

2 Answers2

1

0.3ms is a very small time window for running a complete program on a GPU. Even for very small tasks, 10x that is more typical. And if your task is so small that it can run in such a small amount of time, then you probably aren't even saturating the GPU and there's really no point to even running it on a GPU.

That said, I do use CUDA for a real-time distributed system with a turnaround time of roughly one second, but it sounds like our definition of "real-time" is a bit more relaxed than yours.

I don't know anything about the "GPU Workbench" you linked.

Brendan Wood
  • 6,220
  • 3
  • 30
  • 28
  • Undoubtedly CUDA is not supposed to be used the way I want to use it. But I hope there is a chance to make it work such as fast. The analogue CPU code runs 20 times slower than a code i've tryed on GTX680. But unfortunately the code on GTX680 is still to slow to use it in production. If you are interested in what i'm dealing with and the problems I've encountered with you are wellcome to my [first topic](http://stackoverflow.com/questions/13130967/why-cuda-memory-copy-speed-behaves-like-this-some-constant-driver-overhead). – otter Oct 30 '12 at 21:59
0

I am gonna answer the question in 2 parts.

  1. The amount of run time of the program depends on the amount of data and the amount of parallelism you implement. And also using different techniques (using L1 and L2 Cache, multiple kernels) and stuff. As you have mentioned real time applications, you need to be using the CPU memory from time to time. If its possible, try using all the data at once.

  2. If your application uses Graphics. I recommend use graphics libraries (OpenGL [also, GLSL], DirectX [HLSL]).

Fr34K
  • 534
  • 6
  • 19