2

How much is the performance of modern FPGA relative to CPU, absolutly in (GFlops/GIops) and what is the cost of one billion integer operations per second on the FPGA? And in which tasks now beneficial to use FPGA? I only found it: http://www.hpcwire.com/hpcwire/2010-11-22/the_expanding_floating-point_performance_gap_between_fpgas_and_microprocessors.html

And an old article: http://www.mouldy.org/fpgas-in-cryptanalysis.pdf

Alex
  • 12,578
  • 15
  • 99
  • 195

1 Answers1

4

Disclaimer: I work for SRC Computers, a heterogeneous CPU/FPGA system manufacturer.

"It depends", of course, is the answer.

A microprocessor is a fixed set of functional units. These perform reasonably well across a broad range of applications.

An FPGA is programmed by a designer with a specific set of functional units designed solely to execute a specific application. As such, it (often) performs very well for a given application.

"How much is the performance of modern FPGA relative to CPU, absolutely in (GFlops/GIops)" becomes a meaningless question. It can be answered for the the microprocessor as it has a fixed set of floating point units. However, for an FPGA, the question evolves into 1) how large if the FPGA, 2) how many floating point units can I pack into it and still do useful work, what is the memory/support architecture around the FPGA and 4) what are the sustained system bandwidths between the FPGA, its memory and the rest of the system?

The answer to "what is the cost of one billion integer operations per second on the FPGA" is similarly addressed by the preceding paragraph.

An interesting thing to keep in mind around performance is that in an FPGA, peak performance equals sustained performance since the FPGA is dedicated to executing a given application. As long as other system parameters do not interfere, of course.

Your question "And in which tasks now beneficial to use FPGA?" is a very broad question and grows with every large FPGA device release. In extremely broad non-exclusive terms, parallel and streaming applications benefit, although the application performance is to a large extent determined by the system architecture.

David Pointer
  • 905
  • 2
  • 18
  • 31
  • Thanks for the reply. In general, we can not say about the performance and CPU, as peak performance in GFlops / GIops can be achieved by the use of SSE, the use of multi-core CPU and smart memory access (with cache - no latency). In real applications may not be supporting it. But this comparison is presented in the article from the link from my question. But still, if you really about to compare peak perfomance, in how many times FPGA can be faster than the CPU at the same price of 10, 100, 1000? – Alex Sep 05 '12 at 16:51
  • @Alex You're welcome. If you have a question specific to the hpcwire article, please contact hpcwire or the article author. I can only give you a general answer relative to "it depends", as I did above. – David Pointer Sep 05 '12 at 18:34
  • Ok. And if, in general, how difficult to move the program written in OpenCL to the FPGA, as by link? http://www.hpcwire.com/hpcwire/2012-08-30/acceleware_altera_launch_training_program_for_opencl_on_fpgas.html – Alex Sep 05 '12 at 19:31
  • @Alex I've never used OpenCL, and so have no right to comment on it. I've only used C (and verilog, of course). There's an excellent compilation of C to FPGA tools here: http://stackoverflow.com/q/5603285/1098754 – David Pointer Sep 05 '12 at 19:47
  • 1
    Can it compute a O(n²) algorithm for n=64k under several milliseconds? (Each n needs 25 FP32 actions). (a 100$ fpga chip for example) – huseyin tugrul buyukisik Nov 02 '15 at 17:03