FPGAs are great at running simple, fixed data flows through parallel processing, while CPUs are optimized for complex and/or dynamic data flows.
The C language is not designed for describing highly parallel systems, as it follows a clearly sequential pattern ("assign a
to b
, then add c
to d
"); while compilers introduce some parallelization as an optimization, the focus is on generating code that behaves as if the instructions were sequentialized.
In an FPGA, on the other hand, you want to break up sequences as far as possible and create parallel circuitry and pipelines, so normally the system is described in the form of interconnected blocks, where each is kept as simple as possible.
For example, where you have (a+b)*(c+d)
, a CPU based design would probably have a single adder, feed it with a
and b
first, then with c
and d
, and finally pass both results to the multiplier.
In an FPGA design, that is rather costly, as you have to create a state machine that keeps track of which of the three computation stages we are at and where the results are kept, so it may be easier to have two dedicated adders hardwired to a
and b
, and c
and d
, respectively, and to have their outputs connected to a multiplier block.
At this point, you basically have created a dedicated machine that can compute this single term and nothing else, but its speed is limited by the speed of the transistors making up the logic gates only, and compared to the state machine you get a speed increase of at least a factor of three (because we only have a single state/instruction now), probably more because we can also discard the logic for storing intermediate results.
In order to decide when to create a state machine/processor, and when to hardcode computations, the compiler would have to know more about the program flow and timing requirements than can be expressed in C/C++, so these languages are not a good choice.
The OS as such also looks vastly different. There are no resources to arbitrate dynamically, so this part is omitted, and all that is left are device drivers. As everything is parallel, these take the form of external modules that are simply linked into your design, and interfaced directly.
If you are just starting out, I'd suggest you get a development kit with a few LEDs, and start with the basic functionality:
- Make the LED blink
- Use a PLL block from the system library to derive a secondary clock, and make the LED blink with a different frequency
- Add a simple bus interface, e.g. SPI, and communicate with a simple external device, e.g. a WS2811 based LED strip
After you have a basic grasp of how the system works, try to get a working simulation environment (the equivalent of a Debug build), and begin including more complex peripherals.