5

I'm studying vivadoHLS, and the tutorial u871 has introduced how to use HLS, and optimize my C/C++ code. But I want to know how to load them into my board zynq 7020, let it run on board.

What I want to implement is : The host (CPU on board) calls the PL(FPGA) to calculate, and sends the parameters to PL, then the PL sends back the result to CPU.

For example, a function in C: add(int* a, int* b), that will add a[i] and b[i] respectively and return an array int* result., Through the HLS, I can unroll the for loop, then it will be faster to calculate. The CPU sends the address of a and b to PL, PL calculates, and sends result address back to CPU.

In the tutorial, it only covers how to use the HLS, doesn't explain how to communicate the PL and CPU, or how to load it to board so it can run on a board.

Please recommend a tutorial or tell me where to learn it, thanks a lot!!

happybunnie_wy
  • 85
  • 3
  • 12
  • 1
    How do you connect a zynq 7020 via PCIe whereas this chip has no PCIe? – Paebbels May 11 '15 at 07:01
  • Umm...Sorry... but I don't really understand what you mean?@Paebbels – happybunnie_wy May 12 '15 at 11:46
  • XillyBus is a PCI Express IP Core for FPGAs that comes with a PCIe DMA driver. But your Zynq 7020 FPGA has no PCIe interface?!? – Paebbels May 12 '15 at 12:39
  • 1
    @Paebbels Oh...Now I understand...that means I can't use Xillybus on my board. But how can I use vivado HLS to implement coprocessing on my board? I've changed my question, please give me some suggestions, really thanks a lot!!! – happybunnie_wy May 16 '15 at 16:48
  • @happybunnie_wy did you find a tutorial or reference for what you were trying to do? I am also trying to use Vivado HLS to create an IP that inputs data from memory (in the form of arrays), operates on them, and then stores the result in memory. From the information that I have looked at so far, including an AXI Stream Interface on the IP and using an AXI DMA seems like the best option. How would you manage the communication? What would be the programming sequence in the bare-metal application and what would be the structure of the C code used to generate the IP? – user3482357 May 20 '16 at 16:58

2 Answers2

5

It's a rather complex subject, since they are many variant to the solution. This part is covered by chapter 2 of ug871, but unfortunately it uses EDK instead of Vivado. The Vivado HLS concepts are the same though. You can also have a look at xapp890.

Basically, the Zynq uses AXI ports to connect to the PL. An AXI port is a classic address+data bus. There are 2 type of AXI, standard and lite. The lite version doesn't support burst, is focus on using less area at the cost of performance and is typically used for register interface. The standard AXI has very high performances and support bursts, you typically use it to connect to a DDR memory.

The Zynq has several AXI ports, both as master and slave. The slave ports allow your IP to read/write to the memory space of the Zynq. The master ports allow the Zynq to read/write to the memory space of your cores. The several ports have different performances, the GP should be used for low performance AXI-Lite, the HP to IP that need a more direct access to the Zynq DDR memory.

The simplest way to connect your IP is using AXI-lite. In Vivado HLS, define register a at address 0, register b at address 4 and register c (the answer) at address d. The function add would look something like:

int add(int a, int b)
{
    volatile int *my_ipaddr = MY_IP_BASEADDR; // Address is configured in Vivado block design

    *(my_ipaddr+0) = a;
    *(my_ipaddr+1) = b;
    return *(my_ipaddr+2);
}

As I don't use Vivado HLS, I'm not sure how to do it. But skimming through ug871 it covers AXI-Lite register interface.

A third type of AXI is called AXI-Stream. It a communication bus without address, only data is present with some flags to synchronize the stream. It's usually used between cores that don't really care for addresses or with a AXI-DMA engine. The main problem is that you can't connect AXI-Stream directly to the Zynq, AFAIK.

An example application is xapp890, although they use Video-DMA core since it's a video application. It provides a higher performance solution. In your example, it would have an input slave AXI-Stream to receive a/b, and an output master AXI-Stream to return c. You would connect the core with an AXI-DMA IP core, and the pseudo code would be:

void add(int *ab, int *c, unsigned int length)
{
    XAxi_Dma_Start_Transfer((void *)ab, length, CHANNEL_MM2S); // Not actual function, MM2S stands for memory to stream
    XAxi_Dma_Start_Transfer((void *)c, length, CHANNEL_S2MM); // S2MM = stream to memory

    while(XAxi_Dma_Transfer_Done == 0) {} // Wait end of transfer
}

This is a lot of information, but hopefully it will allow you to understand the application notes. To summarize, your IP has to provide AXI (Lite, Standard of Stream) interfaces to exchange data, which you connect to the Zynq AXI ports. Additionally, your IP can also have an interrupt signal.

Jonathan Drolet
  • 3,318
  • 1
  • 12
  • 23
0

As Jonathan figured, it is rather complex subject. You can do all the communication stuff in between PL and CPU/RAM by your own (and don't forget on driver development) but you can also try to use some existing tools. For example, we have tried RSoC Framework but more such "frameworks" possibly exist.

Viktor Puš
  • 81
  • 1
  • 3