I am writing a device driver for a Fibre Channel card that will move large amounts of data. The card can act as PCI master and will DMA the data into system memory. This is on an x86_64 linux system running kernel 3.0.35.
I first tried allocating the buffers using kmalloc(), but found that I could not allocate buffers large enough. However, as a learning exercise, I continued with the development using small buffers allocated with kmalloc(). I got the driver to work in this case. I allocated small buffers with kmalloc() (and the GFP_DMA flag), passed the address returned by kmalloc to dma_map_single(), and returned the address returned to the registers on the card.
I am now trying to modify the driver to use large buffers - on the order of 200MB. I have reserved a block of memory at boot time using the mem= kernel parameter. I map the reserved memory using ioremap(). I have mmap-ed the memory to user space and I have verified that the user application can write to the mmap-ed region and that the driver can read the data using the virtual address returned by ioremap(). But I have not been able to get the board to DMA data to my buffers.
I have tried to map the region that I allocated with dma_map_single(). I first passed the virtual address returned from ioremap(). I thought this was the correct address because the DMA-API-HOWTO file says:
"the driver can give a virtual address X to an interface like dma_map_single(), which sets up any required IOMMU mapping and returns the DMA address Z. The driver then tells the device to do DMA to Z, and the IOMMU maps it to the buffer at address Y in system RAM."
But I got no data from the card. I have also tried passing dma_map_single() the physical address of the region, but that didn't work. I have tried writing the address returned by dma_map_single() to the cards registers, and I have tried writing the physical address to the registers. Neither of those efforts worked, either.
Is there a step that I'm missing? Is my procedure in this case completely wrong? Can a card even perform DMAs to a memory region like the one I have reserved on an x86_64?
There is a possibility that I could upgrade the system to a 3.12.28 kernel. I understand that that kernel supports CMA. Would CMA work in this case?
Any help is appreciated.