10

I might have some misconceptions here, so bear with me.

I wrote a program that captures images from a camera. I am sharing the memory between the camera and my application with mmap as I found in the V4L2 documentation. This works great. Now my processor (it's TI's DM3730) also has a DSP. I want to use the DSP, but it requires physical contiguous memory. TI provides drivers to allocate the memory. My problem is that right now I lose a lot of time to copy the mmap'ed memory into the physical contiguous memory. Is there a way to tell mmap that it should not allocate memory itself, but that I tell mmap to use memory that I allocate.

To give you an idea of what I am doing (There is a lot of code missing of course, but I stuck very close to the V4L2 documentation. I hope this is enough to understand my problem):

//reserve physical contiguous memory
dsp_buffer      = Memory_alloc(buffer_length, &myParams); 

...
//reserve memory for buffer, but not contiguous
buffers[n_buffers].start =
     mmap (NULL ,                    /* start anywhere */
     buf.length,
     PROT_READ | PROT_WRITE ,  /* required */                               
     MAP_SHARED ,              /* recommended */
     fd, buf.m.offset);

After that I copy the memory out of the non-contiguous memory into the contiguous memory, whenever a frame is ready.

...
//wait until frame is ready in memory
r = select (fd + 1, &fds, NULL, NULL, &tv); 
...
//copy the memory over to the physically contiguous memory
memcpy(dsp_buffer,buffers[buf.index].start,size); 
...

How could I get the frames into the physical contiguous memory right away?

Lucas
  • 13,679
  • 13
  • 62
  • 94
  • I don't know this particular CPU, does it have huge page support? If it does, you should try to `mmap` huge pages. Huge pages are guaranteed to be physically contiguous (first, within one huge page, and second, the pool of huge pages as such). – Damon Nov 28 '11 at 18:58
  • @Damon: I am not sure, let me get back to you on this. It might also be important to note that I am stuck on the 2.6.32 Kernel. – Lucas Nov 28 '11 at 19:07
  • The pool of huge pages is not contiguous - at least not on the x86 architecture. The allocation algorithms for regular and huge pages work exactly the same, allocating the desired size from PAGE_SIZE and similar macros and variables. – gnometorule Nov 28 '11 at 19:15
  • Never mind. I misread the above and grab my fool's cap. – gnometorule Nov 28 '11 at 19:21

2 Answers2

3

If you cannot pass the result of Memory_alloc() as first argument to your mmap() (for example, if it also uses mmap() that would make it impossible to map to that memory again), you probably should use another streaming I/O method from the given example - IO_METHOD_USERPTR variation. It uses the same ioctl as IO_METHOD_MMAP to capture frames and should provide the similar efficiency.

praetorian droid
  • 2,989
  • 1
  • 17
  • 19
  • This would fail if something was mapped at dsp_buffer already. You cannot use mmap to alias pages. – bdonlan Nov 28 '11 at 19:35
  • What is Memory_alloc()? Is dsp_buffer aligned on a page boundary? – praetorian droid Nov 28 '11 at 19:47
  • @praetorian droid: Memory_alloc() is a call to a kernel module that is provided by Texas Instruments to allocate physically contiguous memory. I don't know how it is implemented. – Lucas Nov 28 '11 at 20:03
  • "In addition to the memory translation issue, it is worth noting that DSP processes buffers and data aligned contiguously in memory while the ARM has the capability to work on fragmented buffers because of its MMU. Hence it is important to pass buffers and parameter information which are aligned contiguously in memory. DMAI API Buffer_create() or Codec Engine API Memory_alloc() [and Memory_contigalloc()]can be used to allocate contiguous memory buffers for function parameters from the CMEM module." [Texas Instrument wiki](http://processors.wiki.ti.com/index.php/C6Accel_Advanced_Users_Guide) – Lucas Nov 28 '11 at 20:09
  • If `Memory_alloc()` uses mmap (that would make it impossible to map to that memory again), you probably should use IO_METHOD_USERPTR variant from given example. – praetorian droid Nov 28 '11 at 20:29
  • * It uses the same ioctl as IO_METHOD_MMAP to capture frames and should provide the similar efficiency. – praetorian droid Nov 28 '11 at 20:46
  • Ha, thank you. This seems to work (I haven't thoroughly profiled yet). Do you know of any downsides I might have to expect by using usrptr instead of mmap? Do you mind to edit your comment about usrptr into your answer so people won't be confused by your current answer? Then I'll accept your answer. – Lucas Nov 28 '11 at 21:30
  • Sure. Updated (clean up). Ok? I don't know about any features of userptr method - never used it. – praetorian droid Nov 28 '11 at 22:34
2

You would need driver support from the camera driver. mmap gets the physical pages it maps from whatever driver it's mapping - the camera in this case. You cannot tell mmap to use some pre-allocated pages, because the underlying driver would have to be told to use these pre-allocated pages.

bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • Okay, so that means I have to allocate memory from the camera driver? – Lucas Nov 28 '11 at 19:45
  • [Linux Device Drivers chap 15](http://lwn.net/images/pdf/LDD3/ch15.pdf) (pdf) seems like what I am looking for ... – Lucas Nov 28 '11 at 19:57