Possible to implement MMU functionality in software with GCC?

Question

I have an ARM Cortex-M0+ MCU (SAML21) that doesn't have a memory management unit or external memory interface, but I would like to use an existing library (libavif + libaom) to manipulate data that doesn't fit in the internal SRAM. In this case, a single uncompressed image frame won't fit in 40kB.

If I were to implement an image compression library myself given these constraints, I would store the image data in a large external RAM chip and access its contents over SPI while compressing the image.

Instead of spending weeks rewriting libaom to access memory through some kind of spi_ram_write/spi_ram_read functions, is there any kind of secret flags/tricks I can use with arm-none-eabi-gcc to get it to replace memory accesses in certain sections of code with function calls (address and data as arguments) like this? This would obviously be incredibly slow, but that's fine for my application.

This is fundamentally the same as this question but I would like to know if it's possible to do in software rather than using linker options and the non-existent MMU.

They seem both to be available via BSD license ([libavif](https://github.com/AOMediaCodec/libavif)/[libaom](https://github.com/darkskygit/libaom/blob/master/LICENSE) which allows modifications – so at very first it *might be* possible. Questions remaining then are if the compiled code fits into flash/persistant storeage and how much stack it would use (-> stack overflow!). You'd be writing your own port of the library then, though this might mean quite an effort. — Aconcagua, Dec 19 '22 at 07:35
You might give it a try and just compile the libraries, code failing gives you guite some information like missing headers (of which you likely need to provide your own versions, together with your own implementations of the functions used from) or incompatible assembler code, finally the linker would tell you which functions you failed to implement, too. — Aconcagua, Dec 19 '22 at 07:42
Why can't you store the image in flash? That's what everyone else does... — Lundin, Dec 19 '22 at 07:46
@Aconcagua I've successfully cross-compiled the libraries already, which wasn't too big of a deal. They should work fine with smaller images that fit in the 40kB of RAM, but will run into runtime issues with anything bigger. — unknownperson, Dec 19 '22 at 07:57
@Lundin Storing the image (as in picture, not firmware image) in flash will cause the flash to wear out over time and the MCU will prematurely die. — unknownperson, Dec 19 '22 at 07:58
Yes I realize that you meant a picture. Everyone stores those in flash. Why do you need to change it and store the changes in NVM? — Lundin, Dec 19 '22 at 08:01
@Lundin The images come from a camera, which would produce raw frames, which would then need to be debayered and compressed. This would need to be repeated every time I want to capture an image. Although flash storage is larger, I don't need the nonvolatileness for this data, and its limited life would be a problem. — unknownperson, Dec 19 '22 at 08:03
How large are these images anyway? You might possibly spare all this effort (assume how much your wages would be for the time invested) and might swap to a MCU with more memory (e.g. STM offers some with up to 1.4M). I don't assume you're creating a mass product, so it would be a one-time investment anyway. — Aconcagua, Dec 19 '22 at 08:31
@unknownperson A camera!? Well you very likely need some external memory then, since it sounds very unlikely that the photos will fit internally in a M0 MCU. Unless you can chew through the image in small portions, it sounds like you picked the wrong MCU for the task. — Lundin, Dec 19 '22 at 08:40
The image sensors I'm looking at are about 1M pixel. I could fit VGA resolution into some of the larger MCUs (like OpenMV does), but given the very low image rate requirements, the M0 one I have should be capable of it given the correct software and the external RAM. I may end up simply reading the full frame into the external RAM and compressing it as smaller individual images. — unknownperson, Dec 19 '22 at 08:49
Where do you write the target images to anyway? Maybe you might simply store them *uncompressed*? You might then compress later when fetching the images from there. — Aconcagua, Dec 19 '22 at 09:14
I just tried compiling the most basic libavif encode example for my MCU and it uses over 3MB of flash and 300kB of RAM just for whatever static variables it needs. That wouldn't fit on all but the largest microcontrollers, and even then just barely. Encoding a 1MP image uses 62MB of heap, over 15x the space needed for a single frame. Needless to say these libraries are unbelievably complicated and not meant for embedded applications. Guess I'm going to have to use JPEG. — unknownperson, Dec 19 '22 at 22:28

score 1 · Answer 1 · answered Dec 19 '22 at 07:22

1

Possible to implement MMU functionality in software with GCC?

In general, to emulate an MMU, before every memory access you'd have to do something like:

    if( data_not_currently_present(fake_address) ) {
          actual_address = fetch_the_data(fake_address);
    } else {
          actual_address = find_the_data(fake_address);
    }

This would destroy performance by a factor of around 1000.

I have an ARM Cortex-M0+ MCU (SAML21) that doesn't have a memory management unit or external memory interface, but I would like to use an existing library (libavif + libaom) to manipulate data that doesn't fit in the internal SRAM.

That's a little like saying you have a bicycle and want to transport 3 tons of rock on the highway. There's 2 choices here: You can call it the wrong CPU for the job, or you can call it the wrong job for the CPU.

answered Dec 19 '22 at 07:22

Brendan

35,656
2
39
66

My design requirement is one 1MP image compressed per day, with almost nothing else for the CPU to do. Even if it is a "bad idea," I'd like to know if it's possible. Additionally, there wouldn't be a need for the address translation done by a typical MMU as long as affected pointers are initialized to valid addresses within the external RAM. – unknownperson Dec 19 '22 at 07:29
1

Hmm - in that case you might be able to do it yourself (without libraries). E.g. like `while( not finished) { fetch the next 20 KiB of the image, process/compress it, write the next 10 KiB of compressed data out }` , maybe. – Brendan Dec 19 '22 at 07:32
I suppose that could work... I'd have to get into the details of the way the compression is done to see if such partial compression is possible. I was originally planning on doing JPEG compression which operates on 8x8 pixel blocks, and this technique is why I found this MCU sufficient in the first place, but after seeing the improvement in performance with AVIF, I'm thinking about trying to switch. – unknownperson Dec 19 '22 at 07:36
@unknownperson '*seeing the improvement in performance*' – performance in which respect??? Are you targetting better compression rates or higher image quality? – Aconcagua Dec 19 '22 at 07:51
@Aconcagua Better compression rate (given same quality) and higher image quality (given same file size) are more or less the same thing, no? AVIF has significantly better visual image quality at the same file size as JPEG, especially at high compression ratios. – unknownperson Dec 19 '22 at 07:55
@unknownperson They are closely related, but not necessarily the same. They might deal differently *which* information gets lost in the end so an image might retain better *visual* quality albeit having dropped more information (I recall a prof at university demonstrating wavelet compression vs. jpeg where that was the case, though I didn't dig in deeper 20 years ago). Just discovering now: [JPEG2000](https://en.wikipedia.org/wiki/JPEG_2000) actually using WC, might be of interest to you as yet another alternative. – Aconcagua Dec 19 '22 at 08:04

Possible to implement MMU functionality in software with GCC?

1 Answers1