4

I'm working on bringing linux to a custom Cortex-M7 board with 16 Mb of SDRAM and 64 Mb of Flash. The platform has no-MMU, no shared libraries, FLAT executables.

I'm having problems booting a Busybox system with very simple init.d shell scripts. The system is running out of memory by executing simple shell commands like "[" or "printf". It turns out that everytime one of these commands are executed the system needs to load the FULL, one and only busybox executable (650 Kb on my system).

So the question is: if the system always have to load in memory a huge executable for each command implemented inside busybox, then how is this convenient? I don't get the point of saving some megabytes of cheap and abundant storage while going out of ram extremely fast, but maybe I'm overlooking something.

Is my platform an use case for Busybox? if not, is there anything to conveniently build linux system utilities each on their own executable?

Thanks in advance!

EDIT:

Busybox, according to themselves "has been written with size-optimization and limited resources in mind" and so became a sort of unquestioned de-facto standard in embedded systems. But how their statement relates to the forementioned issues on RAM (not storage) constrained systems? I believe this worths some clarification.

Follow up, system details:

The kernel is already compiled for XIP, executing from the 64 Mb external flash. The entire read/write ext3 root filesystem (including busybox binary) now resides on a micro SD card. Busybox executable is using the FLAT format ("bFLT") with load to RAM bit enabled, that bit seems to be causing a new load on a different memory block each time it runs a concurrent command until it exhausts the fitting blocks. The advice to put busybox (the entire /bin, /sbin) on a XIP filesytem is brilliant and it will surely improve the execution speed (of course, this new filesystem will need to reside on the 64 Mb external flash). I never tried a "bFLT" to be executed in place on such a filesystem (nor have an idea if it works) but I'll do my research/testing about that.

royconejo
  • 2,213
  • 3
  • 24
  • 29
  • Try to load the full `[` or `printf` from bash. – KamilCuk Aug 05 '19 at 14:26
  • first question you are trying to run linux on a cortex-m7? second question busybox/u-boot are not required to load and start linux, its pretty easy to do that, what is it you want to get from a bootloader? these are quite bloated bootloaders, do you absolutely need a bootloader and if so what features do you need. Seems like the system engineering hasnt been done I suspect you either need to choose a different processor/family or choose a different software solution for the processor you have. but cant tell from your question. – old_timer Aug 06 '19 at 14:52
  • @old_timer yes, I'm running linux on a Cortex-M7, I believe it's a common MMU-less target. I'm using a custom, extremely simple bootloader similar to afboot-stm32 but with graphics subsystem initialization, but even if I was using u-boot, I'm afraid I don't get your point about the bootloader since I've plenty of flash storage. Surely I'll be happier with a MMU and plenty of memory, but I need to make this work as it is. The system is capable, It's just there is not much information available about running linux on MMU-less architectures. – royconejo Aug 08 '19 at 01:25
  • thats part of the problem, running linux on anything non standard/common is a problem and the bootloader adds to the fun. Did arm port this or some hobbyist, etc? the m7 itself is almost irrelelvant most of the work is the non-arm portions of the chip. some arm specific stuff but the majority is not. is there a known working port of both a bootloader and a linux to that chip? is linux required, what about an rtos of which there are definitely some ported to mcus like that. is an os required at all, is the first question... – old_timer Aug 08 '19 at 02:39
  • I dont port operating systems I run bare metal but porting to a new platform is not a simple thing, for someone very experienced at it can take months. and thats for the easy ones with an mmu and basically a full blown linux. For a scaled down one or an ancient one that is still around for mmu-less platforms, may or may not be more or less work. If someone has already ported these things to your chip and possibly board then its a howto thing. might be worth finding a board/system/linux combination that has been ported. – old_timer Aug 08 '19 at 02:43
  • more questions than answers possibly too big of a topic for this forum. you are stuck at the bootloader stage for a particular chip/board. waht is the specific chip and is it a commercial board you can talk about or a product in development that you cant? then from there where did you get busybox, are you porting it or is it already ported supposedly, etc? – old_timer Aug 08 '19 at 02:44
  • @old_timer STMicroelectronics developed specific drivers for their own Cortex-M4/M7 which I'm using, and are part of the mainline kernel. Choosing another OS it's way off the scope of my question, specifically I'm looking for best practices to build a linux system on MMU-less, memory constrained architectures and choosing which coreutils, how to compile them and from where to execute them is just one aspect I'd like to improve upon. – royconejo Aug 08 '19 at 06:02
  • dear @conejoroy it's more than a year that I started learning about embedded linux on mmu less architectures and I tried so so hard on running a full linux system on cortex-m platforms such as stm32f429 discovery board and my own stm32h7 custom designed board. but still I have so many problems and questions and I'm struggling so much. I was wondering what do u think if we colaborate on it to bring linux to mmu-less platforms and start a blog or some kinda book to make it easier for other people to work with linux on mmu-less platforms... I have many cool ideas. can I have your email address? – Mahyar Shokraeian Aug 15 '21 at 12:37

2 Answers2

6

TL-DR; Linux has a huge infrastructure and variety of rootfs or boot file systems available. The choice is due to accommodation of different system constraints and end user functions. Busybox is a good choice for the target system, but any software can be misused if a system engineer doesn't spend time to understand it.


Is my platform an use case for Busybox?

It is if you take time to minimize the kernel size and busybox itself. It is unlikely you need all features in your current busybox.

if not, is there anything to conveniently build linux system utilities each on their own executable?

See klibc information below. You can also build dash with musl, with buildroot and busybox. Many filesystem builders support shared libraries or static binaries. However, there are many goals such as package management, and live updates, that a filesystem builder may target.

More Details

You can configure features out of busybox. The idea is that all of the configured features are needed. Therefore you need them all in memory. With busybox, ls, mkdir, printf, etc are all the same binary. So if you run a shell script the one code load is all code loads. The other way, you have many separate binaries and each will take extra memory. You need to minimize Linux to get more RAM and you can take features out of busybox to make it smaller. Busybox is like a giant shared library; or more accurately a shared process. All code pages are the same.

a custom Cortex-M7 board with 16 Mb of SDRAM and 64 Mb of Flash

...

one and only busybox executable (650 Kb on my system)

Obviously 650KB is far less than 16MB. You don't say what the other RAM is used for. For another good alternative look at the klibc toolsuite. What is not clear is whether the FLASH is NAND/NOR and if you have XIP enabled. Generally, busybox would be better with XIP flash and klibc would be better (and more limited) for SDRAM only, with some filesystem in flash.

See: Memory used by relocatable code, PIC, and static linking in the Busybox FAQ. It is designed to run from Read-only memory which can be a big gain depending on system structure. It probably provides a more rich feature set than klibc as the goal with that project is just to boot some other mount device (a hard drive, SSD etc).

Klibc does not have as much documentation as busybox. It can be either a shared library or statically linked. Each binary will only use the RAM needed for that task with static linking, but this will take more flash space. The binary with klibc are,

 1. dash    2. chroot     3. dd      4.  dmesg  5.  mkdir  6.  mkfifo
 7. mknode  8. pivot_root 9. unmount 10. true   11. false  12. sleep
 13. ln    14. ls        15. mv      16. nuke   17. minips 18. cat
 19. uname 20. halt      21. kill    22. cpio   23. sync   24. readlink
 25. gzip  26. losetup

and that is IT! No networking, no media players, etc. You can write code to use klibc, but it is a very constrained library and may not have features that you require. Generally it would be limit to disk only tasks. It is great to probe USB for external device to boot from for example.

Busybox can do a lot more. Most klibc static binaries will be under 100kB; with 10-30kB typical. Dash and gzip are larger. However, I think you need to remove configuration items from your kernel as 650KB << 16MB and busybox would be a fine choice for this system even without XIP.

I should also be noted that Linux does 'demand page loading' for code with an MMU system. Even if you don't have swap, code can be kicked out of RAM and reloaded later with a page fault. Your system is no-MMU, so busybox will not perform as well in this case. With an mmu and 'demand page loading' it will do much better.

For severe constraints, you can always code a completely library free binary. This avoids libgcc startup and support infrastructure which you might not need. Generally, this is only good to test a kernel vs. initrd issue and for script/binary that must run in many different library environments.

See also:

  • AXFS - xip read-only file system.
  • CramFs - another xip file system.
  • XIP kernel - the kernel can be huge. Get it out of RAM if possible. Configure with EMBEDDED option if not.
  • nommu.org - some information on github
  • elf2flt - Mike Frysingers updates to binutils 2.27-2.31.1
  • fdpic gcc - notes from 2016 by mickael-guene.

XIP can only work with ROM, NOR flash and possibly SPI-NOR MTDs.

artless noise
  • 21,212
  • 6
  • 68
  • 105
  • Excellent! Your suggestion about XIP and XIP filesystem is brilliant. Please see my edit with the follow up and system details. – royconejo Aug 05 '19 at 23:27
  • "Your system is no-MMU, so busybox will not perform as well in this case. With an mmu and 'demand page loading' it will do much better." Yes I'm aware of that. That's why I believe my question is valid since nothing known can be assumed when working on a non-MMU platform. – royconejo Aug 05 '19 at 23:31
1

The point of BusyBox is that you need to have only one (shared) executable in memory.

You wrote:

It turns out that everytime one of these commands are executed the system needs to load the FULL, one and only busybox executable (650 Kb on my system).

That's not entirely true. For the first command the executable is loaded in memory indeed. But if you're running multiple commands (i.e. multiple BusyBox instances), the executable is not loaded into memory multiple times. A large part of the binary (simply said: all read-only data, such as the executable code and constant data) will be reused. Each additional process only requires some additional memory for its own tasks.

So if a single instance of BusyBox consumes 640 kB of memory, it's possible that 10 instances together only use 1 MB of memory, while 10 unique executables of 200 kB would use 2 MB.

I would recommend to do some realistic tests on your system and check the actual memory consumption. But note that tools such as ps or top can be a bit misleading or difficult to understand. A lot has been written about that already, and if you would like to know more about that, a good starting point would be here.

wovano
  • 4,543
  • 5
  • 22
  • 49
  • This maybe true with an mmu. It is not with no mmu. Certainly all data pages must load. Also some options like shared library, might cause the code pages to reload at different addresses. – artless noise Aug 23 '19 at 20:58
  • Some of the issues are in user space such as the loader and libc in use. I agree that busybox can work like you say. However as the op points out this is not always the case. Perhaps ulibc was not configured correctly? – artless noise Aug 24 '19 at 00:04
  • I have to admit I don't know that much details about how Linux loads the process in memory on different systems. I tried to explain the general principle of BusyBoy, which explains the statement that it "has been written with size-optimization and limited resources in mind". For a system where this principle does not work, I would seriously question if BusyBox has any advantage at all... So, without knowing all the details and purely based on this information I would say that BusyBox is not the best choice for this platform then, at least not for the reason of memory optimization. – wovano Aug 24 '19 at 06:33
  • @artlessnoise you are correct, the stated in the response above (and on the BusyBox project page) is not correct when running on a MMU-less architecture. Furthermore, there is a complex method to use "shared libraries" on MMU-less kernels (shared FLAT binaries, I think) that it's not supported by buildroot and in my experience, impossible to achieve by using crosstool/busybox alone. So I believe my question still remains valid. – royconejo Aug 29 '19 at 16:12
  • @conejoroy FLAT should be supported by buildroot. However, this is complex. You need kernel support, compiler/linker/binutils support, and uclibc support for FLAT. I don't think glibc can support it and it might not be in MUSCL or other libraries. All three have to be right, and no one is probably actively looking at it so support might be broke for particular version combinations. busybox will only *work as advertised* with FLAT on a non-mmu linux. – artless noise Aug 30 '19 at 18:14
  • @artlessnoise I have a working toolchain/uclib/kernel with FLAT support. And Buildroot is not working as advertised when FLAT is being loaded into RAM (as it happens when booting from SD or without XIP support). With XIP enabled (as executing from ROM) executable code doesn't need to be loaded, just executed, but this is true for ANY executable, even for many coreutils in their own executable. – royconejo Aug 31 '19 at 19:17
  • There is another recommendation to use [fdpic binaries](http://www.aerifal.cx/~dalias/binfmts.html). The last kind of work I see on this is about 2016. At that point I know people had Linux running on Cortex-M and I don't believe there was an issue to share code then. It is quite possible that things are in disrepair. The only point I was trying to make is that it is possible for this to work; not that it should work out of the box for you. I understand that can be frustrating. – artless noise Sep 01 '19 at 22:48