How to start ARM Cortex programming using embedded C?

Question

I am familiar with 8051 C programming .Now I want to learn ARM cortex M3 programming . I have STM32F103C8T6 development board with ARM Cortex M3 Processor, it's programmer and Keil compiler.I want to do small projects with it for example blinking LEDs ,SPI and I2C programming etc. I am having little knowledge about arm architecture . Many people on blogs say directly start programming instead of reading architecture or reading hundred pages ARM datasheet. I don't understand how is it possible.

So what should be my first step?

Should I read datasheet of STM32F103C8T6 or ARM Cortex M3 user manual?
8051 and ARM programming has much difference. In 8051 ,we don't need to add library/header files. In ARM we need to add many library/header files. Suppose I want to do blinky program or learn SPI/I2C communication. In KEIL Compiler or STM CubeMX, these header files are already there .But if I wish do it everything (writing header file codes for peripheral,i/o ports, SPI/I2C protocol codes) from scratch by myself,is it really possible ? If yes, what should I do for it?

I am very much confused and frustrated as I have not find proper person yet to guide me regarding it

enter image description here

score 11 · Answer 1 · answered Jun 29 '18 at 05:29

I have been there, After some time I realised that it's always better to start with the data sheet even if it's your first board.

Data sheet provides a comprehensive working of the board, pins and basic communication. It could be tedious but it's worth and you will realise when you start programming.

After that you could directly jump to header files, and see implementation of basic functionalities, this will give lot of insights about optimisation technique, the style of programming and best practices.

If possible find some more code written for that board (I always fail here, It's hard if your board is rare).

With this you should be ready to write almost any code. Start with the blinky (the hello world for boards)

Also with my experience, I want to tell you that, it's okay if it takes time. Have patience and persistence.

score 5 · Accepted Answer · answered Jun 29 '18 at 09:31

It is possible (i.e. to start without detailed knowledge of datasheets and reference manuals) if you utilise existing library code to deal with architectural, start-up and peripheral driver issues. For ARM Cortex-M and STM32 specifically those might include (at various levels of abstraction and scope):

Often also commercial tool vendors (such as Keil, IAR, Rowley, Green Hills) provide example projects and driver libraries and middleware to get you started - often for specific development boards.

That is perhaps where your perception that for ARM programming you

"need to add many library/header files."

You don't need to at all - but they are more complex parts than 8051 with extensive and complex peripheral sets that differ between parts and vendors, and you can save a great deal of time and effort by utilising such libraries.

Note that the ARM Cortex-M core itself does not include the microcontroller peripherals and outside of the CPU and the NVIC interrupt controller, and on some higher-end parts the FPU, everything is vendor specific and differs widely between vendors - that is why you need to either understand the vendor documentation, or leverage chip or tool vendor or community provided libraries.

If you want to fully understand the Cortex-M or the STM32 and get the most out of them, then there is no substitute for reading the reference material, but it is by no means necessary just to get started.

If you want an easier in to Cortex-M than the ARM reference material, then Joseph Yiu's The Definitive Guide to ARM® Cortex®-M3 and Cortex®-M4 Processors is a good source, but unless you are writing low-level RTOS or bare-metal start-up code or other system level code, you may not need that much material. The earlier M3 only edition of this book is available as a PDF in some places.

The chip vendor's reference manual, which will describe the vendor specific features such as memory interfaces, memory map, power-management features, flash memory programming, interrupt mapping, and hardware peripherals will be the more useful material perhaps.

For the STM32 specifically there is a somewhat broad guide by Trevor Martin of Hitex: The Insider's Guide To The STM32 ARM Based Microcontroller, just one of several publications by Trevor that may be useful.

@user3559780 : That is a different question and off-topic on SO (asking for resources, and purely a matter op opinion). Google and perhaps Amazon reviews are your friend. You should perhaps refer to "C on embedded systems" rather then "embedded C" - there is no special C for embedded systems, but there are programming techniques and issues specific to embedded systems. I am not the person to ask either, because I would always advise to use C++ in preference to C, albeit perhaps subset of C++ suited to your target constraints. — Clifford, Jun 30 '18 at 19:18

score 3 · Answer 3 · answered Jun 29 '18 at 06:51

As a C programmer, you don't need to read about the CPU core architecture, although I recommend a brief read of the CPU manual. Knowing what registers there are and what resources there are, data cache, instruction cache etc, means you can write higher quality C code. This is however far more important with horribly bad cores like 8051.

As for the MCU peripheral hardware and memory, you do need to read every single line in the manual for the parts you intend to use. This includes the fundamentals like watchdog, clock setup, MMU, interrupt handling etc etc.

However, most tool chains for MCU:s come with some manner of sugar-coating libraries nowadays. They'll give you a working project with most things set. This means that you don't have to learn everything at once, but can do so bit by bit, as the project goes on. To blink some LEDs for example, you should be able to do that without reading anything but the GPIO part of the manual.

You might eventually have to replace the pre-made quick & dirty libraries with quality code. This is because the silicon manufacturers that provides these libraries are notoriously bad at writing software. In some cases they manage to give you proper MCU set-up code, but more often they give you low quality code.

This is roughly how a MCU setup should look like. You can use the list to verify if what you've gotten from the manufacturer is at all useful, or if it has to be rewritten by a professional embedded programmer. ARM CMSIS for example, may or may not be good enough, depending on application requirements. It comes in different flavours too, depending on tool chain.

score 3 · Answer 4 · edited Apr 01 '21 at 08:17

This is a couple of working examples plus advice, if not what you are looking for then, in summary, don't bother...

I recommend you follow the various paths, and repeat that on a periodic basis. Professionally you want to be able to cover the range from just the datasheets/schematics up to canned libraries from the vendors and up to freertos or other solution. My personal preference is datasheets/schematics I find this takes less time and is cleaner, faster, more reliable, etc - YMMV. First rule is the documentation is buggy, second the libraries are buggy (and scary if you look inside).

Some vendors do a good job relatively some tend to do not that great of a job, with time you may find out for yourself which vendors you like, not that you would professionally be able to dictate one over another until well down the road, but for fun as a hobby can certainly do that. So far I am not talking about arm vs 8051 vs other, they all have this datasheet vs library, buggy docs, etc. Anyway, there are times where you may have to dig into the library or various online open source examples to find the missing enable bit not documented anywhere, find out there may be an order in which some registers have to be programmed, which someone else may have had inside knowledge or just dumb luck.

So I'll just post a somewhat minimal example that should work with whatever flavor of arm-whatever-gcc/as/ld you have. (arm-linux-gnueabi arm-none-eabi, etc). And/or you can just build your gnu tools from sources as I do periodically. Best start with some pre-built though worry about one issue at a time.

The board you have shown in your picture also has particular community type name, its the "blue pill" or "stm32 blue pill". And unlike some other inexpensive boards you can find on eBay or other this one has had community work done to fit into the Arduino sandbox, yet another avenue to pursue for knowledge and experience, taking that Arduino path you don't need to read much of anything other than google stm32 blue pill, take some very generic few lines of code and take them to their sandbox and you have a blinking led.

You are going to want/need openocd with stlink support, so either find a binary or build from sources, its pretty easy. My preference is to take the config files from the tcl directory and carry them with my project and modify as needed rather than hope they are there and have not changed from one version of openocd to another or from one machine I am on to another. YMMV, this is very much a personal preference thing. For example I use an el-cheapo jlink clone, a few bucks on eBay (purple board). And have a jlink.cfg

interface jlink
transport select swd

The board in your picture is one of the stlink flavors, you can figure out through trial and error or lsusb or other means which one. It might be this for example:

interface hla
hla_layout stlink
hla_device_desc "ST-LINK/V2"
hla_vid_pid 0x0483 0x3748

In my case

openocd -f jlink.cfg -f target.cfg

where target.cfg is a single file that contains the various stm32f1xxx config files from openocd.

The blue pill has an led on PC13, port c pin 13 within that port.

The examples here are not just bare metal but include the bootstrap code, you don't need other code or headers, only these files, and a gnu arm compiler or cross compiler depending on if you are running this on an arm development machine (Raspberry Pi computer, etc) or something else (x86 based, etc). The code is designed for portability and other things, not so much as a library approach, a way to get you started in an "I can do this" fashion then you move on to your own personal preferences by examining more complicated solutions.

sram.s

@.cpu cortex-m0
@.cpu cortex-m3
.thumb

.thumb_func
.global _start
_start:
    ldr r0,=0x20001000
    mov sp,r0
    bl notmain
    b hang
.thumb_func
hang:   b .

.thumb_func
.globl PUT32
PUT32:
    str r1,[r0]
    bx lr

.thumb_func
.globl GET32
GET32:
    ldr r0,[r0]
    bx lr

.thumb_func
.globl dummy
dummy:
    bx lr

The stm32f103... is cortex-m3 based which you should know by now because you got the STM32F103C8T6 documentation from st the datasheet with pinouts and such, packaging and part number info (tells you how much flash and ram you have among other things) and the reference manual in st terms (some vendors the datasheet has all the registers and descriptions as well) which has all the registers, address space, and such details. Between these you find out this contains a cortex-m3 from arm, so you go to arms website and get the technical reference manual for the cortex-m3 in which you find it is based on the armv7-m architecture, and at arms website you get the armv7-m architectural reference manual, this is your STARTING set of documentation for this CHIP, then you may try to find a schematic or some other way of figuring out that PC13 is where the led is.

So while the armv7-m supports the much wider array of thumb2 extensions to the thumb instruction set, I generally prefer to start with the traditional thumb instructions for getting started from a generic-ish framework as you will see below then if needed (for performance usually) change the build tools or code to allow in the armv7-m thumb2 extensions, YMMV.

blinker01.c

void PUT32 ( unsigned int, unsigned int );
unsigned int GET32 ( unsigned int );
void dummy ( unsigned int );

#define GPIOCBASE 0x40011000
#define RCCBASE 0x40021000

int notmain ( void )
{
    unsigned int ra;
    unsigned int rx;

    ra=GET32(RCCBASE+0x18);
    ra|=1<<4; //enable port c
    PUT32(RCCBASE+0x18,ra);
    //config
    ra=GET32(GPIOCBASE+0x04);
    ra&=~(3<<20);   //PC13
    ra|=1<<20;      //PC13
    ra&=~(3<<22);   //PC13
    ra|=0<<22;      //PC13
    PUT32(GPIOCBASE+0x04,ra);

    for(rx=0;;rx++)
    {
        PUT32(GPIOCBASE+0x10,1<<(13+0));
        for(ra=0;ra<400000;ra++) dummy(ra);
        PUT32(GPIOCBASE+0x10,1<<(13+16));
        for(ra=0;ra<400000;ra++) dummy(ra);
    }
    return(0);
}

Years of experience with chips, boards, simulation of chips and boards and peripherals, plus an old timer when I was in my 20s in the aerospace industry, you want to force the instruction, 32 bit loads and stores, 16, bit etc. So I use asm to get the right instruction I want as well as have a very clean abstraction layer (despite the performance cost, which can be solved with macros as needed down the road with the same source code) with which to attach a simulator, punch through an operating system, etc. YMMV.

I have had the volatile pointer trick fail BTW and produce the wrong instruction, I very rarely use that trick. Never use structs across compile domains as a rule and don't misuse unions, and I say that because these days almost every solution you are going to find is going to violate one of both of those cardinal rules. You will own that risk when you use such libraries. but professionally you will at times have to own that risk as those libraries may be dictated for one of a number of reasons, personal preference, the boss said so, maintaining legacy projects, don't have time/desire to re-write a tcp-ip stack or filesystem, etc.

You can see this is doing a verbose read-modify-write to configure PC13 as a push-pull output. Then using a convenient bit set or reset register to change the state of one pin in that port. Having dummy() in a separate optimization domain, forces the compiler to have to generate that loop and burn that time so that we can see the led on

for(x=0;x<0x80000;x++) continue;

Optimizes away as it does nothing so to avoid that don't optimize intentionally or accidentally, or make x volatile to suggest the compiler does loads and stores every time it touches x, or call a function in another compile domain that uses x forcing the compiler to have to build the loop. LATER you can mess with timers.

Linker scripts are very specific to the toolchain, you can do without with gnu but there is a weirdism that sometimes happens, that I will leave you to find (-Ttext=0x20000000 in this case). You can make it slightly simpler than this but most folks make it significantly more complicated, YMMV.

MEMORY
{
    ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
    .text : { *(.text*) } > ram
    .rodata : { *(.rodata*) } > ram
    .bss : { *(.bss*) } > ram
}

I have no need to initialize .bss nor .data so I can simply remove all the baggage that comes with those requirements. A violation of the C standard that I am happy with for this example. Also note my entry point is not called main() at least one toolchain so have to assume others looks for that function name and piles on extra baggage that we don't want. The entry point name should be generic. Even _start which is used by the toolchain to mark the entry point as if this were a binary to be run on an operating system. Is not needed the linker may throw a warning but will still build the binary.

build

arm-none-eabi-as --warn --fatal-warnings -mcpu=cortex-m0 sram.s -o sram.o
arm-none-eabi-gcc -Wall -Werror -O2 -nostdlib -nostartfiles -ffreestanding -mcpu=cortex-m0 -march=armv6-m -mthumb -c blinker01.c -o blinker01.o
arm-none-eabi-ld -o blinker01.elf -T sram.ld sram.o blinker01.o
arm-none-eabi-objdump -D blinker01.elf > blinker01.list
arm-none-eabi-objcopy blinker01.elf blinker01.bin -O binary

I no longer think the -nostdlib -nostartfiles -ffreestanding is required, it depends I guess on which version and flavor of gnu tools you are using.

Always check the disassembly on a first build, and whenever you mess with the makefile/infrastructure:

20000000 <_start>:
20000000:   4805        ldr r0, [pc, #20]   ; (20000018 <dummy+0x4>)
20000002:   4685        mov sp, r0
20000004:   f000 f80a   bl  2000001c <notmain>
20000008:   e7ff        b.n 2000000a <hang>

2000000a <hang>:
2000000a:   e7fe        b.n 2000000a <hang>

2000000c <PUT32>:

To insure the entry point we wanted, the code we wanted up front is there and built correctly, which it is in this case.

Open a couple more command line terminals. Generally where you launch openocd is the reference point for where it looks for files, so if you launch in the directory where your binary is, then you don't need to add paths, and/or copy your binary, in this case the elf version is preferred to where you launch your openocd. Or just type paths, its your choice. Also by taking over the openocd config files you can write your own scripts into the config to automate these steps.

In one window launch openocd

openocd -f jlink.cfg -f target.cfg

Open On-Chip Debugger 0.10.0-dev-00325-g12e4a2a (2016-07-05-23:15)
Licensed under GNU GPL v2
For bug reports, read
    http://openocd.org/doc/doxygen/bugs.html
swd
adapter speed: 1000 kHz
adapter_nsrst_delay: 100
none separate
cortex_m reset_config sysresetreq
Info : No device selected, using first device.
Info : J-Link ARM-OB STM32 compiled Jun 30 2009 11:14:15
Info : Hardware version: 7.00
Info : VTarget = 3.300 V
Info : clock speed 1000 kHz
Info : SWD DPIDR 0x1ba01477
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints

If you do not end up with the line x breakpoints, y watchpoints (for an arm) then you are not ready to move on, get your debugger wired right, etc. Note that the vcc line on the debugger is NOT really there to power the board despite what they say for that particular dongle you have in the picture, I think I blew up the first one of those trying to do that. You want to power the target, then the vcc line is actually a vcc sense line, they drive the IO voltage on the debugger as well as set the sample point for the input. Generically/historically so you can use tools like this on 5.0v 3.3v 3.0v, 1.8v, etc the v sense line is used to see we are using 3.3v levels in this case. Check your wiring check your power, re-install, build from sources, etc until openocd gets you to this point.

The simplest way in is the telnet interface, I have no use for gdb, but that is the next level of complication, save that for after the telnet interface works.

in one of your other command line terminal windows

telnet localhost 4444

Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Open On-Chip Debugger
>

Then

> halt
stm32f1x.cpu: target state: halted
target halted due to debug-request, current mode: Thread 
xPSR: 0x01000000 pc: 0x08000042 msp: 0x20001000
>

If you halt again after it is halted you don't get that output. I have in flash an infinite loop (take the example below and comment out the bl notmain) to demonstrate to myself this sram example will work for you, nothing on flash other than the stack pointer is required for the sram example to work.

Once halted then

> load_image sram/blinker01.elf
144 bytes written at address 0x20000000
downloaded 144 bytes in 0.008081s (17.402 KiB/s)
> resume 0x20000000
>

And the led should start blinking slowly. One could argue that needs to be resume 0x20000001, but the tool happens to work with 0x20000000.

This is running out of sram, which is generally smaller than the flash, but you can get a lot of education this way without risking bricking the mcu (with the boot0 pin you get a get out jail free card with these stm32 devices, for your own projects for parts that don't have a strap pin for an alternate/safe boot path, you should try to build in an alternate, safe, boot path so you don't brick the chip/board (accidentally change the I/O pin configuration for the jtag pins you rely on to get into the chip and change the firmware for example, been there done that) it happens from time to time no matter how much experience you have. I generally buy at least two if not more of each new to me board, just in case.

Now doing the same thing with flash

flash.s

@.cpu cortex-m0
@.cpu cortex-m3
.thumb

.thumb_func
.global _start
_start:
stacktop: .word 0x20001000
.word reset
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang
.word hang

.thumb_func
reset:
    bl notmain
    b hang
.thumb_func
hang:   b .

.thumb_func
.globl PUT32
PUT32:
    str r1,[r0]
    bx lr

.thumb_func
.globl GET32
GET32:
    ldr r0,[r0]
    bx lr

.thumb_func
.globl dummy
dummy:
    bx lr

How a processor boots is very specific to that processor, just have to read the docs, in this case it is a vector table not simply an address you just start executing. The details for this ARE in the arm docs, look for words like reset and exception and vector and interrupt and you will find the table. Note these cortex-m mcus can have up to and over 128 or 256 possible vectors, depends on the core and the chip vendor, you don't HAVE to specify more than you are using you can get away with something to consume the stack pointer word (possibly the stack pointer init you want) and have to have the reset vector, beyond that it is up to you, you don't provide a hard fault or undefined instruction path and have it just land in the middle of code, your call, your debug. At the same time setting up vectors for interrupts (burning valuable flash space) you will never enable... your call, your debug.

flash.ld

MEMORY
{
    rom : ORIGIN = 0x08000000, LENGTH = 0x1000
    ram : ORIGIN = 0x20000000, LENGTH = 0x1000
}
SECTIONS
{
    .text : { *(.text*) } > rom
    .rodata : { *(.rodata*) } > rom
    .bss : { *(.bss*) } > ram
}

(I recycle my generic examples/starting point so the amount of flash/ram likely does not reflect this actual part, adjust as needed).

Changing the blink rate so I can see the difference between the flash version and sram version.

for(rx=0;;rx++)
{
    PUT32(GPIOCBASE+0x10,1<<(13+0));
    for(ra=0;ra<200000;ra++) dummy(ra);
    PUT32(GPIOCBASE+0x10,1<<(13+16));
    for(ra=0;ra<200000;ra++) dummy(ra);
}

build

arm-none-eabi-as --warn --fatal-warnings -mcpu=cortex-m0 flash.s -o flash.o
arm-none-eabi-gcc -Wall -Werror -O2 -nostdlib -nostartfiles -ffreestanding -mcpu=cortex-m0 -march=armv6-m -mthumb -c blinker01.c -o blinker01.o
arm-none-eabi-ld -o blinker01.elf -T flash.ld flash.o blinker01.o
arm-none-eabi-objdump -D blinker01.elf > blinker01.list
arm-none-eabi-objcopy blinker01.elf blinker01.bin -O binary

And very much need to inspect the entry point

Disassembly of section .text:

08000000 <_start>:
 8000000:   20001000    andcs   r1, r0, r0
 8000004:   08000041    stmdaeq r0, {r0, r6}
 8000008:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 800000c:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 8000010:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 8000014:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 8000018:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 800001c:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 8000020:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 8000024:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 8000028:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 800002c:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 8000030:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 8000034:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 8000038:   08000047    stmdaeq r0, {r0, r1, r2, r6}
 800003c:   08000047    stmdaeq r0, {r0, r1, r2, r6}

08000040 <reset>:
 8000040:   f000 f808   bl  8000054 <notmain>
 8000044:   e7ff        b.n 8000046 <hang>

08000046 <hang>:
 8000046:   e7fe        b.n 8000046 <hang>

The initial value for the stack pointer (the disassembly of the vector table is bogus just the disassembler trying to make sense of those words, the values are what we care about ignore the disassembly) then the vectors, the reset vector is the address of the reset handler ORRED with one 0x08000040|1 = 0x08000041, don't expect it to boot otherwise, it will try to go into a fault handler and then likely fail there. You should NOT NEED to orr the one in yourself in this code, if you are doing a bootloader hopping to some other address, sure you will need to orr with one. I strongly discourage ADDING one, if the tool got it right and you add one you now messed up the address and it won't work if the tool got it right and you orr one then you didn't mess it up.

Now here is the rub...

I have had some of these blue pills come with the flash locked, and there is an openocd way to get out of that but I don't know where I recorded it, I did modify my uart based loader to handle it, note another thing to have in your back pocket, this device as with a vast number of other mcus from this and other vendors may come with a burned in bootloader or other non-jtaggy way into the chip in circuit, make some attempts to build a tool for that interface as you may have to some day professionally. In this case the write unprotect command does not actually do what the documentation implies, it doesn't return success and you have to power cycle the chip, but then it is unprotected and you can use a uart bootloader based tool or the swd (jtag like) tool to get into this part.

Same deal as before use openocd to connect to the debugger in the part.

> halt
stm32f1x.cpu: target state: halted
target halted due to debug-request, current mode: Thread 
xPSR: 0x01000000 pc: 0x20000014 msp: 0x20000ff0
> flash write_image erase blinker01.elf 
auto erase enabled
device id = 0x20036410
flash size = 64kbytes
wrote 1024 bytes from file blinker01.elf in 0.437904s (2.284 KiB/s)
>

If instead you get

stm32x device protected
failed erasing sectors 0 to 0

Your part is write protected and you have to figure that out. If I find it I will post it here, but because the part I have handy is not protected I cant test it. And not really wanting to go buy a handful more parts in the hope they are protected just to demonstrate what you may find when you purchase these blue pills from someone. At the same time I don't see a lot of these examples for this board dealing with this so maybe it was just one batch and/or one manufacturer that did this and myself and another guy online happened to get unlucky.

At this point you can

> reset

Or

push the reset button with the jumpers both moved to the 0 side. Which indicates boot0 and boot1 are strapped to 0 (really only need boot0 to be a 0). In your picture your jumpers are set to 1, you need to move at least the one not next to the reset button, the one next to the reset button is assumed to be boot1.

And the led should blink twice as fast as the sram example. And when you power the board off then on again, with boot0=0 it will start to run this program from flash.

Pure assembly would have been even simpler of an example, the volatile pointer trick slightly simpler. But if you cant get this code working you are probably not going to get something more complicated working. This is close to as simple as it gets.

Note again these examples do not support .data nor assume .bss is zero, it will take a bit more code and knowledge for you to allow for those assumptions, personally I don't rely on those and don't have to complicate the bootstrap nor the linker script (which are toolchain specific and won't port), but that is personal preference.

Search for "stm32 blue pill" and find the pages on STM32duino or something like that, try those examples. There is probably mbed support another sandbox, this one backed buy or supported by ARM, as well as st has at least two flavors of libraries cmsis and a legacy one. Note that chip vendors to be in business often have a library set, and for marketing and other reasons, continue to churn on those, so while the HAL and CMSIS and cube and whatever are present today, expect one and eventually all of those to be gone down the road, even CMSIS, in part because it is a bit of a mess, no real central owner. Granted CMSIS may evolve, but don't expect it to remain in its exact form.

At the same time the way this industry works is you ideally buy a part for your product that is relatively new and/or will be in production about as long as your product hopes to be in production, you write/debug/build the firmware and hopefully never have to touch it, so you can use whatever the popular/FAD library of the day is, ideally save a build machine from today to use in the future, but hope that you never have to touch it and today's favorite library will work long enough to get you to production. the do it yourself approach is far easier to maintain, for simple things like gpio, spi, uart, etc are IMO easier than the library, things like USB, Ethernet, etc might be harder not the interface with the peripherals, but the vendor library will have included a usb stack, an ethernet ip stack, filesystem support which are likely heavily integrated into their peripheral library support and not necessarily worth separating to avoid their peripheral library code.

At the end of the day, professionally, you should assume you own all the code you use, including the libraries. Your boss will only see that the product line failed, and not care that it was a third party library that you chose that was at fault.

Hmm if you have used 8051 then you know a fair amount of this, or should have. Vendor supplied libraries vs not. The core or at least clones are used and not identical to each other from vendor to vendor/family to family, etc. And you have to adapt to instruction set vs chip stuff... — old_timer, Jun 29 '18 at 14:43
_"You are going to want/need openocd with stlink support..."_ is a bold assumption and the basis of a huge amount of what followed. By no means the only option, and certainly not the simplest - especially if you spend money. I'd have perhaps wanted to have found out much more about the OP's development environment and tools (and perhaps budget) before putting in such an admirable amount of work on an answer. — Clifford, Jun 29 '18 at 15:42
@Clifford openocd is free, supports the device in the picture, allows for an sram load before trying a flash load in case the part is locked, and down the road helps with generic debugging if desired. there are other stlink based solutions yes, as mentioned. plus a zillion others not mentioned. If the OP does not have the hardware in the picture then will happily delete this answer. Or recommend they spend the roughly $5-$6 dollars for the hardware required (okay another few bucks for jumper wires). solution should work on windows or linux or osx... — old_timer, Jun 29 '18 at 17:04
Those boards generally come without pins soldered down so that is the bigger thing IMO, if dont have the tools or experience, not that putting pins on is hard but if you have never done it. This particular board only supports the swd interface without a soldering iron required, there is another blue board that is cortex-m0 based for about the same money that has both swd and uart exposed, for another $1.50 give or take you can get a usb to uart with which you can get at the uart interface, download or I would recommend write, a tool to program the chip via that.... — old_timer, Jun 29 '18 at 17:07
...can do that here too but need a header/pin solution and need some flavor of uart to connect to the host. — old_timer, Jun 29 '18 at 17:08
or for $10 plus shipping (which turns out to be painful as it is almost the cost of the board) you can get a low level NUCLEO board (or discovery) that has the stlink and in some cases uart, all built in. plus the pins are already soldered down. I assumed, perhaps incorrectly, that the OP had what was pictured. — old_timer, Jun 29 '18 at 17:09
Sorry, you seemed to have missed my point. I do not disagree that openocd is a low cost option, but the question was about C programming for ARM Cortex-M3, not about how to connect to the part for programming and debug, and cost was not mentioned as a constraint. As a beginner, if I were faced with all that information, I would probably give up! — Clifford, Jun 29 '18 at 17:28
@Clifford hmm, okay...My hope with these is to remove the myriad of traps that normally lead folks to give up. If I have created a new burden well I dont know what to do. — old_timer, Jun 29 '18 at 18:17
I agree, you have put an unbelievable amount of effort in. I was going to suggest it worthy of https://stackoverflow.com/documentation article, but found that project has been shut down. This is perhaps just the kind of thing that project would have been good for. Too broad I think for a really good SO question. — Clifford, Jun 29 '18 at 18:27
It wasnt that much work just banged it out. I have a plethera of these examples out there in the world already, I just trimmed one down a tiny bit (cut the white space out mostly) and wrote a few words for it. Wasnt much work at all. I prefer not to just use links as answers nor comments so post the code here with a little text. — old_timer, Jun 29 '18 at 18:43
BTW the NUCLEO-F031K6 is a good first board if this one is too overwhelming. The built in debugger mounts like a thumb drive you simply copy the .bin file over to that drive and the debugger programs the part and resets it. Not all nucleos pass uart through the debugger end, but this one does. Also it has an stlink interface you can use as well. No additional wires, dont need openocd nor a uart bootloader tool, etc. Just FYI, if this board is to overwhelming... — old_timer, Jun 29 '18 at 19:44
Hmm, never bothered to look at how the arduino folks solved, this but if you can get in one time via swd or uart, you can program their bootloader which supports dfu. Once done you dont need the swd nor uart dongle you can link for a different address within flash and the stm32duino looks at boot1, and you can use dfu-util to program the remaining user (to them) portions of the flash. granted not a pure/clean boot as they have mucked with the peripherals before you. — old_timer, Jun 30 '18 at 15:30
the problem with Arduino is the really unsatisfactory debug support - i.e. "printf debugging". It is also C++ based, otherwise I'd have included it in my answer. — Clifford, Jun 30 '18 at 15:37
understood. can borrow/steal the bootloader without having to adopt anything arduino. My meaning in this case is to allow to re-program the flash without an additional debugger, stock dfu-util works once you get a dfu bootloader on there (using some flavor of debugger once). Even when I buy actual arduinos I dont use the sandbox, I take advantage of their bootloader... — old_timer, Jun 30 '18 at 16:09

How to start ARM Cortex programming using embedded C?

4 Answers4