For software development, I often prefer to work close to the hardware. Libraries that abstract away the hardware not only use up limited flash memory, they add to the potential sources of bugs in your code. For a basic test of STM32 library bloat, I compiled the buttons example from my TM1638NR library in the Arduino 1.8.13 IDE using stm32duino for a STM32F030 target. The flash required was just over 8kB, or slightly more than half of the 16kB of flash specification on the STM32F030F4P6 MCU. While I wasn't ready to write my own tiny Arduino core for the STM32F, I was determined to find a more efficient way of programming small ARM Cortex-M devices.
After a bit of searching, looking at Bill Westfield's Miimalist ARM project, libopencm3, and other projects, I found most of what I was looking for in a series of STM32 bare metal programming posts by William Ransohoff. However instead of using an ST-Link programmer, I decided to use a standard USB-TTL serial dongle to communicate with the ROM bootloader on the STM32.
To enable the bootloader, the STM32 boot0 pin must be pulled high during power-up. then the bootloader will wait for communication over the USART Tx and Rx lines. On the STM32F030F4P6, the Tx line is PA9, and the Rx line is PA10. In order reset the chip before flashing, I also connected the DTR line from my serial module to NRST (pin 4) on the MCU as shown in the following wiring diagram:
For flashing the MCU, I decided on stm32flash. While installation on Debian Linux is as simple as, "apt install stm32flash", I had some difficulty finding a recent Windows build. So I ended up building it myself. Although my build defaults to 115.2kbps, I found 230.4kbps completely reliable. At 460.8kbps and 500kbps, I encountered intermittent errors, so I stuck with 230.4kbps. After making the necessary connections, and before flashing any code to the MCU, do a test to confirm the MCU is detected.
One thing to note about stm32flash is that it does not detect the amount of flash and RAM on the target MCU. The numbers come from a hard-coded table based on the device ID reported. The official flash size in kB is stored in the system ROM at address 0x1FFFF7CC. On my STM32F030F4P6, the value read from that address is 0x0010, reflecting the spec of 16kB flash for the chip. My testing revealed that it actually has 32kB of usable flash.
I used William's STM32F0 GPIO example as a template to create a tiny blinky example that uses less than 300 bytes of flash. Most of that is for the vector table, which on the Cortex-M0 has 48 entries of 4 bytes each. To save space, I embedded the reset handler in an unused part of the vector table. Since the blinky example doesn't use any interrupts, all but the initial stack pointer at vector 0 and the reset handler at vector 1 could technically be omitted. I plan to re-use the vector table code for other projects, so I did not prune it down to the minimum.
The blinky example will toggle PA9 at a frequency of 1Hz. That is the UART Tx pin on the MCU, which is connected to the Rx pin on the USB-TTL dongle. This means when the example runs, the Rx LED on the USB-TTL dongle will flash on and off.
I think my next step in Cortex-M development will be to experiment with libopencm3. It appears to have a reasonably lightweight abstraction of GPIO and some peripherals, so it should be easier to write code that is portable across multiple different ARM MCUs.