Saturday, April 3, 2021

Honey, I shrunk the Arduino core!

 


One of my gripes about the Arduino AVR core is that it is not an example of efficient embedded programming.  One of the foundations of C++ (PDF) is zero-overhead abstractions, yet the Arduino core has a very significant overhead.  The Arduino basic blink example compiles to almost 1kB, with most of that space taken up by code that is never used.  Rewriting the AVR core is a task I'm not ready to tackle, but after writing picoCore, I realized I could use many of the same optimization techniques in an Arduino library.  The result is ArduinoShrink, a library that can dramatically reduce the compiled size of Arduino projects.  In this post I'll explain some of the techniques I used to achieve the coding trifecta of faster, better, and smaller.

The Arduino core is actually a static library that is linked with the project code.  As Eli explains in this post on static linking, libraries like libc usually have only one function per .o in order to avoid linking in unnecessary code.  The Arduino doesn't use that kind of modular approach, however by making use of gcc's "-ffunction-sections" option, it does mitigate the amount of code bloat due to the non-modular approach.

With ArduinoShrink, I wrote more modular, self-contained code.  For example, the Arduino delay() function calls micros(), which relies on the 32-bit timer0 interrupt overflow counter.  I simplified the delay function so that it only needs the 8-bit timer value.  If the user code never calls micros() or millis(), the timer0 ISR code never gets linked in.  By using a more efficient algorithm and writing the code in AVR assembler, I reduced the size of the delay function to 12 instructions taking 24 bytes of flash.

In order to minimize code size and maximize speed, almost half of the code is in AVR assembler.  Despite improvements in compiler optimization techniques over the past decades, on architectures like the AVR I can almost always write better assembler code than what the compiler generates.  That's especially true for interrupt service routines, such as the timer0 interrupt used to maintain the counters for millis() and micros().  My assembler version of the interrupt uses only 56 bytes of flash, and is faster than the Arduino ISR written in C.

One part that is still written in C is the digitalWrite() function.  The Arduino core uses a set of tables in flash to map a given pin number to an IO port and bit, making for a lot of code to have digitalWrite(13, LOW) clear PORTB5.  Making use of Bill's discovery that these flash memory table lookups can be resolved at compile time, digitalWrite(13, LOW) compiles to a single instruction: "cbi PORTB, 5".

ArduinoShrink is also designed to significantly reduce interrupt latency.  The original timer0 interrupt takes around 5us to run, during which time any other interrupts are delayed.  The first instruction in my ISR is 'sei', which allows other interrupts to run, reducing the latency impact to a few cycles more than the hardware minimum.  The official Arduino core disables interrupts in several places, such as when reading the millis counter.  My solution is to detect if the millis counter has been updated and re-read it, thereby avoiding any interrupt latency impact.

The only limitation compared to the official AVR core is that the compiler must be able to resolve the pin number for the digital IO functions at compile time.  Although the pin may hard-coded, even with LTO enabled, avr-gcc is not always able to recognize the pin is a compile-time constant.  Since AVR is not a priority target for GCC optimizations, I can't rely on compiler improvements to resolve this limitation.  Therefore I plan to write a version of digitalWrite that is much smaller and faster, even when avr-gcc can't figure out the pin at compile time.

Although ArduinoShrink should be compatible with any Arduino sketch, given some of the compiler tricks I've used it's not unlikely I've missed a potential error.  If you do find what you think is a bug, open an issue in the github repository.


9 comments:

  1. This is very welcome and I am looking forward to trying it. I was quite dismayed to find out that with the standard library, a loop consisting of alternately setting and clearing a pin would only run at a few 10's of kHz.

    ReplyDelete
    Replies
    1. You should get a toggle rate of just over 2MHz.

      Delete
  2. Hi. This is really nice!
    It's being discussed over at Arduboy.com, for a small game system based on the Atmel 32u4.
    https://community.arduboy.com/t/arduinoshrink-library/9715

    Thanks for sharing. :-)

    ReplyDelete
  3. on first compile check of Library of the blink sketch, I get this warning.

    WARNING: Category '' in library ArduinoShrink is not valid. Setting to 'Uncategorized'

    [code]
    /*
    Blink
    Turns on an LED on for one second, then off for one second, repeatedly.

    Most Arduinos have an on-board LED you can control. On the UNO, MEGA and ZERO
    it is attached to digital pin 13, on MKR1000 on pin 6. LED_BUILTIN is set to
    the correct LED pin independent of which board is used.
    If you want to know what pin the on-board LED is connected to on your Arduino model, check
    the Technical Specs of your board at https://www.arduino.cc/en/Main/Products

    This example code is in the public domain.

    modified 8 May 2014
    by Scott Fitzgerald

    modified 2 Sep 2016
    by Arturo Guadalupi

    modified 8 Sep 2016
    by Colby Newman
    */


    // the setup function runs once when you press reset or power the board
    void setup() {
    // initialize digital pin LED_BUILTIN as an output.
    pinMode(LED_BUILTIN, OUTPUT);
    }

    // the loop function runs over and over again forever
    void loop() {
    digitalWrite(LED_BUILTIN, HIGH); // turn the LED on (HIGH is the voltage level)
    delay(1000); // wait for a second
    digitalWrite(LED_BUILTIN, LOW); // turn the LED off by making the voltage LOW
    delay(1000); // wait for a second
    }
    [/code]

    WARNING: Category '' in library ArduinoShrink is not valid. Setting to 'Uncategorized'

    ReplyDelete
    Replies
    1. What version of the Arduino IDE are you using? I'm not seeing this warning with 1.8.13.

      Delete
    2. library.properties now has a category field. I just pushed it to the repo.

      Delete
    3. Ralph,
      thanks much!
      I had to sort out my setup also:
      runing Vista Home priemium on a Core duo circa (2006)
      Went to Arduino IDE 1.8.13 <-- a bridge to far ;-)
      lost the Java,But got a new JRE (281)-- shwoo-
      read several post along the way, found out that for Vista I needed
      arduino IDE 1.8.11 so I have the IDE that can Compile!

      All is well that ends well.

      for Now ... Thanks again Ralph....

      Delete
  4. Very interesting and useful stuff.

    ReplyDelete