Monday, April 27, 2020

Measuring AVR interrupt latency

One thing I like about AVR MCUs is that their datasheets are relatively short and simple.  It's also one of the things I don't like, because the datasheets often lack important details.  Understanding external interrupt latency is one things that is lacking complete and clear details.  I decided to investigate the interrupt latency of the ATtiny13 and the ATtiny85.  The datasheet's description of interrupt response time and external interrupts is identical for both parts.

Interrupt Response Time

The ATtiny13 datasheet section 4.7.1, under the heading "Interrupt Response Time", says, "The interrupt execution response for all the enabled AVR interrupts is four clock cycles minimum. After four clock cycles the Program Vector address for the actual interrupt handling routine is executed. [...] The vector is normally a jump to the interrupt routine, and this jump takes three clock cycles. [...] If an interrupt occurs when the MCU is in sleep mode, the interrupt execution response time is increased by four clock cycles."

While section 4.7.1 is reasonably detailed, it has one significant error, and another important omission.  The error is the sentence, "The vector is normally a jump to the interrupt routine, and this jump takes three clock cycles".  All AVRs with less than 8KB of flash, like the ATtiny13, have no jump instruction.  They only have a relative jump "rjmp", which takes two clock cycles.  This is obviously a copy/paste error from the datasheet of an AVR with  more than 8KB of flash.  Anyone familiar with the AVR instruction set would likely catch this simple error. The omission from section 4.7.1 is much harder to recognize until you carefully examine section 9.2 and figure 9-1 in the datasheet.

Figure 9-1 shows a circuit which appears to add a latency of two clock cycles to pin change interrupts.  There is no written description for the circuit, and the external interrupt details in section 9.2 of the datasheet state, "Pin change interrupts on PCINT[5:0] are detected asynchronously."  Since pin change interrupts can be used to wake the part from power-down sleep mode when all clocks are disabled, they must be detected asynchronously during power-down sleep.  To determine when they are detected synchronously requires testing.

To test the interrupt latency I wrote a program in assembler that can generate low pulses of different lengths using PWM.  I chose not to write the program in C because I want to be able to measure the interrupt latency down to a single cycle.  On the t13, PB1 is the pin for INT0, PCINT1, and  OC0B.  By using OC0B to generate a low pulse on PB1, I'll be able to trigger INT0 and PCINT1 without any external connections.  When the interrupt is triggered, it should take four cycles to execute the code at the interrupt vector.  That code is an rjmp to the interrupt function, and that rjmp takes two additional clock cycles.  For the best-case latency, the first instruction in the interrupt function will execute six cycles after the interrupt is triggered.

The first instruction of the interrupt function checks the state of the pin that triggered the interrupt (the "sbic" instruction).  If the pin is low, it skips the next instruction, then goes into an infinite loop.  If the pin is high, it toggles the LED pin.  Since the PWM is configured to generate a low pulse, if the pulse has ended before the sbic, the LED will light up to indicate the interrupt response time was too slow.  The length of the pulse is one cycle longer than the value stored in OCR0B, which is done at lines 28 and 29.  My testing consisted mainly of modifying the OCR0B value, then building and flashing the modified code to the AVR.

Results

As expected INT0 latency is 4 clock cycles from the end of the currently executing instruction.  This means that if the interrupt occurs during the first cycle of a call instruction which takes 3 cycles, the interrupt response time will be 6 cycles.  For pin change interrupts, the latency is 6 cycles, indicating the synchronizer circuit adds 2 cycles of latency.  In idle sleep mode, both INT0 and PCINT latency is 8 cycles, indicating pin change interrupts operate asynchronously when the CPU clock is not running.

Wednesday, April 8, 2020

Better asserts in C with link-time optimization

I've been a fan of link-time optimization for several years.  I've been a fan of efficient programming for even longer.  I was an early fan of C++ because features like function overloading made it easier to move decisions done at run-time in C to compile-time with C++.  As C++ has become more complex over the decades, I've become less of a C++ fan, and appreciate the simplicity of C.

For small embedded systems like 8-bit AVRs and ARM M0, run-time error checking with assert() has minimal usefulness compared to UNIX, where a a core dump will help pinpoint the error location and the state of the program at the time of the error.  Even if the usability problems were solved, real-time embedded systems may not be able to afford the performance costs of run-time error checking.

Both C++ and C support static assertions.   Anyone who has tried to use static_assert likely has encountered "expression in static assertion is not constant" errors for anything but the simplest of checks.  The limitations of static_assert is well documented elsewhere, so I will not go into further details in this post.

I had long understood that LTO allowed the compiler to evaluate expressions in code at build time,  I never realized it's potential for static error checking.  The idea came to me when looking at a fellow embedded developer's code for fast Arduino digital IO.  In particular, Bill's code introduced me to the gcc error function attribute.  The documentation describes the attribute as follows:

  • If the error or warning attribute is used on a function declaration and a call to such a function is not eliminated through dead code elimination or other optimizations, an error or warning (respectively) that includes message is diagnosed.  This is useful for compile-time checking ...
Despite the fact that it seems the error attribute was introduced to address some of the limitations of static asserts, it doesn't seem to be commonly used.  After some experimentation, I came up with a basic example.
pll.c:
__attribute((error("")))
void constraint_error(char * details);

volatile unsigned pll_mult;


void set_pll_mult(unsigned multiplier)
{
    if (multiplier > 8) constraint_error("multlier out of range");
    pll_mult = multiplier;
}

main.c:
extern void set_pll_mult(unsigned multiplier);

int main()
{
    set_pll_mult(9);
}

$ gcc -Os -flto -o main *.c
In function 'set_pll_mult.constprop',
    inlined from 'main' at main.c:6:5:
pll.c:9:25: error: call to 'constraint_error' declared with attribute error:
     if (multiplier > 8) constraint_error("multlier out of range");
                         ^
When set_pll_mult() is called with an argument greater than 8, a compile error occurs.  When it is compiled with a valid multiplier, the "if (multiplier > 8)" statement is eliminated by the optimizer.  One drawback to the technique is that the caller (main.c in this case) is not identified when the called function is not inlined.  Increasing the optimization level to O3 may help to get the function inlined.