Arduino doesn't use makefiles for building, and instead has a build process which allows for configuration of some of the build flags in a platform.txt file which is read on startup of the IDE. Testing build flags involves editing the file, starting the IDE, and doing a build. To speed up the process and to have complete control over the compile, archive, and link process I worked with avr-gcc on the command line, using the temporary files compiled by Arduino.
My first discovery was that when a library's .o files are included on the command line (instead of an archive file), LTO didn't work. All the code from each .o was included in the binary, even code that was never used. That's when I started researching how gcc's callgraph generation works. The gcc internals documentation is good, though it still leaves unanswered questions.
The big question is how does it pick the root of the call graph? I thought the starting point for the call graph analysis would be the C Runtime Initialization object file, which references main. From there the linker could look at all the .o's to find main, and build the graph. After lots of experiments, I discovered LTO may generate multiple call graphs for the same executable - in other words there can be many root nodes. Every .o file specified on the command line will be used for root node candidates, and if there are only .a files specified, every .o in the first .a will be used for root node candidates.
When all the .o files for an Arduino library are included on the build line, gcc's LTO includes many functions that never get called from main. Creating an archive with the .o files solves that problem. That still doesn't result in optimal performance of LTO. Since the Arduino IDE compiles all of it's core .o files into core.a, which is the first argument in the build line, all the .o files in core.a get included in the build. Specifying main.cpp.o solves this problem.
$ avr-gcc -mmcu=atmega328p -Wl,-relax -flto -Os main.cpp.o hjrp.a core.a USB_Host_Shield_2.0-master/UHS.a -o USBHIDJ.elf
$ avr-size USBHIDJ.elf
text data bss dec hex filename
12530 194 460 13184 3380 USBHIDJ.elf
Building the same files without main.cpp.o specified (listing core.a first) was almost 3KB larger:
$ avr-size USBHIDJ.elf
text data bss dec hex filename
18152 204 493 18849 49a1 USBHIDJ.elf
For some further details on LTO check out Honza Hubička's Blog:
This is quite strange behavior. GCC basically uses linked plugin feedback to figure out what symbols are needed. Either in your setup you have no linker plugin enabled (try compiling with -fuse-linker-plugin) or something is wrong with the resolution data given by the plugin. You can see the data after copmiling with --save-temps. They end up in file called *.res (often soemting like -lm.res that is anoying) and you can see what symbols are roots of GCC's unreachable code removal.
ReplyDeleteHonza
Thanks for the suggestions. I'm pretty sure the linker plugin is enabled, since I figured out that it *wasn't* enabled in the Atmel toolchain for linux: http://nerdralph.blogspot.ca/2014/06/atmel-avr-toolchain-344.html
DeleteI'll compare with -fuse-linker-plugin just to make sure though. I'm not sure I'll get much deeper than that - I do most of my AVR development with command-line tools and only occasionally contribute to Arduino development since it's a popular tool for newbies. I also tend to get a bit frustrated whenever I delve deeper into gcc and see how some parts of it just seem fucked up, and other parts are cool but so complex my head hurts when I try to understand them.
I tried adding -fuse-linker-plugin, and still get a bloated elf (~82KB) when I specify the .o's instead of an archive:
Delete$ avr-gcc -mmcu=atmega328p -fuse-linker-plugin -Wl,-relax -flto -Os main.cpp.o hjrp.a core.a USB_Host_Shield_2.0-master/*.o -o USBHIDJ.elf
I did some more reading on lto and the plugin, and am wondering if the linker needs to support the plugin interface too? When I run avr-ld -plugin I get "unrecognized option '-plugin'".
However the help shows -plugin as one of the options:
Delete$ avr-ld --help | egrep plugin
-plugin PLUGIN Load named plugin
-plugin-opt ARG Send arg to last-loaded plugin