If you’d like a more up-to-date version of ShellCheck than what Raspbian provides, you can build your own on a Raspberry Pi Zero in a little over 21 hours.
Alternatively, as of last week, you can also download RPi compatible, statically linked armv6hf binaries of every new commit and stable release.
It’s statically linked — i.e. the executable has all its library dependencies built in — so you can expect it to be pretty big. However, I didn’t expect it to be 67MB:
build@d1044ff3bf67:/mnt/shellcheck# ls -l shellcheck -rwxr-xr-x 1 build build 66658032 Jul 14 16:04 shellcheck
This is for a tool intended to run on devices with 512MiB RAM. strip
helps shed a lot of that weight, and the post-stripped number is the one we’ll use from now on, but 36MB is still more than I expected, especially given that the x86_64 build is 23MB.
build@d1044ff3bf67:/mnt/shellcheck# strip --strip-all shellcheck build@d1044ff3bf67:/mnt/shellcheck# ls -l shellcheck -rwxr-xr-x 1 build build 35951068 Jul 14 16:22 shellcheck
So now what? Optimize for size? Here’s ghc -optlo-Os
to enable LLVM opt
size optimizations, including a complete three hour Qemu emulated rebuild of all dependencies:
build@31ef6588fdf1:/mnt/shellcheck# ls -l shellcheck -rwxr-xr-x 1 build build 32051676 Jul 14 22:38 shellcheck
Welp, that’s not nearly enough.
The real problem is that we’re linking in both C and Haskell dependencies, from the JSON formatters and Regex libraries to bignum implemenations and the Haskell runtime itself. These have tons of functionality that ShellCheck doesn’t use, but which is still included as part of the package.
Fortunately, GCC and GHC allow eliminating this kind of dead code through function sections. Let’s look at how that works, and why dead code can’t just be eliminated as a matter of course:
An ELF binary contains a lot of different things, each stored in a section. It can have any number of these sections, each of which has a pile of attributes including a name:
.text
stores executable code.data
stores global variable values.symtab
stores the symbol table- Ever wondered where compilers embed debug info? Sections.
- Exception unwinding data, compiler version or build IDs? Sections.
This is how strip
is able to safely and efficiently drop so much data: if a section has been deemed unnecessary, it’s simple and straight forward to drop it without affecting the rest of the executable.
Let’s have a look at some real data. Here’s a simple foo.c
:
int foo() { return 42; } int bar() { return foo(); }
We can compile it with gcc -c foo.c -o foo.o
and examine the sections:
$ readelf -a foo.o ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: REL (Relocatable file) Machine: ARM [..] Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 000000 000034 000034 00 AX 0 0 4 [ 2] .rel.text REL 000000 000190 000008 08 I 8 1 4 [ 3] .data PROGBITS 000000 000068 000000 00 WA 0 0 1 [ 4] .bss NOBITS 000000 000068 000000 00 WA 0 0 1 [..] Symbol table '.symtab' contains 11 entries: Num: Value Size Type Bind Vis Ndx Name [..] 9: 00000000 28 FUNC GLOBAL DEFAULT 1 foo 10: 0000001c 24 FUNC GLOBAL DEFAULT 1 bar
There’s tons more info not included here, and it’s an interesting read in its own right. Anyways, both our functions live in the .text
segment. We can see this from the symbol table’s Ndx
column which says section 1
, corresponding to .text
. We can also see it in the disassembly:
$ objdump -d foo.o foo.o: file format elf32-littlearm Disassembly of section .text: 00000000 <foo>: 0: e52db004 push {fp} 4: e28db000 add fp, sp, #0 8: e3a0302a mov r3, #42 ; 0x2a c: e1a00003 mov r0, r3 10: e28bd000 add sp, fp, #0 14: e49db004 pop {fp} 18: e12fff1e bx lr 0000001c <bar>: 1c: e92d4800 push {fp, lr} 20: e28db004 add fp, sp, #4 24: ebfffffe bl 0 <foo> 28: e1a03000 mov r3, r0 2c: e1a00003 mov r0, r3 30: e8bd8800 pop {fp, pc}
Now lets say that the only library function we use is foo
, and we want bar
removed from the final binary. This is tricky, because you can’t just modify a .text
segment by slicing things out of it. There are offsets, addresses and cross-dependencies compiled into the code, and any shifts would mean trying to patch that all up. If only it was as easy as when strip
removed whole sections…
This is where gcc -ffunction-sections
and ghc -split-sections
come in. Let’s recompile our file with gcc -ffunction-sections foo.c -c -o foo.o
:
$ readelf -a foo.o [..] Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000 0000 0000 00 0 0 0 [ 1] .text PROGBITS 00000 0034 0000 00 AX 0 0 1 [ 2] .data PROGBITS 00000 0034 0000 00 WA 0 0 1 [ 3] .bss NOBITS 00000 0034 0000 00 WA 0 0 1 [ 4] .text.foo PROGBITS 00000 0034 001c 00 AX 0 0 4 [ 5] .text.bar PROGBITS 00000 0050 001c 00 AX 0 0 4 [ 6] .rel.text.bar REL 00000 01c0 0008 08 I 10 5 4 [..] Symbol table '.symtab' contains 14 entries: Num: Value Size Type Bind Vis Ndx Name [..] 12: 00000000 28 FUNC GLOBAL DEFAULT 4 foo 13: 00000000 28 FUNC GLOBAL DEFAULT 5 bar
Look at that! Each function now has its very own section.
This means that a linker can go through and find all the sections that contain symbols we need, and drop the rest. We can enable it with the aptly named ld
flag --gc-sections
. You can pass that flag to ld
via gcc
using gcc -Wl,--gc-sections
. And you can pass that whole thing to gcc
via ghc
using ghc -optc-Wl,--gc-sections
I enabled all of this in my builder’s .cabal/config
:
program-default-options gcc-options: -Os -Wl,--gc-sections -ffunction-sections -fdata-sections ghc-options: -optc-Os -optlo-Os -split-sections
With this in place, the ShellCheck binary became a mere 14.5MB:
-rw-r--r-- 1 build build 14503356 Jul 15 10:01 shellcheck
That’s less than half the size we started out with. I’ve since applied the same flags to the x86_64 build, which brought it down from 23MB to 7MB. Snappier downloads and installs for all!
For anyone interested in compiling Haskell for armv6hf on x86_64, I spent weeks trying to get cross-compilation going, but in the end (and with many hacks) I was only able to cross-compile armv7. In the end I gave up and took the same approach as with the Windows build blog post: a Docker image runs the Raspbian armv6 userland in Qemu user emulation mode.
I didn’t even have to set up Qemu. There’s tooling from Resin.io for building ARM Docker containers for IoT purposes. ShellCheck (ab)uses this to run emulated GHC and cabal. Everything Just Works, if slowly.
The Dockerfile is available on GitHub as koalaman/armv6hf-builder.