SPI peripheral library. At this speed SPI requires ~60% CPU time at -Og
optimisation level. This could be further improved by trimming down the
SPI interrupt. But...
Speed is now limited by Downstream's single-packet-per-URB restriction,
to about 460kB/s. USB Middleware does not implement TX FIFO empty
interrupt, so a bit of work is required here.