It turns out that suspend support on the STM32 USB core is buggy as
heck. Host mode cannot resume after suspend, and device mode cannot
receive resume or send wakeup signalling.
I managed to fake resume support by keeping Downstream and our connected
device running at full power, and simulating a wakeup event to the host
by disconnecting/reconnecting Upstream from the host.
...Downstream was not always changing state correctly after closely
spaced interrupts.
Also improve flash-write-lockout function to avoid dependency on
optimisation level.
Each USB transaction passed to the driver now consists of multiple
64-byte packets. 8 packets when receiving, 4 packets when transmitting.
The STM32 silicon bugs out when more than 4 packets are scheduled to
write at a time :(
Reads 1.0MB/sec, writes 967kB/sec, not CPU limited :)
- Unexpected second port-connected interrupt on cold boot with low-speed
device connected.
- Retry on failure to get device descriptor. (We still fail after three
attempts, but that is better than failing after the first one!)
Also changed eclipse project to use external builder.
So apparently the STM32F401's SPI DMA is even more buggy than the 405's.
Worked around an intermittent stall/timeout by busy-waiting Upstream's
packet length transmission and reception, instead of DMA-ing it like the
packet body. Ugh...
correctly now, either because SPI is in 16-bit mode, or because I found
all the other bugs!
Doubled SPI baudrate to 10.5Mbps. Transfer speed now limited (again) by
Downstream's lack of FIFO buffering in the USB host controller.
Also disabled DMA transaction half-complete interrupt in
stm32f4xx_hal_dma.c, as it wasn't doing anything useful.
SPI peripheral library. At this speed SPI requires ~60% CPU time at -Og
optimisation level. This could be further improved by trimming down the
SPI interrupt. But...
Speed is now limited by Downstream's single-packet-per-URB restriction,
to about 460kB/s. USB Middleware does not implement TX FIFO empty
interrupt, so a bit of work is required here.
Although I am not entirely convinced this is necessary, as the SPI data
stall issue only appeared with optimisation off (-O0). Perhaps re-visit
this if Upstream needs more free CPU time later...