So apparently the STM32F401's SPI DMA is even more buggy than the 405's. Worked around an intermittent stall/timeout by busy-waiting Upstream's packet length transmission and reception, instead of DMA-ing it like the packet body. Ugh...