LPC 1768 SPI... strange behaviour

Hello I am new to mbed and the LPC1768 and I want to connect a very fast DAC (fclock 50MHz) to the SPI bus of the mbed because I have to transfer as fast as possible wave data (approx. 20kHz @ 128 datapoints).

At first I used the spi function spi.frequency() 2
with which I can set the clock to 48MHz. That works fine, but for a sent byte there is a lost time before (1.0usec) and after (0.9usec) the actual transfer. The byte itself takes 0.2usec. Please see also the scope picture.
1
Herein is Yellow: SPICLK, Purple: MOSI, Blue: Chip select.
Sending the byte (the letter ‘A’ not buff from the listing ) costs a total of 2.1usec while the byte itself takes only 0.2usec!

To avoid this 1.9 usec ‘overhead’ I then switched to controlling SPI via registers. 4
Now the transfer takes only 1usec per byte (0xD2 not 0x5A from the listing) at a clock of 12MHz and with virtually no time before and after the actual transfer. See the scope picture.3
I know that the maximum SPICLK is CCLK / 8 so indeed 96/8 = 12MHz but why can I set 48MHz when using the spi.frequency () function?
And also: why does using spi.write () have such huge overhead?