I have STM32F4 runnig @ 144Mhz with TLS server on it while TLS Client on PC end.
For some reason “MBEDTLS_SSL_SERVER_KEY_EXCHANGE” and " MBEDTLS_SSL_CLIENT_KEY_EXCHANGE" these handshake phases taking 14 seconds each.
Note here length of P & G value 2048 and 1024.
Thus Polar SSL was fast, “Although” using P&G value from Ploar SSL in MBed TLS still does not get any better. Handshake still takes ~5 sec shown in following snap(which is close to last snap of last post)
I just thought I would add my 2 cents.
The handshake process is a very processor intensive algorithm.
To give you a bit of perspective I am running 120MHz RISC processor with data and instruction cache enabled. Using 256bit ECC encryption the entire handshake process takes approximately 6 seconds. This is after I fully optimized the code including parts of the mbedtls stack. The ARM processor you are running is somewhat in the ballpark as mine as it’s Cortex M4 (RISC instructions) with slightly higher frequency.
Your bottlenecks in the system are
The signature verification (ECDSA for ECC)
The key exchange protocol (ECDH), especially if you’re using a ephemeral key. Since you need to generate the key instead of using a static one.
The solution to get it anywhere close to 1 second would be to implement hardware co-processor to offload the work. Which is what I did. Depending on the cyrpto suite you plan to support make sure you select a co-processor that can help. For me I used the ATECC508a
In earlier post none of both projects(one with MBedTLS & other Polar SSL) had optimisation on. I’m not fan of optimisation of any sort as complier can be notorious for optimising variables and code.
Nonetheless I tried high speed optimisation on both projects (GCC complier) and yes there has been substantial jump in speed. I’m presuming it could be floating point hardware responsible for multiplication and division.
And there is still difference between Polar SSL and MbedTLS most likely it could be some configuration on MBedTLS. In past I had switched role of endpoint device from server to client to save ram & gain speed on micro. i.e. Microcoltroller with TCP Server + SSL Client and PC with TCP Client + SSL Server; as PC can process handshake in few millisec it takes off load from micro.
As John correctfully mentioned, the PK operations ( key exchange and signing \ certificate verifying) are the bottle necks and depend on the CPU.
From you latest snapshot, Looking at the old PolarSSL, From the Server Hello message to the Change Cipher Spec message, it took 0.6 seconds ( The first application data is sent after 5.6 seconds), and in the Mbed TLS snapshot, using same ciphersuite (TLS_DHE_RSA_WITH_AES_256_CBC_SHA) it took 0.95 seconds. There is some differnces, but as the product evolved, there might have been some changes affecting this.
Looking at the other Mbed TLS 2.6.1 snapshot, that uses the ECDHE ciphersuite, it took 1.2 seconds.
Note that the first application data is sent after ~33 seconds, but the handshake is finished much before that. (I would say it’s the Encrypted Handshake message sent after 1.2 seconds).
I have .net application running SSL client at other end which can send data from textbox to SSL Server hosted on CM4 Micro. Thus there has been variable time of application data sent by client in earlier snaps.