Slow handshake for MBed TLS

Hello,

I have STM32F4 runnig @ 144Mhz with TLS server on it while TLS Client on PC end.
For some reason “MBEDTLS_SSL_SERVER_KEY_EXCHANGE” and " MBEDTLS_SSL_CLIENT_KEY_EXCHANGE" these handshake phases taking 14 seconds each.

Seems its smilar issue posted here…
https://tls.mbed.org/discussions/generic/long-tls-handshake-connection-time

Here are packet sniffer snap, on probing it seems working in function “mbedtls_dhm_make_params()”

Although on same micro using old polar SSL lib it works fine(handshake done in <1 sec). Here is packet snap using Polar SSL

Can any one help me on understanding whats the issue could be…

Thanks

Update:
Just to rule out complexities of different algorithms I have forced both to use same algorithm and results are same again. MBed TLS is hanging somewhere and thus slow for handshake

Here is updated snap packet sniffer snaps

Although by setting precalculated DHM params using “mbedtls_ssl_conf_dh_param()” handshake time has come down drom 30 sec to 5 sec now see following, but this is not the solution.

Further update:
Difference I so far found in Polar SSL 1.2.8 and MBedTLS(2.6.1) is following.

During server setup MBedTLS calls mbedtls_ssl_config_defaults(); which calls following func

mbedtls_ssl_conf_dh_param( conf, MBEDTLS_DHM_RFC5114_MODP_2048_P, MBEDTLS_DHM_RFC5114_MODP_2048_G )

While Polar SSL 1.2.8 during server setup calls ssl_init(); which effectively does
ssl_set_dh_param(); by setting up prime and mod value

ssl_set_dh_param(ssl,POLARSSL_DHM_RFC5114_MODP_1024_P, POLARSSL_DHM_RFC5114_MODP_1024_G)

Note here length of P & G value 2048 and 1024.
Thus Polar SSL was fast, “Although” using P&G value from Ploar SSL in MBed TLS still does not get any better. Handshake still takes ~5 sec shown in following snap(which is close to last snap of last post)

I just thought I would add my 2 cents.
The handshake process is a very processor intensive algorithm.
To give you a bit of perspective I am running 120MHz RISC processor with data and instruction cache enabled. Using 256bit ECC encryption the entire handshake process takes approximately 6 seconds. This is after I fully optimized the code including parts of the mbedtls stack. The ARM processor you are running is somewhat in the ballpark as mine as it’s Cortex M4 (RISC instructions) with slightly higher frequency.

Your bottlenecks in the system are

  1. The signature verification (ECDSA for ECC)
  2. The key exchange protocol (ECDH), especially if you’re using a ephemeral key. Since you need to generate the key instead of using a static one.

The solution to get it anywhere close to 1 second would be to implement hardware co-processor to offload the work. Which is what I did. Depending on the cyrpto suite you plan to support make sure you select a co-processor that can help. For me I used the ATECC508a

1 Like

Hi John,

Thanks for sharing your results.

In earlier post none of both projects(one with MBedTLS & other Polar SSL) had optimisation on. I’m not fan of optimisation of any sort as complier can be notorious for optimising variables and code.

Nonetheless I tried high speed optimisation on both projects (GCC complier) and yes there has been substantial jump in speed. I’m presuming it could be floating point hardware responsible for multiplication and division.

And there is still difference between Polar SSL and MbedTLS most likely it could be some configuration on MBedTLS. In past I had switched role of endpoint device from server to client to save ram & gain speed on micro. i.e. Microcoltroller with TCP Server + SSL Client and PC with TCP Client + SSL Server; as PC can process handshake in few millisec it takes off load from micro.

See snaps below for optimised projects

Hi @pm77
As John correctfully mentioned, the PK operations ( key exchange and signing \ certificate verifying) are the bottle necks and depend on the CPU.

From you latest snapshot, Looking at the old PolarSSL, From the Server Hello message to the Change Cipher Spec message, it took 0.6 seconds ( The first application data is sent after 5.6 seconds), and in the Mbed TLS snapshot, using same ciphersuite (TLS_DHE_RSA_WITH_AES_256_CBC_SHA) it took 0.95 seconds. There is some differnces, but as the product evolved, there might have been some changes affecting this.
Looking at the other Mbed TLS 2.6.1 snapshot, that uses the ECDHE ciphersuite, it took 1.2 seconds.
Note that the first application data is sent after ~33 seconds, but the handshake is finished much before that. (I would say it’s the Encrypted Handshake message sent after 1.2 seconds).

Have you checked the client’s log to undestand why the application data is sent so late?
Have you looked at Increasing SSL and TLS performance — Mbed TLS documentation and https://tls.mbed.org/kb/how-to/reduce-mbedtls-memory-and-storage-footprint ?
Regards,
Mbed TLS Support
Ron

Hi Ron,

I have .net application running SSL client at other end which can send data from textbox to SSL Server hosted on CM4 Micro. Thus there has been variable time of application data sent by client in earlier snaps.

Here are latest figure in table

So my final verdict for DHE key exchange is I must use optimisaition on top setting cipher references on server side is gong to help me.