Mbedtls_ssl_handshake() segfault after ~1000 iterations

omtayroom · September 18, 2019, 9:08am

meanwhile I could confirm that the segfault is triggered by the way my code stores the sockets (and the mbedtls contexts along with it).

“same key” is also ruled out.

I’ll update when I know the actual reason.

omtayroom · September 18, 2019, 3:09pm

Hello Ron,

here is the mbedtls_ssl_context tlsCtx data passed in mbedtls_ssl_handshake(&tlsCtx). Its from the debugger and should be the call that triggers the segfault:

{
 conf = 0x7fb6a0095120,
 state = 2,
 renego_status = 0,
 renego_records_seen = 0,
 major_ver = 3,
 minor_ver = 1,
 badmac_seen = 0,
 f_send = 0x7fb6a9fa2870 ,
 f_recv = 0x0,
 f_recv_timeout = 0x7fb6a9fa2770 ,
 p_bio = 0x7fb6a0003400,
 session_in = 0x0,
 session_out = 0x0,
 session = 0x0,
 session_negotiate = 0x7fb6a000c830,
 handshake = 0x7fb6a000c8d0,
 transform_in = 0x0,
 transform_out = 0x0,
 transform = 0x0,
 transform_negotiate = 0x7fb6a000c6f0,
 p_timer = 0x0,
 f_set_timer = 0x0,
 f_get_timer = 0x0,
 in_buf = 0x7fb6a0004430 "",
 in_ctr = 0x7fb6a0004430 "",
 in_hdr = 0x7fb6a0004438 "\026\003\003",
 in_len = 0x7fb6a000443b "",
 in_iv = 0x7fb6a000443d "",
 in_msg = 0x7fb6a000443d "",
 in_offt = 0x0,
 in_msgtype = 0,
 in_msglen = 0,
 in_left = 0,
 in_epoch = 0,
 next_record_offset = 0,
 in_window_top = 0,
 in_window = 0,
 in_hslen = 0,
 nb_zero = 0,
 keep_current_message = 0,
 disable_datagram_packing = 0 '\000',
 out_buf = 0x7fb6a0008590 "",
 out_ctr = 0x7fb6a0008590 "",
 out_hdr = 0x7fb6a0008598 "\026\003\001\001u\001",
 out_len = 0x7fb6a000859b "\001u\001",
 out_iv = 0x7fb6a000859d "\001",
 out_msg = 0x7fb6a000859d "\001",
 out_msgtype = 22,
 out_msglen = 373,
 out_left = 0,
 cur_out_ctr = "\000\000\000\000\000\000\000\001",
 mtu = 0,
 split_done = 0 '\000',
 client_auth = 0,
 hostname = 0x0,
 alpn_chosen = 0x0,
 cli_id = 0x0,
 cli_id_len = 0,
 secure_renegotiation = 0,
 verify_data_len = 0,
 own_verify_data = '\000' ,
 peer_verify_data = '\000' 
}

can you spot anything odd?

mbedtls_ssl_read_record () and mbedtls_ssl_fetch_input () are at the bottom of the backtrace.

The mbedtls_net_context netCtx.fd has a value of 1037 when I see it last (passed to mbedtls_ssl_set_bio() right before the call to mbedtls_ssl_handshake(&tlsCtx).

roneld01 · September 19, 2019, 2:53pm

Unfortunately, I can’t see at the moment something strange. Perhaps there is some data corruption, that overrides some of the pointers.
As for the fd value of 1037. Is this reasonable in our system? How many open file descriptor can be open in one single moment in your system?
Regards

omtayroom · September 20, 2019, 8:26am

I think I am able to open much more:

$ ulimit -n
1048576

at each a single moment even the 1037 should not be in use, because I am running the test with two threads only. I think it just goes up that much because the [linux] kernel does not immediately re-issue the numbers.

(I see FD number re-using when I run 20000 iterations with a single thread - then there is lots of time between the uses)

Topic		Replies	Views
Error 0x7780 during handshake Mbed TLS mbed_tls	7	9209	May 10, 2023
Need help restarting TLS connections with MbedTLS Mbed OS	0	450	September 17, 2018
MbedTLS Handshake failing between client & server (v 3.4.0) Bug Reports / Issues mbed_client , mbed_tls	0	600	June 21, 2023
Mbedtls_ssl_handshake gives 0x6180 error (failed to allocate memory) after 2 hours of operation Mbed TLS	1	775	December 2, 2019
Mbedtls_ssl_handshake bad message length Mbed TLS	1	3026	June 8, 2018

Mbedtls_ssl_handshake() segfault after ~1000 iterations

Related topics