TLS Socket reuse connection failure after several minutes

Hi,
I’m having an issue with TLS socket ‘reuse’ connection.
I get the same issue using Wi-Fi (ESP8266) or Ethernet interface.
I’m using:
Mbed-os rev.7063 online compiler
Nucleo-F767ZI (Wi-Fi and Ethernet)
Nucleo-F446ZE (Wi-Fi)

After several minutes whether data is being sent or not the problem still happens. I get:

HttpRequest failed (error code -3012)

The only solution is a hard or soft ‘full’ MCU reset.

Using the standard TLS socket connection is working fine, the problem is I’m using way too much TLS handshake data and the connection process is much slower this way, I’m posting every 20 seconds.

As a work around I have tried to trap and handle the problem. If I detect the error -3012 and do this…

socket->close();
delete socket;

Then re-initiate the TLS socket ‘reuse’ connection. I get the following HARD FAULT …

++ MbedOS Fault Handler ++

FaultType: HardFault

Context:
R0 : 2000A9E0
R1 : 20003C48
R2 : 00000000
R3 : 080091E5
R4 : 080451F0
R5 : 2000A9E0
R6 : 2000A1B8
R7 : 2000736C
R8 : 2000737C
R9 : 2000A1B8
R10 : 2000A164
R11 : 20007360
R12 : 0803C02F
SP : 20006938
LR : 08006D13
PC : 00000000
xPSR : 00000000
PSP : 200068D0
MSP : 2007FFD8
CPUID: 411FC270
HFSR : 40000000
MMFSR: 00000000
BFSR : 00000000
UFSR : 00000002
DFSR : 0000000B
AFSR : 00000000
Mode : Thread
Priv : Privileged
Stack: PSP

– MbedOS Fault Handler –

++ MbedOS Error Info ++
Error Status: 0x80FF013D Code: 317 Module: 255
Error Message: Fault exception
Location: 0x801B5D7
Error Value: 0x0
Current Thread: main Id: 0x20009A30 Entry: 0x801B96B StackSize: 0x2000 StackMem: 0x20004970 SP: 0x2007FF7C
For more info, visit: mbedos-error
– MbedOS Error Info –

= System will be rebooted due to a fatal error =
= Reboot count(=1) reached maximum, system will halt after rebooting =

My project code is based around this mbed example here:

Any suggestions would be appreciated and I can transfer this project to Studio if I need to get to the TLS library.

Paul

Hi Paul,
The error -3012 is NSAPI_ERROR_DEVICE_ERROR. It is returned in the library by numerous places however I am guessing it was probably returned one of of the two places in TLSSocketWRapper.cpp. I would suggest you check where this error is returned from to understand deeply the cause for the failure. Was it a timing issue and perhaps a short wait would assist or post every 30 seconds, etc…?

As for the HARD FAULT, I haven’t seen your code, but I am guessing that if you closed and deleted the socket, perhaps you misplaced reallocating the socket? (socket = new TLSSocket();)
If you have done so, then please check with the debugger where it crashes.
Regards,
Mbed TLS Support
Ron

Thanks Ron,
I’ll move this onto Studio so I can get at the library at the same time publish the code so its accessible. Its nothing complex at all and it may simply be me not using it correctly.
The non- reuse method is faultless, no drop out for over 48 hours posting every 20 seconds.
I’ll come back in a couple of days.
Paul

Hi Ron,
The HARD FAULT was my fault, misplaced that ( socket = new TLSSocket();)

I can now re initialize the TLS socket with no issues.
I haven’t looked any further at this stage other than to run some tests.
What I have found is that running for many hours continuously, it hits this error every 8 minutes exactly.
There is a connection timeout at Firebase that I believe is 60 seconds, but I’m POSTING or PUTTING every 10 seconds so it should remain live (I have tried various delays with the same result).

AFAIK providing the Firebase database is accessed, the connection should not drop.
But I’m not absolutely sure about that.

Before I dig deeper into the TLSSocketWRapper.cpp, does Mbed TLSSocket have a timeout setting? and just how long would I expect the connection to last without renewing?
BR,
Paul

Hi @star297

TLSSocketWrapper has a set_timeout API. By default it is set to “forever”. But this is mostly to indicate a blocking socket. I haven’t found at the moment any indication for 8 minutes though.
Regards

I have published my library and example here:
https://os.mbed.com/users/star297/code/Firebase-Example/docs/tip/main_8cpp_source.html
Strangely the error now occurs at 14 minutes after I tidied the code up. Quite sure I’m doing something wrong although there’s no hard fault at any time. Is it possible to enable error_tracing to catch the point of failure?