How to correctly use TCPSocket

I have had a lot of problems with using TCPSocket. It doesn’t seem to be too user friendly when it comes to handling errors and special cases. One interesting comment about it that I could is:

https://os.mbed.com/forum/bugs-suggestions/topic/36246/

The issues there were mostly disregarded.

Anyway, I used the TCPSocket over a year ago to receive data, but if the socket received too much data it somehow froze and never worked again. Could not get it to behave, so we moved all comms to UDP and built own “reliability” on top of that, Now I need TCP for MQTT but TCPSocketagain tries hard to make it impossible.

In happy cases I can connect to the MQTT broker and send data more or less forever, no issues. But in cases where the broker is offline or the network has some issues, I run into crashes. What I do is basically:

  • open the socket:
    mqttSocket.open(EthInterface::get_default_instance())

  • connect to the broker:
    mqttSocket.connect( SocketAddress(brokerIp, brokerPort)

  • set it as async:

    mqttSocket.set_timeout( 0 );
    mqttSocket.set_blocking( false );
    mqttSocket.sigio(callback(incomingDataCallback));
  • send and receive data.

However, it connect() fails or there’s an error later on I close the socket: mqttSocket.close() and then try to reopen it again using open(). Here it always crashes. Regardless if I use a single socket instance as above or if I always allocate a new one on the heap. Instant freeze and then a watchdog reboot. The socket has no active readers or writers at this point. Looking at the docs for TCPSocket it seems that close() does some deallocations of data? Reading this piece of docs:

    /** Accepts a connection on a socket.
     *
     *  The server socket must be bound and set to listen for connections.
     *  On a new connection, returns connected network socket which user is expected to call close()
     *  and that deallocates the resources. Referencing a returned pointer after a close()
     *  call is not allowed and leads to undefined behavior.
     *
     *  By default, accept blocks until incoming connection occurs. If socket is set to
     *  non-blocking or times out, error is set to NSAPI_ERROR_WOULD_BLOCK.
     *
     *  @param error      pointer to storage of the error value or NULL
     *  @return           pointer to a socket
     */
    virtual TCPSocket *accept(nsapi_error_t *error = NULL);

It seems that close() is only for accepted incoming sockets? Or is it? Nothing in the docs for TCPSocket mentions closing a socket or reusing it. Neither do the docs for InternetSocket say anything useful. All the examples that use sockets are happy cases that only connect once to a server and never have any errors, so they are not too much help.

So what would be the correct way to reconnect to a server in a correct way? Or is there some recommended other API that would be better, perhaps raw LWIP or something else that is stable?

Hi Jan,

Have a look here, I had the same issue and this sorted the problem out for me.
I use Studio off-line and working with the latest OS5 and OS6.

Ping function to help overcome long connect timeout

I use this snip to collect data from various devices on my local network (Ethernet of ESP8266 WIFI), then POST data to and from Firebase periodically.
200ms timeout works for me (I have tested it working at 10ms ) in case the IP drops or there’s an unsuccessful connect attempt for any reason.
I find it takes around 3ms to connect and send/receive 64 bytes of data.
It has been running solid for two weeks (24/7)so far.

void get_remote_device(int device) {
  TCPSocket remote_device;
  SocketAddress remote_addr(clientaddress, 80);
  remote_device.set_timeout(200);
  remote_device.open(net);
  if (remote_device.connect(remote_addr) == 0) {
    char sendbuffer[] = "GET /data\r";
    int buffer_size = strlen(sendbuffer);
    remote_device.send(sendbuffer, strlen(sendbuffer));
    int rec_count = 0;
    char recbuffer[128];
    rec_count = remote_device.recv(recbuffer, sizeof(recbuffer));
    recbuffer[rec_count] = '\0';
    // printf("receive buffer: %s count: %d\n", recbuffer,strlen(recbuffer));
    if (device == 0) {
      sscanf(recbuffer, "%f %f %f %f %f %d %s %d", &data[0], &data[1], &data[2],
             &data[3], &data[4], &Buffer_RSSI, Buffer_SSID, &Buffer_rtc);
    }
    if (device == 1) {
      sscanf(recbuffer, "%f %f %d %d %d %d %d %d %d %s %d", &waterlevel,
             &tanktemp, &gardenpump, &valve[0], &valve[1], &valve[2], &valve[3],
             &valve[4], &Tank_RSSI, Tank_SSID, &Tank_rtc);
    }
  }
  remote_device.close();
}

Hi Paul,

Thank you for the reply. I’m not sure we have/had the same problem though? For me doing what you do, i.e. recreate a TCPSocket will lead to a 100% certain crash. Trying to close and reopen leads to a crash.

On the other hand, I rewrite the code to use altcp (application layer TCP), and while it’s a tiny bit more cumbersome it’s also not crashing. I can close my connection as many times as I want and reopen without weird bugs.