Bluepill can bus random problems

Hi! I have a strange problem with can bus, bluepill and mbed os. I use platformio to build project with added LINKFLAGS --specs=nano.specs (this only affects debug size to fit in the flash. problem stays without it too). Mbed OS version is 5.15.

Problem is at random time can1.write starts to return errors. But if I look into can1 registers with debugger, it says that everything is ok and state is READY and error bits are 0. But all consecutive writes fail. If I reset can interface it starts to work again, but then randomly stops to send messages.

First error appears before 100 messages was sent, but number is random.

Here is a link to the code

Any ideas how to solve this problem?

Hello Oleg,

Problem is at random time can1.write starts to return errors.

I think this could happen when the CAN controller is busy with receiving data in the onCanReceived ISR. The CAN::receive method locks the mutex (calls Mutex’s lock method). However, in Mbed OS (with RTOS) Mutex methods cannot be called from interrupt service routines (ISR). So rather than receiving data in ISR try to poll for new data in the main or in other thread as shown in the CAN hello, world example program. Then you can synchronize the write with read for example using a Condition Variable.

Best regards, Zoltan

Thanks for the idea, I tried disabling interrupt and now it sends few hundreds (200-300) messages before first errors start to appear. Any other ideas?

Is it possible, that code does not use HSE oscillator and error appears because of internal clock source?

Maybe the bus is noisy or busy with other massages of higher priority. Retry when it fails:

...
#define MAX_ATTEMPS 5
...
int i = 0;
while (i++ < MAX_ATTEMPS) {
    if (can1.write(data)) {
        break;
    }
    ThisThread::sleep_for(1ms);
}
...

or

while (!can.write(data)) {
    ThisThread::sleep_for(1ms);
}

Thanks for that. The only traffic on the bus is from this code. I send messages only from one thread. So maybe any other ideas?

PS. I also found, that when running without debugger connected (powered via usb) it starts to send a lot of messages, and finally overflows something, so can monitor breaks displaying values.

The PA_11, PA_12 pins are used also as USB_DM and USB_DP pins (and they are connected over 22 Ohm resistors to the USB connector). Maybe that is causing the problem. Try to use the PB_8, PB_9 pins instead:

CAN can1(PB_8, PB_9);

Or supply the board over the +5V (+3V3) pin.
Also make sure the CAN bus is terminated with 120 Ohm resistors at both ends.

1 Like

Thanks, after moving can transceiver to pins b8 and b9 usb power works fine, but I get tx errors anyway in a small amount.

I see you are the author of community package of mbed os 6.2 for bluepill. Any clues how to use it with platformio?

Also, is there are any good hacks, like using newlib-nano for mbed and bluepill?

Unfortunately, I have no experience with the Platformio IDE so I’m afraid I’m not able to help you with that. However, you can try to follow the same procedure as in Mbed Studio (a free offline IDE for Mbed):

  • Make sure no mbed device (DAP Link, ST-Link) is connected to your PC.
  • In the menu select File > Import a program ...
  • Copy & Paste https://os.mbed.com/users/hudakz/code/mbed-os-bluepill/ to the URL edit box.
  • Try to update the mbed-os.lib to the latest revision (Bottom-right - Libraries tab).
  • Open the Target drop down menu and type bluepill to the edit box.
    Consequently, BLUEPILL should be listed in the MCUs and custom targes .
  • Click on the BLUEPILL item to select it as Target .

Or you can use the Mbed CLI offline tool.
If you cross-compile on Linux (Ubuntu) then you can give the QtCreator IDE a try.

You can build for the BLUEPILL with the small C libraries by creating an mbed_app.json with the following contents:

{
 "target_overrides": {
   "*": {
     "target.c_lib": "small"
   }
 }
}

There is available also an example project at Bare-metal Event Queue with BLUEPILL.

Best regards, Zoltan

1 Like

So after testing with different conditions I still get a lot of can tx errors and the only way to resolve this is to reset can interface and start again. It is the only device transmitting on the bus and it’s only 2 messages every 100ms + 3 messages/sec.

Any ideas how to make it stable? Or how to find the source of this problem?

  • At least two nodes must be connected to a CAN bus to make it work (exception to the rule is one node in a loop-back mode).
  • Use twisted pair of wires for CAN bus.
  • Terminate the CAN bus with 120 Ohm resistors at both ends.
  • Try to connect all nodes to the same ground.
  • Reduce stab length (distance between the CAN transceiver and the CAN bus lines) for all nodes as much as achievable (keep it less than 30cm).
  • Make sure all nodes are set to use the same frequency. Please notice that you cannot use arbitrary frequency. Only selected frequencies are supported.
  • Reduce the frequency for all nodes step by step (by selecting commonly used frequencies) until the number of bus errors is zero (or acceptable).
  • To set the CAN bus frequency for STM targets in Mbed the following table is used to adjust the sampling point. See the the can_speed() function implemented in the can_api.c file:
// The following table is used to program bit_timing. It is an adjustment of the sample
// point by synchronizing on the start-bit edge and resynchronizing on the following edges.
// This table has the sampling points as close to 75% as possible (most commonly used).
// The first value is TSEG1, the second TSEG2.
static const int timing_pts[23][2] = {
    {0x0, 0x0},      // 2,  50%
    {0x1, 0x0},      // 3,  67%
    {0x2, 0x0},      // 4,  75%
    {0x3, 0x0},      // 5,  80%
    {0x3, 0x1},      // 6,  67%
    {0x4, 0x1},      // 7,  71%
    {0x5, 0x1},      // 8,  75%
    {0x6, 0x1},      // 9,  78%
    {0x6, 0x2},      // 10, 70%
    {0x7, 0x2},      // 11, 73%
    {0x8, 0x2},      // 12, 75%
    {0x9, 0x2},      // 13, 77%
    {0x9, 0x3},      // 14, 71%
    {0xA, 0x3},      // 15, 73%
    {0xB, 0x3},      // 16, 75%
    {0xC, 0x3},      // 17, 76%
    {0xD, 0x3},      // 18, 78%
    {0xD, 0x4},      // 19, 74%
    {0xE, 0x4},      // 20, 75%
    {0xF, 0x4},      // 21, 76%
    {0xF, 0x5},      // 22, 73%
    {0xF, 0x6},      // 23, 70%
    {0xF, 0x7},      // 24, 67%
};

As you can see, for some speeds the sampling point is closer to 75% than for others. So I think, some speeds are a bit more reliable than others.

Read this thread for a tip how to automatically recover from a bus-off mode (BOM). But please note that in Mbed OS 6 the ABOM is set by the _can_init_freq_direct function rather than by the can_init_freq.

1 Like

Thanks for the list of recommendations, but I use same hardware and environment setup to test it. Only thing I change is firmware and problems appear only with mbed.

Is it possible, that this errors are caused by sample point position? It looks like 87.5 is common for CAN, but code shows that it’s not 87.5.

I think it’s about clocks and BTR register, but I can find any info about setting clocks and BTR with mbed. Where I can find more info about it?