Problem going from development board to custom board

Hello everyone,

It’s been some month now I started developing for embedded system in my work.
I have found enough informations to have my code working on a development. I’m aiming at having my code working on a custom board.

The development board is a NUCLEO-F439ZI and the custom board uses a STM32F439ZIY.

My problem came at the moment I tried to flash my code on the custom board. The code didn’t runned.
I then wrote a very small code to check if the problem came from the code hitself (shouldn’t be since it’s working fine on the development board).
This small code worked on the development board but not on the custom board.

I then checked the difference between the two board and I isolated the following:

  • The custom board uses an external 8MHz XTAL
  • The package is different between the two board (F439ZITx for the development board and F439ZIYx for the development board)

The modification for the external clock is also as simple as possible with following diagram:
image

I modified the development board to have an external 8MHz XTAL but then again, the code is working on the modified development board and not on the custom board.
I didn’t find any information about a difference between the packages that could explain my problem.

I have been on this problem for 2 days trying things and I feel like I’m close to a solution but still not there.

Now about the problem specifically :

The minimal code is as simple as this :

#include "mbed.h"
#include "system_clock.c"   //strange to include a .c file but ok...

int main() {
    SetSysClock_PLL_HSE(1); 
    int state = 0;
    state++;
    }
    while(1){
        state = !state;
    }
    
}

My custom_targets.json looks like this :

{
    "CUSTOM_BOARD": {
        "inherits": [
            "NUCLEO_F439ZI"
        ],
        "overrides":{
            "clock_source": "USE_PLL_HSE_XTAL"
        },
        "device_has_remove":{

        },
        "public": true,
        "core": "Cortex-M4F",
	    "macros_add": [
            "STM32F439xx"
        ],
	    "supported_toolchain": [
            "GCC_ARM", 
            "ARM"
        ],
        "device_name": "STM32F439ZIYx",
        "OUTPUT_EXT": "hex",
        "bootloader_supported": true,
        "release_versions": ["5", "6"]
    }
}

I launched a debug session both for the working development board and the custom board. The outputs and arm-none-eabi readelf where as follows :

  • Working development board :
Selected port 50000 for debugging
0001218:INFO:board:Target type is stm32f439xi
0001231:INFO:coresight_target:Asserting reset prior to connect
0001239:INFO:dap:DP IDR = 0x2ba01477 (v1 rev2)
0001264:INFO:ap:AHB-AP#0 IDR = 0x24770011 (AHB-AP var1 rev2)
0001331:INFO:rom_table:AHB-AP#0 Class 0x1 ROM table #0 @ 0xe00ff000 (designer=020 part=411)
0001339:INFO:rom_table:[0]
0001355:INFO:rom_table:[1]
0001434:INFO:rom_table:[2]
0001514:INFO:rom_table:[3]
0001551:INFO:rom_table:[4]
0001552:INFO:rom_table:[5]
0001554:INFO:cortex_m:CPU core #0 is Cortex-M4 r0p1
0001559:INFO:dwt:4 hardware watchpoints
0001563:INFO:fpb:6 hardware breakpoints, 4 literal comparators
0001573:INFO:coresight_target:Deasserting reset post connect
0001595:INFO:server:Semihost server started on port 4444 (core 0)
0001650:INFO:gdbserver:GDB server started on port 50000 (core 0)
Reading symbols from PATH/code.elf...
warning: 
Loadable section "RW_IRAM1" outside of ELF segments
(no debugging symbols found)...done.
0001895:INFO:gdbserver:Client connected to port 50000!
warning: Loadable section "RW_IRAM1" outside of ELF segments
0x08001568 in HAL_PWR_EnterSLEEPMode ()
0001927:INFO:gdbserver:Attempting to load argon
0001928:INFO:gdbserver:Attempting to load freertos
0001928:INFO:gdbserver:Attempting to load rtx5
0001929:INFO:gdbserver:rtx5 loaded successfully
Resetting target with halt
Successfully halted device on reset
Attached to debugger on port 50000
0019736:INFO:loader:Erased chip, programmed 28672 bytes (7 pages), skipped 4096 bytes (1 page) at 1.80 kB/s
Resetting target with halt
Successfully halted device on reset
Image loaded: PATH/code.elf
Note: automatically using hardware breakpoints for read-only addresses.
[New Thread 536878112]
[New Thread 536877976]
[New Thread 536878044]
[Switching to Thread 536878112]
Thread 2 "main" hit Breakpoint 1, 0x08003570 in main ()
0214584:INFO:gdbserver:Client detached
0214585:INFO:gdbserver:Client disconnected from port 50000!

readelf :

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] ER_IROM1          PROGBITS        08000000 000034 006414 00  AX  0   0  8
  [ 2] RW_m_crash_data   NOBITS          200001b0 006448 000100 00  WA  0   0  4
  [ 3] RW_IRAM1          PROGBITS        200002b0 006448 000108 00  WA  0   0  4
  [ 4] RW_IRAM1          NOBITS          200003b8 006550 001ff8 00  WA  0   0  8
  [ 5] ARM_LIB_HEAP      NOBITS          200023b0 006550 02d850 00  WA  0   0  4
  [ 6] ARM_LIB_STACK     NOBITS          2002fc00 006550 000400 00  WA  0   0  4
  [ 7] .debug_frame      PROGBITS        00000000 006550 0077ec 00      0   0  1
  [ 8] .symtab           SYMTAB          00000000 00dd3c 007620 10      9 1367  4
  [ 9] .strtab           STRTAB          00000000 01535c 005b48 00      0   0  1
  [10] .note             NOTE            00000000 01aea4 000028 00      0   0  4
  [11] .comment          PROGBITS        00000000 01aecc 0019b8 00      0   0  1
  [12] .shstrtab         STRTAB          00000000 01c884 000080 00      0   0  1
  • Not working custom board:
Selected port 50000 for debugging
0001496:INFO:board:Target type is stm32f439xi
0001510:INFO:coresight_target:Asserting reset prior to connect
0001517:INFO:dap:DP IDR = 0x2ba01477 (v1 rev2)
0001549:INFO:ap:AHB-AP#0 IDR = 0x24770011 (AHB-AP var1 rev2)
0001758:INFO:rom_table:AHB-AP#0 Class 0x1 ROM table #0 @ 0xe00ff000 (designer=020 part=411)
0001818:INFO:rom_table:[0]
0001819:INFO:rom_table:[1]
0001822:INFO:rom_table:[2]
0001824:INFO:rom_table:[3]
0001825:INFO:rom_table:[4]
0001827:INFO:rom_table:[5]
0001830:INFO:cortex_m:CPU core #0 is Cortex-M4 r0p1
0001836:INFO:cortex_m:FPU present: FPv4-SP-D16-M
0001840:INFO:dwt:4 hardware watchpoints
0001844:INFO:fpb:6 hardware breakpoints, 4 literal comparators
0001856:INFO:coresight_target:Deasserting reset post connect
0001872:INFO:server:Semihost server started on port 4444 (core 0)
0001935:INFO:gdbserver:GDB server started on port 50000 (core 0)
Reading symbols from PATH/code.elf...
warning: 
Loadable section "RW_IRAM1" outside of ELF segments
(no debugging symbols found)...done.
0002176:INFO:gdbserver:Client connected to port 50000!
warning: Loadable section "RW_IRAM1" outside of ELF segments
0x08002eaa in _wait_us_ticks ()
0002216:INFO:gdbserver:Attempting to load argon
0002217:INFO:gdbserver:Attempting to load freertos
0002218:INFO:gdbserver:Attempting to load rtx5
0002219:INFO:gdbserver:rtx5 loaded successfully
Resetting target with halt
Successfully halted device on reset
Attached to debugger on port 50000
0020954:INFO:loader:Erased chip, programmed 28672 bytes (7 pages), skipped 4096 bytes (1 page) at 1.71 kB/s
Resetting target with halt
Successfully halted device on reset
Image loaded: PATH/code.elf
Note: automatically using hardware breakpoints for read-only addresses.
0021294:WARNING:rtx5:TransferError while reading list elements (list=0x20000338, node=0xbe8ef7fb), terminating list: Memory transfer fault (read) @ 0xbe8ef803-0xbe8ef804
[New Thread 2]
warning: while parsing threads: not well-formed (invalid token)
[New Thread 536878112]
[New Thread 536877976]
[New Thread -1097926661]
Thread 
2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2]

readelf:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] ER_IROM1          PROGBITS        08000000 000034 0063b0 00  AX  0   0  8
  [ 2] RW_m_crash_data   NOBITS          200001b0 0063e4 000100 00  WA  0   0  4
  [ 3] RW_IRAM1          PROGBITS        200002b0 0063e4 000108 00  WA  0   0  4
  [ 4] RW_IRAM1          NOBITS          200003b8 0064ec 001fa8 00  WA  0   0  8
  [ 5] ARM_LIB_HEAP      NOBITS          20002360 0064ec 02d8a0 00  WA  0   0  4
  [ 6] ARM_LIB_STACK     NOBITS          2002fc00 0064ec 000400 00  WA  0   0  4
  [ 7] .debug_frame      PROGBITS        00000000 0064ec 0076b8 00      0   0  1
  [ 8] .symtab           SYMTAB          00000000 00dba4 007580 10      9 1362  4
  [ 9] .strtab           STRTAB          00000000 015124 005ae0 00      0   0  1
  [10] .note             NOTE            00000000 01ac04 000028 00      0   0  4
  [11] .comment          PROGBITS        00000000 01ac2c 0019ac 00      0   0  1
  [12] .shstrtab         STRTAB          00000000 01c5d8 000080 00      0   0  1

I don’t know what I’m doing wrong and I don’t have any idea on what to do next. Can you guys help me with this problem?

Last thing :
I’m on windows 10, using mbed studio and GCC_ARM for compiler

Have a nice day
ABR

Hmm, you definitely should not need to call SetSysClock_PLL_HSE from your main() function. The functions in system_clock.c are automatically executed by Mbed OS before main() is called. To switch the board to use external HSE oscillator, the "clock_source": "USE_PLL_HSE_XTAL" line should be enough.

Also interesting is that the debugger shows SIGSEGV which is not a failure mode commonly seen with clock config issues. If it’s a clock config problem, I’d expect to see a hang in one of the system clock functions. If you remove the include of system_clock.c and the call to SetSysClock_PLL_HSE, what does the debugger show then?

1 Like

With the updated code the debugger gives the same kind of output :

Image loaded: PATH/code.elf
Note: automatically using hardware breakpoints for read-only addresses.
0022798:WARNING:rtx5:TransferError while reading list elements (list=0x20000338, node=0x08406806), terminating list: Memory transfer fault (read) @ 0x0840680e-0x08406811
[New Thread 2]
[New Thread 536878032]
[New Thread 536877896]
[New Thread -1]
[New Thread 138438662]
[New Thread 536877964]
[New Thread 1681938643]
Thread 
2 "Handler mode" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2]
0xd3000040 in ?? ()

This would suggest that the problem comes from elsewhere but as I said in the first post I have no idea what could cause this.

Hello Adrien,

warning: while parsing threads: not well-formed (invalid token)
[New Thread 536878112]
[New Thread 536877976]
[New Thread -1097926661]
Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2]

Jamie Smith:

Also interesting is that the debugger shows SIGSEGV which is not a failure mode commonly seen with clock config issues.

According to the datasheet the MCU is equipped with Memory protection unit.

The memory protection unit (MPU) is used to manage the CPU accesses to memory to prevent one task to accidentally corrupt the memory or resources used by any other active task.
The MPU is especially helpful for applications where some critical or certified code has to be protected against the misbehavior of other tasks. It is usually managed by an RTOS (real-time operating system). If a program accesses a memory location that is prohibited by the
MPU, the RTOS can detect it and take action. In an RTOS environment, the kernel can dynamically update the MPU area setting, based on the process to be executed."

Maybe the Memory protection unit is activated and causing the Segmentation fault.

You can try to use the STMCubeProgrammer to check it. The MPU is optional and can be bypassed for applications that do not need it.

1 Like

This is the read using the stm32cube programmer for the MPU

This memory read is the same for the development board but I tried to disable it a tthe start of the main with the function

    HAL_MPU_Disable();

Is there a way to enable/disable it in the config?

The problem is still there.

I’m sorry but it was only a tip based on the datasheet info. Unfortunately, I do not have any experience with a Memory protection unit. My guess was that there are some check boxes available on the STM32CubeProgrammer’s OB (Options Bytes) tab to turn the Memory protection unit “on/off” .

mmmmmmmmmmmmmh

Tried to uncheck the “write protection” checkbox nWRP0.
When doing so I have the message

pyocd.core.exceptions.FlashProgramFailure: ('program_page(0x8000000) error: 1', 134217728, 1)
"0002564:ERROR:gdbserver:Unhandled exception in handle_message: ('program_page(0x8000000) error: 1', 134217728, 1)"

instead of the SEGFAULT.
I checked the box again and everything is back to the SEGFAULT

Hi

I don’t understand why you try to change OB or disable MPU…

From your initial post,
I confirm you can define your own SetSysClock_PLL_HSE in your application as default function is WEAK:

There is no SW difference between F439ZITx and F439ZIYx

Code should be simplified:

#include "mbed.h"
int main() {
    int state = 0;
    while(1){
        state = !state;
    }
}

custom_targets.json also:

    "CUSTOM_BOARD": {
        "inherits": [
            "NUCLEO_F439ZI"
        ],
        "overrides":{
            "clock_source": "USE_PLL_HSE_XTAL"
        },
        "device_name": "STM32F439ZIYx",
        "OUTPUT_EXT": "hex"
    }

About cristal, I know this is not a trivial part. There is the AN2867 Application note from ST:

Regards,

1 Like

To rule out the problem with crystal you can try to test both boards with the internal RC oscillator.
mbed_app.json:

...
        "overrides": {
            "clock_source": "USE_PLL_HSI",
...

By the way, I think you was trying to change the flash write protection rather than SRAM. This is what is displayed when I open OB for a chip not equipped with MPU (STM32F411RE):

So I tried both the suggestions.

Cleaning the code and the configuration did not solve the problem but it made things obviously more clean.

Changing the clock from HSE to HSI also didn’t solve the problem (nothing changed in the debug output)

The OB have a “PCROP protection” section that the STM32F411RE doesn’t have :


The value is unchecked for both the development board and the custom board.

So the problem certainly does not come from the clock and seems like it doesn’t come from the MPU either.

To summarize the problem :

  • The simple code compile without any problem and flash on the development board and the custom board
  • The code execute well on the development board but fails to so so on the custom board. By reading the debugger output it seems like a bad memory management
  • The difference between the two board are the clock (The problem does not come from here), the package (no SW difference based on what @jeromecoutant said so no problem here), the st-link used (I don’t see why it would cause any problem)

So clearly there is a difference between the development board and the custom board and it is related to the configuration.

I checked the st-link by using an other one.

Turns out using the new doesn’t give this SEGFAULT so the error is certainly coming from this. The code is working nicely on the custom board now.

Sorry for the time I borrowed and at least it helped me understanding how mbed works.

Have a nice day.
ABR

Problem has been solved for like 15 minutes.

It is back with the new st-link I have no idea why it was solved and now it’s not.

I made the same little code in the stm32cube and debug it and it worked.

I there anything set up differently in the stm32cube than the mbed? I didn’t change anything for the stm32 config

EDIT:
I even used to embedded st-link from the development board to test with the two board being as close as possible. The same error happens.

quick update for anyone having the same problem in the future:

I found out that the BOOT0 pin isn’t connected on the custom board. Consequences are described in the following post: ST Community

Have a nice day.
ABR

1 Like

Thank you for sharing the fix! I’m sure this will save others a lot of headaches and frustration.

Great article, thanks for writing