1

I am learning baremetal development on ARM, for which I chose to simulate Raspi3 on QEMU. Hence, its a virtual ARM Cortex A-53 imlementing ARMv8 architecture. I have compiled the following simple baremetal code :

.global _start
_start:
1:  wfe
    b 1b

I launch it using :

qemu-system-aarch64 -M raspi3 -kernel kernel8.img -display none -S -s

and the GDB is connected to it from the other terminal using :

gdb-multiarch ./kernel8.elf -ex 'target remote localhost:1234' -ex 'break *0x80000' -ex 'continue'

So far everything is good and I can notice the breakpoint in gdb.

Reading symbols from ./kernel8.elf...
Remote debugging using localhost:1234
0x0000000000000000 in ?? ()
Breakpoint 1 at 0x80000: file start.S, line 5.
Continuing.

Thread 1 hit Breakpoint 1, _start () at start.S:5
5   1:  wfe
(gdb) info threads
  Id   Target Id                    Frame
* 1    Thread 1.1 (CPU#0 [running]) _start () at start.S:5
  2    Thread 1.2 (CPU#1 [running]) 0x0000000000000300 in ?? ()
  3    Thread 1.3 (CPU#2 [running]) 0x000000000000030c in ?? ()
  4    Thread 1.4 (CPU#3 [running]) 0x000000000000030c in ?? ()
(gdb) list
1   .section ".text.boot"
2
3   .global _start
4   _start:
5   1:  wfe
6       b 1b
(gdb)

As per my understanding, in case of ARM all the cores will execute the same code on reset, so ideally all the cores in my case must be running the same code. I just want to verify that by putting breakpoints and that is the problem. The break points for other cores are not hit. If I am not wrong, the threads in my case are nothing but the cores. I tried putting break but does not work :

(gdb) break *0x80000 thread 2
Note: breakpoint 1 (all threads) also set at pc 0x80000.
Breakpoint 2 at 0x80000: file start.S, line 5.
(gdb) thread 2
[Switching to thread 2 (Thread 1.2)]
#0  0x0000000000000300 in ?? ()
(gdb) info threads
  Id   Target Id                    Frame
  1    Thread 1.1 (CPU#0 [running]) _start () at start.S:5
* 2    Thread 1.2 (CPU#1 [running]) 0x0000000000000300 in ?? ()
  3    Thread 1.3 (CPU#2 [running]) 0x000000000000030c in ?? ()
  4    Thread 1.4 (CPU#3 [running]) 0x000000000000030c in ?? ()
(gdb) s
Cannot find bounds of current function
(gdb) c
Continuing.
[Switching to Thread 1.1]

Thread 1 hit Breakpoint 1, _start () at start.S:5
5   1:  wfe
(gdb)

I deleted the core 1 breakpoint, and then the core 2 hangs forever :

(gdb) info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000000000080000 start.S:5
    breakpoint already hit 2 times
2       breakpoint     keep y   0x0000000000080000 start.S:5 thread 2
    stop only in thread 2
(gdb) delete br 1
(gdb) info break
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   0x0000000000080000 start.S:5 thread 2
    stop only in thread 2
(gdb) thread 2
[Switching to thread 2 (Thread 1.2)]
#0  0x000000000000030c in ?? ()
(gdb) c
Continuing.

What can I do get a breakpoint on core 2? What am I doing wrong here?

EDIT

I tried set scheduler-locking on (assuming this is what I need) but this also seems not working for me.

(gdb) break *0x80000
Breakpoint 3 at 0x80000: file start.S, line 5.
(gdb) thread 2
[Switching to thread 2 (Thread 1.2)]
#0  0x000000000000030c in ?? ()
(gdb) set scheduler-locking on
(gdb) c
Continuing.


^C/build/gdb-OxeNvS/gdb-9.2/gdb/inline-frame.c:367: internal-error: void skip_inline_frames(thread_info*, bpstat): Assertion `find_inline_frame_state (thread) == NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n

This is a bug, please report it.  For instructions, see:
<http://www.gnu.org/software/gdb/bugs/>.

/build/gdb-OxeNvS/gdb-9.2/gdb/inline-frame.c:367: internal-error: void skip_inline_frames(thread_info*, bpstat): Assertion `find_inline_frame_state (thread) == NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n)

EDIT 2

Upon @Frank's advice, I built (latest) qemu 6.2.0 locally and used the gdb available in the arm toolchain.

naveen@workstation:~/.repos/src/arm64/baremetal/raspi3-tutorial/01_bareminimum$ /opt/qemu-6.2.0/build/qemu-system-aarch64 -version
QEMU emulator version 6.2.0
Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

naveen@workstation:~/.repos/src/arm64/baremetal/raspi3-tutorial/01_bareminimum$ /opt/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-gdb -version
GNU gdb (GNU Toolchain for the A-profile Architecture 10.3-2021.07 (arm-10.29)) 10.2.90.20210621-git

But I am still having the problem. My other cores 2,3 and 4 never hit the breakpoints. It seems they are not even running my code, as the address they are pointing to, does not look ok.

(gdb) info threads
  Id   Target Id                    Frame
* 1    Thread 1.1 (CPU#0 [running]) _start () at start.S:5
  2    Thread 1.2 (CPU#1 [running]) 0x000000000000030c in ?? ()
  3    Thread 1.3 (CPU#2 [running]) 0x000000000000030c in ?? ()
  4    Thread 1.4 (CPU#3 [running]) 0x000000000000030c in ?? ()

EDIT 3

The problem seems with my Makefile, as when I used the command to build, as suggested by Frank, it worked for me. Can someone please look as what's wrong with this Makefile :

CC = /opt/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-elf
CFLAGS = -Wall -O2 -ffreestanding -nostdinc -nostartfiles -nostdlib -g

all: clean kernel8.img

start.o: start.S
    ${CC}-gcc $(CFLAGS) -c start.S -o start.o

kernel8.img: start.o
    ${CC}-ld -g -nostdlib start.o -T link.ld -o kernel8.elf
    ${CC}-objcopy -O binary kernel8.elf kernel8.img

clean:
    rm kernel8.elf kernel8.img *.o >/dev/null 2>/dev/null || true

EDIT 4

It turns out that when I use kernel8.elf with QEMU for booting, everything works as expected. But when I use kernel8.img which is a binary format, I get the issue. With bit of reading, I understand that ELF contains the "extra" information required to make the example work. But for clarification, how can I make the kernel8.img work?

Naveen
  • 7,944
  • 12
  • 78
  • 165

1 Answers1

2

You probably have an issue with the versions of gdb or qemu you are using, since I was not able to reproduce your problem with a version 10.1 of aarch64-elf-gdb and a version 6.2.0 of qemu-system-aarch64 compiled from scratch on an Ubuntu 20.04.3 LTS system:

wfe.s:

        .global _start
_start:
1:      wfe
        b 1b

Building wfe.elf:

/opt/arm/10/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-elf-gcc -g -ffreestanding -nostdlib -nostartfiles -Wl,-Ttext=0x80000 -o wfe.elf wfe.s

Looking at generated code:

/opt/arm/10/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-elf-objdump -d wfe.elf

wfe.elf:     file format elf64-littleaarch64


Disassembly of section .text:

0000000000080000 <_stack>:
   80000:       d503205f        wfe
   80004:       17ffffff        b       80000 <_stack>

Starting qemu in a shell session:

/opt/qemu-6.2.0/bin/qemu-system-aarch64 -M raspi3b -kernel wfe.elf -display none -S -s

Starting gdb in another:

/opt/gdb/gdb-10.1-aarch64-elf-x86_64-linux-gnu/bin/aarch64-elf-gdb wfe.elf -ex 'target remote localhost:1234' -ex 'break *0x80000' -ex 'continue'

gdb session:

GNU gdb (GDB) 10.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "--host=x86_64-linux-gnu --target=aarch64-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Remote debugging using localhost:1234
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
0x0000000000080000 in ?? ()
Breakpoint 1 at 0x80000
Continuing.
[Switching to Thread 1.4]

Thread 4 hit Breakpoint 1, 0x0000000000080000 in ?? ()
(gdb) break *0x80000 thread 2
Note: breakpoint 1 (all threads) also set at pc 0x80000.
Breakpoint 2 at 0x80000
(gdb) info threads
  Id   Target Id                    Frame 
  1    Thread 1.1 (CPU#0 [running]) 0x0000000000080000 in ?? ()
  2    Thread 1.2 (CPU#1 [running]) 0x0000000000080000 in ?? ()
  3    Thread 1.3 (CPU#2 [running]) 0x0000000000080000 in ?? ()
* 4    Thread 1.4 (CPU#3 [running]) 0x0000000000080000 in ?? ()
(gdb) c
Continuing.
[Switching to Thread 1.2]

Thread 2 hit Breakpoint 1, 0x0000000000080000 in ?? ()
(gdb) info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000000000080000 
        breakpoint already hit 2 times
2       breakpoint     keep y   0x0000000000080000  thread 2
        stop only in thread 2
        breakpoint already hit 1 time
(gdb) del 1
(gdb) info b
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   0x0000000000080000  thread 2
        stop only in thread 2
        breakpoint already hit 1 time
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb) 

The anwers to your two questions are therefore:

  1. What can I do get a breakpoint on core 2?

Exactly what you are doing.

  1. What am I doing wrong here?

Nothing, but may be using old/buggy versions of gdb and/or qemu - my guess would be that gdb is the culprit is your case, but I may be wrong.

You can easily verify by testing again using the version of gdb provided in the gcc toolchain available from Arm, AArch64 ELF bare-metal target (aarch64-none-elf) - I tried, and it worked fine as well:

/opt/arm/10/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-elf-gdb wfe.elf -ex 'target remote localhost:1234' -ex 'break *0x80000' -ex 'continue'
GNU gdb (GNU Toolchain for the A-profile Architecture 10.3-2021.07 (arm-10.29)) 10.2.90.20210621-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "--host=x86_64-pc-linux-gnu --target=aarch64-none-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.linaro.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from wfe.elf...
Remote debugging using localhost:1234
_start () at wfe.s:3
3       1:      wfe
Breakpoint 1 at 0x80000: file wfe.s, line 3.
Continuing.

Thread 1 hit Breakpoint 1, _start () at wfe.s:3
3       1:      wfe
(gdb) break *0x80000 thread 2
Note: breakpoint 1 (all threads) also set at pc 0x80000.
Breakpoint 2 at 0x80000: file wfe.s, line 3.
(gdb)  info threads
  Id   Target Id                    Frame 
* 1    Thread 1.1 (CPU#0 [running]) _start () at wfe.s:3
  2    Thread 1.2 (CPU#1 [running]) _start () at wfe.s:3
  3    Thread 1.3 (CPU#2 [running]) _start () at wfe.s:3
  4    Thread 1.4 (CPU#3 [running]) _start () at wfe.s:3
(gdb) c
Continuing.
[Switching to Thread 1.2]

Thread 2 hit Breakpoint 1, _start () at wfe.s:3
3       1:      wfe
(gdb) info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000000000080000 wfe.s:3
        breakpoint already hit 2 times
2       breakpoint     keep y   0x0000000000080000 wfe.s:3 thread 2
        stop only in thread 2
        breakpoint already hit 1 time
(gdb) del 1
(gdb) info b
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   0x0000000000080000 wfe.s:3 thread 2
        stop only in thread 2
        breakpoint already hit 1 time
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3       1:      wfe
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3       1:      wfe
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3       1:      wfe
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3       1:      wfe
(gdb) c
Continuing.

Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3       1:      wfe
(gdb) 

Please note that explaining how to build the latest versions of gdb and qemu is out of the scope of the current answer.

Frant
  • 5,382
  • 1
  • 16
  • 22
  • Thanks @Frank , I followed your advice but I am still hitting the issue. Please check *Edit 2* as comment section is smaller to put the info. – Naveen Jan 05 '22 at 12:56
  • Hey... it seems the problem is with the Makefile that I am using. When I use your command to build it, it works fine for me. Can you please take a look at my Makefile in EDIT 3 – Naveen Jan 05 '22 at 13:06
  • Please feel free to accept my answer if you think it provides the correct responses to your questions . Could you please augment your original question with the content of your linker script, as well as with the output for the following command: `/opt/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-objdump -d kernel8.elf` ? thanks. – Frant Jan 05 '22 at 14:34
  • Yes, something is wrong, your cores should execute code at `0x0000000000080000` , not at `0x000000000000030c` - this is probably why your breakpoints at `0x80000` are not working as expected. – Frant Jan 05 '22 at 15:48
  • I found the problem. The Makefile is not an issue. I get issue only when I use `kernel.img` generated via objcopy. If I use `kernel.elf` (which is actually same as your wfe.elf) then there is no problem. Any idea why `kernel.img` is not working? – Naveen Jan 05 '22 at 17:47
  • Are they any reasons why you are not accepting the answer ? can I improve it in any way ? – Frant Jan 05 '22 at 18:12
  • I cannot accept the answer because we still do not know why the original command using `kernel8.img` is not working but `kernel8.elf` works. If you can answer that , then we can close this. The objective is not just to find a working solution but also to clarify "why" it did not work earlier? – Naveen Jan 05 '22 at 18:27
  • 2
    I would disagree: I answered precisely to your two questions: 1) What can I do get a breakpoint on core 2? Exactly what you are doing. 2): What am I doing wrong here? Nothing - this is the right answer I guess since you never questioned your executable had an issue, but rather that the issue was related to the commands you were using for gdb/qemu and inside your gdb session. You would otherwise have provided your Makefile and linker script. The way I see this that you are asking a new question, I would be glad to answer to, but which is still a new question from my point of view. – Frant Jan 05 '22 at 18:50
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/240765/discussion-between-frant-and-naveen). – Frant Jan 05 '22 at 20:03
  • Ok let me post a new question then. – Naveen Jan 06 '22 at 04:23
  • 1
    Here's the follow-up question in case you want to answer : https://stackoverflow.com/questions/70603156/why-arm-cores-behaving-differently-with-an-elf-and-binary-file – Naveen Jan 06 '22 at 06:04