Issue
I'm trying to learn the basics of ARM assembly and wrote a fairly simple program to sort an array. I initially assembled it using the armv8-a
option and ran the program under qemu
while debugging with gdb
. This worked fine and the program initialized the array and sorted it as expected.
Ultimately I would like to be able to write some assembly for my Raspberry Pi Pico, which has an ARM Cortex M0+, which I believe uses the armv6-m
option. However, when I change the directive in my code, it compiles fine but behaves strangely in that the program counter increments by 4 after every instruction instead of the 2 that I expect for thumb. This is causing my program to not work correctly. I suspect that qemu
is trying to run my code as if it were compiled for the full ARM instruction set instead of thumb, but I'm not sure why this is.
I am running on Ubuntu Linux 20.04 LTS, using qemu-arm
version 4.2.1 (installed from the package manager). Does the qemu-arm
executable only run full ARM binaries? If so, is there another qemu
package I can install to run a thumb binary?
Here is my code if it is helpful:
.arch armv6-m
.cpu cortex-m0plus
.syntax unified
.thumb
.data
arr: .skip 4 * 10
len: .word 10
.section .text
.global _start
.align 2
_start:
ldr r0, arr_adr @ load the address of the start of the array into register 0
movs r1, #0 @ clear the counter register
movs r2, #100
init_loop:
str r2, [r0,r1] @ store r2's value to the base address of the array plus the offset stored in r1
subs r2, r2, #10 @ subtract 10 from r2
adds r1, #4 @ add 4 to the offset (1 word in bytes)
cmp r1, #40 @ check if we've reached the end of the array
bne init_loop
movs r1, #0 @ clear the offset
out_loop:
mov r3, r1 @ set the index of the minimum value to the current array index
mov r4, r1 @ set the inner loop index to the outer loop index
in_loop:
ldr r5, [r0,r3] @ load the minimum index's value to r5
ldr r6, [r0,r4] @ load the inner loop's next value to r6
cmp r6, r5 @ compare the two values
bge in_loop_inc @ if r6 is greater than or equal to r5, increment and restart loop
mov r3, r4 @ set the minimum index to the current index
in_loop_inc:
adds r4, #4
cmp r4, #40 @ check if at end of array
blt in_loop
ldr r5, [r0,r3] @ load the minimum index value into r5
ldr r6, [r0,r1] @ load the current outer loop index value into r6
str r6, [r0,r3] @ swap the two values
str r5, [r0,r1]
adds r1, #4 @ increment outer loop index
cmp r1, #40 @ check if at end of array
blt out_loop
loop:
nop
b loop
arr_adr: .word arr
Thank you for your help!
Solution
There are a couple of concepts to disentangle here:
(1) Arm vs Thumb : these are two different instruction sets. Most CPUs support both, some support only one. Both are available simultaneously if the CPU supports both. To simplify a little bit, if you jump to an address with the least significant bit set that means "go to Thumb mode", and jumping to an address with that bit clear means "go to Arm mode". (Interworking is a touch more complicated than that, but that's a good initial mental model.) Note that all Arm instructions are 4 bytes long, but Thumb instructions can be either 2 or 4 bytes long.
(2) A-profile vs M-profile : these are two different families of CPU architecture. M-profile is "microcontrollers"; A-profile is "applications processors", which is "(almost) everything else". M-profile CPUs always support Thumb and only Thumb code. A-profile CPUs support both Arm and Thumb. The Raspberry Pi Pico is a Cortex-M0+, which is M-profile.
(3) QEMU system emulation vs user-mode emulation : these are two different QEMU executables which run guest code in different ways. The system emulation binary (typically qemu-system-arm
) runs "bare metal code", eg an entire OS. The guest code has full control and can handle exceptions, write to hardware devices, etc. The user emulation binary (typically qemu-arm
) is for running Linux user-space binaries. Guest code is started in unprivileged mode and has access to the usual Linux system calls. For system emulation, which CPU is being emulated depends on what machine type you select with the -M
or --machine
option. For user-mode emulation, the default CPU is "A-profile with all supported features enabled" (this is --cpu max
).
You're currently using qemu-arm
which means you get user-mode emulation. This should support Thumb binaries, but unless you pass it a --cpu
option it will be using an A-profile CPU. I would also suggest using a newer QEMU for M-profile work, because a lot of M-profile CPU bugs have been fixed since version 4.2. I think 4.2 is also too old to have the Cortex-M0 CPU.
GDB should tell you in the PSR what the T bit is set to -- use that to check whether you're in Thumb mode or Arm mode, rather than looking at how much the PC is incrementing by.
There's currently no QEMU system emulation of the Raspberry Pi Pico (though somebody has been doing some experimental work on one). If your assembly is just basic "working with registers and a bit of memory" you can do that with the user-mode emulator. Or you can try the 'microbit' machine model, which is a Cortex-M0 board -- if you're not doing things that are specific to the Pi Pico that might be good enough.
Answered By - Peter Maydell
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.