Embedded - Geek went Freak!

Embedded

Ubuntu: Cross-compile baremetal Cortex-M assembly program

In this post, we will cross-compile a small baremetal program for ARM processor on an Ubuntu machine.

ARM cross-compile toolchain

First step is to install the ARM cross-compiler toolchain. Luckily Ubuntu already has it in its software repository. Execute the following command in the terminal to install ARM EABI compatible tool chain:

sudo apt install gcc-arm-none-eabi

Check the version of the installed compiler using the following command:

arm-none-eabi-gcc --version

Sample baremetal program

Now, we need a sample baremetal program to compile. I have choosen a very simple assembly program.

startup.S

.global _start
_start:
  B _reset /* Reset */
  B . /* Undefined */
  B . /* SWI */
  B . /* Prefetch Abort */
  B . /* Data Abort */
  B . /* reserved */
  B . /* IRQ */
  B . /* FIQ */

_reset:
  mov r1, #10
  ldr r0, =0x20000000
  str r1, [r0]
  ldr r2, [r0]
  B .

Assemble

Lets assemble the assembly file using GCC assembler.

arm-none-eabi-as -mcpu=cortex-m3 -g startup.S -o startup.out

Link

Finally lets link the object file startup.out generated by the assembler.

arm-none-eabi-ld -Ttext=0x0 -o startup.elf startup.out

Note: Since the program is very simple, I haven’t used any linker script here.

-Ttext=0x0 option instructs the linker to use 0x0 as the starting address of the instructions.

Ubuntu: Emulate baremetal Cortex-M program

In this post, we will emulate a baremetal program for Cortex-M on Ubuntu PC.

Installation

We will need

  1. QEMU emulator for ARM
  2. GDB

Fortunately both of them are available through Ubuntu software repository.

Install them using the following command:

sudo apt install qemu-system-arm
sudo apt install gdb-arm-none-eabi

Emulation

We will use QEMU for emulation. GDB is used to control and inspect QEMU.

Launch QEMU

qemu-system-arm -monitor stdio -machine lm3s811evb -cpu cortex-m3 -s -S -kernel startup.elf
  • -monitor stdio
    Access QEMU HMI monitor from terminal
  • -machine lm3s811evb -cpu cortex-m3
    Select machine lm3s811evb and CPU cortex-m3
  • -s
    Start GDB server on localhost:1234
  • -S
    Don’t start execution. This is used so we can start and control execution from GDB
  • -kernel startup.elf
    The executable file to execute

Launch GDB client

arm-none-eabi-gdb startup.elf

You should now be in GDB interactive console.

Connect to QEMU

Lets connect to GDB server hosted by QEMU from the GDB client

target remote localhost:1234
load

Run the program

continue

Inspect

Press <Ctrl-c> to stop execution.

Check registers

In lines 13, 14 and 16, we update registers r1, r0 and r2 respectively. They should hold values 0x20000000, 10 and 10 respectively.

info reg r0 r1 r2

Should print:

r0 0x20000000 536870912
r1 0xa 10
r2 0xa 10

Check memory

We write value 10 to memory address 0x20000000. Lets check if that worked correctly:

x/4wx 0x20000000

0x20000000: 0x0000000a 0x00000000 0x00000000 0x00000000

STM32F: Calculating APB clock frequency (PCLKx)

The clock frequency of APB is determined through a long sequence of prescaling and selecting as shown in the image below:

APB clock source flow

Note: In this post, external oscillator and PLL are used to select SYSCLK.

Term Explanation
HSE External clock frequency
PLLM PLL division factor
PLLN PLL multiplication factor
PLLP SYSCLK division factor
HPRE AHB prescaler
PPREx APBx prescaler

PLL

$$tex fVCO = \frac{HSE}{PLLM} * PLLN tex$$

SYSCLK

$$tex SYSCLK = \frac{fVCO}{PLLP} tex$$

AHB clock

$$tex HCLK = \frac{SYSCLK}{HPRE} tex$$

APB clock

$$tex PCLKx = \frac{HCLK}{PPREx} tex$$

An example

Lets consider an external oscillator of frequency 16MHz. Lets say we need a SYSCLK and HCLK of 168MHz.

>> HPRE = 1

This leaves us with,

$$tex \frac{PLLN}{PLLM * PLLP} = \frac{SYSCLK}{HSE} tex$$
$$tex \frac{PLLN}{PLLM * PLLP} = 10.5 tex$$

We can settle with the following values:

>> PLLN = 336
>> PLLM = 16
>> PLLP = 2  

Now, for a PCLKx of 42MHz, we can pick,

>> PPREx = 4  

STM32F: What is PCLK and fPCLK

Couple of peripherals like SPI and UART derive their clock from the fPCLK. So what clock is PCLK and what frequency is fPCLK?

PCLKx is the clock of the corresponding APB peripheral X. For example:

Clock Bus
PCLK1 APB1
PCLK2 APB2

Note: Similarly, HCLKx is the clock of the corresponding AHB peripheral X. For example:

Clock Bus
HCLK1 AHB1
HCLK2 AHB2

So, when SPI2 says it derives its clock for fPCLK, what it means is the clock of its APB bus. In STM32F407, SPI2 is on APB1. So this makes its fPCLK fPCLK1.

LPC810: UART baudrate configuration

This post is about UART baudrate configuration in LPC810.

FRG and BRG can be used to derive the desired baudrate.

Block diagram

Setup BRG

BRG should produce output clock rate 16 times the desired baudrate. The input clock to BRG is BASECLK.

$$tex BRGVAL = \frac{BASECLK}{16 * Baudrate} tex$$
//Setup BRG
LPC_USART0->BRG = MAINCLK / (16 * aBaudRate);

Setup FRG

Output clock from FRG is common for all UART peripherals.

$$tex UARTFRGMUL = \frac{FRGINCLK*(UARTFRGDIV+1)}{16 * Baudrate * BRGVAL} - (UARTFRGDIV+1) tex$$

It is easier, if we set UARTFRGDIV to 255.

//Set up FRG
LPC_SYSCON->UARTFRGDIV = 0xFF;
LPC_SYSCON->UARTFRGMULT = ((MAINCLK * 256) / (16 * aBaudRate * LPC_USART0->BRG))
    - 256;

Setup clock to FRG

$$tex UARTDIV = \frac{MAINCLK}{FRGINCLK} tex$$
//Setup clock to FRG
LPC_SYSCON->UARTCLKDIV = UARTCLKDIV;

ARM SWD: SWO vs SWV

The pin and the protocol

SWD (Serial Wire Debug) is a minimal pin debug and trace port. At a very minimal configuration, SWD consists of a debug port matching JTAG’s functionality but with just 2 wires.

These two wires are:

  1. SWCLK
  2. SWDIO

SWCLK is the clock for the synchronous bi-directional half-duplex communication channel SWDIO that runs between the host and the target. The host can access the DAP (Debug Access Port) as per ADI (ARM Debug Interface specification) through SWDIO pin.

ARM cores have advanced tracing functionality enabled by ITM, ETM, DWT, etc. These are asynchronous trace messages needed to be sent from the processor to the host.

In addition to SWCLK and SWDIO, another wire can optionally be added to obtain trace functionality. This wire is called SWO (Serial Wire Output).

SWO is a unidirectional asynchronous pin with trace data flowing from the target to the host. SWV data can be sent over SWO pin using either UART or Manchester encoded.

SWD+SWO
SWD+SWO

When the trace data is sent through SWO pin, it is called Serial Wire View.

Trace data can also be sent through parallel data bus called TPIU.

Conclusion

SWO is a pin/wire in SWD port whereas SWV is a tracing protocol and technology that is sent through the SWO pin.

References

  1. CoreSight Technology

NVIC: Interrupt preemption

NVIC has a very good priority, priority grouping and preemption support.

For a good preemption support, the interrupt must also support priorities and priority grouping.

Strictly speaking, priorities are all we need for good preemption support. The idea is to preempt the lower priority active interrupt with a higher priority pending interrupt. This is done decrease interrupt latency for higher priority tasks.

Nesting and stacking

TODO

How priority grouping helps interrupt preemption?

TODO

CPS instruction: Difference between Cortex-M vs others

Interrupts

Cortex-A and Cortex-M families have different interrupt and exception models.

Cortex-A has traditional interrupts through IRQ and FIQ. While Cortex-M has vector table supported by NVIC controller.

In Cortex-A, IRQ and FIQ are enabled/disabled using I and F flags in CPSR register.

#Enable IRQ interrupt cpsie i #Enable FIQ interrupt cpsie f

#Disable IRQ interrupt cpsid i #Disable FIQ interrupt cpsid f

One can also directly manipulate I and F flags in CPSR register using msr and mrs instructions.

I_BIT = 0x80 F_BIT = 0x40

#Disables IRQ and FIQ interrupts mrs r0, cpsr orr r0, r0, #I_BIT|F_BIT msr cpsr_c, r0

#Enables IRQ and FIQ interrupts mrs r0, cpsr bic r0, r0, #I_BIT|F_BIT msr cpsr_c, r1

In Cortex-M, there are no IRQ and FIQ. Interrupts can be disabled and enabled using PRIMASK and FAULTMASK registers.

msr PRIMASK, r0 msr FAULTMASK, r0

Even though Cortex-M doesn’t have either CPSR or I and F flags, it has cpsie and cpsid instructions to enable and disable interrupts and fault exceptions. When cpsi instructions are used in Cortex-M micro-controllers, they affect PRIMASK and FAULTMASK rather than CPSR register.

#Enable interrupts and configurable fault handlers (clear PRIMASK) cpsie i #Enable interrupts and fault handlers (clear FAULTMASK) cpsie f

#Disable interrupts and configurable fault handlers (set PRIMASK) cpsid i #Disable interrupts and all fault handlers (set FAULTMASK) cpsid f

Execution modes

Cortex-A processors have several execution modes. Current mode can be read and changed through 5 least significant bits of CPSR register.

C_BIT = 0x1F USER_BITS = 0b10000

mrs r0, cpsr bic r0, #C_BIT orr r0, #USER_BITS msr CPSR_c, r0

Cortex-M processor has only two execution modes. They are Thread and Handler modes. Current mode can be read and changed through least significant bit of control register.

msr CONTROL, r0

Cortex-M Program Status Register

Since M-profile has discarded IRQ and FIQ exceptions and also execution modes, these bits in PSR are unnecessary. So, M-profile adopts new PSR format and registers.

CPSR in non M-profile processors
CPSR

Mode, I and F bits are meaningless in M-profile micro-controllers. Instead of them, M-profile adds other bits like ISR number, IT/ICI, etc.

PSR in non M-profile processors
PSR

It can be read in assembly using the following code:

mrs <rd>, PSR

These bits can also be accessed separately:

  1. APSR: Application Program Status Register
    • ALU flags
    • N, Z, C, C flags
    • mrs <rd>, APSR
  2. IPSR: Interrupt Program Status Register
    • Interrupt/Exception number
    • mrs <rd>, IPSR
  3. EPSR: Exception Program Status Register
    • IT, ICI, T bits
    • mrs <rd>, EPSR