# Tutorial 10 - Privilege Level ## tl;dr In early boot code, we transition from the `Hypervisor` privilege level (`EL2` in AArch64) to the `Kernel` (`EL1`) privilege level. ## Table of Contents - [Introduction](#introduction) - [Scope of this tutorial](#scope-of-this-tutorial) - [Checking for EL2 in the entrypoint](#checking-for-el2-in-the-entrypoint) - [Transition preparation](#transition-preparation) - [Returning from an exception that never happened](#returning-from-an-exception-that-never-happened) - [Are we stackless?](#are-we-stackless) - [Test it](#test-it) - [Diff to previous](#diff-to-previous) ## Introduction Application-grade CPUs have so-called `privilege levels`, which have different purposes: | Typically used for | AArch64 | RISC-V | x86 | | ------------- | ------------- | ------------- | ------------- | | Userspace applications | EL0 | U/VU | Ring 3 | | OS Kernel | EL1 | S/VS | Ring 0 | | Hypervisor | EL2 | HS | Ring -1 | | Low-Level Firmware | EL3 | M | | `EL` in AArch64 stands for `Exception Level`. If you want more information regarding the other architectures, please have a look at the following links: - [x86 privilege rings](https://en.wikipedia.org/wiki/Protection_ring). - [RISC-V privilege modes](https://content.riscv.org/wp-content/uploads/2017/12/Tue0942-riscv-hypervisor-waterman.pdf). At this point, I strongly recommend that you glimpse over `Chapter 3` of the [Programmer’s Guide for ARMv8-A](http://infocenter.arm.com/help/topic/com.arm.doc.den0024a/DEN0024A_v8_architecture_PG.pdf) before you continue. It gives a concise overview about the topic. ## Scope of this tutorial If you set up your SD Card exactly like mentioned in [tutorial 06], the Rpi will always start executing in `EL2`. Since we are writing a traditional `Kernel`, we have to transition into the more appropriate `EL1`. [tutorial 06]: https://github.com/rust-embedded/rust-raspi3-OS-tutorials/tree/master/06_drivers_gpio_uart#boot-it-from-sd-card ## Checking for EL2 in the entrypoint First of all, we need to ensure that we actually execute in `EL2` before we can call respective code to transition to `EL1`: ```rust pub unsafe extern "C" fn _start() -> ! { // Expect the boot core to start in EL2. if (bsp::cpu::BOOT_CORE_ID == cpu::smp::core_id()) && (CurrentEL.get() == CurrentEL::EL::EL2.value) { el2_to_el1_transition() } else { // If not core0, infinitely wait for events. wait_forever() } } ``` If this is the case, we continue with preparing the `EL2` -> `EL1` transition in `el2_to_el1_transition()`. ## Transition preparation Since `EL2` is more privileged than `EL1`, it has control over various processor features and can allow or disallow `EL1` code to use them. One such example is access to timer and counter registers. We are already using them since [tutorial 08](../08_timestamps/), so of course we want to keep them. Therefore we set the respective flags in the [Counter-timer Hypervisor Control register] and additionally set the virtual offset to zero so that we get the real physical value everytime: [Counter-timer Hypervisor Control register]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/regs/cnthctl_el2.rs.html ```rust // Enable timer counter registers for EL1. CNTHCTL_EL2.write(CNTHCTL_EL2::EL1PCEN::SET + CNTHCTL_EL2::EL1PCTEN::SET); // No offset for reading the counters. CNTVOFF_EL2.set(0); ``` Next, we configure the [Hypervisor Configuration Register] such that `EL1` should actually run in `AArch64` mode, and not in `AArch32`, which would also be possible. [Hypervisor Configuration Register]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/regs/hcr_el2.rs.html ```rust // Set EL1 execution state to AArch64. HCR_EL2.write(HCR_EL2::RW::EL1IsAarch64); ``` ## Returning from an exception that never happened There is actually only one way to transition from a higher EL to a lower EL, which is by way of executing the [ERET] instruction. [ERET]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/asm.rs.html#49-62 This instruction will copy the contents of the [Saved Program Status Register - EL2] to `Current Program Status Register - EL1` and jump to the instruction address that is stored in the [Exception Link Register - EL2]. This is basically the reverse of what is happening when an exception is taken. You'll learn about it in an upcoming tutorial. [Saved Program Status Register - EL2]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/regs/spsr_el2.rs.html [Exception Link Register - EL2]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/regs/elr_el2.rs.html ```rust // Set up a simulated exception return. // // First, fake a saved program status where all interrupts were masked and SP_EL1 was used as a // stack pointer. SPSR_EL2.write( SPSR_EL2::D::Masked + SPSR_EL2::A::Masked + SPSR_EL2::I::Masked + SPSR_EL2::F::Masked + SPSR_EL2::M::EL1h, ); // Second, let the link register point to runtime_init(). ELR_EL2.set(runtime_init::runtime_init as *const () as u64); ``` As you can see, we are populating `ELR_EL2` with the address of the [runtime_init()] function that we earlier used to call directly from the entrypoint. Finally, we set the stack pointer for `SP_EL1` and call `ERET`: [runtime_init()]: src/runtime_init.rs ```rust // Set up SP_EL1 (stack pointer), which will be used by EL1 once we "return" to it. SP_EL1.set(bsp::cpu::BOOT_CORE_STACK_START); // Use `eret` to "return" to EL1. This results in execution of runtime_init() in EL1. asm::eret() ``` ## Are we stackless? We just wrote a big inline rust function, `el2_to_el1_transition()`, that is executed in a context where we do not have a stack yet. We should double-check the generated machine code: ```console make objdump [...] Disassembly of section .text: 0000000000080000 _start: 80000: mrs x8, MPIDR_EL1 80004: tst x8, #0x3 80008: b.ne #0x10 <_start+0x18> 8000c: mrs x8, CurrentEL 80010: cmp w8, #0x8 80014: b.eq #0xc <_start+0x20> 80018: wfe 8001c: b #-0x4 <_start+0x18> 80020: mov x8, xzr 80024: mov w9, #0x3 80028: msr CNTHCTL_EL2, x9 8002c: msr CNTVOFF_EL2, x8 80030: adrp x8, #0x0 80034: mov w10, #-0x80000000 80038: mov w11, #0x3c5 8003c: mov w12, #0x80000 80040: msr HCR_EL2, x10 80044: msr SPSR_EL2, x11 80048: add x8, x8, #0xda0 8004c: msr ELR_EL2, x8 80050: msr SP_EL1, x12 80054: eret ``` Looks good! Thanks zero-overhead abstractions in the [cortex-a] crate! :heart_eyes: [cortex-a]: https://github.com/rust-embedded/cortex-a ## Test it In `main.rs`, we additionally inspect if the mask bits in `SPSR_EL2` made it to `EL1` as well: ```console $ make chainboot [...] Minipush 1.0 [MP] ⏳ Waiting for /dev/ttyUSB0 [MP] ✅ Connected __ __ _ _ _ _ | \/ (_)_ _ (_) | ___ __ _ __| | | |\/| | | ' \| | |__/ _ \/ _` / _` | |_| |_|_|_||_|_|____\___/\__,_\__,_| Raspberry Pi 3 [ML] Requesting binary [MP] ⏩ Pushing 15 KiB =========================================🦀 100% 0 KiB/s Time: 00:00:00 [ML] Loaded! Executing the payload now [ 0.703812] Booting on: Raspberry Pi 3 [ 0.704900] Current privilege level: EL1 [ 0.706811] Exception handling state: [ 0.708592] Debug: Masked [ 0.710156] SError: Masked [ 0.711719] IRQ: Masked [ 0.713283] FIQ: Masked [ 0.714848] Architectural timer resolution: 52 ns [ 0.717149] Drivers loaded: [ 0.718496] 1. BCM GPIO [ 0.719929] 2. BCM PL011 UART [ 0.721623] Timer test, spinning for 1 second [ 1.723753] Echoing input now ``` ## Diff to previous