Submitted by MarcAdmin on Wed, 11/06/2019 - 08:03

The previous chapter of this tutorial went over the steps required to setup a RISC-V development environment to create a program that runs on a bare metal VirtIO board using QEMU. Even though the example program – which calculates the sum two integers – was written in RISC-V assembly, no prior knowledge was required to follow along. This chapter will dive into the details of RISC-V assembly language as well as expand on what exactly is happening at each step of the development. The topics covered in this chapter will include an overview of the RISC-V architecture, its assembly instructions, pseudo-instructions, and directives.

The following listing illstrates the assembly code of the add.s program from the previous chapter:

1:         .text
2:         .global _start
3: _start:
4:         li      a0, 5
5:         li      a1, 4
6:         add     a0, a0, a1
7: stop:   j       stop

The code was changed a little to use a different set of registers. This program when assembled results in an object file which can then be linked to create the file that will be loaded onto the board. In this scenario, only the one object file is used, however, more complex programs may required more than one object file. The linker's job is to put all of these files together into a single executable program.

This add.s program is composed of one instruction (6), three pseudo-instructions (4 5, and 7), and two directives (1, 2). Moreover, the instructions and pseudo-instructions have operands comprised of either registers or immediates. Understanding each of these entities will help when creating more complex programs in RISC-V assembly.

Registers

RISC-V systems will have a base set of 32 registers x0-x31. The x0 register is read-only with a value fixed to zero. The rest will have varying content. The application binary interface (ABI) prescribes conventions for the name and usage of the various registers. These are listed in the following table:

Register(s)	ABI Name(s)	Description	Saved by
x0	zero	Hard-wired zero	N/A
x1	ra	Function return address	Caller
x2	sp	Stack pointer	Callee
x3	gp	Global pointer	N/A
x4	tp	Thread pointer	N/A
x5	t0	Temporary/alternate link register	Caller
x6-x7	t1-t2	Temporary values	Caller
x8	s0/fp	Saved register/Frame pointer	Callee
x9	s1	Saved register	Callee
x10-x11	a0-a1	Function arguments/Return values	Caller
x12-x17	a2-a7	Function arguments	Caller
x18-x27	s2-s11	Saved registers	Callee
x28-x31	t3-t6	Temporary values	Caller

Registers can be referred by their ABI names or their actual names in assembly programs; the two are interchangeable.

When a function is invoked, it may modify the values of some of these registers. For this reason it is advisable to save the contents of those registers in memory in order to be able to restore them when the function completes. The ABI convention prescribes which party in a function call (the caller or the callee) is responsible for saving these values. This convention is described in the "Saved by" column of the table.

If a register is to be saved by the caller, its value should be stored in a frame of the stack, that was allocated for that purpose, prior to calling the function. This ensures that the values can be restored when the function returns. In general, it is a good idea to save all of the registers if the caller does not know which registers may be modified by the callee.

Registers to be saved by the callee only need to be saved to memory if the function uses those registers. Functions must not leave a trace, the state of the machine must be the same as it was prior to the function being invoked (with the exception of the desired function result).

A function implementation defined in RISC-V assembly should use the following prologue before doing any of its work:

function_label:
        addi    sp, sp, -framesize      # The stack grows downward
        sd      ra,framesize-8(sp)      # Save the return address
        # Save registers owned by the callee as needed to memory.

This will ensure that the function can return to the point where it was called, and that any register state will be saved.

Before a function returns, the saved register values must be restored. This is achieved by the following epilogue which should end a function call.

# restore registers from the stack if needed
        ld      ra, framesize-8(sp)     # Restore the return address register
        addi    sp, sp, framesize       # Pop the stack
        ret     # return to the caller

This will restore the saved registers, set the return address and de-allocate the stack frame that was used to save this information.

The add.s program can be enhanced to use the function prologue and epliogue to define a function that calculates the sum its arguments in registers a0 and a1. This new program is illustrated in the listing that follows:

        .text
        .align 2
        .global sum
sum:
        addi    sp, sp, -32     # Stack frames must be 16-bit aligned
        sd      ra, 24(sp)      # Save the return address
        add     a0, a0, a1      # Add the function operands
        ld      ra, 24(sp)      # restore return address
        addi    sp, sp, 32      # De-allocate the stack frame
        ret

The sum function can be called by name from a different assembler program. Create a main.s program with the following content:

        .text
        .align 2
        .global _start
_start:
        li      a0, 5
        li      a1, 4
        call    sum
stop:   j       stop

This will load the values 5 and 4 into the registers used for arguments to the the sum function and call it. This can be assembled and linked as follows:

$ riscv64-unknown-elf-as -o add.o add.s
$ riscv64-unknown-elf-as -o main.o main.s
$ riscv64-unknown-elf-ld -Ttext=0x80000000 -o sum.elf main.o add.o

If this program is assembled, linked, and run in QEMU, it will call the sum function to calculate the sum of the operands. This can be verified by inspecting the value of register a0 which should be 9.

NOTE: The order in which the object files are supplied to the linker is important. If the add.o file is supplied first, the program will not run because its content will be located at the reset address.

Instructions

Instructions are mnemonics that map directly to machine codes. For example the add instruction in the sum function corresponds with the op-code 0x33 (or b0110011).

When the add instruction is combined with its operands, the result is a single machine instruction. In RISC-V, all machine instructions are 32-bits long (unless you're using the compressed extension, but for now we'll just deal with 32-bit operations).

The add instruction is what's known as an R-type instruction because its operands are all registers. This type of instruction has the following form:

BITS	31:25	24:20	19:15	14:12	11:7	6:0
R-Type	func7	rs2	rs1	func3	rd	opcode

In this table, the operation's function is a combination of func7 and func3. For the add instruction, this is b0000000 and b000. The rs2 and rs1 field are the source registers whose value will be added. The rd field will be the destination register for the result. Notice that the register fields are 5-bits wide. This allows the instruction to reference any of the 32 base registers (i.e. 2⁵ registers). The add instruction from the previous example will be constructed as follows:

func7: b0000000
rs2: b01011 (for x11 which is a1)
rs1: b01010 (for x10 which is a0)
func3: b000
rd: b01010 (for x10)
opcode: b0110011

Putting it all together, we get b00000000101101010000010100110011, or 0x00B50533. We can confirm this by disassembling the object file that was created:

$ riscv64-unknown-elf-objdump -d add.o

sum.o:     file format elf64-littleriscv


Disassembly of section .text:

0000000000000000 <sum>:
   0:   fe010113                addi    sp,sp,-32
   4:   00113c23                sd      ra,24(sp)
   8:   00b50533                add     a0,a0,a1
   c:   01813083                ld      ra,24(sp)
  10:   02010113                addi    sp,sp,32
  14:   00008067                ret

The add instruction in the sum function is at offset 8 of the object file, the machine instruction is 00b50533 which is the value that we had calculated for the instruction.

In addition to R-Type instructions, there are also I-Type instructions that operate on immediates (literal values), S-Type for storing to memory, U-Type for loading values from memory, B-Type for branching, and J-Type for jumps (e.g. function calls). The following table describes the layout of each of these instruction types:

BITS	31:25	24:20	19:15	14:12	11:7	6:0
R-Type	func7	rs2	rs1	func3	rd	opcode
I-Type	imm[11:5]	imm[4:0]	rs1	func3	rd	opcode
S-Type	imm[11:5]	rs2	rs1	func3	imm[4:0]	opcode
U-Type	imm[31:25]	imm[24:20]	imm[19:15]	imm[14:12]	rd	opcode
B-Type	imm[12,10:5]	rs2	rs1	func3	imm[4:1,11]	opcode
J-Type	imm[20,10:5]	imm[4:1,11]	imm[19:15]	imm[14:12]	rd	opcode

Pseudo-Instructions

Unlike instructions, pseudo-instructions do not map directly to op-codes. Typically these represent idioms to make the programmer's life a little easier.

For example, the li in the sum program is an example of a pseudo-instruction. As explained in the previous chapter, this pseudo-instruction maps to an addi I-Type instruction which adds the value of x0 to the immediate value and stores the result in the destination register. Pseudo-instructions provide convenient mnemonics for programming without adding additional op-codes.

Pseudo-instructions may also translate to more than one assembler instruction. For example, the call pseudo-instruction will be
translated into a sequence of three instructions: auipc, addi, and jal.

Directives

Directives are commands for the assembler rather than instructions that it will translate into machine code. Directives can be used to tell the assembler where to place code and data in the resulting object file, or to setup the memory of the target system. The previous example used assembler directives to export global symbols, to set the alignment for instructions, and to ensure that the code is assembled into the ".text" section of the object file.

To understand the purpose of the assembler directives, it is important to understand how assembled code is linked together. The assembler produces object files that are combined to produce an Executable and Linkable Format (ELF) file. This file will be segmented into different sections:

.text: CPU instructions (the executable code).
.rodata: Read-only data.
.data: Global, mutable, initialized data.
.bss: Global, mutable, un-initialized data.

Up to now, only the text section has been used. The location where the code is loaded was specified using the "-T" option when invoking the linker. If multiple object files are passed to the linker, their text sections merged into a single contiguous section.

Code and data will have different run-time requirements. Code is generally read-only where as data my required read-write permissions. Therefore it is advantageous that code and data are not interleaved. To ensure this, the locations of text and data sections of the program should not overlap.

To avoid having to define the position of each section at the command line, the linker allows the memory layout to be defined using a linker script:

OUTPUT_ARCH( "riscv" )
SECTIONS {
	. = 0x80000000;
	.text : {
		PROVIDE(_text_start = .);
		.*(.text.init)
		main.o (.text)
		.*(.text .text.*)
		PROVIDE(_text_end = .);
	}
	PROVIDER(_global_pointer = ,);
	.rodata : {
		PROVIDE(_rodata_start = .);
		.*(.rodata .rodata.*)
		PROVIDE(_rodata_end = .);
	}
	.data : {
		. = ALIGN(4096);
		PROVIDE(_data_start = .);
		.*(.sdata .sdata.*) *(.data .data.*)
		PROVIDE(_data_end = .);
	}
	.bss : {
		PROVIDE(_bss_start = .);
		.*(.sbss .sbss.*) *(.bss .bss.*)
		PROVIDE(_bss_end = .);
	}
	PROVIDE(_stack_start = _bss_end);
	PROVIDE(_stack_end = _stack_start + 0x8000);
}

The SECTIONS keyword is used to specify how the various sections are layed out in the file. In the linker script shown previously, the .text, .rodata, .data, and .bss sections are defined.

The .text section will include all code that follows a .section .text.init or .text assembler directive. The sum.s and main.s will therefore both be included in this section. On line 7 of the linker script, the text section of the main.o object file is included explicitly. This will ensure that the main program appears in the linked program before the sum function does (which is included by the wildcard on the next line).

The PROVIDE keyword is used to define a symbol at the address of the definition. The start and end of each of the sections will be provided by the linker. Moreover, the start and end of the stack memory area can be declared in this way. In a later chapter, these symbols will be used to setup the stack pointer.

Putting it All Together

The program can now be assembled and linked using the following sequence of commands:

$ riscv64-unknown-elf-as -o sum.o sum.s
$ riscv64-unknown-elf-as -o main.o main.s
$ riscv64-unknown-elf-ld -T linker.lds -o sum.elf main.o sum.o

This will produce an ELF file called sum.elf. By inspecting the sum.elf file, we can see that the _start symbol shows up before the sum function:

$ riscv64-unknown-elf-objdump -d sum.elf 

sum.elf:     file format elf64-littleriscv


Disassembly of section .text:

0000000080000000 <_start>:
    80000000:	00500513          	li	a0,5
    80000004:	00400593          	li	a1,4
    80000008:	00009117          	auipc	sp,0x9
    8000000c:	ff810113          	addi	sp,sp,-8 # 80009000 <_stack_end>
    80000010:	008000ef          	jal	ra,80000018 <sum>

0000000080000014 <stop>:
    80000014:	0000006f          	j	80000014 <stop>

0000000080000018 <sum>:
    80000018:	fe010113          	addi	sp,sp,-32
    8000001c:	00113c23          	sd	ra,24(sp)
    80000020:	00b50533          	add	a0,a0,a1
    80000024:	01813083          	ld	ra,24(sp)
    80000028:	02010113          	addi	sp,sp,32
    8000002c:	00008067          	ret

The disassembled sum.elf also shows that the call pseudo-instruction was translated to the following sequence of instructions:

auipc   sp,0x9
addi    sp,sp,-8
jal     ra,80000018

This program can be run in QEMU just as before and the result should be the same as previous runs.

Conclusion

This chapter of the Bare Metal RISC-V tutorial covered the assembly language in a little more details. Assembly programs are made up of directives, pseudo-instructions, and instructions. Directives provide guidance to the assembler on how to organize the assembled code. Pseudo-instructions provide useful mnemonics that are mapped to one or more primitive assembler instructions. Instructions are translated into binary machine instructions which direct the execution flow of the processor. In future chapters, this information will be utilised to make the RISC-V processors do more intersting things.

RISC-V Bare Metal Programming Chapter 2: OpCodes Assemble!