Back to articles
CPU Architecture

Building a RISC-V Emulator in C

Create a functional RISC-V emulator from scratch, implementing the RV32I instruction set.

kos1
January 2, 2026
20 min read

Introduction

RISC-V is an open-source instruction set architecture that's gaining massive adoption. Building an emulator is the best way to truly understand how CPUs work. We'll implement RV32I - the base 32-bit integer instruction set.

RISC-V Architecture Basics

RISC-V is elegantly simple:

  • 32 registers - x0-x31 (x0 is hardwired to zero)
  • Program counter - Points to current instruction
  • Fixed 32-bit instructions - Easy to decode
  • Load/store architecture - Only loads/stores access memory

CPU State Structure

struct CPU {
    regs: [u32; 32],    // 32 general-purpose registers
    pc: u32,            // Program counter
    memory: Vec,    // RAM
}

impl CPU {
    fn new() -> Self {
        let mut cpu = CPU {
            regs: [0; 32],
            pc: 0,
            memory: vec![0; 1024 * 1024], // 1MB RAM
        };
        cpu.regs[2] = 0x10000; // Stack pointer (sp)
        cpu
    }

    fn read_reg(&self, idx: usize) -> u32 {
        if idx == 0 { 0 } else { self.regs[idx] }
    }

    fn write_reg(&mut self, idx: usize, val: u32) {
        if idx != 0 { self.regs[idx] = val; }
    }
}

Instruction Formats

RV32I has six instruction formats. Each 32-bit instruction encodes:

// R-type: register-register operations
// | funct7 | rs2 | rs1 | funct3 | rd | opcode |
//    7       5     5      3       5      7

// I-type: immediate operations  
// | imm[11:0] | rs1 | funct3 | rd | opcode |
//     12        5      3       5      7

// S-type: stores
// | imm[11:5] | rs2 | rs1 | funct3 | imm[4:0] | opcode |

// B-type: branches
// | imm[12|10:5] | rs2 | rs1 | funct3 | imm[4:1|11] | opcode |

Instruction Decoding

fn decode(&self, inst: u32) -> DecodedInst {
    let opcode = inst & 0x7F;
    let rd = ((inst >> 7) & 0x1F) as usize;
    let funct3 = (inst >> 12) & 0x7;
    let rs1 = ((inst >> 15) & 0x1F) as usize;
    let rs2 = ((inst >> 20) & 0x1F) as usize;
    let funct7 = (inst >> 25) & 0x7F;

    // I-type immediate (sign-extended)
    let imm_i = ((inst as i32) >> 20) as u32;

    // S-type immediate
    let imm_s = ((inst >> 7) & 0x1F) 
              | (((inst >> 25) & 0x7F) << 5);

    DecodedInst { opcode, rd, funct3, rs1, rs2, funct7, imm_i, imm_s }
}

Executing Instructions

fn execute(&mut self) {
    let inst = self.fetch();
    let d = self.decode(inst);

    match d.opcode {
        0x33 => { // R-type (ADD, SUB, etc.)
            let result = match (d.funct3, d.funct7) {
                (0x0, 0x00) => self.read_reg(d.rs1).wrapping_add(self.read_reg(d.rs2)), // ADD
                (0x0, 0x20) => self.read_reg(d.rs1).wrapping_sub(self.read_reg(d.rs2)), // SUB
                (0x7, 0x00) => self.read_reg(d.rs1) & self.read_reg(d.rs2), // AND
                (0x6, 0x00) => self.read_reg(d.rs1) | self.read_reg(d.rs2), // OR
                (0x4, 0x00) => self.read_reg(d.rs1) ^ self.read_reg(d.rs2), // XOR
                _ => panic!("Unknown R-type"),
            };
            self.write_reg(d.rd, result);
        }
        0x13 => { // I-type (ADDI, etc.)
            let result = match d.funct3 {
                0x0 => self.read_reg(d.rs1).wrapping_add(d.imm_i), // ADDI
                0x7 => self.read_reg(d.rs1) & d.imm_i, // ANDI
                0x6 => self.read_reg(d.rs1) | d.imm_i, // ORI
                _ => panic!("Unknown I-type"),
            };
            self.write_reg(d.rd, result);
        }
        0x03 => { // Loads (LW, LB, etc.)
            let addr = self.read_reg(d.rs1).wrapping_add(d.imm_i);
            let val = self.mem_read32(addr);
            self.write_reg(d.rd, val);
        }
        // ... more opcodes
        _ => panic!("Unknown opcode: {:07b}", d.opcode),
    }

    self.pc += 4; // Advance to next instruction
}

Running a Program

fn run(&mut self) {
    loop {
        self.execute();
        if self.pc >= self.memory.len() as u32 {
            break;
        }
    }
}

Conclusion

A basic RISC-V emulator can be built in a few hundred lines of code. From here, you can add the M extension (multiply/divide), compressed instructions, or even implement a simple operating system.