Skip to content

Commit

Permalink
Match emulator to circuits: use a Harvard Architecture (#688)
Browse files Browse the repository at this point in the history
Our circuits use a [modified](Modified_Harvard_architecture) [Harvard
architecture](https://en.wikipedia.org/wiki/Harvard_architecture) where
the memory space for instructions and data are completely separated,
even though they are both addressed with 32 bits. That is a Good Thing.

That means writing to the data cell with address x does not change the
value of the instruction cell with address x.

In contrast, our emulator uses a [von Neumann
architecture](https://en.wikipedia.org/wiki/Von_Neumann_architecture) at
its core, but goes through quite a few gymnastics to hide that fact.

> In a von Neumann architecture, data memory and stored program memory
are intermixed. Modern computers have moved away from this architecture
for [security](https://en.wikipedia.org/wiki/W%5EX) and performance
reasons.

In this PR, we turn our emulator into a Harvard Architecture machine as
well to match what our circuits are doing. That means we change the
emulator to explicitly load instructions not from the data RAM, but from
the program ROM, which we are already storing separately anyway.

> Instead of thinking about separate address spaces for data and code,
you can also imagine that we have an [instruction cache and a data
cache](https://en.wikipedia.org/wiki/Modified_Harvard_architecture#Split-cache_(or_almost-von-Neumann)_architecture),
and that we load the instructions into their cache once at the start of
the program, and then never invalidate that cache.

This works towards #630
  • Loading branch information
matthiasgoergens authored Dec 5, 2024
1 parent f97ae15 commit 43a9617
Show file tree
Hide file tree
Showing 4 changed files with 34 additions and 11 deletions.
28 changes: 22 additions & 6 deletions ceno_emul/src/rv32im.rs
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,10 @@ pub trait EmuContext {
// Get the value of a memory word without side-effects.
fn peek_memory(&self, addr: WordAddr) -> Word;

// Load from memory, in the context of instruction fetching.
// Only called after check_insn_load returns true.
fn fetch(&mut self, pc: WordAddr) -> Result<Word> {
self.load_memory(pc)
}
/// Load from instruction cache
// TODO(Matthias): this should really return `Result<DecodedInstruction>`
// because the instruction cache should contain instructions, not just words.
fn fetch(&mut self, pc: WordAddr) -> Option<Word>;

// Check access for instruction load
fn check_insn_load(&self, _addr: ByteAddr) -> bool {
Expand Down Expand Up @@ -465,12 +464,29 @@ impl Emulator {
pub fn step<C: EmuContext>(&self, ctx: &mut C) -> Result<()> {
let pc = ctx.get_pc();

// TODO(Matthias): `check_insn_load` should be unnecessary: we can statically
// check in `fn new_from_elf` that the program only has instructions where
// our platform accepts them.
if !ctx.check_insn_load(pc) {
ctx.trap(TrapCause::InstructionAccessFault)?;
return Err(anyhow!("Fatal: could not fetch instruction at pc={:?}", pc));
}
let Some(word) = ctx.fetch(pc.waddr()) else {
ctx.trap(TrapCause::InstructionAccessFault)?;
return Err(anyhow!("Fatal: could not fetch instruction at pc={:?}", pc));
};

let word = ctx.fetch(pc.waddr())?;
// TODO(Matthias): our `Program` that we are fetching from should really store
// already decoded instructions, instead of doing this weird, partial checking
// for `0x03` here.
//
// Note how we can fail here with an IllegalInstruction, and again further down
// when we match against the decoded instruction. We should centralise that. And
// our `step` function here shouldn't need to know anything about how instruction
// are encoded as numbers.
//
// One way to centralise is to do the check once when loading the program from the
// ELF.
if word & 0x03 != 0x03 {
// Opcode must end in 0b11 in RV32IM.
ctx.trap(TrapCause::IllegalInstruction(word))?;
Expand Down
3 changes: 3 additions & 0 deletions ceno_emul/src/tracer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,9 @@ impl Tracer {
self.record.pc.after = pc;
}

// TODO(Matthias): this should perhaps record `DecodedInstruction`s instead
// of raw codes, or perhaps only the pc, because we can always look up the
// instruction in the program.
pub fn fetch(&mut self, pc: WordAddr, value: Word) {
self.record.pc.before = pc.baddr();
self.record.insn_code = value;
Expand Down
12 changes: 8 additions & 4 deletions ceno_emul/src/vm_state.rs
Original file line number Diff line number Diff line change
Expand Up @@ -190,10 +190,14 @@ impl EmuContext for VMState {
*self.memory.get(&addr).unwrap_or(&0)
}

fn fetch(&mut self, pc: WordAddr) -> Result<Word> {
let value = self.peek_memory(pc);
self.tracer.fetch(pc, value);
Ok(value)
// TODO(Matthias): this should really return `Result<DecodedInstruction>`
fn fetch(&mut self, pc: WordAddr) -> Option<Word> {
let byte_pc: ByteAddr = pc.into();
let relative_pc = byte_pc.0.wrapping_sub(self.program.base_address);
let idx = (relative_pc / WORD_SIZE as u32) as usize;
let word = self.program.instructions.get(idx).copied()?;
self.tracer.fetch(pc, word);
Some(word)
}

fn check_data_load(&self, addr: ByteAddr) -> bool {
Expand Down
2 changes: 1 addition & 1 deletion ceno_emul/tests/test_vm_trace.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ fn test_empty_program() -> Result<()> {
);
let mut ctx = VMState::new(CENO_PLATFORM, empty_program);
let res = run(&mut ctx);
assert!(matches!(res, Err(e) if e.to_string().contains("IllegalInstruction(0)")));
assert!(matches!(res, Err(e) if e.to_string().contains("InstructionAccessFault")),);
Ok(())
}

Expand Down

0 comments on commit 43a9617

Please sign in to comment.