Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial PVM test vectors #3

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open

Conversation

koute
Copy link

@koute koute commented Jun 24, 2024

Initial PVM test vectors/test suite.

This is still incomplete; not every instruction is covered yet and only very simple test cases were added. I will be expanding this aggressively.

Since we will still be making some changes (e.g. 64-bit support) I'll be explicitly versioning this, with a detailed changelog so that anyone who uses these tests can easily keep up.

pvm/README.md Outdated Show resolved Hide resolved
@sourabhniyogi
Copy link

@koute Can you kindly add test cases for the host functions of Appendix B.6/B.7/B.8?

Here are 4 groups, in priority order:

  1. LOOKUP, READ, WRITE, SOLICIT, HISTORICAL_LOOKUP, IMPORT, EXPORT [7]
  2. NEW, MACHINE, PEEK, POKE, INVOKE, TRANSFER [6]
  3. QUIT, INFO, GAS, CHECKPOINT, FORGET [5]
  4. EMPOWER, DESIGNATE, ASSIGN, UPGRADE, EXPUNGE [5]

The first group is DA-centric, the second group is service+VM setup/invocations -- the first 2 groups are valuable to connect to code up connections to state merklization + erasure coding, whereas the latter 2 groups can be done later as they are bookkeeping oriented and are easy to get right once we solve the first 2 groups.

Thank you!

@ec2
Copy link

ec2 commented Jul 8, 2024

How are these programs supposed to be consumed? The program blob parse expects things like the program to start with BLOB_MAGIC.

@koute
Copy link
Author

koute commented Jul 9, 2024

Can you kindly add test cases for the host functions of Appendix B.6/B.7/B.8?

For these initial test vectors the priority is to get basic tests for the instruction set ready. I will also add some host call tests later, but comprehensive test suite for all host calls is probably out-of-scope, at least for pure PVM tests.

How are these programs supposed to be consumed?

Take a look at the schema to see to which parameters of the Ψ equation from the Gray Paper they correspond to, and use them accordingly to test your own PVM implementation.

The program blob parse expects things like the program to start with BLOB_MAGIC.

Yes, my PolkaVM uses its own container format for the program blobs which the GP doesn't use. PolkaVM is not the source of truth for how a PVM should work, the GP is.

@ec2
Copy link

ec2 commented Jul 9, 2024

@koute Good point on the GP being the source of truth.

Take a look at the schema to see to which parameters of the Ψ equation from the Gray Paper they correspond to, and use them accordingly to test your own PVM implementation.

For context I'm building FFI bindings to PolkaVM. So would you say that these tests in particular are for folks who are implementing the PVM from scratch?

@sourabhniyogi
Copy link

Since we will still be making some changes (e.g. 64-bit support) I'll be explicitly versioning this, with a detailed changelog so that anyone who uses these tests can easily keep up.

We passed all the test vectors you provided so far. What is the reason for the GP needing to support 32-bit registers while the contracts pallet should definitely aim for 64-bit? After you have 64-bit PVM engineered for contracts pallet shouldn't the GP be adjusted to be 64-bit?

@koute
Copy link
Author

koute commented Jul 13, 2024

So would you say that these tests in particular are for folks who are implementing the PVM from scratch?

Yes.

What is the reason for the GP needing to support 32-bit registers while the contracts pallet should definitely aim for 64-bit?

This is only temporary. We will be migrating GP to 64-bit too; we just need to first prototype the design in PolkaVM to make sure it's solid. (Otherwise we might end up with a design that looks good on paper but is bad in practice.) We're working on it right now.

(That said, the changes when migrating to 64-bit won't be huge - the registers will be extended and there will be a couple of new instructions, but that's about it as far as major changes go.)

As far as the instruction set and the core semantics are concerned, we aim to have both PolkaJAM and pallet-contracts in alignment and we're making effort to make sure they don't diverge. (With PolkaJAM having the priority here, but I believe we can support both with the same VM.)

@ec2
Copy link

ec2 commented Jul 17, 2024

In GP(A.1), the program is defined as follows:
image

E(|c|) is the SCALE compact integer encoding on the length of c. So in this test case, |c| should be 3 (8, 135, 9), so shouldn't E(|c|) be [12] instead of 3?

Edit: Looks like I misunderstood |k| = |c|. It seems like this is talking about the bit length of the mask being equal to the the byte length of the instructions rounded to 8. Encoding question still stands though.

@koute
Copy link
Author

koute commented Jul 18, 2024

@ec2 Where did you read that these are SCALE compact integers? These are not SCALE compact integers. From the GP:

e

If you look at this equation and crosscheck it with how parity-scale-codec encodes compact integers you can see that they're not the same.

This is a slightly different varint serialization format which:

  • only supports up to 64-bits,
  • is more efficient to decode,
  • encodes numbers which fit within 7-bits as if raw-encoded,
  • is uniform and more compact when dealing with small numbers (SCALE compact encoding uses one more byte to encode integers between 64..128 and 16384..2097152, but needs one less byte to encode integers between 268435456..1073741824).

@ec2
Copy link

ec2 commented Jul 18, 2024

@koute GP(Appendix I.3) says that E() is the SCALE encode function. I also did see the screenshot you posted from the GP.

I'm not super familiar with SCALE so I assumed that the screenshot just formally describes how SCALE does variable int encoding.

@ec2
Copy link

ec2 commented Jul 19, 2024

@koute Sorry to keep hounding you here! I think I found a discrepancy between the testcases and the GP.
The test case inst_move_reg.json tests move_reg (opcode 82).

According to the GP,
image

In the test, the supplied arg to move_reg is [121]. And so, r_A = min(12, 121%16) = 9 and r_D = min(12, 121/16) = 7. So the mutation will end up being reg[7] = reg[9].

The test case has only initial-regs[7] set to 1 and 0 elsewhere. And so the expected mutation is reg[7] = reg[9] = 0.

TLDR: I think the impl of PVM that made these test cases have the arguments for move_reg flipped.

@sourabhniyogi
Copy link

I will also add some host call tests later, but comprehensive test suite for all host calls is probably out-of-scope, at least for pure PVM tests.

Alright, we don't want to interrupt your deep work but legend has it you implemented PVM in a day =) so if its not too much to ask ... could you give us the simplest "Jam Service" byte code (for a refine+accumulate) for us to implement many of the basic host functions? Given one good example we can probably fill in the rest and provide a few more back.

My idea of the simplest "Jam Service" byte code is to compute the sum of squares for a set of integer work items, like
Work Items in a Work Package: 5, 7, 9
Refine: squares the work items, exports 25, 49, 81
Accumulate: reads the result of refine ( 25, 49, 81 ) and writes to a service's storage
We can attempt to build the byte code by hand like it is 1964 but maybe you already have something "simple" like this that you can share?

If not, do you have a better recommendation for simplest "Jam Service"? Or, a strategy that is better than hand building byte code?

This sort of baby JAM test case will help teams get baby JAM implementations blood flowing, and set up a low V (like V=6) cluster complete with QUIC, erasure coding, Patricia Merkle Trie, BMT proofs, and so on.

@xlc
Copy link
Contributor

xlc commented Jul 30, 2024

I am confused about trap vs halt vs panic in PVM. In GP, the trap instruction will exit with the black square, so does the jump to 2^32-2^16 address. To my understanding, that is exit 0. But in the pvm testvector, trap is panic and the test of the trap instruction will result trap exit status but the inst_ret_halt test results halt. There is some inconsistency.

-- (called "panic" in the Graypaper)
-- the execution ended with a trap (the `trap` instruction was executed, the execution went "out of bounds", an invalid jump was made, or an invalid instruction was executed)
trap,

Another question. jump_ind is using djump which can be used to exit the program. But how about the one using branch? e.g. jump. What happen to jump into the exit address? panic or halt?

@koute
Copy link
Author

koute commented Jul 30, 2024

Sorry to keep hounding you here! I think I found a discrepancy between the testcases and the GP.

@ec2 Yes, indeed, there is. We will fix it soon. Thanks! We highly appreciate anyone who helps crosscheck these.

Alright, we don't want to interrupt your deep work but legend has it you implemented PVM in a day =) so if its not too much to ask ... could you give us the simplest "Jam Service" byte code (for a refine+accumulate) for us to implement many of the basic host functions?

@sourabhniyogi The rumors of my exploits seem to be grossly exaggerated; it was actually two days, not one. :P

Anyway, we will most likely put up some more tests out in the future, but for now if you quickly want something to test with then your best bet would be to build one yourself.

You don't have to build a blob by hand; you could use my work-in-progress PVM assembler. For example:

$ git clone https://github.com/koute/polkavm.git
$ cd polkavm
$ cargo run -p polkatool -- assemble tools/spectool/spec/src/inst_branch_greater_or_equal_signed_ok.txt -o output.polkavm
$ cargo run -p polkatool disassemble --show-raw-bytes output.polkavm

This will output the program in a PolkaVM-specific container (which is not part of the GP), but you can extract the code blob with a simple Rust program - use polkavm_common::program::ProgramParts::from_bytes to load the blob and then the code_and_jump_table field will have the raw program bytes.

I am confused about trap vs halt vs panic in PVM. In GP, the trap instruction will exit with the black square, so does the jump to 2^32-2^16 address.

@xlc

  • "halt" is meant to be a normal termination (dynamic jump to 0xffff0000)
  • "panic" (called a "trap" here) is meant to be an abnormal termination

Hm, you're right that the trap instruction in the GP is specified to halt instead of panicking; this should have been a panic instead. I'll see about correcting this; thanks.

@clw8998
Copy link

clw8998 commented Sep 3, 2024

Hello @koute , I recently encountered some issues while using your PVM.

Here’s my code:

pub @main:
    a1 = 0

When I use the following command to compile:

cargo run -p polkatool -- assemble ./test_txt_code/test.txt -o test.pvm

The bytecode content of test.pvm is as follows:

[80, 86, 77, 0, 1, 5, 7, 1, 0, 4, 109, 97, 105, 110, 6, 6, 0, 0, 2, 4, 8, 253, 0]

My question is, how do I extract the pure program portion as defined in GP_0.36(213), because it seems the first part contains some ASCII-encoded section names.

ASCII encoded section name:

[80, 86, 77, 0, 1, 5, 7, 1, 0, 4, 109, 97, 105, 110, 6, 6]
// [80, 86, 77] "PVM" in ASCII
// [109, 97, 105, 110] "main" in ASCII

GP_0.36(213) should be:

[0, 0, 2, 4, 8, 253, 0]

@charliewinston14
Copy link

charliewinston14 commented Nov 4, 2024

Hi. How do the signed values work?

For example in "inst_div_signed"?

rA=8 (value: 2147483664)
rB=7 (value: 7)
rD=9

How is register 9 expected to be 3988183920?

@koute
Copy link
Author

koute commented Nov 4, 2024

How do the signed values work?

I recommend reading this article on Wikipedia.

How is register 9 expected to be 3988183920?

It's not. It's expected to be -2147483632 / 7 = -306783376.

@davxy davxy mentioned this pull request Nov 19, 2024
13 tasks
@sourabhniyogi
Copy link

Now that 0.5 is 64-bit (only), we really need this to be updated to support 64-bit test vectors (only)

We were able to use polkatool (64-bit) and do our services, which touched 23 opcodes and 5 host functions and would like to cover all the opcodes robustly.

Can we wrap up 2024 with 64-bit test vectors?

@koute
Copy link
Author

koute commented Jan 24, 2025

New PVM test vectors are here; see the changelog for details.

@boymaas
Copy link

boymaas commented Jan 26, 2025

Thank you @koute for the vectors. I have a question: inst_store_imm_u8_trap_read_only.json and inst_store_u8_trap_read_only.json expect a panic. The graypaper describes a page fault.

"Similarly, where ram must be mutated and yet mutable access is not possible, then machine state is unchanged, and the exit reason is a fault with the lowest address to be read which is inaccessible." link

@koute
Copy link
Author

koute commented Jan 27, 2025

I have a question: inst_store_imm_u8_trap_read_only.json and inst_store_u8_trap_read_only.json expect a panic. The graypaper describes a page fault.

The are two types of PVMs - outer PVMs and inner PVMs. The outer PVMs run the toplevel JAM services (on_transfer, accumulate, refine, etc.) while inner PVMs can be spawned by refine using the invoke hostcall. These two types of PVMs work in a slightly different way, and in this case the relevant differences are:

  • inner PVMs (currently) don't support read-only memory, and they generate page faults
  • outer PVMs support read-only memory, and don't really generate page faults (you could technically say that they do, but for outer PVMs a panic and a page fault would be indistinguishable - both are unrecoverable and equivalent to a panic, so there isn't really much point in even having the concept of "page faults" in outer PVMs nor supporting them in implementations for anything other than debugging)

So in this case those tests were meant to test the outer PVM behavior where you have read only memory and the program tries to write there. But okay, you're right this might be confusing wrt to the GP; I'll just delete those tests.

@boymaas
Copy link

boymaas commented Jan 27, 2025

Thanks for the explanation! I wasn’t aware of these distinctions between outer and inner PVMs, but it makes sense now.

@boymaas
Copy link

boymaas commented Jan 27, 2025

I'm running into another discrepancy with my implementation. The test pvm/programs/inst_store_indirect_u16_with_offset_nok.json, expects a page fault address of 0x00021000. Based on my current understanding, the formula indicates the PVM should report the first violating address, which would be 0x00021001. However, the test seems to expect the start of the page where the violation occurred instead.

@dakk
Copy link

dakk commented Jan 28, 2025

What is the memory size expected by those tests?

@koute
Copy link
Author

koute commented Jan 28, 2025

Based on my current understanding, the formula indicates the PVM should report the first violating address, which would be 0x00021001. However, the test seems to expect the start of the page where the violation occurred instead.

You're correct. The intended behavior is that the address of the page is returned; we will update the GP.

What is the memory size expected by those tests?

Each test defines the expected memory layout; see initial-page-map and initial-memory fields.

@boymaas
Copy link

boymaas commented Jan 28, 2025

You're correct. The intended behavior is that the address of the page is returned; we will update the GP.

Thanks, since the graypaper will be updated, I’ll adjust my implementation to return the address of the page.

@bloppan
Copy link

bloppan commented Jan 29, 2025

Hello @koute I have a question about the tests that use the branch function.

For example, the test inst_branch_eq_ok.json executes the instruction 170 -> branch_eq, which calls branch(12, true). The position n=12 of the program corresponds to the instruction 20 -> load_imm_64. Before executing the branch instruction, we have to check if c[n] belongs to the basic block set but the instruction 20 -> load_imm_64 doesn't belongs to the basic block set opcodes. What I missing here?

@koute
Copy link
Author

koute commented Jan 29, 2025

@bloppan Whether a basic block starts at a given position or not is not determined by the first instruction of the basic block but by the previous instruction, which is a trap, which is a basic block terminator (hence it also starts a new basic block).

@bloppan
Copy link

bloppan commented Jan 31, 2025

Hi @koute , in the test riscv_rv64um_divu.json, when pc = 110, the instruction 193 -> div_u_32 is executed and takes as arguments: ω9, ω8, ω11

ω9 = 0x1
ω8 = 0xffffffff80000000
ω11 = ω8 / ω9 = 0xffffffff80000000 / 0x1

The result is ω11 = 0xffffffff80000000

The instruction div_u_32 is 32 bits unsigned and it doesn't have sign extension, so I think the result should be ω11 = 0x80000000. In my implementation, the test passes if I add the sign extension to the division result, but it's not specified in the GP.

@koute
Copy link
Author

koute commented Jan 31, 2025

The instruction div_u_32 is 32 bits unsigned and it doesn't have sign extension, so I think the result should be ω11 = 0x80000000. In my implementation, the test passes if I add the sign extension to the division result, but it's not specified in the GP.

It's an error in the GP, and the test vector is correct. The 32-bit instruction variants always sign extend.

@emielsebastiaan
Copy link

With the current Graypaprer spec (0.6.1) the following test vector is incorrect.
https://graypaper.fluffylabs.dev/#/4bb8fd2/2a48012a7d01

Test vector

rem_s_64 (206)
Current Current incorrect calculation: -9223372036854775791 mod 7 = 2
Current incorrect output ω_9: 18446744073709551611
Correct output ω_9: 2

Z_8(ω_A) mod Z_8(ω_B)
Current incorrect calculation: -9223372036854775791 mod 7 = -5
Correct calculation: -9223372036854775791 mod 7 = 2

Analysis

The incorrect current output is explained because some programming languages (such as: RUST en C) provide a negative output for a modulo operation on a negative number.
Python on the other hand outputs a positive number for a modulo operation on a negative number.
In 'Maths', "the usual representative is the least positive residue, the smallest non-negative integer"

In mathematics, the result of the modulo operation is an equivalence class, and any member of the class may 
be chosen as representative; however, the usual representative is the least positive residue, the smallest 
non-negative integer that belongs to that class (i.e., the remainder of the Euclidean division).[2] 

However, other conventions are possible. Computers and calculators have various ways of storing and 
representing numbers; thus their definition of the modulo operation depends on the programming language 
or the underlying hardware. 

Source: https://en.wikipedia.org/wiki/Modulo

Possible solutions

  1. Test vector is incorrect with current GP-0.6.1 specification and should be corrected.
  2. Graypaper should explicitly state that modulo operations of negative numbers should be allowed to be negative.

In summary 'maths' shoud have priority over any implementation ambiguity, therefore as is solution 1 should be the way to go.

@koute
Copy link
Author

koute commented Feb 3, 2025

The incorrect current output is explained because some programming languages (such as: RUST en C) provide a negative output for a modulo operation on a negative number. [...] In summary 'maths' shoud have priority over any implementation ambiguity, therefore as is solution 1 should be the way to go.

No, it's explained because PVM is based on RISC-V, and that's how the RISC-V's (and also coincidentally how AMD64's) modulo instruction works. It has absolutely nothing to do with how the modulo operator works in any programming language.

Two of the main design principles of PVM are:

  1. we can use upstream RISC-V compilers without any modifications, so while we have some leeway because we postprocess the RISC-V code into PVM, ultimately we must keep the original RISC-V instruction semantics intact,
  2. so that it's easy to recompile PVM into native machine code, so the semantics of instructions should, in general, match how real hardware tends implements those instructions.

So the test vector here is correct and is what we want, and changing the semantics here as you suggest is not a good idea as it will provide no practical benefit while having significant practical downsides.

Anyway, thank you for bringing this ambiguity to our attention.

@emielsebastiaan
Copy link

Sure thanks, this is fine of course. But then GP should be adjusted to explicitly state that the modulo operator on a negative number yields a negative number, and not a positive number as expected by 'Maths'.

@clw8998
Copy link

clw8998 commented Feb 4, 2025

Test Case: inst_div_signed_64

When pc = 0, calling div_s_64 (204), we have:
(All numbers here are in decimal.)

  • $ω7 = 9223372036854775824$
  • $ω8 = 7$
  • $Z_8(ω7) = -9223372036854775792$
  • $Z_8(ω8) = 7$

Performing the division and applying the floor function:

$⌊ -9223372036854775792 / 7 ⌋$ = $⌊ -1317624576693539398.8571428571429 ⌋$ = -1317624576693539398

(However, it should be -1317624576693539399 after applying the floor function.)

  • $Z_8^{-1}(-1317624576693539398) = 17129119497016012218$
  • $Z_8^{-1}(-1317624576693539399) = 17129119497016012217$

Question:

  • The expected result after applying the floor function should be -1317624576693539399, but the test vector shows -1317624576693539398.
  • Since GP does not explicitly define the floor function? (not sure). I checked the floor function definition, but I am unsure whether this applies to GP.
  • A similar issue also occurs in riscv_rv64um_div.

Did I make a mistake somewhere?

@koute
Copy link
Author

koute commented Feb 5, 2025

@clw8998 I can confirm the test vector is correct here and the expected value is -1317624576693539398 (0xedb6db6db6db6dba). In this case the fractional part of the result should always be truncated because these are integer (non-floating point) division instructions.

To illustrate why let's pick some smaller numbers to make this more obvious. Let's try to divide a positive number first:

7 / 3 = 2.333 ~= 2
2 * 3 = 6

Now let's try flipping the sign:

-7 / 3 = -2.333 ~= -3
-3 * 3 = -9

vs

-7 / 3 = -2.333 ~= -2
-2 * 3 = -6

Notice that flipping the sign of one of the inputs doesn't change the numerical value of a / b * b (just its sign) if we use truncation.

You're right that mathematically floor does the (in this case) incorrect thing; we will fix the GP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.