Inconsistient immediate bit-range approaches between extensions #41

elliotb-lowrisc · 2024-04-18T14:13:06Z

I've noticed that there are at least two different approaches being used for encoding immediate values that exclude one or more LSBs in the ISA documentation. I think this multiplicity of methods could cause misunderstandings and mistakes...

The first approach in use is to shift all the bit ranges/indexes such that all LSBs of the provided value are used. I found this approach in use in RV32_I.hs. For example, LUI in "The RISC-V Instruction Set Manual - Volume I: Unprivileged ISA - 20191213" specifies imm[31:12], but lui_raw at RV32_I.hs:169 takes imm[19:0]. Both take 20 bits, but the former specifies the top 20 and the latter takes the bottom 20. This seems fine if all present and future usages of the lui instruction adhere to this convention, if more confusing for some people (and perhaps more intuitive for other people). I wonder if this approach is legacy behaviour from a time when, perhaps, the bit range behaviour of the encode function was not developed to the same extent that it is now.

However, there is another approach in use that leaves the bit ranges/indexes as they are given in the spec. I found this approach in use in RV_C.hs. For example, C.LUI in the spec document mentioned earlier specifies nzimm[17:12] (nzimm[17] and nzimm[16:12]), and also c_lui_raw at RV_C.hs:144 takes nzimm[17:12] (nzimm[17] and nzimm[16:12]). This means that the 12 LSBs of the provided immediate are dropped when c_lui is used, with only the 6 bits above them being encoded. Again, this seems fine if all present and future usages of c_lui adhere to this convention, but it seems needlessly confusing for this behaviour to differ between the compressed and uncompressed versions of the same instruction.

I suppose the first question might be, is the wider issue of multiple approaches to immediate bit-ranges of concern to anyone other than me?

If so, which approach is generally preferred, if any?

And ultimately, is standardisation of this worth the effort and chance of introducing a new issue?

The text was updated successfully, but these errors were encountered:

marnovandermaas · 2024-04-22T09:35:50Z

I think I prefer using the same bit range as the spec to avoid confusion.

PeterRugg · 2024-04-22T11:43:28Z

I agree this is an issue, and needs some thought. In general, there's a bit of a problem that bits get thrown away without any warning, but we also don't want to force the templates to perform lots of sanitisation on the bits around running the instruction. For what it's worth, I think I'm in favour of the first approach for the common case, where you specify a full 32-bit immediate and it picks the bits that are architecturally encoded (although the sign-extension behaviour is a little odd). Maybe we could add a template constructor that allows people to opt-in to strict immediate checking.

And ultimately, is standardisation of this worth the effort and chance of introducing a new issue?

I wouldn't worry too much about this: we still need to make things stable in a way that ensures we get the expected coverage, so now is the time to be making these kinds of changes. You've already seen problems with the compressed immediate encodings where we were missing cases. We should think about how we can write "tests" for QCVEngine that ensure we cover everything we think we're covering.

elliotb-lowrisc · 2024-04-22T12:54:14Z

For what it's worth, I think I'm in favour of the first approach for the common case, where you specify a full 32-bit immediate and it picks the bits that are architecturally encoded (although the sign-extension behaviour is a little odd).

To clarify, are you talking about the approach seen in RV32_I.hs or in RV_C.hs? "...The first approach..." sounds like the former, but "...full 32-bit immediate and it picks the bits that are architecturally encoded..." sounds like the latter to me.

Maybe we could add a template constructor that allows people to opt-in to strict immediate checking.

Some mechanism like that sounds good to me, assuming I understand you correctly. Something which allows additional immediate constraints, such as 'at least one of bits ... must be non-zero', that can be tailored to each instruction that requires it.

I've not been able to prevent illegal instructions from being generated in the "compressed" test due to C.LUI, CANDI16SP, and C.ADDI4SPN being given zero values for non-zero immediates. I suspect the same is happening silently for instructions with non-zero immediates that happen to have a related instruction with the same encoding except for zeroes instead of the non-zero immediate, such as C.SRLI/C.SRLI64, C.SRAI/C.SRAI64, and C.SLLI/C.SLLI64. Neither seem like a good thing to be happening.

I wasn't able to find a good solution myself. I looked into the QuickCheck NonZero thingy but that would only guarantee that the whole immediate was non-zero, and you could easily end up with a zero value but selecting all but the selected bit. Something like the QuickCheck ==> operator seems more flexible, but I couldn't figure out how to fit in into the existing code without significant refactoring.

PeterRugg · 2024-04-22T16:54:01Z

To clarify, are you talking about the approach seen in RV32_I.hs or in RV_C.hs? "...The first approach..." sounds like the former, but "...full 32-bit immediate and it picks the bits that are architecturally encoded..." sounds like the latter to me.

Apologies, I was confused. I think it's the second approach. But to be completely explicit, I'm suggesting that the upper 20 bits of the immediate passed into lui should be used, so lui 0x4000_0000 (as called from haskell) gives something sensible while lui 0x0000_0001 gives a lui of 0.

Something which allows additional immediate constraints, such as 'at least one of bits ... must be non-zero', that can be tailored to each instruction that requires it.

Exactly: instructions would be written in all files defining instruction encodings with enough information to tell if the encoding is valid (i.e. effectively no immediate bits were thrown away and things that were expected to be non-zero were non-zero). By default, that gets ignored, but you can opt-in to getting some kind of error as the template is evaluated if you want to.

I've not been able to prevent illegal instructions from being generated in the "compressed" test due to C.LUI, CANDI16SP, and C.ADDI4SPN being given zero values for non-zero immediates. I suspect the same is happening silently for instructions with non-zero immediates that happen to have a related instruction with the same encoding except for zeroes instead of the non-zero immediate, such as C.SRLI/C.SRLI64, C.SRAI/C.SRAI64, and C.SLLI/C.SLLI64. Neither seem like a good thing to be happening.

We've sort of seen this as a semi-feature up till now: it's easy enough to imagine that one implementation forgets to check for the special case and that's a hardware bug, so this seems like something you should test. I guess the problem is if one implementation has an extension that maps that bit-pattern onto another instruction, while the other implementation does not, and so you get spurious failures. It would also be a problem if your implementations handled exceptions differently, but then you're in trouble anyway for verifying interesting cases (especially around CHERI). In any case, I agree that at least as an option, you should be able to only get the instruction you asked for, validly encoded.

I wasn't able to find a good solution myself.

At the encode functions, you have the information about what the immediates are. One very hacky option is to map the zero immediate case to some other case for instructions that have non-zero immediates. Having the encode function return a Maybe is also an option, but would have wider-ranging consequences.

Modify `unscatter` so it works for instructions that do not encode the lowest bits of the immediate value provided to them. This issue has been hidden because many instructions have been encoded using the lower immediate bits rather than upper immediate bits (e.g. `imm[19:0]` rather than `imm[31:12]` for `lui`), and others (which do omit lower immediate bits) are not included in disassembly output (e.g. `c_lui` with empty `rv_c_disass`). See also: CTSRD-CHERI#41

Modify `unscatter` so it works for instructions that do not encode the lowest bits of the immediate value provided to them. This issue has been hidden because many instructions have been encoded using the lower immediate bits rather than upper immediate bits (e.g. `imm[19:0]` rather than `imm[31:12]` for `lui`), and others (which do omit lower immediate bits) are not included in disassembly output (e.g. `c_lui` with empty `rv_c_disass`). See also: #41

PeterRugg · 2024-08-06T13:19:01Z

@elliotb-lowrisc Was this fully fixed by #45 or do we need to do more work to bring all the extensions into line?

elliotb-lowrisc · 2024-08-06T13:24:21Z

#45 was just a minor fix to display code. I don't think anything has happened yet to address this wider issue of differing approaches to immediate bit-ranges between the extensions.

elliotb-lowrisc mentioned this issue May 3, 2024

Fix verbose print of instrs that omit low imm bits #45

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistient immediate bit-range approaches between extensions #41

Inconsistient immediate bit-range approaches between extensions #41

elliotb-lowrisc commented Apr 18, 2024

marnovandermaas commented Apr 22, 2024

PeterRugg commented Apr 22, 2024

elliotb-lowrisc commented Apr 22, 2024

PeterRugg commented Apr 22, 2024

PeterRugg commented Aug 6, 2024

elliotb-lowrisc commented Aug 6, 2024

Inconsistient immediate bit-range approaches between extensions #41

Inconsistient immediate bit-range approaches between extensions #41

Comments

elliotb-lowrisc commented Apr 18, 2024

marnovandermaas commented Apr 22, 2024

PeterRugg commented Apr 22, 2024

elliotb-lowrisc commented Apr 22, 2024

PeterRugg commented Apr 22, 2024

PeterRugg commented Aug 6, 2024

elliotb-lowrisc commented Aug 6, 2024