-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistient immediate bit-range approaches between extensions #41
Comments
I think I prefer using the same bit range as the spec to avoid confusion. |
I agree this is an issue, and needs some thought. In general, there's a bit of a problem that bits get thrown away without any warning, but we also don't want to force the templates to perform lots of sanitisation on the bits around running the instruction. For what it's worth, I think I'm in favour of the first approach for the common case, where you specify a full 32-bit immediate and it picks the bits that are architecturally encoded (although the sign-extension behaviour is a little odd). Maybe we could add a template constructor that allows people to opt-in to strict immediate checking.
I wouldn't worry too much about this: we still need to make things stable in a way that ensures we get the expected coverage, so now is the time to be making these kinds of changes. You've already seen problems with the compressed immediate encodings where we were missing cases. We should think about how we can write "tests" for QCVEngine that ensure we cover everything we think we're covering. |
To clarify, are you talking about the approach seen in RV32_I.hs or in RV_C.hs? "...The first approach..." sounds like the former, but "...full 32-bit immediate and it picks the bits that are architecturally encoded..." sounds like the latter to me.
Some mechanism like that sounds good to me, assuming I understand you correctly. Something which allows additional immediate constraints, such as 'at least one of bits ... must be non-zero', that can be tailored to each instruction that requires it. I've not been able to prevent illegal instructions from being generated in the "compressed" test due to I wasn't able to find a good solution myself. I looked into the QuickCheck |
Apologies, I was confused. I think it's the second approach. But to be completely explicit, I'm suggesting that the upper 20 bits of the immediate passed into
Exactly: instructions would be written in all files defining instruction encodings with enough information to tell if the encoding is valid (i.e. effectively no immediate bits were thrown away and things that were expected to be non-zero were non-zero). By default, that gets ignored, but you can opt-in to getting some kind of error as the template is evaluated if you want to.
We've sort of seen this as a semi-feature up till now: it's easy enough to imagine that one implementation forgets to check for the special case and that's a hardware bug, so this seems like something you should test. I guess the problem is if one implementation has an extension that maps that bit-pattern onto another instruction, while the other implementation does not, and so you get spurious failures. It would also be a problem if your implementations handled exceptions differently, but then you're in trouble anyway for verifying interesting cases (especially around CHERI). In any case, I agree that at least as an option, you should be able to only get the instruction you asked for, validly encoded.
At the encode functions, you have the information about what the immediates are. One very hacky option is to map the zero immediate case to some other case for instructions that have non-zero immediates. Having the encode function return a |
Modify `unscatter` so it works for instructions that do not encode the lowest bits of the immediate value provided to them. This issue has been hidden because many instructions have been encoded using the lower immediate bits rather than upper immediate bits (e.g. `imm[19:0]` rather than `imm[31:12]` for `lui`), and others (which do omit lower immediate bits) are not included in disassembly output (e.g. `c_lui` with empty `rv_c_disass`). See also: CTSRD-CHERI#41
Modify `unscatter` so it works for instructions that do not encode the lowest bits of the immediate value provided to them. This issue has been hidden because many instructions have been encoded using the lower immediate bits rather than upper immediate bits (e.g. `imm[19:0]` rather than `imm[31:12]` for `lui`), and others (which do omit lower immediate bits) are not included in disassembly output (e.g. `c_lui` with empty `rv_c_disass`). See also: CTSRD-CHERI#41
Modify `unscatter` so it works for instructions that do not encode the lowest bits of the immediate value provided to them. This issue has been hidden because many instructions have been encoded using the lower immediate bits rather than upper immediate bits (e.g. `imm[19:0]` rather than `imm[31:12]` for `lui`), and others (which do omit lower immediate bits) are not included in disassembly output (e.g. `c_lui` with empty `rv_c_disass`). See also: CTSRD-CHERI#41
Modify `unscatter` so it works for instructions that do not encode the lowest bits of the immediate value provided to them. This issue has been hidden because many instructions have been encoded using the lower immediate bits rather than upper immediate bits (e.g. `imm[19:0]` rather than `imm[31:12]` for `lui`), and others (which do omit lower immediate bits) are not included in disassembly output (e.g. `c_lui` with empty `rv_c_disass`). See also: CTSRD-CHERI#41
Modify `unscatter` so it works for instructions that do not encode the lowest bits of the immediate value provided to them. This issue has been hidden because many instructions have been encoded using the lower immediate bits rather than upper immediate bits (e.g. `imm[19:0]` rather than `imm[31:12]` for `lui`), and others (which do omit lower immediate bits) are not included in disassembly output (e.g. `c_lui` with empty `rv_c_disass`). See also: CTSRD-CHERI#41
Modify `unscatter` so it works for instructions that do not encode the lowest bits of the immediate value provided to them. This issue has been hidden because many instructions have been encoded using the lower immediate bits rather than upper immediate bits (e.g. `imm[19:0]` rather than `imm[31:12]` for `lui`), and others (which do omit lower immediate bits) are not included in disassembly output (e.g. `c_lui` with empty `rv_c_disass`). See also: CTSRD-CHERI#41
Modify `unscatter` so it works for instructions that do not encode the lowest bits of the immediate value provided to them. This issue has been hidden because many instructions have been encoded using the lower immediate bits rather than upper immediate bits (e.g. `imm[19:0]` rather than `imm[31:12]` for `lui`), and others (which do omit lower immediate bits) are not included in disassembly output (e.g. `c_lui` with empty `rv_c_disass`). See also: #41
@elliotb-lowrisc Was this fully fixed by #45 or do we need to do more work to bring all the extensions into line? |
#45 was just a minor fix to display code. I don't think anything has happened yet to address this wider issue of differing approaches to immediate bit-ranges between the extensions. |
I've noticed that there are at least two different approaches being used for encoding immediate values that exclude one or more LSBs in the ISA documentation. I think this multiplicity of methods could cause misunderstandings and mistakes...
The first approach in use is to shift all the bit ranges/indexes such that all LSBs of the provided value are used. I found this approach in use in RV32_I.hs. For example,
LUI
in "The RISC-V Instruction Set Manual - Volume I: Unprivileged ISA - 20191213" specifiesimm[31:12]
, butlui_raw
at RV32_I.hs:169 takesimm[19:0]
. Both take 20 bits, but the former specifies the top 20 and the latter takes the bottom 20. This seems fine if all present and future usages of thelui
instruction adhere to this convention, if more confusing for some people (and perhaps more intuitive for other people). I wonder if this approach is legacy behaviour from a time when, perhaps, the bit range behaviour of theencode
function was not developed to the same extent that it is now.However, there is another approach in use that leaves the bit ranges/indexes as they are given in the spec. I found this approach in use in RV_C.hs. For example,
C.LUI
in the spec document mentioned earlier specifiesnzimm[17:12]
(nzimm[17]
andnzimm[16:12]
), and alsoc_lui_raw
at RV_C.hs:144 takesnzimm[17:12]
(nzimm[17]
andnzimm[16:12]
). This means that the 12 LSBs of the provided immediate are dropped whenc_lui
is used, with only the 6 bits above them being encoded. Again, this seems fine if all present and future usages ofc_lui
adhere to this convention, but it seems needlessly confusing for this behaviour to differ between the compressed and uncompressed versions of the same instruction.I suppose the first question might be, is the wider issue of multiple approaches to immediate bit-ranges of concern to anyone other than me?
If so, which approach is generally preferred, if any?
And ultimately, is standardisation of this worth the effort and chance of introducing a new issue?
The text was updated successfully, but these errors were encountered: