-
Notifications
You must be signed in to change notification settings - Fork 43
Feature detection #356
Comments
Nice write-up, thanks Thomas! Nit: SIMD opcodes are prefixed with Also, is your intention to have feature detection be part of the existing SIMD proposal? I.e. the current proposal will only ship with feature detection implemented. Or can SIMD proposal progress as it is with the caveat that we prioritize feature-detection for SIMD before we look at other stuff (like 128-bit SIMD v2, or fast/long SIMD)?
This will require some sort of I think this scheme is pretty neat. If we can require engines to know how to parse these instructions, not necessarily implement them, then we can maintain forwards-compatibility without having to introduce this scheme. The advantage is you save a single byte for each new opcode. And it is not hard for engines to do because the "patterns" of instructions won't change much. However it does require them to know how to "parse" each instruction.
Can you elaborate a bit more? You're saying, that new SIMD instructions will be introduced with the same prefix, e.g. 0x7b 0xef, then we make a copy, 0x7a 0xef, which replaces opcodes that were previously defined? Why is there a 0x7a 0xef that was previously defined, where did it come from? |
🎉
Oops, thanks for catching that. Fixed inline above.
I was thinking that we should consider shipping it as part of MVP, but thinking about it more, I'm not sure doing that would improve the situation much. Whether or not we ship feature detection as a part of MVP, there will be a lengthy period of time where users have to do separate SIMD and MVP builds before all engines support feature detection. Whether we do feature detection as part of MVP or as the first follow-up, all other follow-up SIMD proposals will be able to be backwards compatible with any engine that has implemented feature detection. The only real difference would be in the toolchain, where shipping them separately would mean having to pass an extra feature flag (e.g.
Yes, I meant the definition of vector given in the standard, which prefixes the list of elements with their count.
Right, most of the complexity of this scheme comes from the requirement that the engine be able to figure out how to parse instructions that it doesn't know anything about, including their immediates. The only way we thought of to accomplish that was with the "immediate descriptor immediate" that makes the structure of each instruction self-describing. But we don't want to pay the cost of that extra one-byte immediate on every SIMD instruction when we're not trying to do the function multiversioning thing, which is why we kept the normal instructions in the
Under this scheme, every single opcode in the |
I would say that the mechanism to detect features should be here from the very start, even if it is not used for any MVP features. |
I think it's a good idea to look for a SIMD-specific solution here, since the conditional compilation proposal's main use case is SIMD, while the proposal itself is controversial. It seems to me that instead of the prefixed instruction space (we'll run out of prefixes soon...) you only need a single instruction, 0xfd k nn where k is some representation of the maximum simd version required by the subsequent instructions, and nn is the number of bytes occupied by those instructions, with the semantics of unreachable if the version is not supported. Why is it desirable to prefix each instruction? Is that so that we guarantee that only SIMD instructions are covered by this mechanism? Or is there another reason? (Would still need simd.has_feature of course.) |
Oh nice, we didn't quite think of that solution, although we considered a similar idea of having some sort of general The only downside of your proposed solution is that the size in bytes of an instruction sequence is not robust to changes in uleb128 encoding. This would be a real complication in practice because tools like wasm-ld intentionally use overly large uleb128s where relocations are patched and downstream tools like wasm-opt. The alternative to specifying the number of bytes to skip is to make unknown instructions parsable by making them self-describing via a mechanism like the the proposed "immediate descriptor immediate." The solution in the opening post could be changed to use just a single
That would save the additional prefix, but would increase code size by another byte per instruction. Overall, I don't think spending another prefix on this would be so bad. We always have the 0xff prefix reserved as an escape hatch to prevent us from running out of prefixes. |
I'm wondering if it is more work for an engine too -- there would have to be a check to make sure the instructions in that block match the SIMD version requested. e.g. imagine your engine supports SIMD v1, v2, v3, and the block condition is only for v2, you should trap on v3 instructions. This means a streaming compiler would either have to scan the instructions twice (once to see if they're valid, next to generate code) or scan forward assuming it's OK and backpatch a trap if it fails, right? Maybe that's not too bad. Also, would we allow these to be nested? I don't see much of a use, so maybe best to disallow it. Though that would create another mode when parsing the instructions. |
Acknowledged.
In that case you could have multiple simd.compat instructions, one for each common size, and then you'd get that byte back. We'd want these compat opcodes to be in the 1-byte range of the current 0xfd encoding, clearly. We could then additionally have a general escape that is a little more expensive, if we don't want to cover all the sizes with single-byte prefixes.
I disagree with that, because the single-byte opcodes are imo quite valuable and should be used sparingly, but it's probably mostly a matter of taste. |
I like this solution a lot. Will update the opening post to use this idea instead. |
Is this pseudocode accurate in terms of using the suggested instructions?
What happens when there is a bug like so:
The bitmask and the instruction used doesn't match. Should it be a validation error? I don't think there is precedence for such "conditional validation", and things will get more complicated with nesting of these Also, this is a quite interesting because the
It will be helpful to list out some examples. From what I can tell, the "opcode" of all future SIMD opcodes will continue to be a leb-encoded u32, so suppose simd.compat is I'm starting to see this feature detection proposal as a sort of clever partition of opcode space to maintain compatibility, another suggestion:
This has a small advantage of not changing a lot of the existing opcodes. |
Yes!
No, this would still validate. If the
Yes, and I hope engines take advantage of that.
Yes, that's the intention. I will add some examples.
I don't quite understand what you're proposing here. Do you mean that for small-opcode instructions we could access them solely through the |
Got it, thanks! It can be surprising if engine supports both ANOTHER_BITMASK and FMA, and the code is still buggy in that the wrong bitmask is checked, it will incorrectly do the right thing.
As I understand it, this feature detection issue will only apply the What I'm proposing is, let's adopt the
Contrasting with what you're suggesting (IIUC), which is to have
Yes, although it will be confusing since a vec of 8 immediate bytes != memarg, you will have to do some decoding. |
I was thinking that all SIMD instructions, both current and future, would have "normal" encodings, but also be accessible via compat wrappers. That would allow engines that don't want to support any SIMD at all support gracefully falling back to MVP code as long as they support the compat wrappers and the v128 type. IIUC what you're saying, we could avoid a distinction between the "compat wrapper" encoding and the "normal" encoding for most instructions by having the That would save even more code size, indeed. I'm not really sure how valuable it is to have a clear distinction between the "normal" encodings and the "compat wrapper" encodings. I'd be interested to hear what other folks think. |
It would save code size for SIMD instructions without immediates, but would increase code size for SIMD instructions with immediates. That's probably still a win, though. And it is nice that there is no special |
There's a spectrum here:
|
That's a fine summary. An additional benefit of (1) is that you must opt in to quiet convert-to-unreachable: only instructions that you've asked to be turned into unreachable can be; for everything else, there's a validation error if the instruction is not supported. Solutions (2)-(4) will turn all single-byte junk into unreachable. (This came to me because I started wondering whether any of these solutions would extend naturally to the entire instruction set. Were we to do that, we might prefer to have more stringent checks than compat-by-default gives us. And on that note, were we to push this solution for the full instruction set, we could realize further size reductions by dispensing with the 0xfd prefix for compat instructions, we could allocate a small number of single-byte opcodes for compat0, compat1, ... encodings.) |
Thanks Thomas for putting this together. Having forward compatibility mechanism would be very valuable for in-the-wild deployment. I have several questions about this proposal:
As for the developer interface, I have strong preference to not rely on gcc-style function multiversioning and rather specify the minimal instruction set via linker option:
|
General observation: this seems scary. The complexity around variable byte encoding and multiple encoding variants seems significant. Is it possible to adopt a solution that is more closely aligned with the LLVM model where per each function you need to know the target features that are available? The way this could work is:
This would allow pretty much direct compilation in gcc/clang model, both from separate translation units (linker just merges functions with different instructions) and from a single translation unit (when target feature attributes of gcc/clang are used). This would not require any extra per-instruction compat encoding, the only change is that the function body must be enclosed in a block that carries the length so it can be skipped, and the features; the block can be restricted to only appear at the function level. This seems trivial to implement from the perspective of existing implementations - again, no extra encoding, just need to parse the new function block and generate This doesn't have to be SIMD specific. Unsure if this is good (yay, generality) or bad (we'll never ship this). |
... maybe this is what feature detection proposal should be, which might unblock it at the cost of setting it far back ;) |
I see the limited scope of this proposal as a feature. If we tried to generalize it beyond SIMD, then there would be significant pressure to come up with a fully general mechanism that could handle arbitrary conditions and arbitrary conditional module content. Then we'd be stuck in the same place as the conditional sections proposal. If we are ok with shipping feature detection as a follow-on proposal, this is something we could look into more, but we'd have to go into it with a well-defined scope.
Yes. No changes would be required.
Making parsing depend on the contents of a new section sounds like it would be more contentious in the CG, but is also a good idea that we could pursue if we are ok shipping feature detection as a follow-on proposal.
How is this different from what you can already do today? Can you share a small example of how this would look?
How is this compatible with your previous point? The library author can't choose which implementation to include in their library unless they know the features up front and they can't include multiple versions in their library without something like function multiversioning.
It's certainly less straightforward than the rest of the SIMD proposal. That being said, the complexity is well-contained. There aren't any complex interactions between this proposal and the rest of WebAssembly.
@binji and I actually considered an idea very similar to this that also used conditional blocks that stored the number of bytes to skip if some features weren't supported. The reason we decided to go with the present idea instead is that the byte length of instruction sequences is really hard to keep track of as tools change LEB encodings, which happens in practice because lld emits overly-large LEBs wherever it resolves a relocation. |
Is that set in stone though? My understanding of the existing feature detection proposal is that it couples the validity rules with runtime selection rules. A proposal like this one decouples them, providing means to specify encoding of unfamiliar extension in a way that a non-supporting implementation can work with, and means for the code to perform the relevant conditional logic for selecting the optimal implementation. What I'm suggesting is that this proposal can be made simpler and more general without coupling these back again. (I understand the reluctance of going back down this path, and I'm sorry to bring this up since I wasn't involved in the fd proposal so I'm not aware of the pain points, but even if we make something SIMD-specific if it could be done in a way that can be then adopted for other instruction set extensions that'd be great, and I think there's a way to do that as sketched out above).
As long as the linker works on a function by function basis, what I'm proposing still seems simple enough to implement. My understanding otherwise is that in this proposal the linker still needs to be aware of the full encoding, including compact forms, to be able to do relocations. In general I feel like the inability to decode individual functions separately is an issue. Case in point (perhaps - maybe I'm conflating) is that today you can't actually strip - as far as I can tell - a module containing SIMD instructions with any official tools; I tried doing it with wasm-strip or -s argument of wasm-ld, and both work on MVP binaries but don't work on SIMD binaries. |
Wouldn't the WAsm engine need to know how many outputs an instruction produce to properly track the stack? |
Beyond the implementation complexity on the decoding side, the proposal as described introduces complexity on the user side in the way of having to reason about two types of encoding, backwards and forwards compatible, potential issues around this on the tooling side, code size tradeoffs (now scalar+SIMD combined .wasm needs to inflate all SIMD instructions so if it's 90% SIMD this would actually regress combined code size from two separate .wasm blobs), etc. If we don't do that and instead use per-function attributes, the size impact could be minimal (a few extra bytes per every SIMD function, these tend to be decently sized) and we wouldn't even need control over this long-term... It would still be the case that we need all tooling to be aware of the full set of features to be able to, for example, perform relocation (because you need to be able to parse the function body). But as far as I can tell this proposal requires this anyhow because it introduces a way to specify a compatible encoding for future extensions but presumably future extensions would come in two forms. Or is the idea here is to reserve future code space for new SIMD instructions and only provide the encoding via the compat form for any extensions of the SIMD spec? (and only support the non-compat encoding for existing instructions) |
@zeux, it's a good point that we might be able to come up with a more general feature detection solution that we could still ship relatively quickly if we pursue a narrower scope than the conditional sections proposal. However, a more limited feature detection mechanism may not actually be useful beyond SIMD. There are not many Wasm features that can be used with changes only to the instructions in function bodies - most require using new sections (threads), new types in function signatures (reference types), new things in the type section (gc), new compilation schemes (exception handling), or some other change that would not be covered by something like a conditional block, forward-compat prefix, or function attribute. Given the limited value of a more general solution, it seems better to double down on a simpler SIMD-specific solution.
The linker actually doesn't know anything about instruction encodings. It just has a list of locations in the object files that need to be patched as relocations are resolved. That means that it depends on all LEB relocations to have 5 bytes in both the input object files and the output binary. The problem is that Binaryen does not keep track of LEB sizes when it parses, so it loses track of instruction sequence sizes. I suppose it could just measure the size again when it emits the binary, though, so maybe that's not as bad as I thought. Definitely worth considering further. Actually, the more I think about it, the more I prefer that solution.
No, the idea is that unknown instructions have the same semantics as |
A design that provides forward compatibility is certainly powerful, but it comes with all the risks and complexities that are mentioned above. If we weaken the requirements from "forward compatibility" to "trivial to implement not-supported SIMD features on all hosts", then I think we get a lot of really nice properties. To be a bit more specific:
What's nice is that this approach would leverage the existing dead-code validation logic that engines already have to implement (and even provide some retroactive justification for why wasm even requires special dead-code validation in the first place ;-). Thus, I think this might be the lowest-effort solution (especially if we assume @conrad-watt's relaxed dead code validation). Also, this design avoids some of the "testing matrix" concerns that you get with (iiuc) all the forward-compatibility solutions: validating a wasm module in all-features-enabled mode (e.g., using While this doesn't give us forward-compatibility, I think it's worth asking what precisely is the problem we're trying to solve. If the problem is "some hosts can't or won't ever implement certain (or all) SIMD features and we don't want to be producing multiple modules, just to use SIMD, forever", then I think the above solution is sufficient. |
In my view, forward compatibility is an important problem to solve, but it's a good point that we haven't considered yet that there is still value to be had here without the complexities of forward compatibility. The two approaches are also complementary; a forward-compatibility mechanism can easily be layered on top of non-forward compatible feature detection. If we can't get consensus on a forward-compatible feature detection mechanism for this initial SIMD proposal, I would be happy if we could at least include the non-compatible version you describe. |
@lukewagner, our solution for optional features should be consistent, i.e., whatever we do for SIMD should equally apply to threads, GC, etc. I don't find it convincing that the situation for SIMD is so fundamentally different that it justifies multiple mechanisms. For example, I expect there to be environments that fundamentally cannot implement SIMD nor does their application domain have any use for it, e.g., embedded space. An engine specialised for such an environment should perhaps not be forced to invest in tracking this huge and fast-growing instruction subset just to be allowed to call itself conforming. Or vice versa, if we think it's okay that they have to, then we can make the same argument for any other optional proposal, which all have a much smaller surface, so would be much less work. So we could go both ways, but should pick one. |
FWIW, the optionality scheme @lukewagner suggested generalized beyond SIMD (just replace "SIMD" in his explanation with any other proposal). That being said, I think SIMD is different enough from other types of proposals that it makes more sense for that scheme to be applied to SIMD proposals only.
I absolutely agree that it would be unfortunate to have multiple mechanisms, but part of the thesis here is that we won't need any mechanisms besides the SIMD mechanism for the reasons given above. For engines that do not want to support e.g. GC, providing a way to gracefully fall back to a non-GC implementation is simply not useful. I confidently predict that no toolchain will want to emit both a GC and non-GC implementation of the source code into the same module, so GC modules will simply not be portable to such engines without a second compilation step to non-GC WebAssembly. This is not a problem per se, but it does mean that our feature detection mechanism can completely ignore GC and any other proposal that meaningfully changes compilation schemes.
I agree, which is why I would like to solve the forward compatibility problem as well as the optionality problem for SIMD. For non-SIMD proposals that such an engine does not want to implement, I believe that the engine should simply not implement them, as outlined above. |
Very well said @tlively. In that argument, "SIMD" could perhaps be generalized to the class of "locally-acting instructions meant to tap into hardware that may not be present on all hosts and, when absent, is so difficult or inefficient to implement that it's better for the instruction to simply be absent". Should any future instructions emerge outside of SIMD that fit that description, I think the same scheme used for SIMD could be extended uniformly.
Even imagining the most aggressive SIMD feature release schedule, if the only thing a non-supporting engine has to do is to add instruction decoding + immediate validation, the actual engineering work required to "keep up" seems like it would be minimal. Add to that the high degree of wasm engine reuse these days and I think the effort here is negligible and thus not a design constraint. |
I feel like optionality for SIMD is a bit of a strawman; both v8 and Firefox support SIMD now; the only other major engine is JSC/Safari, and it's not clear to me if the timelines there will be helped by making it possible to do a no-op implementation. And regardless of that, realistically all consumers of SIMD will need to provide a non-SIMD fallback for the foreseeable future, which doesn't seem possible to overcome. So I feel like we should focus strictly on forward compatibility, with two simple angles:
For pt2 the likely candidates are going to be v256, v512 and mask16; v512 & mask16 realistically are Intel AVX512-only which means the real timeline for that to be widely usable across multiple architectures is very long, AMD Zen 3 came out this month and it doesn't have support for AVX 512, so I'm tempted to ask if it makes sense to standardize v256 with zero instructions targeting that and then the problem simply boils down to how do we specify instruction encoding for future unknown instructions, with possibilities along the lines of a general-purpose compat instruction (proposed in the original post), blocks with explicit length (proposed by me in a comment above), and possibly one more variant of a generalized instruction prefix (similar to how some hardware vendors like Intel extended their ISA, we could have a Then it's just a matter of drawing up pros & cons for these and picking one. The only real alternative to me seems to be not doing anything here, which isn't great. |
SIMD optionality may not be that useful on the Web, but I expect it to be useful for portability in the broader WebAssembly ecosystem. I agree that on the Web, forward compatibility is much more useful. Forward compatibility also subsumes optionality, so strategically we should angle for a forward compatibility solution first, but if the CG rejects that, we should fall back to at least trying to get basic feature detection and optionality into this proposal. That way we will have solved the simpler problem and laid the building blocks necessary to pursue forward compatibility as a follow-on proposal. I will write fresh proposal using the forward-compatibility block idea later today. |
It seems most people on this thread would like to see feature detection, I think we should explore it, and I want to try and raise some points (hopefully) in favor of not doing anything, just to get something balanced:
|
If we go with a block idea and require the block around SIMD-carrying functions then it doesn't cost us much in terms of specification complexity to allocate an extra feature bit for "is SIMD functional period", although this would require changing the implementations to do feature detection via loading and executing a module with SIMD instructions in it (and changing the code that emits instructions accordingly). But it would be great to avoid a possibility where you can produce a given function both in a way that's compatible with implementations that support SIMD optionally and with ones that don't, just to make sure that we have crisp rules about what it means to have SIMD supported. Although maybe that's in general an orthogonal question, which is "do we want to make it possible for the engines to support SIMD without actually implementing SIMD, to make it easier long-term for the users of SIMD" |
Maybe we can discuss and reach consensus on a few high level points either this Friday or next time:
|
This thread is super fascinating, and the questions regarding types, instructions, and versioning only add to it. To add to @zeux 's questions:
|
Yes, it's a matter of granularity. In the most fine-grained scheme, each instruction maybe supported independently of any other instruction, so there is no need for versions or feature sets. On the other hand, usability suffers under such a scheme because it causes a combinatorial explosion in the number of possible configurations. Each module would have to individually check whether each instruction it wants to use is available. In practice, users of WebAssembly already have to reason about versions and feature sets due to different proposals shipping at different times, so it's not too much of a stretch to have these concepts codified in the spec.
Types generally appear in a WebAssembly module in more places than than instructions do, for example in function signatures, so if we wanted to handle forward compatibility for types in those locations we would need a separate mechanism. However, I would propose that we restrict our forward compatibility mechanism to instructions and types inside the code section to keep it simpler. |
Yeah, that was my fear. Even with compile-time collection, binary compression, and a set intersection, this could become cumbersome and tedious.
Excellent point for a number of reasons. If we were to look at a pie chart for WebAssembly and WebAssembly SIMD users, what would the piechart look like with respect to browsers capable of automatic updates, browsers without such support, and 'server-side' functionality like Node.JS?
What does it mean to be typed only inside the code section? What restrictions does that entail? |
It's hard to say what the exact breakdown of users is between different platforms, but we can make some useful generalities. All modern browsers are generally routinely updated (although on varying schedules), but there can be big gaps between when new features are shipped on some browser and when they are shipped on all browsers. Browsers that are not regularly updated are not likely to support WebAssembly SIMD at all (and prossibly not MVP WebAssembly, either). My impression is that Node users are generally happy to use the latest LTS, which gets all of Chrome's features, albeit with a lag time of up to a few years for LTS releases. Other server-side environments are new enough that they are being actively developed and frequently updated, but due to resource constraints will implement new features at different times.
The most important restriction is that SIMD types wouldn't be able to be used in function signatures in a forward-compatible manner. I don't think this will be a problem in practice because any function with a non-SIMD fallback wouldn't be able to use SIMD types in its non-SIMD signature anyway. |
Let's say we have an entry point function with memory pointer parameters and length that calls multiple SIMD functions if they exist. Those SIMD functions take v128s as parameters or provide v128s as return values. And for grins, let's say 1 SIMD function calls another. Does this work? Is this allowable and/or functional in your proposal? |
I've updated the opening post to describe the new block-based design, incorporating feedback from @zeux and others above. I'll also present on it briefly at our meeting tomorrow so that everyone can participate in discussion about it without necessarily having read the updated post.
This works if the baseline feature set for the compilation includes |
Some additional thoughts about this:
|
I agree that we should feel free to use this feature for non-SIMD proposals that introduce new instructions (but not new sections or other constructs). For that reason this could easily be split off into a separate proposal, but I also don't think it's unreasonable to include it in the SIMD proposal.
This is a good design question. WebGPU, for example, only allows extension features to be used when explicitly requested in order to avoid this problem. However, I don't think this is as important for WebAssembly because we expect all WebAssembly to be tool-generated. We can expect tools to be less buggy than humans here. |
Given the freeze on new instruction proposals we agreed on in #389, feature detection would no longer help us ship SIMD any sooner. It has been moved out to it's own proposal and discussion should continue on https://github.com/WebAssembly/feature-detection, so I will close this issue. Thank you all for your input and feedback! |
There is currently no runtime feature detection mechanism, which means that libraries/applications will require separate builds for every supported feature set. We previously meant to solve this problem with a general mechanism in the conditional sections proposal, but that proposal is blocked on the [module linking] proposal, so it is not going to be shipped any time soon. This is creating a lot of pressure to fit every potentially-useful instruction into this MVP proposal, which in turn is preventing us from making progress towards shipping the MVP. This issue proposes a new SIMD-specific feature detection mechanism that would relieve some of that pressure and make it much easier for our users to take advantage of follow-on SIMD proposals. Implementing feature detection would be quite a bit of work, so it wouldn't necessarily let us ship any sooner, but it would add significant user value to the SIMD proposal.
Feature detection:
simd.has_features
The first new item is a new instruction
simd.has_features
(name subject to bikeshedding) that takes an immediate bitmask identifying a SIMD feature set and returns a1
if the current engine supports that feature set and a0
otherwise. The immediate bitmask will be a uleb128 to allow it to scale to an arbitrary number of features. This MVP proposal would correspond to bit 0 in the bitmask and future follow-on proposals will correspond with successively higher bits.Forward compatibility:
feature_block
While
simd.has_features
allows supported features to be detected at runtime, we still need a way for new instructions to pass validation on older engines on which they aren't supported. To do this, we introduce a new block-like construct,feature_block
(name subject to bikeshedding). Its binary format syntax isfeature_bitvec
is a uleb128 encoding the same kind of feature bitmask used insimd.has_features
andbyte_len
is the byte length ofinstr*
. During decoding, if the engine supports all the features infeature_bitvec
, thefeature_block
is decoded as a normal block, i.e.block blocktype instr* end
. Otherwise, thefeature_block
is decoded as anunreachable
, usingbyte_len
to skip its contents entirely without decoding them.Because it is a decoding-time feature, there is no need for
feature_block
to appear in WebAssembly's syntax, semantics, or validation algorithm. Similarly,simd.has_features
can be specified as decoding toi32.const 0
ori32.const 1
and does not need to appear in the spec outside of the binary format.Usage
When an object is compiled from source, any instruction available in the baseline feature set enabled for that compilation would be used normally. Any additional features could be conditionally enabled via function multiversioning, which would compile down to a dispatch function using
simd.has_features
to determine which lowered version of the multiversioned function to call. The various lowered versions of the function would have their bodies wrapped in afeature_block
specifying the features they use.Instruction alias-based design [Obsolete]
The second item is the scheme for allowing unsupported future instructions to pass validation. This scheme has to be sufficiently general to allow any future SIMD instruction to pass parsing and validation no matter what opcode, type, or immediates they may have. At the same time, we don't want producers to have to pay in code size or complexity for this engine forward-compatibility scheme if they don't need to emit backward-compatible binaries.
The original design for this is posted below in the "Previous design" section. The new design here reflects @lars-t-hansen's suggestions as of this comment.
We propose introducing new
simd.compat
instructions in the single-byte SIMD opcode range. Eachsimd.compat
instruction will take as its first immediate the opcode of another SIMD instruction and will adopt the semantics of that instruction if the engine knows about it. If the engine does not know of an instruction corresponding to the provided opcode, thesimd.compat
will take on the semantics ofunreachable
instead. This allows future SIMD instructions to pass validation on old engines as long as they are "wrapped" in asimd.compat
instruction.To be able to parse arbitrary unknown SIMD instructions, the engine has to know what kind of immediates the unknown instructions expect. To this end there will a separate
simd.compat
instruction for each possible shape of immediates that SIMD instructions can take, as well as an additional variant that takes an uninterpreted vector of bytes for use with future SIMD instructions with uncommon immediate patterns or immediate patterns we haven't anticipated.simd.compat
<target opcode>
simd.compat.immbyte
<target opcode> <ImmByte>
simd.compat.memarg
<target opcode> <memarg>
simd.compat.bytes
<target opcode> <vec of ImmByte>
Text format
The wrapped version of each SIMD instruction should re-use the name of the wrapped instruction in some manner for readability. I propose that the wrapped version of each instructions have the same name as the normal version of the instruction, prefixed with
compat.
(subject to bikeshedding). This may be controversial because it introduces many different text format names each of thesimd.compat
instructions.Original prefix-based design [Obsolete]
SIMD opcodes are all currently prefixed with the "SIMD" prefix byte
0x7d
. We propose introducing an alias for every defined SIMD instruction with the same opcode but prefixed with a new "Forward-compatible SIMD" prefix byte0x7a
(exact prefix subject to change to avoid collisions). Every instruction in the0x7a
prefix space takes an initial byte immediate that describes the additional immediate arguments the instruction takes.This initial immediate descriptor immediate makes every instruction in the
0x7a
prefix space self-describing, so every instruction in that space can be parsed even if the engine does not know about it. To pass validation, every opcode in the0x7a
prefix space that does not correspond to a concrete SIMD instruction (that the engine knows about) has the same type and semantics as theunreachable
instruction.When we introduce a follow-on proposal, all the new instructions in the
0x7d
prefix space will be mirrored into the0x7a
prefix space with prescribed immediate descriptor immediates, replacing opcodes that previously were specified to take any immediate desciptors and have unreachable semantics. Technically, this will not be a backwards compatible change, but that's working as intended because this is a forwards-compatibility feature. This back-compat hazard should be explicitly called out in the spec.Thanks to @binji for the helpful discussions about how best to design this.
The text was updated successfully, but these errors were encountered: