This repository has been archived by the owner on Mar 20, 2024. It is now read-only.
allow vector element processing to effectively proceed in parallel, not strict 0 to vl, for most ops. #534
Labels
Resolve after v1.0
Does not need to be resolved for v1.0 draft
Most vector operations yield the same results if the elements are processed in any order.
With an opaque vstart #532 additional restart information can be encoded in XLEN bits to allow restart of various restart states.
Explicitly relax the from 0 to vl constraint, effectively allowing parallel processing.
Note, load operations especially can benefit from this relaxation. Specifically element loads can occur opportunistically leveraging any present cache entry without having to wait for any other request to complete, and cache subsystem requests can be optimal in any order. See discussions in #502 and #504.
Notably, stores are affected by process order. Equally notable is the presence of an ordered and an unordered variant for stores.
Some overlapping destination with source register decisions are premised on 0 to vl processing order.
It is valid to still enforce these constraints as simple systems may elect to always process from 0 to vl, and they should not be penalized for a potential optimization by other implementations.
The text was updated successfully, but these errors were encountered: