allow vector element processing to effectively proceed in parallel, not strict 0 to vl, for most ops. #534

David-Horner · 2020-07-17T14:18:09Z

Most vector operations yield the same results if the elements are processed in any order.

With an opaque vstart #532 additional restart information can be encoded in XLEN bits to allow restart of various restart states.

Explicitly relax the from 0 to vl constraint, effectively allowing parallel processing.

Note, load operations especially can benefit from this relaxation. Specifically element loads can occur opportunistically leveraging any present cache entry without having to wait for any other request to complete, and cache subsystem requests can be optimal in any order. See discussions in #502 and #504.

Notably, stores are affected by process order. Equally notable is the presence of an ordered and an unordered variant for stores.

Some overlapping destination with source register decisions are premised on 0 to vl processing order.
It is valid to still enforce these constraints as simple systems may elect to always process from 0 to vl, and they should not be penalized for a potential optimization by other implementations.

kasanovic · 2021-03-19T08:28:25Z

We require vstart to report faulting element in base v1.0, as this is needed to simplify error handling and reporting, but reserved other values >VLMAX for future use.
In general, restart will require more than the number of bits in start, so unclear on the utility/generality of this approach.
Marking as a post-v1.0 issue.

David-Horner mentioned this issue Jul 19, 2020

Provide vlo*, vector load ordered variants with consecutive pairing #535

Open

kasanovic added the Resolve for v1.0 To be resolved for v1.0 draft label Aug 7, 2020

kasanovic added Resolve after v1.0 Does not need to be resolved for v1.0 draft and removed Resolve for v1.0 To be resolved for v1.0 draft labels Mar 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow vector element processing to effectively proceed in parallel, not strict 0 to vl, for most ops. #534

allow vector element processing to effectively proceed in parallel, not strict 0 to vl, for most ops. #534

David-Horner commented Jul 17, 2020

kasanovic commented Mar 19, 2021

allow vector element processing to effectively proceed in parallel, not strict 0 to vl, for most ops. #534

allow vector element processing to effectively proceed in parallel, not strict 0 to vl, for most ops. #534

Comments

David-Horner commented Jul 17, 2020

kasanovic commented Mar 19, 2021