-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operator set wave 3 #805
base: main
Are you sure you want to change the base?
Operator set wave 3 #805
Conversation
Another "TODO" - the new ops need "constraints" tables |
Note from discussion w/ @a-sully - CoreML has restrictions on the dequantize op that we'll need to think about.
Re-emphasizing that dequantizing (u)int4 in CoreML is extremely limited (input must be const). @mwyrzykowski - any thoughts about how we can handle the proposed ops efficiently? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial pass.
<dl dfn-type=dict-member dfn-for=MLCumulativeSumOptions> | ||
: <dfn>exclusive</dfn> | ||
:: | ||
Whether to include or exclude the current value in the output, meaning inclusive presum addition (see https://en.wikipedia.org/wiki/Prefix_sum) or exclusive post-sum addition. Given input *[1,2,3,4]*, inclusive addition would yield an output of *[1,3,6,10]* whereas exclusive would yield *[0,1,3,6]*. The default is inclusive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent between "presum" vs. "post-sum"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "inclusive prefix sum addition" and "exclusive prefix sum addition" (or w/o the "addition" part)?
Whether to include or exclude the current value in the output, meaning inclusive presum addition (see https://en.wikipedia.org/wiki/Prefix_sum) or exclusive post-sum addition. Given input *[1,2,3,4]*, inclusive addition would yield an output of *[1,3,6,10]* whereas exclusive would yield *[0,1,3,6]*. The default is inclusive. | |
Whether to include or exclude the current value in the output, meaning inclusive prefix sum addition or exclusive prefix sum addition [[Prefix-sum]]. Given input *[1,2,3,4]*, inclusive addition would yield an output of *[1,3,6,10]* whereas exclusive would yield *[0,1,3,6]*. The default is inclusive. |
Here's the biblio entry to be added at the end of the <pre class="biblio">
block:
+ },
+ "Prefix-Sum": {
+ "href": "https://en.wikipedia.org/wiki/Prefix_sum",
+ "title": "Prefix Sum",
+ "authors": ["The Wikipedia community"],
+ "date": "January 2025"
}
}
</pre>
The Wikipedia article seems quite good. I believe this operation has been known for centuries(?), so hard to hunt down the canonical reference :-)
The reason why it is good to keep the spec references in their own section is to help assess their stability, licensing etc. during certain transitions. Especially important for normative references: https://www.w3.org/guide/process/tilt/normative-references.html
|
||
: <dfn>reversed</dfn> | ||
:: | ||
Whether to reverse the summation direction along the active axis to instead start from the high coordinate to low coordinate. Given input *[1,2,3,4]*, inclusive forward addition would yield an output of *[1,3,6,10]* whereas backward summation would yield *[10,9,7,4]*. The default is exclusive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The default is exclusive" - should be "The default is forward." ?
Also, inconsistent phrasing "inclusive forward addition" vs. "backward summation"
- *greater*: Compare if the values of the first input tensor is greater, element-wise. | ||
- *greaterOrEqual*: Compare if the values of the first input tensor is greater or equal, element-wise. | ||
- *lesser*: Compare if the values of the first input tensor is lesser, element-wise. | ||
- *lesserOrEqual*: Compare if the values of the first input tensor is lesser or equal, element-wise. | ||
- *logicalNot*: Invert the values of the input tensor to values 0 or 1, element-wise. Specifically, when the input value is non-zero, invert it to 0. Conversely, for a zero input value, invert it to 1. | ||
- *logicalAnd*: Compute the logical *and* operator, element-wise, treating any non-zero value as true and returning elements of 0 or 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, how about "of the two input tensors" instead of "operator" ?
MLBinarySupportLimits greater; | ||
MLBinarySupportLimits greaterOrEqual; | ||
MLBinarySupportLimits lesser; | ||
MLBinarySupportLimits lesserOrEqual; | ||
MLLogicalNotSupportLimits logicalNot; | ||
MLLogicalNotSupportLimits logicalAnd; | ||
MLLogicalNotSupportLimits logicalOr; | ||
MLLogicalNotSupportLimits logicalXor; | ||
}; | ||
</script> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add the new ops to the <div dfn-for...
just below this line.
This ensures the arguments in the IDL get linked up appropriately.
@@ -3482,6 +3630,13 @@ partial dictionary MLOpSupportLimits { | |||
1. Return |output|. | |||
</div> | |||
|
|||
<div algorithm> | |||
The <dfn method for=MLGraphBuilder>sign(|input|, |options|)</dfn> method steps are: | |||
1. Let |output| be the result of running the [=MLGraphBuilder/element-wise-unary-op | create element-wise unary operation=] given "sign", |input|, signed types « {{MLOperandDataType/"float32"}}, {{MLOperandDataType/"float16"}}, {{MLOperandDataType/"int32"}}, {{MLOperandDataType/"int8"}} », and |options|. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We don't include flavor text like "signed types" before the list of supported types elsewhere.
index.bs
Outdated
### dequantizeLinear ### {#api-mlgraphbuilder-dequantizelinear} | ||
Dequantizes an integer tensor to floating point space using the scale and zero-point bias, where `output = (input - zeroPoint) * scale`. | ||
|
||
The operation will be [=broadcast=] according to [[!numpy-broadcasting-rule]]. The input tensors must be [=bidirectionally broadcastable=]. The [=MLOperand/rank=] of the output tensor is the maximum [=MLOperand/rank=] of the input tensors. For each dimension of the output tensor, its size is the maximum size along that dimension of the input tensors, and each dimension must be blockwise compatible with the output (e.g. given an input shape [12], scales of the following shapes are blockwise compatible {[1], [3], [4], [6], [12]} as they are all multiples of the input dimensions, but a shape of [5] would not be). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe introduce "blockwise compatible" as a definition, to avoid duplicating it between dequantizeLinear() and quantizeLinear()?
index.bs
Outdated
1. [=list/For each=] |size| of |shapeInput|: | ||
1. If |dimCount| is less than or equal to |axis| then [=iteration/continue=]. | ||
1. Set |shapeOutput|[|rankOutput| + |dimCount| - |axis| - 1] to |size|. | ||
1. Increment |dimCount| by one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dimCount
isn't used after it is calculated?
(Pre-existing issue!)
index.bs
Outdated
@@ -7878,6 +8674,45 @@ partial dictionary MLOpSupportLimits { | |||
</details> | |||
</div> | |||
|
|||
### tile ### {#api-mlgraphbuilder-tile} | |||
Repeat a tensor the number of times along each dimension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wording "a number" rather than "the number" ?
index.bs
Outdated
<div dfn-for="MLGraphBuilder/tile(input, options)" dfn-type=argument> | ||
**Arguments:** | ||
- <dfn>input</dfn>: an {{MLOperand}}. The input N-D tensor. | ||
- <dfn>repetitions</dfn>: A count per each dimension of how many times to repeat that dimension. The repetitions count must match the input rank, using 1's for any axis that should retain the same size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wording: "per each" seems redundant. Either "count per dimension" or "count for each dimension" ?
@@ -8537,6 +9372,8 @@ Operations present in other neural network inference APIs can often be emulated | |||
|
|||
<p class="note">{{Float16Array}} is at <a href="https://tc39.es/process-document/">ECMA Stage 3</a> signaling its design is finished. Implementers wanting to enable this type ahead native implementations can emulate the type by passing raw bits via {{Uint16Array}}. <a href="https://github.com/webmachinelearning/webnn/issues/373">[Issue webnn#373]</a></p> | |||
|
|||
<p class="note">There is no Uint4Array/Int4Array class. Nybbles are stored in byte arrays of {{Uint8Array}} with the lower nybble in the lower bits, meaning tensor element 0 would be found in byte 0 bits 0-3, and tensor element 1 in byte 0 bits 4-7 (and so on, with tensor element 5 in byte 2 bits 4-7). Odd tensor element counts are rounded up to whole bytes, and last nybble is ignored, meaning a 5 element tensor uses 3 bytes.</a></p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "nibble" is the preferred spelling.
Also, the examples may be a bit confusing, as the first example mentions tensor element 5 (so, a 6-or-more element tensor) which would pack the that element's bits into the upper nibble, whereas the second example mentions a 5 element tensor which would pack the last element's bits into the lower nibble. Maybe intentionally pick a number other than 5 for the first or second example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definitions of byte length and validate buffer with descriptor should get updated. Probably more places too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add to MLOperandDataType
enum too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the definition of the cast()
operator and the cast
algorithm - even just to call out that uint4 is not permitted.
<tr> | ||
<td>{{scale}}</td> | ||
<td>{{MLOperandDataType/"float32"}}, {{MLOperandDataType/"float16"}}</td> | ||
<td>0 to {{input}}'s [=MLOperand/rank=]</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce testing complexity some, we could require the scale and zeroPoint are explicitly reshaped to the same rank as input ahead of time? That would also nicely resolve variant oddities in how the caller express axes ahead of time🤔.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing other ops, requiring scale and zeroPoint to be the same rank as the input would be consistent. Easy enough for frameworks to inject a reshape().
If someone was manually coding against the API then supporting scalars would be convenient, but I don't think we should prioritize that.
<tr> | ||
<td>{{indices}}</td> | ||
<td>{{MLOperandDataType/"int32"}}, {{MLOperandDataType/"uint32"}}, {{MLOperandDataType/"int64"}}</td> | ||
<td>> 1</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see other occurrences of > 1
. Should I use 2 to N instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"2 to N" I guess?
While "> 1" is compact and readable, making it "2 to N" is probably what we should do, since N is a defined term and we may allow implementations to define a maximum N.
Please take a peek at the definition of "allowed ranks" and reword if you think it's needed. It already mentions ranges, so might be okay.
Added data type tables.
Thanks - will address more |
Adds the following operators, per #375:
Some TODO's remain:
Preview | Diff