FP8 ACLE specification #323

momchil-velikov · 2024-06-13T16:08:01Z

name: Pull request
about: Technical issues, document format problems, bugs in scripts or feature proposal.

Thank you for submitting a pull request!

If this PR is about a bugfix:

Please use the bugfix label and make sure to go through the checklist below.

If this PR is about a proposal:

We are looking forward to evaluate your proposal, and if possible to
make it part of the Arm C Language Extension (ACLE) specifications.

We would like to encourage you reading through the contribution
guidelines, in particular the section on submitting
a proposal.

Please use the proposal label.

As for any pull request, please make sure to go through the below
checklist.

Checklist: (mark with X those which apply)

If an issue reporting the bug exists, I have mentioned it in the
PR (do not bother creating the issue if all you want to do is
fixing the bug yourself).
I have added/updated the SPDX-FileCopyrightText lines on top
of any file I have edited. Format is SPDX-FileCopyrightText: Copyright {year} {entity or name} <{contact informations}>
(Please update existing copyright lines if applicable. You can
specify year ranges with hyphen , as in 2017-2019, and use
commas to separate gaps, as in 2018-2020, 2022).
I have updated the Copyright section of the sources of the
specification I have edited (this will show up in the text
rendered in the PDF and other output format supported). The
format is the same described in the previous item.
I have run the CI scripts (if applicable, as they might be
tricky to set up on non-*nix machines). The sequence can be
found in the contribution
guidelines. Don't
worry if you cannot run these scripts on your machine, your
patch will be automatically checked in the Actions of the pull
request.
I have added an item that describes the changes I have
introduced in this PR in the section Changes for next
release of the section Change Control/Document history
of the document. Create Changes for next release if it does
not exist. Notice that changes that are not modifying the
content and rendering of the specifications (both HTML and PDF)
do not need to be listed.
When modifying content and/or its rendering, I have checked the
correctness of the result in the PDF output (please refer to the
instructions on how to build the PDFs
locally).
The variable draftversion is set to true in the YAML header
of the sources of the specifications I have modified.
Please DO NOT add my GitHub profile to the list of contributors
in the README page of the project.

main/acle.md

andrewcarlotti · 2024-06-19T11:08:35Z

I'd prefer slightly different naming for the intrinsics and new types. Specifically:

Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.? This would be more consistent with existing type names while still preserving the "modal" distinction. It also makes the type name more easily distinguishable from FPMR values (which use fpm_t).
Can we drop the _fpm suffix from all the intrinsic names, and instead represent the modality in the type suffix (by replacing _f8 with _fm8 wherever it appears)?

Combining these, my proposal is to replace, for example,
float16x4_t vdot_lane_f16_f8_fpm(float16x4_t vd, fpm8x8_t vn, fpm8x8_t vm, __builtin_constant_p(lane), fpm_t fpm)
with
float16x4_t vdot_lane_f16_fm8(float16x4_t vd, floatm8x8_t vn, floatm8x8_t vm, __builtin_constant_p(lane), fpm_t fpm).

tools/intrinsic_db/advsimd.csv

momchil-velikov · 2024-06-21T10:20:41Z

Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.?

Can we drop the _fpm suffix from all the intrinsic names, and instead represent the modality in the type suffix (by replacing _f8 with _fm8 wherever it appears)?

I am in favour of both proposals.

rsandifo-arm · 2024-06-28T08:37:39Z

This might have been mentioned already, but the new vector types should also be added to svset_neonq, svget_neonq and svdup_neonq.

ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic. From the PR#323: ``` ACLE defines the `__fpm8` type, which can be used for the E5M2 and E4M3 8-bit floating-point formats. It is a storage and interchange only type with no arithmetic operations other than intrinsic calls. ```` The type should be an opaque type and its format in undefined in Clang. Only defined in the backend by a status/format register, for AArch64 the FPMR. This patch is an attempt to the add the fpm8_t scalar type. It has a parser and codegen for the new scalar type. The patch it is lowering to and 8bit unsigned as it has no format. But maybe we should add another opaque type. [1] ARM-software/acle#323

momchil-velikov · 2024-07-05T15:31:52Z

Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.?

Can we drop the _fpm suffix from all the intrinsic names, and instead represent the modality in the type suffix (by replacing _f8 with _fm8 wherever it appears)?

I am in favour of both proposals.

Coming up Soon(tm).

momchil-velikov · 2024-07-05T15:34:03Z

This might have been mentioned already, but the new vector types should also be added to svset_neonq, svget_neonq and svdup_neonq.

My next step is to add intrinsics for the untyped SVE/SME instructions, that would include these too.

This patch adds these new vector sizes for neon: fpm8x16_t and fpm8x8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323

neon_intrinsics/advsimd.md

momchil-velikov · 2024-07-23T17:39:54Z

I'd prefer slightly different naming for the intrinsics and new types. Specifically:

Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.? This would be more consistent with existing type names while still preserving the "modal" distinction. It also makes the type name more easily distinguishable from FPMR values (which use fpm_t).

This part done.

main/acle.md

paulwalker-arm · 2024-07-26T11:02:27Z

Can we call the new types floatm8_t, floatm8x16_t, svfloatm8_t, etc.? This would be more consistent with existing type names while still preserving the "modal" distinction. It also makes the type name more easily distinguishable from FPMR values (which use fpm_t).

As an amendment that follows the scheme used when going from float16 -> bfloat16 what about mfloat8_t, mfloat8x16_t, svmfloat8_t with "m" meaning "modal"?

momchil-velikov · 2024-07-30T11:03:13Z

Things renamed according to the above naming scheme.

rsandifo-arm

The new feature macros should also be listed in the “Summary of predefined macros” section.

main/acle.md

tools/intrinsic_db/advsimd.csv

main/acle.md

`mfloat8_t` | equivalent to `__mfp8` | According to ACLE[1] proposal [1] ARM-software/acle#323

According to the ACLE[1] [1]ARM-software/acle#323

and fpm8x16_t to mfloat8x16_t According to the ACLE[1] [1]ARM-software/acle#323

ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic. From the PR#323: ``` ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3 8-bit floating-point formats. It is a storage and interchange only type with no arithmetic operations other than intrinsic calls. `mfloat8_t` | equivalent to `__mfp8` | ```` The type should be an opaque type and its format in undefined in Clang. Only defined in the backend by a status/format register, for AArch64 the FPMR. This patch is an attempt to the add the fpm8_t scalar type. It has a parser and codegen for the new scalar type. The patch it is lowering to and 8bit unsigned as it has no format. But maybe we should add another opaque type. According to ACLE[1] proposal [1] ARM-software/acle#323

This patch adds these new vector sizes for neon: mfloat8x16_t and mfloat8x8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323

This patch adds these new vector sizes for sve: svmfloat8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323

main/acle.md

tools/intrinsic_db/advsimd_classification.csv

main/acle.md

tools/intrinsic_db/advsimd_classification.csv

main/acle.md

momchil-velikov · 2024-09-12T14:41:22Z

Squished everything and rebased, not supposed to contain content changes.

[fixup] Add description of helper intrisnics ... and other clarifications. [fixup] Add FP8 intrinsics for untyped NEON instructions (load/store/etc) [fixup] Add FP8 intrinsics for untyped SVE/SME instructions [fixup] Define the format for the FP8 state [fixup] Mass renaming of FP8 instrinsics and types, and misc other [fixup] Another naming scheme [fixup] Miscellaneous fixes [fixup] Add FP8 variants for some more "untyped" NEON intrinsics [fixup] Add NEON reinterpret varaiants for FP8 and misc other [fixup] Add FP8 feature macros to the predefined macros summary table [fixup] Misc fixes [fixup] Fix some indentation [fixup] Move FP8 SVE intrinsics section alongside other streaming-compatible intrinsics [fixup] Change FP8 NEON intrinsics classification

momchil-velikov · 2024-09-18T12:23:24Z

Last update was completely botched, now fixed.

CarolineConcatto

Thank you Momchil for the work.

tools/intrinsic_db/advsimd.csv

ARM ACLE PR#323[1] adds new modal types for 8-bit floating point intrinsic. From the PR#323: ``` ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3 8-bit floating-point formats. It is a storage and interchange only type with no arithmetic operations other than intrinsic calls. ```` The type should be an opaque type and its format in undefined in Clang. Only defined in the backend by a status/format register, for AArch64 the FPMR. This patch is an attempt to the add the MFloat8_t scalar type. It has a parser and codegen for the new scalar type. The patch it is lowering to and 8bit unsigned as it has no format. But maybe we should add another opaque type. [1] ARM-software/acle#323

rockdreamer · 2024-09-23T09:43:33Z

main/acle.md

@@ -6515,7 +6714,7 @@ single vectors:

 | **Signed integer**   | **Unsigned integer** | **Floating-point**   |                      |
 | -------------------- | -------------------- | -------------------- | -------------------- |
-| `svint8_t`           | `svuint8_t`          |                      |                      |
+| `svint8_t`           | `svuint8_t`          |                      | `svmfloat8_t         |


Should the

svbfloat16_t is only available if the header file also provides a definition of bfloat16_t

section below be repeated or extended for svmfloat8_t?

Wouldn't that be necessary only if the underlying type definition was conditional?

It would, I was confused, thanks for clarifying :)

sallyarmneale

minor suggestions

sallyarmneale · 2024-09-25T09:57:09Z

main/acle.md

+The FP8 types are all opaque types. That is to say they can only be used
+by intrinsics.


That is to say they can only be used by intrinsics.

Change to: That is, they can only be used by intrinsics.

sallyarmneale · 2024-09-25T09:57:55Z

main/acle.md

+
+#### FCVTNT, FCVTNB
+
+Single-precision convert, narrow and interleave to 8-bit floating-point (top and bottom).


comma after 'narrow'

This patch implements these intrinsics: FSCALE SINGLE AND MULTI ``` // Variants are also available for: // [_single_f32_x2], [_single_f64_x2], // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4] svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming; // Variants are also available for: // [_f32_x2], [_f64_x2], // [_f16_x4], [_f32_x4], [_f64_x4] svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming ``` (cf. ARM-software/acle#323) Co-authored-by: Caroline Concatto <[email protected]>

This patch implements following intrinsics: ``` float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm) float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm) float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm) float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm) float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm) ``` as defined in ARM-software/acle#323 Co-authored-by: Hassnaa Hamdi <[email protected]>

This patch implements these intrinsics: FSCALE SINGLE AND MULTI ``` // Variants are also available for: // [_single_f32_x2], [_single_f64_x2], // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4] svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming; // Variants are also available for: // [_f32_x2], [_f64_x2], // [_f16_x4], [_f32_x4], [_f64_x4] svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming ``` (cf. ARM-software/acle#323) Co-authored-by: Caroline Concatto <[email protected]>

This patch implements following intrinsics: ``` float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm) float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm) float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm) float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm) float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm) ``` as defined in ARM-software/acle#323 Co-authored-by: Hassnaa Hamdi <[email protected]>

This patch implements these intrinsics: FSCALE SINGLE AND MULTI ``` // Variants are also available for: // [_single_f32_x2], [_single_f64_x2], // [_single_f16_x4], [_single_f32_x4], [_single_f64_x4] svfloat16x2_t svscale[_single_f16_x2](svfloat16x2_t zd, svfloat16_t zm) __arm_streaming; // Variants are also available for: // [_f32_x2], [_f64_x2], // [_f16_x4], [_f32_x4], [_f64_x4] svfloat16x2_t svscale[_f16_x2](svfloat16x2_t zd, svfloat16x2_t zm) __arm_streaming ``` (cf. ARM-software/acle#323) Co-authored-by: Caroline Concatto <[email protected]>

This patch implements following intrinsics: ``` float16x4_t vscale_f16(float16x4_t vn, int16x4_t vm) float16x8_t vscaleq_f16(float16x8_t vn, int16x8_t vm) float32x2_t vscale_f32(float32x2_t vn, int32x2_t vm) float32x4_t vscaleq_f32(float32x4_t vn, int32x4_t vm) float64x2_t vscaleq_f64(float64x2_t vn, int64x2_t vm) ``` as defined in ARM-software/acle#323 Co-authored-by: Hassnaa Hamdi <[email protected]>

rearnsha reviewed Jun 14, 2024

View reviewed changes

main/acle.md Outdated Show resolved Hide resolved

rockdreamer reviewed Jun 14, 2024

View reviewed changes

main/acle.md Outdated Show resolved Hide resolved

rockdreamer reviewed Jun 18, 2024

View reviewed changes

main/acle.md Outdated Show resolved Hide resolved

andrewcarlotti reviewed Jun 19, 2024

View reviewed changes

tools/intrinsic_db/advsimd.csv Outdated Show resolved Hide resolved

CarolineConcatto mentioned this pull request Jul 1, 2024

[CLANG][AArch64] Add the modal 8 bit floating-point scalar type llvm/llvm-project#97277

Open

CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 22, 2024

[CLANG]Add Neon vectors for fpm8_t

fd4d8da

This patch adds these new vector sizes for neon: fpm8x16_t and fpm8x8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323

ktkachov reviewed Jul 23, 2024

View reviewed changes

neon_intrinsics/advsimd.md Outdated Show resolved Hide resolved

Lukacma mentioned this pull request Jul 23, 2024

[AArch64] Implement intrinsics for SME2 FSCALE llvm/llvm-project#100128

Merged

Lukacma mentioned this pull request Jul 24, 2024

[AArch64] Implement NEON vscale intrinsics llvm/llvm-project#100347

Merged

ktkachov reviewed Jul 26, 2024

View reviewed changes

main/acle.md Outdated Show resolved Hide resolved

rsandifo-arm reviewed Jul 30, 2024

View reviewed changes

CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 31, 2024

Rename 8-bit fp from fmp8 to mfloat8

c622d44

`mfloat8_t` | equivalent to `__mfp8` | According to ACLE[1] proposal [1] ARM-software/acle#323

momchil-velikov mentioned this pull request Jul 31, 2024

Intrinsics for absolute minimum and maximum, and table lookup #324

Merged

8 tasks

CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 31, 2024

Rename 8-bit fp from fmp8 to mfloat8

e5bcd7f

`mfloat8_t` | equivalent to `__mfp8` | According to ACLE[1] proposal [1] ARM-software/acle#323

CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 31, 2024

Rename NEON 8bit floating point from fpm8x8_t to mfp8x8_t

6dd7d08

According to the ACLE[1] [1]ARM-software/acle#323

CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Jul 31, 2024

Rename NEON 8bit floating point from fpm8x8_t to mfloat8x8_t

a3f9937

and fpm8x16_t to mfloat8x16_t According to the ACLE[1] [1]ARM-software/acle#323

momchil-velikov force-pushed the fp8-acle branch from 5188a28 to c3532ca Compare August 1, 2024 16:19

CarolineConcatto added a commit to CarolineConcatto/llvm-project that referenced this pull request Aug 2, 2024

[CLANG]Add Scalable vectors for mfloat8_t

ceb5124

This patch adds these new vector sizes for sve: svmfloat8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323

CarolineConcatto mentioned this pull request Aug 2, 2024

[CLANG]Add Scalable vectors for mfloat8_t llvm/llvm-project#101644

Open

mgabka reviewed Aug 20, 2024

View reviewed changes

momchil-velikov force-pushed the fp8-acle branch from b5c7566 to ecadad8 Compare August 29, 2024 15:57

rsandifo-arm reviewed Sep 10, 2024

View reviewed changes

mgabka reviewed Sep 11, 2024

View reviewed changes

main/acle.md Outdated Show resolved Hide resolved

main/acle.md Show resolved Hide resolved

main/acle.md Outdated Show resolved Hide resolved

main/acle.md Outdated Show resolved Hide resolved

main/acle.md Show resolved Hide resolved

vhscampos added the proposal label Sep 12, 2024

momchil-velikov force-pushed the fp8-acle branch from d34bcc6 to 6435cdd Compare September 12, 2024 14:39

momchil-velikov marked this pull request as draft September 18, 2024 10:05

momchil-velikov force-pushed the fp8-acle branch from c1e711f to 7bb741b Compare September 18, 2024 12:21

momchil-velikov marked this pull request as ready for review September 18, 2024 12:23

CarolineConcatto approved these changes Sep 18, 2024

View reviewed changes

tools/intrinsic_db/advsimd.csv Outdated Show resolved Hide resolved

[fixup] Fix incorrect intrinsic name

b3c11b9

vhscampos merged commit 5525258 into ARM-software:main Sep 23, 2024
4 checks passed

rockdreamer reviewed Sep 23, 2024

View reviewed changes

sallyarmneale reviewed Sep 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP8 ACLE specification #323

FP8 ACLE specification #323

momchil-velikov commented Jun 13, 2024

andrewcarlotti commented Jun 19, 2024

momchil-velikov commented Jun 21, 2024

rsandifo-arm commented Jun 28, 2024

momchil-velikov commented Jul 5, 2024

momchil-velikov commented Jul 5, 2024

momchil-velikov commented Jul 23, 2024

paulwalker-arm commented Jul 26, 2024

momchil-velikov commented Jul 30, 2024 •

edited

Loading

rsandifo-arm left a comment

momchil-velikov commented Sep 12, 2024

momchil-velikov commented Sep 18, 2024

CarolineConcatto left a comment

rockdreamer Sep 23, 2024

momchil-velikov Sep 25, 2024

rockdreamer Sep 25, 2024

sallyarmneale left a comment

sallyarmneale Sep 25, 2024

sallyarmneale Sep 25, 2024

		The FP8 types are all opaque types. That is to say they can only be used
		by intrinsics.


		#### FCVTNT, FCVTNB

		Single-precision convert, narrow and interleave to 8-bit floating-point (top and bottom).

FP8 ACLE specification #323

FP8 ACLE specification #323

Conversation

momchil-velikov commented Jun 13, 2024

andrewcarlotti commented Jun 19, 2024

momchil-velikov commented Jun 21, 2024

rsandifo-arm commented Jun 28, 2024

momchil-velikov commented Jul 5, 2024

momchil-velikov commented Jul 5, 2024

momchil-velikov commented Jul 23, 2024

paulwalker-arm commented Jul 26, 2024

momchil-velikov commented Jul 30, 2024 • edited Loading

rsandifo-arm left a comment

Choose a reason for hiding this comment

momchil-velikov commented Sep 12, 2024

momchil-velikov commented Sep 18, 2024

CarolineConcatto left a comment

Choose a reason for hiding this comment

rockdreamer Sep 23, 2024

Choose a reason for hiding this comment

momchil-velikov Sep 25, 2024

Choose a reason for hiding this comment

rockdreamer Sep 25, 2024

Choose a reason for hiding this comment

sallyarmneale left a comment

Choose a reason for hiding this comment

sallyarmneale Sep 25, 2024

Choose a reason for hiding this comment

sallyarmneale Sep 25, 2024

Choose a reason for hiding this comment

momchil-velikov commented Jul 30, 2024 •

edited

Loading