Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FMV] Remove features without compiler support and consolidate others. #329

Conversation

labrinea
Copy link
Contributor

@labrinea labrinea commented Jul 9, 2024

It was raised in #315 that some features are of no use. In this revision I am removing FMV features that have no equivalent backend feature in the compiler therefore cannot be supported in any meaningful way [1], and I am fusing FMV features that the compiler cannot support independently [2].

[1]: dgh, rpres, sve-bf16, sve-ebf16, sve-i8mm, memtag3
[2]: {sha1, sha2}, {aes, pmull}, {sve2-aes, sve2-pmull128},
{memtag, memtag2}, {ssbs, ssbs2}, {ls64, ls64_v, ls64_accdata}


name: Pull request
about: Technical issues, document format problems, bugs in scripts or feature proposal.


Thank you for submitting a pull request!

If this PR is about a bugfix:

Please use the bugfix label and make sure to go through the checklist below.

If this PR is about a proposal:

We are looking forward to evaluate your proposal, and if possible to
make it part of the Arm C Language Extension (ACLE) specifications.

We would like to encourage you reading through the contribution
guidelines
, in particular the section on submitting
a proposal
.

Please use the proposal label.

As for any pull request, please make sure to go through the below
checklist.

Checklist: (mark with X those which apply)

  • If an issue reporting the bug exists, I have mentioned it in the
    PR (do not bother creating the issue if all you want to do is
    fixing the bug yourself).
  • I have added/updated the SPDX-FileCopyrightText lines on top
    of any file I have edited. Format is SPDX-FileCopyrightText: Copyright {year} {entity or name} <{contact informations}>
    (Please update existing copyright lines if applicable. You can
    specify year ranges with hyphen , as in 2017-2019, and use
    commas to separate gaps, as in 2018-2020, 2022).
  • I have updated the Copyright section of the sources of the
    specification I have edited (this will show up in the text
    rendered in the PDF and other output format supported). The
    format is the same described in the previous item.
  • I have run the CI scripts (if applicable, as they might be
    tricky to set up on non-*nix machines). The sequence can be
    found in the contribution
    guidelines
    . Don't
    worry if you cannot run these scripts on your machine, your
    patch will be automatically checked in the Actions of the pull
    request.
  • I have added an item that describes the changes I have
    introduced in this PR in the section Changes for next
    release
    of the section Change Control/Document history
    of the document. Create Changes for next release if it does
    not exist. Notice that changes that are not modifying the
    content and rendering of the specifications (both HTML and PDF)
    do not need to be listed.
  • When modifying content and/or its rendering, I have checked the
    correctness of the result in the PDF output (please refer to the
    instructions on how to build the PDFs
    locally
    ).
  • The variable draftversion is set to true in the YAML header
    of the sources of the specifications I have modified.
  • Please DO NOT add my GitHub profile to the list of contributors
    in the README page of the project.

@labrinea
Copy link
Contributor Author

labrinea commented Jul 9, 2024

Please review @DanielKristofKiss @rsandifo-arm

@andrewcarlotti
Copy link

Thanks for picking this up. It looks mostly good to me, but is there any reason you've retained dit and ebf16 in this PR?

@labrinea
Copy link
Contributor Author

labrinea commented Jul 9, 2024

Thanks for picking this up. It looks mostly good to me, but is there any reason you've retained dit and ebf16 in this PR?

Good catch on ebf16. I simply missed it. I'll fix in new revision. Regarding dit, LLVM seems to have an equivalent backend feature therefore I think it's worth keeping in the short/medium term. We can revisit later if it is of no use. Do you agree?

It was raised in ARM-software#315 that some features are of no use. In this revision
I am removing FMV features that have no equivalent backend feature in the
compiler therefore cannot be supported in any meaningful way [1], and I am
fusing FMV features that the compiler cannot support independently [2].

[1]: dgh, ebf16, rpres, sve-bf16, sve-ebf16, sve-i8mm, memtag3
[2]: {sha1, sha2}, {aes, pmull}, {sve2-aes, sve2-pmull128},
     {memtag, memtag2}, {ssbs, ssbs2}, {ls64, ls64_v, ls64_accdata}
@labrinea labrinea force-pushed the fmv-remove-and-consolidate-features branch from 139fffa to 4f62132 Compare July 9, 2024 18:00
@andrewcarlotti
Copy link

There's no -march flag for +dit, and my previous objections to it still apply, so I still think it should be removed. If you'd rather leave it for now, then we could have that as a separate incremental PR.

@tmatheson-arm
Copy link
Contributor

Wasn't the fundamental criticism in #315 that FMV is about detection, not codegen? And therefore just because there is no codegen effect of an FMV feature doesn't mean it is unusable.

For example, combining sha1 + sha2: is there never any context in which you want to discriminate between them for detection?

I thought mentag/memtag2 was already identified as a necessary distinction?

I think this is more likely to progress if it is split into several smaller patches, in which the rationale for combining/deleting each set of features is made clear.

I am removing FMV features that have no equivalent backend feature in the compiler

Just because the LLVM backend happens to combine features for codegen, that is not a justification for combining them at the FMV level. I would expect the rationale to be based on the Arm ARM or the feature sets of existing processors.

@DanielKristofKiss
Copy link
Contributor

Originally all features that available in user-space were added to the FMV spec.

Especially with target_version the expectation is that developers could write code for a certain feature with intrinsics or even inline assembly with .inst so no need for backend support.
Also target_version could be used for functionally control some aspect of the application instead of the codegen benefits.

Today there is code out there in the wild that use feature detection(HWCAP) to pick up a feature only when an unrelated and actually not used other feature is present as a kind a uarch version detection. This is bit of abuse of the feature detections but wanted to add here for awareness.
So with target_clones maybe we need to keep considering features that might not end up in instructions.

I'd prefer to keep as many feature in the list as many observable in userspace and remove stuff if we can be sure there is no real chance to use them in some way.

I think this is more likely to progress if it is split into several smaller patches, in which the rationale for combining/deleting each set of features is made clear.

I second this approach.

@Wilco1
Copy link

Wilco1 commented Jul 16, 2024

I disagree - #315 lists good reasons to merge/remove various features. I believe it is better to start with a small set of well defined features and expand over time as required. The ACLE currently lists 60 FMV features, but GCC and LLVM implement a subset of these - AFAIK which will work and which do not is not documented anywhere...

Note FMV is not the best nor the most efficient for feature detection. Defining a global variable that holds the HWCAP values may be a better solution if that is all that is needed.

@labrinea
Copy link
Contributor Author

labrinea commented Jul 16, 2024

Let me explain how the landscape looks like in LLVM and why I am proposing this change. Currently the vehicle for propagating architectural features from Clang to LLVM is the 'target-features' metadata. This is common between the command line and the target attribute, allowing them to interoperate. The FMV features I am suggesting to remove do not have corresponding metadata, making the interoperability impossible. Here is a motivating example why we need it: llvm/llvm-project#87939. A workaround would be to add dummy metadata (without backend correspondence) for some of those FMV features. The features I am mostly concerned about are those with different meanings in FMV than the command line or the target attribute. Those are: aes, sve2-aes, memtag, ssbs, ls64. Perhaps we could start by aligning these?

@labrinea
Copy link
Contributor Author

labrinea commented Jul 21, 2024

I think this is more likely to progress if it is split into several smaller patches, in which the rationale for combining/deleting each set of features is made clear.

Okay here is what is required in LLVM for splitting sha1 from sha2: llvm/llvm-project#99816. If we decide to move this forward then GCC would have to implement the same behavior. Unfortunately we cannot do the same for the features I listed in my previous message (aes, sve2-aes, memtag, ssbs, ls64) because we would have to change their semantics, which breaks backwards compatibility. Let me explain: According to ArmARM:

  • feat_pmull depends on feat_aes
  • feat_sve_pmull128 depends on feat_sve_aes
  • feat_mte2 depends on feat_mte
  • feat_ssbs2 depends on feat_ssbs
  • feat_ls64_accdata depends on feat_ls64_v
  • feat_ls64_v depends on feat_ls64

The compiler is using the name of the dependee to represent the aggregated feature set instead of using the name of the dependant: for example the name ls64 means {feat_ls64_accdata+feat_ls64_v+feat_ls64}. If the name ls64_accdata had been used instead, then we would be able to do what I've shown for feat_sha1. Same goes for the rest of the features I have listed. Do you think it's ok to drop backwards compatibility and spit all these features? Or do you prefer to proceed with this ACLE proposal? Neither is great but we have to choose a way forwards.

*Note that aes interacts weirdly with crypto so changing its semantics may turnout complicated.

@ktkachov
Copy link

Features like sve-bf16 have intrinsics that compilers support, does that not count as "compiler support"?
Also, sve-i8mm includes the SUDOT and USDOT instructions that have both intrinsics and can (and are) automatically code generated from normal C code

@labrinea
Copy link
Contributor Author

Features like sve-bf16 have intrinsics that compilers support, does that not count as "compiler support"?
Also, sve-i8mm includes the SUDOT and USDOT instructions that have both intrinsics and can (and are) automatically code generated from normal C code

Hey Kyril, thanks for looking at this. In LLVM both the sve-bf16 and sve-i8mm instructions are predicated on finer grained features like sve(2/p1), sme(2), bf16 and i8mm.

@labrinea
Copy link
Contributor Author

@jroelofs what is your opinion on this proposal? Are there any particular features that you would like to keep from those listed in this PR? For more details please refer to the original PR from Andrew here.

@jroelofs
Copy link

I don't find the argument that LLVM lumps features together to be a compelling reason to lump them together on the FMV side of things. We can implement them as if they were lumped together, where that makes sense (via an alias table or something). But IMO, users should be able to mention the name of the feature they're actually depending on, to reduce confusion on the part of people reading code that uses these attributes.

On the axis of whether FMV should support detection of features that do not affect codegen in the compiler, I also have a similarly un-compelled outlook: the point of FMV is not just to toggle compiler-aware features, but instead to give users a consistent and easy-to-reason-about way to write isa-feature-aware versioned functions.

Together, this is why I strongly supported @labrinea's patch to stop optimizing FMV resolvers in terms of implied features. The way LLVM lumps features together for those "implies" relationships at the moment is a bit of an ad-hoc mess which we periodically have to clean up.

@labrinea
Copy link
Contributor Author

We can implement them as if they were lumped together, where that makes sense (via an alias table or something).

Okay before I start raising separate PRs, in the case of sha1 and sha2 ArmARM says:

SHA2, bits [15:12]
0b0000 No SHA2 instructions implemented.
FEAT_SHA256 implements the functionality identified by the value 0b0001 .
If the value of ID_AA64ISAR0_EL1.SHA1 is 0b0000 , this field must have the value 0b0000 .

SHA1, bits [11:8]
0b0000 No SHA1 instructions implemented.
FEAT_SHA1 implements the functionality identified by the value 0b0001 .
If the value of ID_AA64ISAR0_EL1.SHA2 is 0b0000 , this field must have the value 0b0000 .

To my understanding this means you can't have one without the other. Would we like to fuse them in the FMV spec or introduce an alias in the compiler sha1 -> sha2 ? I think the draft should be abandoned based on the above information.

labrinea added a commit to labrinea/acle that referenced this pull request Sep 4, 2024
As originally discussed in ARM-software#315 and then in ARM-software#329 we want to unify
features that the toolchains cannot support independently. In the
case of ls64 I have attempted to split the lumped feature in the
compiler (see llvm/llvm-project#101712),
but unfortunately this would break backwards compatibility:

Mapping 'ls64' to FeatureLS64_ACCDATA would enable all three of
{FeatureLS64, FeatureLS64_V, FeatureLS64_ACCDATA} but then using
'nols64' either on the command line or the assembler directive
would only disable FeatureLS64_ACCDATA without disabling the
other two. For that we would have to map 'ls64' to FeatureLS64,
but then it would not enable the other two.

As far as I am aware there are no hardware implementations out
there which implement ls64 without ls64_v or ls64_accdata, so
unifying these features in the specification should not be a
regression for feature detection. If any of this changes in
the feature we can revisit the specification.
@labrinea
Copy link
Contributor Author

We are addressing the listed features case by case.

@labrinea labrinea closed this Sep 10, 2024
labrinea added a commit to labrinea/acle that referenced this pull request Sep 12, 2024
As originally discussed in ARM-software#315 and then in ARM-software#329 we want to unify
features that the toolchains cannot support independently. In the
case of ls64 I have attempted to split the lumped feature in the
compiler (see llvm/llvm-project#101712),
but unfortunately this would break backwards compatibility:

Mapping 'ls64' to FeatureLS64_ACCDATA would enable all three of
{FeatureLS64, FeatureLS64_V, FeatureLS64_ACCDATA} but then using
'nols64' either on the command line or the assembler directive
would only disable FeatureLS64_ACCDATA without disabling the
other two. For that we would have to map 'ls64' to FeatureLS64,
but then it would not enable the other two.

As far as I am aware there are no hardware implementations out
there which implement ls64 without ls64_v or ls64_accdata, so
unifying these features in the specification should not be a
regression for feature detection. If any of this changes in
the feature we can revisit the specification.
vhscampos pushed a commit that referenced this pull request Sep 12, 2024
As originally discussed in #315 and then in #329 we want to unify
features that the toolchains cannot support independently. In the
case of ls64 I have attempted to split the lumped feature in the
compiler (see llvm/llvm-project#101712),
but unfortunately this would break backwards compatibility:

Mapping 'ls64' to FeatureLS64_ACCDATA would enable all three of
{FeatureLS64, FeatureLS64_V, FeatureLS64_ACCDATA} but then using
'nols64' either on the command line or the assembler directive
would only disable FeatureLS64_ACCDATA without disabling the
other two. For that we would have to map 'ls64' to FeatureLS64,
but then it would not enable the other two.

As far as I am aware there are no hardware implementations out
there which implement ls64 without ls64_v or ls64_accdata, so
unifying these features in the specification should not be a
regression for feature detection. If any of this changes in
the feature we can revisit the specification.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants