Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a learning path for Function Multiversioning. #1196

Merged
merged 2 commits into from
Sep 3, 2024

Conversation

labrinea
Copy link
Contributor

Before submitting a pull request for a new Learning Path, please review Create a Learning Path

  • I have reviewed Create a Learning Path

Please do not include any confidential information in your contribution. This includes confidential microarchitecture details and unannounced product information. No AI tool can be used to generate either content or code when creating a learning path or install guide.

  • I have checked my contribution for confidential information

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the Creative Commons Attribution 4.0 International License.

@labrinea
Copy link
Contributor Author

@pareenaverma @jroelofs can you take a look? Thanks

Can I implement versions of a function in separate translation units?
answers:
- Yes, function versions can spread across different translations units.
- No, all of them must be in the same translation unit.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did that for clang, but (and I don't know if you care...) did gcc get the same feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do care about GCC, this learning path is for both toolchains. We made this design decision part of ACLE so GCC will have to adhere to it. Adding @andrewcarlotti for more visibility.


#### Differences between GCC 14 and LLVM 19 implementations

- The attribute `target_version` in GCC is only supported for C++, not for C.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also want to mention target_clones here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, target_clones is supported for C as well in gcc-14.


#### Static resolution with GCC

The GCC compiler optimizes calls to versioned functions when they can be statically resolved. Such calls would otherwise be routed through the resolver but instead they become direct which allows them to be inlined. This may be possible whenever a function is compiled with a sufficiently high set of architecture features (so including `target`/`target_version`/`target_clones` attributes, and command line options). LLVM is not yet able to perform this optimization.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM is not yet able to perform this optimization.

We did for a few cases at one point. Did that get reverted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean this llvm/llvm-project#80606 ? Not reverted, but also not quite the same as GCC. Also, after llvm/llvm-project#99522 I don't think that the clang frontend can generate a trivial resolver anymore (which was the only case llvm knows how to optimise).

* Fixed upper half of box in comment
* Renamed foo to sumPosEltsScaledByIndex in examples
* Added missing comma in resolver's emission section
@jasonrandrews
Copy link
Contributor

I'll start the editorial review. Thanks for submitting.

@jasonrandrews jasonrandrews merged commit e9a3476 into ArmDeveloperEcosystem:main Sep 3, 2024
Copy link

@andrewcarlotti andrewcarlotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of small nits that you might want to improve, but otherwise looks good to me.

## Semantics
Function Multiversioning allows compiling several implementations of a function into the same binary and then selecting the most appropriate version at runtime. The intention is to take advantage of hardware features for accelerating your application at function level granularity.

To specify a function version you can annotate its declaration with either of `__attribute__((target_version("name")))` or `__attribute__((target_clones("name",...)))`, where "name" denotes one or more architectural features separated by '+'. This annotation implies a dependency between a function version and the feature set it is specified for. The compiler generates optimized versions of the function for the specified targets. The `target_clones` attribute behaves just like `target_version` but specifies multiple versions for the same function definition. The former is perhaps suitable for functions that the compiler can optimize differently depending on the requested features, whereas the latter allows the user to manually optimize a version at the source level using intrinsics or inline assembly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the last sentence, it's unclear whether "former" and "latter" refer to the ordering in the previous sentence of the ordering at the start of the paragraph.

You could append 'or the string "default"' to make it clearer how the "default" version is specified.

To specify a function version you can annotate its declaration with either of `__attribute__((target_version("name")))` or `__attribute__((target_clones("name",...)))`, where "name" denotes one or more architectural features separated by '+'. This annotation implies a dependency between a function version and the feature set it is specified for. The compiler generates optimized versions of the function for the specified targets. The `target_clones` attribute behaves just like `target_version` but specifies multiple versions for the same function definition. The former is perhaps suitable for functions that the compiler can optimize differently depending on the requested features, whereas the latter allows the user to manually optimize a version at the source level using intrinsics or inline assembly.

A hardware platform may support multiple architectural features from the dependency sets, or it may not support any. Therefore Function Multiversioning provides a convenient way to select the most appropriate version of a function at runtime. The selection is permanent for the lifetime of the process and works as follows:
- Select the most specific version (the one with most features), else

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description is imprecise, and we're planning to clarify and slightly modify the prioritization criteria anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants