Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FMV] Runtime Resolver Function #74

Merged
merged 1 commit into from
Sep 25, 2024

Conversation

BeMg
Copy link
Contributor

@BeMg BeMg commented Apr 19, 2024

This PR proposes a runtime resolver function that retrieves the environment information. Since this resolver function is expected to be available and interchangeable for both libgcc and compiler-rt, a formal specification for the resolver function interface is necessary.


When generating the resolver function for function multiversioning, a mechanism is necessary to obtain the environment information.

To achieve this goal, several steps need to be taken:

  1. Collect the required extensions for a particular function.
  2. Transform these required extensions into a platform-dependent form.
  3. Query whether the environment fulfills these requirements during runtime.

Step 1 is handled by the compiler, while step 3 must follow the necessary steps from the platform during runtime.

This RFC aims to propose how the compiler and runtime function can tackle step 2.

Here is a example

__attribute__((target_clones("default", "arch=rv64gcv"))) int bar() {
    return 1;
}

In this example, there are two versions of function bar. One for default, another for "rv64gcv".

If the environment meets the requirements, then bar can utilize the arch=rv64gcv version. Otherwise, it will invoke the default version.

This process be controlled by the ifunc resolver function.

ptr bar.resolver() {
   if (isFulFill(...))
      return "bar.arch=rv64gcv";
   return bar.default;
}

The isFulFill should available during the program runtime.

The version arch=rv64gcv require

i, m, a, f, d, c, v, zicsr, zifencei, zve32f, zve32x, zve64d, zve64f, zve64x, zvl128b, zvl32b, zvl64b,

The problem 2 is about where to maintain the relationship between extension names and platform-dependent probe forms.

Here are three possible approach to achieve goal.

  1. Encode all required extensions into a string format, then let the platform implement its own probe approach based on the string inside the runtime function. This approach maintains the relationship between extension names and platform-dependent probe forms inside the runtime function.
ptr bar.resolver() {
   if (isFulFill("i_m_a_f_d_c_v_zicsr_zifencei_zve32f_zve32x_zve64d_zve64f_zve64x_zvl128b_zvl32b_zvl64b"))
      return bar.arch=rv64gcv;
   return bar.default;
}

bool isFulFill(char *ReqExts) {
    if (isLinux())
       return doLinuxRISCVExtensionProbe(ReqExts);
    if (isFreeBSD())
       return doFreeBSDRISCVExtensionProbe(ReqExts);
    // Other platform
    ....
    return false;
}
  • Pros
    • Human readable
    • Relatively high portability
    • Provides a uniform interface for all platforms
  • Cons
    • Requires extra effort for string processing in the runtime function.
  1. Encode all required extensions into a compiler-defined key, then let the platform implement its own probe approach inside the runtime. This approach maintains the relationship between the compiler-defined key for extensions and the platform-dependent probe form inside the runtime function.
// Assume compiler define
// i -> 1
// m -> 2
...

ptr bar.resolver() {
   if (isFulFill([1, 2, 3, 8, ...], length))
      return bar.arch=rv64gcv;
   return bar.default;
}

bool isFulFill(int *ReqExts, length) {
    if (isLinux())
       return doLinuxRISCVExtensionProbe(ReqExts, length);
    if (isFreeBSD())
       return doFreeBSDRISCVExtensionProbe(ReqExts, length);
    // Other platform
    ....
    return false;
}
  • Pros
    • Doesn't require string processing during runtime
    • Provides a uniform interface for all platforms
  • Cons
    • Requires maintaining the relationship between the compiler-defined key for extensions and the concrete extension names inside runtime function.
  1. Define a different runtime function for each platform and construct any necessary information during compilation time if necessary for the platform. This approach maintains the relationship between extension names and platform-dependent probe forms inside the compiler.
// If compiler compile for linux, then use bar.resolver.linux
ptr bar.resolver.linux() {
   if (isFulFillLinux(LinuxProbeObject))
      return bar.arch=rv64gcv;
   return bar.default;
}

ptr bar.resolver.freebsd() {
   if (isFulFillFreeBSD(FreeBSDProbeObject))
      return bar.arch=rv64gcv;
   return bar.default;
}

// Other platform bar.resolver
...

bool isFulFillLinux(LinuxProbeObject Obj) {
   return doLinuxProbe(Obj);
}

bool isFulFillFreeBSD(FreeBSDProbeObject Obj) {
   return doFreeBSDProbe(Obj);
}

// Other platform isFulFill
...

  • Pros
    • Relatively simple implementation for the runtime function
  • Cons
    • Does not provide a uniform interface for all platforms

@BeMg
Copy link
Contributor Author

BeMg commented Apr 19, 2024

@BeMg
Copy link
Contributor Author

BeMg commented Apr 19, 2024

cc @kito-cheng

@topperc
Copy link
Contributor

topperc commented Apr 19, 2024

Is two word "FullFill" supposed to be the single word "Fulfill"?

@topperc
Copy link
Contributor

topperc commented Apr 20, 2024

Do we intend to support __builtin_cpu_supports which is built on the same interface as function multiversioning on other targets like X86? That will require a reasonably fast query mechanism. String processing may be too much for that.

@BeMg
Copy link
Contributor Author

BeMg commented Apr 22, 2024

Is two word "FullFill" supposed to be the single word "Fulfill"?

Oops, I think there is a typo here. Updated.

@BeMg
Copy link
Contributor Author

BeMg commented Apr 22, 2024

Do we intend to support __builtin_cpu_supports which is built on the same interface as function multiversioning on other targets like X86? That will require a reasonably fast query mechanism. String processing may be too much for that.

If we only allow one extension each time. Does it provide a reasonably fast query mechanism? Or must it be some kind of bit operation to determine support?

For example, compiler generate this resolver function base on __builtin_cpu_supports. And compiler-rt/libgcc use the method 1 to implement __builtin_cpu_supports.

ptr bar.resolver() {
   if (__builtin_cpu_supports("i") && 
       __builtin_cpu_supports("m") && 
       __builtin_cpu_supports("a") && 
       __builtin_cpu_supports("f") && 
       __builtin_cpu_supports("d") && 
       __builtin_cpu_supports("c") && 
       __builtin_cpu_supports("v") && 
       __builtin_cpu_supports("zicsr") && 
...
       __builtin_cpu_supports("zvl64b"))
      return bar.arch=rv64gcv;
   return bar.default;
}

@topperc
Copy link
Contributor

topperc commented Apr 22, 2024

Do we intend to support __builtin_cpu_supports which is built on the same interface as function multiversioning on other targets like X86? That will require a reasonably fast query mechanism. String processing may be too much for that.

If we only allow one extension each time. Does it provide a reasonably fast query mechanism? Or must it be some kind of bit operation to determine support?

For example, compiler generate this resolver function base on __builtin_cpu_supports. And compiler-rt/libgcc use the method 1 to implement __builtin_cpu_supports.


ptr bar.resolver() {

   if (__builtin_cpu_supports("i") && 

       __builtin_cpu_supports("m") && 

       __builtin_cpu_supports("a") && 

       __builtin_cpu_supports("f") && 

       __builtin_cpu_supports("d") && 

       __builtin_cpu_supports("c") && 

       __builtin_cpu_supports("v") && 

       __builtin_cpu_supports("zicsr") && 

...

       __builtin_cpu_supports("zvl64b"))

      return bar.arch=rv64gcv;

   return bar.default;

}

My concern is that each time you pass a string into the compiler-rt interface, it will need to execute multiple strcmps to compare the input string against every extension name the library knows about to figure out which extension is being asked for. That gets expensive if called very often.

On x86, builtin_cpu_supports calls the library the first time to update some global variables. After the first time it is a load and a bit test

@jrtc27
Copy link

jrtc27 commented May 8, 2024

If you use a sensible data structure like a trie you can do it linearly in the length of the input string

@BeMg
Copy link
Contributor Author

BeMg commented May 24, 2024

To enhance both the performance(compare to string base) and portability(compare to hwprobe base), I have updated the runtime interface with a new layer for each queryable extension. This approach is similar to approach 2 described in the PR's description. This comment aims to explain it with a concrete example using the IFUNC resolver function and __builtin_cpu_supports.

Two structures are defined in the runtime library to store the status of hardware-enabled extensions:

Each queryable extension has a unique position inside the structure bit to represent whether it is enabled. For example: extension m enable bit could be stored inside __riscv_feature_bit.features[0] & (1 << 5)

struct {
	unsigned length;
    unsigned long long features[MAXLENGTH];
} __riscv_feature_bit;

struct {
    unsigned vendorID;
    unsigned length;
    unsigned long long features[MAXLENGTH];
} __riscv_vendor_feature_bit;

Additionally, there is a function to initialize these two structures using a system-provided mechanism:

void __init_riscv_features_bit();

In summary, this approach uses __riscv_feature_bit and __riscv_vendor_feature_bit to represent whether an extension is enabled. They are initialized by __init_riscv_features_bit. Both structures are defined in compiler-rt/libgcc.


When the compiler emits the IFUNC resolver function, it can use these structures to check whether all extension requirements are fulfilled.

Here is a simple example for a resolver:

; -target-feature +i
__attribute__((target_clones("default", "arch=rv64im"))) int foo1(void) {
  return 1;
}
func_ptr foo1.resolver() {
	__init_riscv_features_bit();
	if (MAX_QUERY_LENGTH > __riscv_feature_bits.length)
		raise_error();

    // Try arch=rv64im
	unsigned long long rv64im_require_feature_0 = constant_build_during_compiation_time();
	unsigned long long rv64im_require_feature_1 = constant_build_during_compiation_time();
	...
	if (
	((rv64im_require_feature_0 & __riscv_feature_bits.features[0]) == rv64im_require_feature_0) &&
	((rv64im_require_feature_1 & __riscv_feature_bits.features[1]) == rv64im_require_feature_1) &&
	...)
		return foo1.rv64im;

	return foo1.default;
}

@jrtc27
Copy link

jrtc27 commented May 24, 2024

Who's specifying which bit is what?

@BeMg
Copy link
Contributor Author

BeMg commented May 24, 2024

My idea is that bit is only meaningful for runtime function and compiler that using __riscv_feature_bits. For function multiversioning, I will allocate non-colliding bits for extensions and remain unchanged. If there is new extension, allocate the available bit or extend the __riscv_feature_bits.features size when it be used by function multiversioning. Vendor extension is guarded by vendorID, so it can be allocated by vendor itself without collosion with other vendor extension.

The remaining problem is how to synchronize the extension bitmask across LLVM, compiler-rt, GCC, and libgcc. I don't have a solution for this yet.

@kito-cheng Any ideas on how we can achieve this synchronization?

@BeMg
Copy link
Contributor Author

BeMg commented Jun 3, 2024

Update: add the extension groupid/bitmask definitions for synchronization across LLVM, compiler-rt, GCC, and libgcc.


cc @kito-cheng @topperc

@kito-cheng
Copy link
Collaborator

This proposal got positive feedback from RISC-V GNU community :)

BeMg added a commit to BeMg/llvm-project that referenced this pull request Jun 5, 2024
Base on riscv-non-isa/riscv-c-api-doc#74.

This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc.

The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
riscv-c-api.md Outdated Show resolved Hide resolved
riscv-c-api.md Outdated Show resolved Hide resolved
riscv-c-api.md Outdated Show resolved Hide resolved
riscv-c-api.md Outdated Show resolved Hide resolved
riscv-c-api.md Outdated Show resolved Hide resolved
riscv-c-api.md Outdated Show resolved Hide resolved
riscv-c-api.md Outdated Show resolved Hide resolved
BeMg added a commit to BeMg/llvm-project that referenced this pull request Jun 11, 2024
…ature_bits/__init_riscv_features_bit

Base on riscv-non-isa/riscv-c-api-doc#74, this patch defines the __riscv_feature_bits and __riscv_vendor_feature_bits structures to store the enabled feature bits at runtime.

It also introduces the __init_riscv_features_bit function to update these structures based on the platform query mechanism.

Additionally, the groupid/bitmask definitions from riscv-non-isa/riscv-c-api-doc#74 are declared and used to update the __riscv_feature_bits and __riscv_vendor_feature_bits structures.
BeMg added a commit to BeMg/llvm-project that referenced this pull request Jun 11, 2024
Base on riscv-non-isa/riscv-c-api-doc#74.

This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc.

The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
BeMg added a commit to BeMg/llvm-project that referenced this pull request Jun 17, 2024
Base on riscv-non-isa/riscv-c-api-doc#74.

This patch defines the groupid/bitmask in RISCVFeatures.td and generates the corresponding table in RISCVTargetParserDef.inc.

The groupid/bitmask of extensions provides an abstraction layer between the compiler and runtime functions.
BeMg added a commit to BeMg/llvm-project that referenced this pull request Jun 17, 2024
…ature_bits/__init_riscv_features_bit

Base on riscv-non-isa/riscv-c-api-doc#74, this patch defines the __riscv_feature_bits and __riscv_vendor_feature_bits structures to store the enabled feature bits at runtime.

It also introduces the __init_riscv_features_bit function to update these structures based on the platform query mechanism.

Additionally, the groupid/bitmask definitions from riscv-non-isa/riscv-c-api-doc#74 are declared and used to update the __riscv_feature_bits and __riscv_vendor_feature_bits structures.
@BeMg BeMg requested review from lenary, topperc and kito-cheng June 18, 2024 07:02
@BeMg
Copy link
Contributor Author

BeMg commented Aug 1, 2024

There are two updates

  1. The function __init_riscv_feature_bits has been updated with an extra parameter. This new argument allows the platform to pass pre-computed results for platform feature information.
  2. A new structure has been defined for CSR-related values (mVendorID, mArchID, mImplID).

@BeMg
Copy link
Contributor Author

BeMg commented Aug 1, 2024

TODO:

  1. Allocate bit for latest hwprobe supported extension
  2. More description/example for vendor feature bit
  3. Mechanism to determine whether __init_riscv_feature_bits executed successfully

@BeMg
Copy link
Contributor Author

BeMg commented Aug 2, 2024

TODO:

  1. Allocate bit for latest hwprobe supported extension

Added and relate LLVM PR llvm/llvm-project#101632

BeMg added a commit to llvm/llvm-project that referenced this pull request Aug 8, 2024
The spec can be found at
riscv-non-isa/riscv-c-api-doc#74.

1. Add the new extension GroupID/Bitmask with latest hwprobe key.
2. Update the `initRISCVFeature `
3. Update `EmitRISCVCpuSupports` due to not only group0 now.
Copy link
Collaborator

@kito-cheng kito-cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @BeMg has already addressed all concerns so far, except for the performance features/data that haven't been added yet. However, we don't have the corresponding HWprobe bits available at the moment. HWprobe is still using a bitmap to represent the speed of misaligned access (RISCV_HWPROBE_MISALIGNED_*), and the cache block size (RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE) has a corresponding extension (Zic64b) to represent it. So, there's nothing that we can't model for now. I guess we MAY need that in the future, but let's hold off on defining that interface until we find a use case for it.

riscv-c-api.md Outdated Show resolved Hide resolved
riscv-c-api.md Outdated Show resolved Hide resolved
freebsd-git pushed a commit to freebsd/freebsd-src that referenced this pull request Aug 22, 2024
GNU/Linux has historically had the following two resolver prototypes:

  1. Elf_Addr(uint64_t, void *)
  2. Elf_Addr(uint64_t, void *, void *)

For the former, AT_HWCAP is passed in the first argument, and NULL in
the second. For the latter, AT_HWCAP is still passed, and the second
argument is a pointer to their home-grown __riscv_hwprobe function.
Should they want to use the third argument in future, they'll have to
introduce yet another prototype to allow for later expansion, and then
all users will have to check whether the second argument is NULL to know
if the third argument really exists. This is all rather silly and will
surely prove fun in the face of type-checking CFI.

Instead, be like arm64 and just define all 8 possible general purpose
register arguments up front. To naive source code that forgets non-Linux
OSes exist this will be compatible with prototype 1 above, since the
second argument will be 0 and it won't look further (though should we
start using the second argument for something that wouldn't be true any
more and it might think it's __riscv_hwprobe, but that incompatibility
is one we can defer committing to, and can choose to never adopt).

Until the standard interface for querying extension information[1] is
settled and implemented in FreeBSD there's not much you can do in a
resolver other than use HWCAP_ISA_B, but this gets the infrastructure in
place for when that day comes.

[1] riscv-non-isa/riscv-c-api-doc#74

Reviewed by:	kib, mhorne
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D46278
@BeMg BeMg requested review from topperc and jrtc27 August 24, 2024 09:52
BeMg added a commit to llvm/llvm-project that referenced this pull request Aug 29, 2024
The spec could be found here
riscv-non-isa/riscv-c-api-doc#74

This patch updates the following symbol:

```
mVendorID -> mvendorid
mArchID -> marchid
mImplID -> mimpid
```
@kito-cheng
Copy link
Collaborator

I think this is ready to land, but could you convert this PR into adoc format which used in main branch now....

@BeMg
Copy link
Contributor Author

BeMg commented Sep 24, 2024

I think this is ready to land, but could you convert this PR into adoc format which used in main branch now....

Rebased and use the adoc format now.

This PR proposes an Extension Bitmask to represent environment information. Since this Extension Bitmask is expected to be available and interchangeable for both libgcc and compiler-rt, a formal specification for the Extension Bitmask interface is necessary.
@BeMg
Copy link
Contributor Author

BeMg commented Sep 25, 2024

squash commit and rewrite the commit message

@kito-cheng
Copy link
Collaborator

Let move on!

@kito-cheng kito-cheng merged commit ac7ee30 into riscv-non-isa:main Sep 25, 2024

The `__init_riscv_feature_bits` function updates `length`, `mvendorid`, `marchid`, `mimpid` and the `features` in `__riscv_feature_bits` and `__riscv_vendor_feature_bits` according to the enabled extensions in the system.

The `__init_riscv_feature_bits` function accepts an argument of type `void *`. This argument allows the platform to provide pre-computed data and access it without additional effort. For example, Linux could pass the vDSO object to avoid an extra system call.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who's expected to call this, and so who's expected to know what magic opaque blob to pass here? Surely you don't expect the compiler to be generating OS-specific code to get data out of the vDSO and pass it to the function?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BeMg ping


The `__init_riscv_feature_bits` function accepts an argument of type `void *`. This argument allows the platform to provide pre-computed data and access it without additional effort. For example, Linux could pass the vDSO object to avoid an extra system call.

NOTE: To detect failure of the `__init_riscv_feature_bits` function, it is recommended to check the bitmask for the `I` extension. The `I` extension must be supported in all valid RISC-V implementations.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not true? RVE exists.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could say check "__riscv_feature_bits.features[0]" is non zero?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now that works, until such time as a base extension other than I and E exists which therefore can't fit in misa, e.g. depending on how CHERI's defined that could end up true (though probably that should be represented in misa as I or E still being set). Alternatively you could just say the length is set to 0 on failure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you could just say the length is set to 0 on failure.

That's good idea, @BeMg could you create a PR for that?

cyyself pushed a commit to cyyself/gcc that referenced this pull request Sep 29, 2024
…d __riscv_vendor_feature_bits

This provides a common abstraction layer to probe the available extensions at
run-time. These functions can be used to implement function multi-versioning or
to detect available extensions.

The advantages of providing this abstraction layer are:
- Easy to port to other new platforms.
- Easier to maintain in GCC for function multi-versioning.
  - For example, maintaining platform-dependent code in C code/libgcc is much
    easier than maintaining it in GCC by creating GIMPLEs...

This API is intended to provide the capability to query minimal common available extensions on the system.

Proposal in riscv-c-api-doc: riscv-non-isa/riscv-c-api-doc#74

Full function multi-versioning implementation will come later. We are posting
this first because we intend to backport it to the GCC 14 branch to unblock
LLVM 19 to use this with GCC 14.2, rather than waiting for GCC 15.

Changes since v3:
- Fix non-linux build.
- Let __init_riscv_feature_bits become constructor

Changes since v2:
- Prevent it initialize more than once.

Changes since v1:
- Fix the format.
- Prevented race conditions by introducing a local variable to avoid load/store
  operations during the computation of the feature bit.

libgcc/ChangeLog:

	* config/riscv/feature_bits.c: New.
	* config/riscv/t-elf (LIB2ADD): Add feature_bits.c.
cyyself pushed a commit to cyyself/gcc that referenced this pull request Oct 1, 2024
…d __riscv_vendor_feature_bits

This provides a common abstraction layer to probe the available extensions at
run-time. These functions can be used to implement function multi-versioning or
to detect available extensions.

The advantages of providing this abstraction layer are:
- Easy to port to other new platforms.
- Easier to maintain in GCC for function multi-versioning.
  - For example, maintaining platform-dependent code in C code/libgcc is much
    easier than maintaining it in GCC by creating GIMPLEs...

This API is intended to provide the capability to query minimal common available extensions on the system.

Proposal in riscv-c-api-doc: riscv-non-isa/riscv-c-api-doc#74

Full function multi-versioning implementation will come later. We are posting
this first because we intend to backport it to the GCC 14 branch to unblock
LLVM 19 to use this with GCC 14.2, rather than waiting for GCC 15.

Changes since v3:
- Fix non-linux build.
- Let __init_riscv_feature_bits become constructor

Changes since v2:
- Prevent it initialize more than once.

Changes since v1:
- Fix the format.
- Prevented race conditions by introducing a local variable to avoid load/store
  operations during the computation of the feature bit.

libgcc/ChangeLog:

	* config/riscv/feature_bits.c: New.
	* config/riscv/t-elf (LIB2ADD): Add feature_bits.c.
cyyself pushed a commit to cyyself/gcc that referenced this pull request Oct 5, 2024
…d __riscv_vendor_feature_bits

This provides a common abstraction layer to probe the available extensions at
run-time. These functions can be used to implement function multi-versioning or
to detect available extensions.

The advantages of providing this abstraction layer are:
- Easy to port to other new platforms.
- Easier to maintain in GCC for function multi-versioning.
  - For example, maintaining platform-dependent code in C code/libgcc is much
    easier than maintaining it in GCC by creating GIMPLEs...

This API is intended to provide the capability to query minimal common available extensions on the system.

Proposal in riscv-c-api-doc: riscv-non-isa/riscv-c-api-doc#74

Full function multi-versioning implementation will come later. We are posting
this first because we intend to backport it to the GCC 14 branch to unblock
LLVM 19 to use this with GCC 14.2, rather than waiting for GCC 15.

Changes since v5:
- Minor fixes on indentation.

Changes since v4:
- Bump to newest riscv-c-api-doc with some new extensions like Zve*, Zc*
  Zimop, Zcmop, Zawrs.
- Rename the return variable name of hwprobe syscall.
- Minor fixes on indentation.

Changes since v3:
- Fix non-linux build.
- Let __init_riscv_feature_bits become constructor

Changes since v2:
- Prevent it initialize more than once.

Changes since v1:
- Fix the format.
- Prevented race conditions by introducing a local variable to avoid load/store
  operations during the computation of the feature bit.

libgcc/ChangeLog:

	* config/riscv/feature_bits.c: New.
	* config/riscv/t-elf (LIB2ADD): Add feature_bits.c.

Co-Developed-by: Yangyu Chen <[email protected]>
Signed-off-by: Yangyu Chen <[email protected]>
cyyself pushed a commit to cyyself/gcc that referenced this pull request Oct 5, 2024
…d __riscv_vendor_feature_bits

This provides a common abstraction layer to probe the available extensions at
run-time. These functions can be used to implement function multi-versioning or
to detect available extensions.

The advantages of providing this abstraction layer are:
- Easy to port to other new platforms.
- Easier to maintain in GCC for function multi-versioning.
  - For example, maintaining platform-dependent code in C code/libgcc is much
    easier than maintaining it in GCC by creating GIMPLEs...

This API is intended to provide the capability to query minimal common available extensions on the system.

Proposal in riscv-c-api-doc: riscv-non-isa/riscv-c-api-doc#74

Full function multi-versioning implementation will come later. We are posting
this first because we intend to backport it to the GCC 14 branch to unblock
LLVM 19 to use this with GCC 14.2, rather than waiting for GCC 15.

Changes since v5:
- Minor fixes on indentation.

Changes since v4:
- Bump to newest riscv-c-api-doc with some new extensions like Zve*, Zc*
  Zimop, Zcmop, Zawrs.
- Rename the return variable name of hwprobe syscall.
- Minor fixes on indentation.

Changes since v3:
- Fix non-linux build.
- Let __init_riscv_feature_bits become constructor

Changes since v2:
- Prevent it initialize more than once.

Changes since v1:
- Fix the format.
- Prevented race conditions by introducing a local variable to avoid load/store
  operations during the computation of the feature bit.

libgcc/ChangeLog:

	* config/riscv/feature_bits.c: New.
	* config/riscv/t-elf (LIB2ADD): Add feature_bits.c.

Co-Developed-by: Yangyu Chen <[email protected]>
Signed-off-by: Yangyu Chen <[email protected]>
bsdjhb pushed a commit to bsdjhb/cheribsd that referenced this pull request Nov 20, 2024
GNU/Linux has historically had the following two resolver prototypes:

  1. Elf_Addr(uint64_t, void *)
  2. Elf_Addr(uint64_t, void *, void *)

For the former, AT_HWCAP is passed in the first argument, and NULL in
the second. For the latter, AT_HWCAP is still passed, and the second
argument is a pointer to their home-grown __riscv_hwprobe function.
Should they want to use the third argument in future, they'll have to
introduce yet another prototype to allow for later expansion, and then
all users will have to check whether the second argument is NULL to know
if the third argument really exists. This is all rather silly and will
surely prove fun in the face of type-checking CFI.

Instead, be like arm64 and just define all 8 possible general purpose
register arguments up front. To naive source code that forgets non-Linux
OSes exist this will be compatible with prototype 1 above, since the
second argument will be 0 and it won't look further (though should we
start using the second argument for something that wouldn't be true any
more and it might think it's __riscv_hwprobe, but that incompatibility
is one we can defer committing to, and can choose to never adopt).

Until the standard interface for querying extension information[1] is
settled and implemented in FreeBSD there's not much you can do in a
resolver other than use HWCAP_ISA_B, but this gets the infrastructure in
place for when that day comes.

[1] riscv-non-isa/riscv-c-api-doc#74

Reviewed by:	kib, mhorne
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D46278
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants