[AArch64] Add soft-float ABI #74460

ostannard · 2023-12-05T12:30:02Z

This adds support for the AArch64 soft-float ABI. The specification for this ABI is currently in review at ARM-software/abi-aa#232, and I won't commit this until that PR is merged.

Because all existing AArch64 hardware has floating-point hardware, we expect this to be a niche option, only used for embedded systems on R-profile systems. We are going to document that SysV-like systems should only ever use the base (hard-float) PCS variant: ARM-software/abi-aa#233. For that reason, I've not added an option to select the ABI independently of the FPU hardware, instead the new ABI is enabled iff the target architecture does not have an FPU.

For testing, I have run this through an ABI fuzzer, but since this is the first implementation it can only test for internal consistency (callers and callees agree on the PCS), not for conformance to the ABI spec.

llvmbot · 2023-12-05T12:30:29Z

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: None (ostannard)

Changes

This adds support for the AArch64 soft-float ABI. The specification for this ABI is currently in review at ARM-software/abi-aa#232, and I won't commit this until that PR is merged.

Because all existing AArch64 hardware has floating-point hardware, we expect this to be a niche option, only used for embedded systems on R-profile systems. We are going to document that SysV-like systems should only ever use the base (hard-float) PCS variant: ARM-software/abi-aa#233. For that reason, I've not added an option to select the ABI independently of the FPU hardware, instead the new ABI is enabled iff the target architecture does not have an FPU.

For testing, I have run this through an ABI fuzzer, but since this is the first implementation it can only test for internal consistency (callers and callees agree on the PCS), not for conformance to the ABI spec.

Patch is 22.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74460.diff

6 Files Affected:

(modified) clang/lib/Basic/Targets/AArch64.cpp (+2-1)
(modified) clang/lib/CodeGen/CodeGenModule.cpp (+2)
(modified) clang/lib/CodeGen/TargetInfo.h (+1)
(modified) clang/lib/CodeGen/Targets/AArch64.cpp (+12-5)
(added) clang/test/CodeGen/aarch64-soft-float-abi.c (+53)
(modified) clang/test/CodeGen/attr-target-clones-aarch64.c (+105-21)

diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index c31f2e0bee543..23090dad83ad7 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -649,7 +649,8 @@ bool AArch64TargetInfo::hasFeature(StringRef Feature) const {
   return llvm::StringSwitch<bool>(Feature)
       .Cases("aarch64", "arm64", "arm", true)
       .Case("fmv", HasFMV)
-      .Cases("neon", "fp", "simd", FPU & NeonMode)
+      .Case("fp", FPU & FPUMode)
+      .Cases("neon", "simd", FPU & NeonMode)
       .Case("jscvt", HasJSCVT)
       .Case("fcma", HasFCMA)
       .Case("rng", HasRandGen)
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index dea58a7ff4146..2e730fdb0b83f 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -143,6 +143,8 @@ createTargetCodeGenInfo(CodeGenModule &CGM) {
       Kind = AArch64ABIKind::DarwinPCS;
     else if (Triple.isOSWindows())
       return createWindowsAArch64TargetCodeGenInfo(CGM, AArch64ABIKind::Win64);
+    else if (!Target.hasFeature("fp"))
+      Kind = AArch64ABIKind::AAPCSSoft;
 
     return createAArch64TargetCodeGenInfo(CGM, Kind);
   }
diff --git a/clang/lib/CodeGen/TargetInfo.h b/clang/lib/CodeGen/TargetInfo.h
index 0c0781a2d5ab9..0b69d92b70cee 100644
--- a/clang/lib/CodeGen/TargetInfo.h
+++ b/clang/lib/CodeGen/TargetInfo.h
@@ -416,6 +416,7 @@ enum class AArch64ABIKind {
   AAPCS = 0,
   DarwinPCS,
   Win64,
+  AAPCSSoft,
 };
 
 std::unique_ptr<TargetCodeGenInfo>
diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp
index be5145daa00b7..a4089bb6c70f1 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -53,8 +53,8 @@ class AArch64ABIInfo : public ABIInfo {
   Address EmitDarwinVAArg(Address VAListAddr, QualType Ty,
                           CodeGenFunction &CGF) const;
 
-  Address EmitAAPCSVAArg(Address VAListAddr, QualType Ty,
-                         CodeGenFunction &CGF) const;
+  Address EmitAAPCSVAArg(Address VAListAddr, QualType Ty, CodeGenFunction &CGF,
+                         AArch64ABIKind Kind) const;
 
   Address EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
                     QualType Ty) const override {
@@ -65,7 +65,7 @@ class AArch64ABIInfo : public ABIInfo {
 
     return Kind == AArch64ABIKind::Win64 ? EmitMSVAArg(CGF, VAListAddr, Ty)
            : isDarwinPCS()               ? EmitDarwinVAArg(VAListAddr, Ty, CGF)
-                                         : EmitAAPCSVAArg(VAListAddr, Ty, CGF);
+                                         : EmitAAPCSVAArg(VAListAddr, Ty, CGF, Kind);
   }
 
   Address EmitMSVAArg(CodeGenFunction &CGF, Address VAListAddr,
@@ -478,6 +478,11 @@ bool AArch64SwiftABIInfo::isLegalVectorType(CharUnits VectorSize,
 }
 
 bool AArch64ABIInfo::isHomogeneousAggregateBaseType(QualType Ty) const {
+  // For the soft-float ABI variant, no types are considered to be homogeneous
+  // aggregates.
+  if (Kind == AArch64ABIKind::AAPCSSoft)
+    return false;
+
   // Homogeneous aggregates for AAPCS64 must have base types of a floating
   // point type or a short-vector type. This is the same as the 32-bit ABI,
   // but with the difference that any floating-point type is allowed,
@@ -509,7 +514,8 @@ bool AArch64ABIInfo::isZeroLengthBitfieldPermittedInHomogeneousAggregate()
 }
 
 Address AArch64ABIInfo::EmitAAPCSVAArg(Address VAListAddr, QualType Ty,
-                                       CodeGenFunction &CGF) const {
+                                       CodeGenFunction &CGF,
+                                       AArch64ABIKind Kind) const {
   ABIArgInfo AI = classifyArgumentType(Ty, /*IsVariadic=*/true,
                                        CGF.CurFnInfo->getCallingConvention());
   // Empty records are ignored for parameter passing purposes.
@@ -534,7 +540,8 @@ Address AArch64ABIInfo::EmitAAPCSVAArg(Address VAListAddr, QualType Ty,
     BaseTy = ArrTy->getElementType();
     NumRegs = ArrTy->getNumElements();
   }
-  bool IsFPR = BaseTy->isFloatingPointTy() || BaseTy->isVectorTy();
+  bool IsFPR = Kind == AArch64ABIKind::AAPCS &&
+               (BaseTy->isFloatingPointTy() || BaseTy->isVectorTy());
 
   // The AArch64 va_list type and handling is specified in the Procedure Call
   // Standard, section B.4:
diff --git a/clang/test/CodeGen/aarch64-soft-float-abi.c b/clang/test/CodeGen/aarch64-soft-float-abi.c
new file mode 100644
index 0000000000000..687647d650e1d
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-soft-float-abi.c
@@ -0,0 +1,53 @@
+// RUN: %clang_cc1 -triple aarch64 -target-feature +fp-armv8 -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,HARD
+// RUN: %clang_cc1 -triple aarch64 -target-feature -fp-armv8 -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,SOFT
+
+// The va_list type does not change between the ABIs
+// CHECK: %struct.__va_list = type { ptr, ptr, ptr, i32, i32 }
+
+// Floats are passed in integer registers, this will be handled by the backend.
+// CHECK: define dso_local half @test0(half noundef %a)
+// CHECK: define dso_local bfloat @test1(bfloat noundef %a)
+// CHECK: define dso_local float @test2(float noundef %a)
+// CHECK: define dso_local double @test3(double noundef %a)
+// CHECK: define dso_local fp128 @test4(fp128 noundef %a)
+__fp16 test0(__fp16 a) { return a; }
+__bf16 test1(__bf16 a) { return a; }
+float test2(float a) { return a; }
+double test3(double a) { return a; }
+long double test4(long double a) { return a; }
+
+// No types are considered to be HFAs or HVAs by the soft-float PCS, so these
+// are converted to integer types.
+struct A {
+  float x;
+};
+// SOFT: define dso_local i32 @test10(i64 %a.coerce)
+// HARD: define dso_local %struct.A @test10([1 x float] alignstack(8) %a.coerce)
+struct A test10(struct A a) { return a; }
+
+struct B {
+  double x;
+  double y;
+};
+// SOFT: define dso_local [2 x i64] @test11([2 x i64] %a.coerce)
+// HARD: define dso_local %struct.B @test11([2 x double] alignstack(8) %a.coerce)
+struct B test11(struct B a) { return a; }
+
+#include <stdarg.h>
+
+// For variadic arguments, va_arg will always retreive 
+// CHECK-LABEL: define dso_local double @test20(i32 noundef %a, ...)
+// CHECK: %vl = alloca %struct.__va_list, align 8
+// SOFT: %gr_offs_p = getelementptr inbounds %struct.__va_list, ptr %vl, i32 0, i32 3
+// SOFT: %reg_top_p = getelementptr inbounds %struct.__va_list, ptr %vl, i32 0, i32 1
+// HARD: %vr_offs_p = getelementptr inbounds %struct.__va_list, ptr %vl, i32 0, i32 4
+// HARD: %reg_top_p = getelementptr inbounds %struct.__va_list, ptr %vl, i32 0, i32 2
+double test20(int a, ...) {
+  va_list vl;
+  va_start(vl, a);
+  return va_arg(vl, double);
+}
+
+// Vector types are only available for targets with the correct hardware, and
+// their calling-convention is left undefined by the soft-float ABI, so they
+// aren't tested here.
diff --git a/clang/test/CodeGen/attr-target-clones-aarch64.c b/clang/test/CodeGen/attr-target-clones-aarch64.c
index 3f2f2fdd24e8a..4404dd0da8e5e 100644
--- a/clang/test/CodeGen/attr-target-clones-aarch64.c
+++ b/clang/test/CodeGen/attr-target-clones-aarch64.c
@@ -1,6 +1,6 @@
 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --check-attributes --check-globals --include-generated-funcs
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -S -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature -fmv -S -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-NOFMV
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature -fp-armv8 -S -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature -fp-armv8 -target-feature -fmv -S -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-NOFMV
 
 int __attribute__((target_clones("lse+aes", "sve2"))) ftc(void) { return 0; }
 int __attribute__((target_clones("sha2", "sha2+memtag2", " default "))) ftc_def(void) { return 1; }
@@ -22,6 +22,8 @@ int __attribute__((target_clones("default"))) main() {
 inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))) ftc_inline2(void) { return 2; };
 
 
+
+//.
 // CHECK: @__aarch64_cpu_features = external dso_local global { i64 }
 // CHECK: @ftc.ifunc = weak_odr ifunc i32 (), ptr @ftc.resolver
 // CHECK: @ftc_def.ifunc = weak_odr ifunc i32 (), ptr @ftc_def.resolver
@@ -30,19 +32,25 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK: @ftc_inline1.ifunc = weak_odr ifunc i32 (), ptr @ftc_inline1.resolver
 // CHECK: @ftc_inline2.ifunc = weak_odr ifunc i32 (), ptr @ftc_inline2.resolver
 // CHECK: @ftc_inline3.ifunc = weak_odr ifunc i32 (), ptr @ftc_inline3.resolver
-
+//.
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc._MlseMaes(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 0
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc._Msve2(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 0
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 0
+//
+//
 // CHECK-LABEL: @ftc.resolver(
 // CHECK-NEXT:  resolver_entry:
 // CHECK-NEXT:    call void @__init_cpu_features_resolver()
@@ -63,18 +71,26 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NEXT:    ret ptr @ftc._Msve2
 // CHECK:       resolver_else2:
 // CHECK-NEXT:    ret ptr @ftc
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_def._Msha2(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 1
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_def._Msha2Mmemtag2(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 1
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_def(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 1
+//
+//
 // CHECK-LABEL: @ftc_def.resolver(
 // CHECK-NEXT:  resolver_entry:
 // CHECK-NEXT:    call void @__init_cpu_features_resolver()
@@ -95,14 +111,20 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NEXT:    ret ptr @ftc_def._Msha2
 // CHECK:       resolver_else2:
 // CHECK-NEXT:    ret ptr @ftc_def
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_dup1._Msha2(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 2
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_dup1(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 2
+//
+//
 // CHECK-LABEL: @ftc_dup1.resolver(
 // CHECK-NEXT:  resolver_entry:
 // CHECK-NEXT:    call void @__init_cpu_features_resolver()
@@ -115,18 +137,26 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NEXT:    ret ptr @ftc_dup1._Msha2
 // CHECK:       resolver_else:
 // CHECK-NEXT:    ret ptr @ftc_dup1
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_dup2._Mfp(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 3
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_dup2._MdotprodMcrc(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 3
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_dup2(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 3
+//
+//
 // CHECK-LABEL: @ftc_dup2.resolver(
 // CHECK-NEXT:  resolver_entry:
 // CHECK-NEXT:    call void @__init_cpu_features_resolver()
@@ -147,6 +177,8 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NEXT:    ret ptr @ftc_dup2._Mfp
 // CHECK:       resolver_else2:
 // CHECK-NEXT:    ret ptr @ftc_dup2
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @foo(
 // CHECK-NEXT:  entry:
@@ -158,10 +190,14 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NEXT:    [[CALL4:%.*]] = call i32 @ftc_dup2.ifunc()
 // CHECK-NEXT:    [[ADD5:%.*]] = add nsw i32 [[ADD3]], [[CALL4]]
 // CHECK-NEXT:    ret i32 [[ADD5]]
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_direct(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 4
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @main(
 // CHECK-NEXT:  entry:
@@ -175,6 +211,8 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NEXT:    [[CALL4:%.*]] = call i32 @ftc_direct()
 // CHECK-NEXT:    [[ADD5:%.*]] = add nsw i32 [[ADD3]], [[CALL4]]
 // CHECK-NEXT:    ret i32 [[ADD5]]
+//
+//
 // CHECK-LABEL: @ftc_inline1.resolver(
 // CHECK-NEXT:  resolver_entry:
 // CHECK-NEXT:    call void @__init_cpu_features_resolver()
@@ -203,6 +241,8 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NEXT:    ret ptr @ftc_inline1._MrngMsimd
 // CHECK:       resolver_else4:
 // CHECK-NEXT:    ret ptr @ftc_inline1
+//
+//
 // CHECK-LABEL: @ftc_inline2.resolver(
 // CHECK-NEXT:  resolver_entry:
 // CHECK-NEXT:    call void @__init_cpu_features_resolver()
@@ -223,6 +263,8 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NEXT:    ret ptr @ftc_inline2._Mfp16
 // CHECK:       resolver_else2:
 // CHECK-NEXT:    ret ptr @ftc_inline2
+//
+//
 // CHECK-LABEL: @ftc_inline3.resolver(
 // CHECK-NEXT:  resolver_entry:
 // CHECK-NEXT:    call void @__init_cpu_features_resolver()
@@ -243,62 +285,92 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NEXT:    ret ptr @ftc_inline3._Mbti
 // CHECK:       resolver_else2:
 // CHECK-NEXT:    ret ptr @ftc_inline3
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline1._MrngMsimd(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 1
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline1._MrcpcMpredres(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 1
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline1._Msve2-aesMwfxt(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 1
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline1(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 1
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline2._Mfp16(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 2
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline2._MfcmaMsve2-bitperm(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 2
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline2(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 2
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline3._Mbti(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 3
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline3._MsveMsb(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 3
+//
+//
 // CHECK: Function Attrs: noinline nounwind optnone
 // CHECK-LABEL: @ftc_inline3(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 3
+//
+//
 // CHECK-NOFMV: Function Attrs: noinline nounwind optnone
 // CHECK-NOFMV-LABEL: @ftc(
 // CHECK-NOFMV-NEXT:  entry:
 // CHECK-NOFMV-NEXT:    ret i32 0
+//
+//
 // CHECK-NOFMV: Function Attrs: noinline nounwind optnone
 // CHECK-NOFMV-LABEL: @ftc_def(
 // CHECK-NOFMV-NEXT:  entry:
 // CHECK-NOFMV-NEXT:    ret i32 1
+//
+//
 // CHECK-NOFMV: Function Attrs: noinline nounwind optnone
 // CHECK-NOFMV-LABEL: @ftc_dup1(
 // CHECK-NOFMV-NEXT:  entry:
 // CHECK-NOFMV-NEXT:    ret i32 2
+//
+//
 // CHECK-NOFMV: Function Attrs: noinline nounwind optnone
 // CHECK-NOFMV-LABEL: @ftc_dup2(
 // CHECK-NOFMV-NEXT:  entry:
 // CHECK-NOFMV-NEXT:    ret i32 3
+//
+//
 // CHECK-NOFMV: Function Attrs: noinline nounwind optnone
 // CHECK-NOFMV-LABEL: @foo(
 // CHECK-NOFMV-NEXT:  entry:
@@ -310,10 +382,14 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NOFMV-NEXT:    [[CALL4:%.*]] = call i32 @ftc_dup2()
 // CHECK-NOFMV-NEXT:    [[ADD5:%.*]] = add nsw i32 [[ADD3]], [[CALL4]]
 // CHECK-NOFMV-NEXT:    ret i32 [[ADD5]]
+//
+//
 // CHECK-NOFMV: Function Attrs: noinline nounwind optnone
 // CHECK-NOFMV-LABEL: @ftc_direct(
 // CHECK-NOFMV-NEXT:  entry:
 // CHECK-NOFMV-NEXT:    ret i32 4
+//
+//
 // CHECK-NOFMV: Function Attrs: noinline nounwind optnone
 // CHECK-NOFMV-LABEL: @main(
 // CHECK-NOFMV-NEXT:  entry:
@@ -327,21 +403,29 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
 // CHECK-NOFMV-NEXT:    [[CALL4:%.*]] = call i32 @ftc_direct()
 // CHECK-NOFMV-NEXT:    [[ADD5:%.*]] = add nsw i32 [[ADD3]], [[CALL4]]
 // CHECK-NOFMV-NEXT:    ret i32 [[ADD5]]
-
-// CHECK: attributes #0 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+lse,+neon" }
-// CHECK: attributes #1 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sve,+sve2" }
-// CHECK: attributes #2 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
-// CHECK: attributes #3 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon,+sha2" }
-// CHECK: attributes #4 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+mte,+neon,+sha2" }
-// CHECK: attributes #5 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon" }
-// CHECK: attributes #6 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+dotprod,+fp-armv8,+neon" }
-// CHECK: attributes #7 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon,+rand" }
-// CHECK: attributes #8 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+predres,+rcpc" }
-// CHECK: attributes #9 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sve,+sve2,+sve2-aes,+wfxt" }
-// CHECK: attributes #10 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon" }
-// CHECK: attributes #11 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+complxnum,+fp-armv8,+fullfp16,+neon,+sve,+sve2,+sve2-bitperm" }
-// CHECK: attributes #12 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti" }
-// CHECK: attributes #13 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sb,+sve" }
-
-// CHECK-NOFMV: attributes #0 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="-fmv" }
-// CHECK-NOFMV: attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="-fmv" }
+//
+//.
+// CHECK: attributes #[[ATTR0:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+lse,+neon,-fp-armv8" }
+// CHECK: attributes #[[ATTR1:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fullfp16,+neon,+sve,+sve2,-fp-armv8" }
+// CHECK: attributes #[[ATTR2:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="-fp-armv8" }
+// CHECK: attributes #[[ATTR3:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+neon,+sha2,-fp-armv8" }
+// CHECK: attributes #[[ATTR4:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+mte,+neon,+sha2,-fp-armv8" }
+// CHECK: attributes #[[ATTR5:[0-9]+]] = { noinline nounwind op...
[truncated]

github-actions · 2023-12-05T12:32:35Z

✅ With the latest revision this PR passed the C/C++ code formatter.

DavidSpickett · 2023-12-05T12:35:55Z

(maybe this is for a later patch but...)

Does this patch include the clang side options for the ABI or does -march=...+no-fp-armv8 already work for that?

ostannard · 2023-12-05T12:42:13Z

This doesn't add any new options, and I'd like to avoid that if possible, instead it is turned on by -march=...+nofp.

DavidSpickett · 2023-12-05T13:34:27Z

That makes sense. Didn't realise we had an fp flag, I don't see it in llvm/include/llvm/TargetParser/AArch64TargetParser.h.

efriedma-quic · 2023-12-05T19:45:02Z

I'm a little surprised we don't need any LLVM backend changes, but I guess in soft-float mode we basically treat floats as ints anyway, so maybe things just work. I'd like to see appropriate regression tests, though (if they don't already exist).

davemgreen · 2023-12-05T19:48:15Z

clang/lib/CodeGen/Targets/AArch64.cpp

@@ -534,7 +540,8 @@ Address AArch64ABIInfo::EmitAAPCSVAArg(Address VAListAddr, QualType Ty,
    BaseTy = ArrTy->getElementType();
    NumRegs = ArrTy->getNumElements();
  }
-  bool IsFPR = BaseTy->isFloatingPointTy() || BaseTy->isVectorTy();
+  bool IsFPR = Kind == AArch64ABIKind::AAPCS &&


Could this be Kind != AArch64ABIKind::AAPCS` in case other ABIKinds make their way here.

ostannard · 2023-12-06T09:43:30Z

I'm a little surprised we don't need any LLVM backend changes, but I guess in soft-float mode we basically treat floats as ints anyway, so maybe things just work. I'd like to see appropriate regression tests, though (if they don't already exist).

Yes, the backend already treated floats as if they were an integer of the same width, so no changes are needed there. I've added a test for the backend.

kyrilltkachov · 2023-12-06T10:56:46Z

Changing the ABI through the architecture flags i.e. +nofp has potential to greatly confuse users, who may come from other architectures that have soft-float ABIs already (including 32-bit arm)

Does this also happen when using +nosimd?
What about users like the Linux kernel that build with -mgeneral-regs-only?
There is already an -mabi= option. How come it doesn't affect what is a pretty major ABI decision?

ostannard · 2023-12-06T14:50:58Z

I originally implemented it this way to avoid making two ABIs valid for one target, but I agree that it would be confusing to have the ABI change automatically. How about we use the -mabi= option as you suggest, with values aapcs (already accepted by clang) and aapcs-soft (new)?

-mabi=aapcs is always the default
If -mabi=aapcs (or the default) is used without an FPU, then floating-point argument and return values are rejected with an error. This matches GCC's current behaviour, and would be a change for clang, though clang previously used a non-compliant ABI. Code which is accepted with this combination is compatible with both ABIs.
If -mabi=aapcs-soft is used with an FPU, then we reject the command-line options with an error. This avoids having two incompatible ABIs for one target. We will always have the option to relax this in future without breaking existing code, but I'd like to avoid that unless there's a very good reason.

Does this also happen when using +nosimd?
No, in clang +nosimd leaves the FPU turned on. That's probably a bad thing, since that isn't a valid combination for AArch64, but that's a different issue.

What about users like the Linux kernel that build with -mgeneral-regs-only?
In clang, that has the same behaviour as +nofp. I vaguely remember something about GCC still allowing FP instructions in inline assembly with this option, but we don't do that in clang because the assembler is integrated into the compiler, and gets the same target options. For the purposes of the ABI, I think we should treat it the same way as +nofp.

AArch64TargetInfo defaults to having the FP feature enabled, but this function was ignoring that and checking for SIMD instructions instead. This won't affect most users, because the driver explicitly enables or disables fp-armv8, which gets handled by AArch64TargetInfo::handleTargetFeatures to turn FP and SIMD on or off. However, it will make testing future patches easier, and allow testing for the presense of FP registers/instructions in CC1 tests. Change-Id: I2d2b3569dca5fa1dc40c5c6d1dabf7741b8c480e

This patch just adds the ABI enum and tests existing behaviour, it doesn't make any functional changes yet. I've not added an option to turn this on, instead it is automatically selected if the target does not have FP registers. This is because the vast majority of existing AArch64 hardware does have an FPU, so we expect this to only be used when abbsoultely necessary, unlike AArch32 where both ABIs were commonly used. Change-Id: I6f3a4562ca60c31b49a926cdce1881418677c604

If we can't pass them in FP registers, then homogeneous floating-point and vector aggregates should be treated like any other composite type, and passed either in registers or on the stack. Change-Id: Icd56e122ad586462d6059069f923ffca4b32a8d2

The AArch64 back-end already avoids saving the FP registers to the va_list when the FP registers aren't present, but clang also needs to know not to load them from the FP register save area when generating code for va_arg. The layout of va_list remains the same, but the vr_top and vr_offs fields are unused. Change-Id: I5d3dee1ac4a29f189432957910662939b79d9329

If any other AArch64ABIKinds reach here in future they will use floating-point registers, which is a better default.

* The default is unchanged ("aapcs", except for targets where it already defaults to "darwinpcs"), independent of whether the target has an FPU * If -mabi=aapcs-soft is used and the target has an FPU, the driver reports an error, to prevent having two incompatible ABIs for one target. * If a hard-float ABI is requested for a target without an FPU, this is accepted by the driver, but any use of floating-point types is an error. This matches GCC's behaviour with -mgeneral-regs-only, and allows for writing code which is compatible with both ABIs.

ostannard · 2024-01-22T16:27:31Z

Ping

ostannard · 2024-02-05T12:56:30Z

Ping

ostannard · 2024-02-12T10:29:49Z

Ping, the ABI specification change (ARM-software/abi-aa#232) has now been merged.

stuij

This seems like a sensible and unobtrusive solution to me. Also the change is backed by a change in the Arm ABI, and it looks like the other review comments have been addressed. I've also built this change locally and ran the lit tests, and everything passes.

ostannard · 2024-02-15T13:00:35Z

One test caused a buildbot failure when the AArch64 backend is not built, fixed by 5b8e7ed

nathanchance · 2024-02-16T16:58:55Z

For what it's worth, this breaks building the Linux kernel for ARCH=arm64 pretty badly, with errors in several drivers:

$ make -skj"$(nrpoc)" ARCH=arm64 LLVM=1 mrproper defconfig all
...
drivers/clk/qcom/gcc-ipq6018.c:896:2: error: expression requires 'double' type support, but ABI 'aapcs' does not support it
  896 |         F(533000000, P_GPLL0, 1.5, 0, 0),
      |         ^
drivers/clk/qcom/clk-rcg.h:10:41: note: expanded from macro 'F'
   10 | #define F(f, s, h, m, n) { (f), (s), (2 * (h) - 1), (m), (n) }
      |                                         ^
...
In file included from drivers/gpu/drm/msm/adreno/a2xx_gpu.c:4:
In file included from drivers/gpu/drm/msm/adreno/a2xx_gpu.h:13:
drivers/gpu/drm/msm/adreno/a2xx.xml.h:1849:24: error: 'A2XX_PA_CL_VPORT_XSCALE' requires 'float' type support, but ABI 'aapcs' does not support it
 1849 | static inline uint32_t A2XX_PA_CL_VPORT_XSCALE(float val)
      |                        ^
drivers/gpu/drm/msm/adreno/a2xx.xml.h:1849:24: note: 'A2XX_PA_CL_VPORT_XSCALE' defined here
drivers/gpu/drm/msm/adreno/a2xx.xml.h:1857:24: error: 'A2XX_PA_CL_VPORT_XOFFSET' requires 'float' type support, but ABI 'aapcs' does not support it
 1857 | static inline uint32_t A2XX_PA_CL_VPORT_XOFFSET(float val)
      |                        ^
drivers/gpu/drm/msm/adreno/a2xx.xml.h:1857:24: note: 'A2XX_PA_CL_VPORT_XOFFSET' defined here
drivers/gpu/drm/msm/adreno/a2xx.xml.h:1865:24: error: 'A2XX_PA_CL_VPORT_YSCALE' requires 'float' type support, but ABI 'aapcs' does not support it
 1865 | static inline uint32_t A2XX_PA_CL_VPORT_YSCALE(float val)
      |                        ^
drivers/gpu/drm/msm/adreno/a2xx.xml.h:1865:24: note: 'A2XX_PA_CL_VPORT_YSCALE' defined here
...

I suspect this may be related to the kernel's use of -mgeneral-regs-only? Up until this point though, there have been no reported issues with code generation or runtime time impact with clang-built kernels.

Prabhuk · 2024-02-16T18:46:53Z

This commit breaks building Fuchsia's Kernel as well.

Sample builder log: https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8755980051536646593/+/u/build/ninja/stdout

[87049/291900](68) CXX kernel.phys_arm64/obj/src/lib/elfldltl/libelfldltl.leb128.cc.o
FAILED: kernel.phys_arm64/obj/src/lib/elfldltl/libelfldltl.leb128.cc.o
../../prebuilt/third_party/clang/custom/bin/clang++ -MD -MF kernel.phys_arm64/obj/src/lib/elfldltl/libelfldltl.leb128.cc.o.d -o kernel.phys_arm64/obj/src/lib/elfldltl/libelfldltl.leb128.cc.o -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS=1 -DZX_ASSERT_LEVEL=2 -DLK_DEBUGLEVEL=2 -DWITH_FRAME_POINTERS=1 -I../../zircon/system/public -I../../src/lib/elfldltl/incl...
In file included from ../../src/lib/elfldltl/leb128.cc:5:
In file included from ../../src/lib/elfldltl/include/lib/elfldltl/dwarf/encoding.h:8:
In file included from ../../sdk/lib/stdcompat/include/lib/stdcompat/span.h:13:
In file included from ../../sdk/lib/stdcompat/include/lib/stdcompat/iterator.h:10:
In file included from ../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/iterator:683:
In file included from ../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/__iterator/common_iterator.h:31:
In file included from ../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/variant:220:
../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/__functional/hash.h:437:9: error: '__v' requires 'float' type support, but ABI 'aapcs' does not support it
  437 |     if (__v == 0.0f)
      |         ^
../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/__functional/hash.h:435:49: note: '__v' defined here
  435 |   _LIBCPP_HIDE_FROM_ABI size_t operator()(float __v) const _NOEXCEPT {
      |                                                 ^
../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/__functional/hash.h:439:34: error: 'operator()' requires 'float' type support, but ABI 'aapcs' does not support it
  439 |     return __scalar_hash<float>::operator()(__v);
      |                                  ^
../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/__functional/hash.h:249:32: note: 'operator()' defined here
  249 |   _LIBCPP_HIDE_FROM_ABI size_t operator()(_Tp __v) const _NOEXCEPT {
      |                                ^
../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/__functional/hash.h:435:32: error: 'operator()' requires 'float' type support, but ABI 'aapcs' does not support it
  435 |   _LIBCPP_HIDE_FROM_ABI size_t operator()(float __v) const _NOEXCEPT {
      |                                ^
../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/__functional/hash.h:435:32: note: 'operator()' defined here
../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/__functional/hash.h:447:9: error: '__v' requires 'double' type support, but ABI 'aapcs' does not support it
  447 |     if (__v == 0.0)
      |         ^
../../prebuilt/third_party/clang/custom/bin/../include/c++/v1/__functional/hash.h:445:50: note: '__v' defined here
  445 |   _LIBCPP_HIDE_FROM_ABI size_t operator()(double __v) const _NOEXCEPT {
      |             ...

Prabhuk · 2024-02-16T18:51:41Z

@ostannard -- Can you please fix or submit a revert? Thank you.

This reverts commit 9cc98e3. Issue: ClangBuiltLinux/linux#1997

This reverts commit 9cc98e3.

This reverts commit 9cc98e3. Issue: ClangBuiltLinux/linux#1997

This is re-working of #74460, which adds a soft-float ABI for AArch64. That was reverted because it causes errors when building the linux and fuchsia kernels. The problem is that GCC's implementation of the ABI compatibility checks when using the hard-float ABI on a target without FP registers does it's checks after optimisation. The previous version of this patch reported errors for all uses of floating-point types, which is stricter than what GCC does in practice. This changes two things compared to the first version: * Only check the types of function arguments and returns, not the types of other values. This is more relaxed than GCC, while still guaranteeing ABI compatibility. * Move the check from Sema to CodeGen, so that inline functions are only checked if they are actually used. There are some cases in the linux kernel which depend on this behaviour of GCC.

This is re-working of llvm#74460, which adds a soft-float ABI for AArch64. That was reverted because it causes errors when building the linux and fuchsia kernels. The problem is that GCC's implementation of the ABI compatibility checks when using the hard-float ABI on a target without FP registers does it's checks after optimisation. The previous version of this patch reported errors for all uses of floating-point types, which is stricter than what GCC does in practice. This changes two things compared to the first version: * Only check the types of function arguments and returns, not the types of other values. This is more relaxed than GCC, while still guaranteeing ABI compatibility. * Move the check from Sema to CodeGen, so that inline functions are only checked if they are actually used. There are some cases in the linux kernel which depend on this behaviour of GCC.

ostannard requested review from arsenm, stuij and smithp35 December 5, 2023 12:30

llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen labels Dec 5, 2023

ostannard mentioned this pull request Dec 5, 2023

AArch64 fails with "Don't know how to custom expand this" with illegal f16 #74402

Open

davemgreen reviewed Dec 5, 2023

View reviewed changes

ostannard added 8 commits January 11, 2024 10:00

[AArch64] Disable HFAs/HVAs for soft-float

898484f

If we can't pass them in FP registers, then homogeneous floating-point and vector aggregates should be treated like any other composite type, and passed either in registers or on the stack. Change-Id: Icd56e122ad586462d6059069f923ffca4b32a8d2

[AArch64] Add a backend test for the soft-float ABI

e48f525

[AArch64] Invert check for soft-float ABI (NFC)

a1171c8

If any other AArch64ABIKinds reach here in future they will use floating-point registers, which is a better default.

Clang-format

06f5701

ostannard force-pushed the aarch64-soft-float branch from 5ee37ae to 3017b8a Compare January 11, 2024 16:35

llvmbot added the clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' label Jan 11, 2024

stuij approved these changes Feb 14, 2024

View reviewed changes

Merge branch 'main' into aarch64-soft-float

8910baa

ostannard merged commit 9cc98e3 into llvm:main Feb 15, 2024
3 of 4 checks passed

nathanchance mentioned this pull request Feb 16, 2024

arm64 "requires 'double/float' type support, but ABI 'aapcs' does not support it" after LLVM commit 9cc98e336980 ClangBuiltLinux/linux#1997

Closed

Prabhuk added a commit to Prabhuk/llvm-project that referenced this pull request Feb 16, 2024

Revert "[AArch64] Add soft-float ABI (llvm#74460)"

159846d

This reverts commit 9cc98e3. Issue: ClangBuiltLinux/linux#1997

Prabhuk added a commit to Prabhuk/llvm-project that referenced this pull request Feb 16, 2024

Revert "[AArch64] Add soft-float ABI (llvm#74460)"

13bf442

This reverts commit 9cc98e3.

Prabhuk added a commit that referenced this pull request Feb 17, 2024

Revert "[AArch64] Add soft-float ABI (#74460)" (#82032)

ea9ec80

This reverts commit 9cc98e3. Issue: ClangBuiltLinux/linux#1997

ostannard mentioned this pull request Mar 6, 2024

[AArch64] Add soft-float ABI #84146

Merged

ostannard mentioned this pull request Oct 1, 2024

aaarch64: there is no way to use the FPU on a softfloat target without creating an ABI mess #110632

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64] Add soft-float ABI #74460

[AArch64] Add soft-float ABI #74460

ostannard commented Dec 5, 2023

llvmbot commented Dec 5, 2023 •

edited

Loading

github-actions bot commented Dec 5, 2023 •

edited

Loading

DavidSpickett commented Dec 5, 2023

ostannard commented Dec 5, 2023

DavidSpickett commented Dec 5, 2023

efriedma-quic commented Dec 5, 2023

davemgreen Dec 5, 2023

ostannard Dec 6, 2023

ostannard commented Dec 6, 2023

kyrilltkachov commented Dec 6, 2023

ostannard commented Dec 6, 2023

ostannard commented Jan 22, 2024

ostannard commented Feb 5, 2024

ostannard commented Feb 12, 2024

stuij left a comment

ostannard commented Feb 15, 2024

nathanchance commented Feb 16, 2024

Prabhuk commented Feb 16, 2024

Prabhuk commented Feb 16, 2024

[AArch64] Add soft-float ABI #74460

[AArch64] Add soft-float ABI #74460

Conversation

ostannard commented Dec 5, 2023

llvmbot commented Dec 5, 2023 • edited Loading

github-actions bot commented Dec 5, 2023 • edited Loading

DavidSpickett commented Dec 5, 2023

ostannard commented Dec 5, 2023

DavidSpickett commented Dec 5, 2023

efriedma-quic commented Dec 5, 2023

davemgreen Dec 5, 2023

Choose a reason for hiding this comment

ostannard Dec 6, 2023

Choose a reason for hiding this comment

ostannard commented Dec 6, 2023

kyrilltkachov commented Dec 6, 2023

ostannard commented Dec 6, 2023

ostannard commented Jan 22, 2024

ostannard commented Feb 5, 2024

ostannard commented Feb 12, 2024

stuij left a comment

Choose a reason for hiding this comment

ostannard commented Feb 15, 2024

nathanchance commented Feb 16, 2024

Prabhuk commented Feb 16, 2024

Prabhuk commented Feb 16, 2024

llvmbot commented Dec 5, 2023 •

edited

Loading

github-actions bot commented Dec 5, 2023 •

edited

Loading