Adding specification to handle the new C2x _BitInt type #191

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS64 through which our calling standard is defined. Along with the AAPCS32 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields subdivision" mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>128) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type such a subdivision would look like some number of quad-words which have not been subdivided (either fully used or fully unused) and either one or zero quad-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a quad-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a quad-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 128) types and Homogeneous Aggregates. With this changeset, _BitInt(N>128) types are treated as arrays of __int128 values. Hence at the machine level they would be a Homogeneous Aggregate of quad-words. The wording in section 5.9.5 currently mentions "uniquely addressable Members". I am not sure what this is referring to, but would expect this is referring to addressable members of the base type. If that is the case then I don't believe anything needs to be updated. If it were referring to addressable members at the language level (which would be strange given the context) then this may need updating since one language-level _BitInt(256) type would not equat to one Fundamental Data Type. -- Combination of unspecified bits in _BitInt and C.16 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. The rules C.12 and C.16 of our parameter passing standard specify that when there are unused bits of a structure and/or Integral fundamental data type that are passed in registers, those unused bits are unspecified. The combination of these two rules means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are unspecified. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS32 through which our calling standard is defined. Along with the AAPCS64 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields" section 5.3.4 mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>64) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type, such a subdivision would look like some number of double-words which have not been subdivided (either fully used or fully unused) and either one or zero double-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a double-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a double-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 64) types and Homogeneous Aggregates. With this changeset, _BitInt(N>64) types are treated as arrays of uint64_t values. Hence at the machine level they would be a Homogeneous Aggregate of double-words. -- Combination of unspecified bits in _BitInt and B.2 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. Rule B.2 of our parameter passing standard specifies that when there are unused bits in an integral Fundamental Data Type that is passed in registers, those unused bits are zero- or sign-extended to a full-word. The combination of this rule with the fact that _BitInt types are zero- or sign-extended to the Fundamental Data Type which they are passed in means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are sign-extended. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

This type has been added into the C2x specification, alongside changes to describe how they are represented at a machine level we also add a design document describing the rationale behind the choices we made. One of the main contentious issues has been around the alignment of large _BitInt types. In this document we have chosen to specify that large _BitInt's are treated as if they were an array of double-register sized chunks. Other psABI's have chosen to represent large _BitInt's as an array of single-register sized chunks. This means that differences in alignment might be surprising to developers attempting to write portable code. However in contrast we believe that developers familiar with ABI's for Arm Architectures would be less surprised by this decision. N.b. I happened to also notice a stack overflow question which was suggesting the use of _BitInt(128) for u128. Given the discussion did not have any mention of the different ABI between this and __uint128 I would take it as evidence for a benefit to having the two integrals ABI match. https://stackoverflow.com/questions/16088282/is-there-a-128-bit-integer-in-gcc

Mostly things that should help the document be easily understood for non-native speakers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding specification to handle the new C2x _BitInt type #191

Adding specification to handle the new C2x _BitInt type #191

Commits on Sep 12, 2023

Commits on Sep 27, 2023

Commits on Oct 3, 2023