Skip to content

Commit

Permalink
Adding a runtime dis/enabler of DIT Capability on AArch64. (aws#1783)
Browse files Browse the repository at this point in the history
- Provide runtime functions that mask out (and back in) the DIT CPU
capability by clearing (setting) an additional bit in
`OPENSSL_armcap_P`. This mechanism was chosen for the following reasons:
  - It does not require an additional global variable.
  - It avoids extra checks on the path of setting/resetting the DIT bit.
- It avoids re-evaluating the CPU capability if we were to clear the DIT
capability bit itself. That latter bit is now left intact.
There were write locks added around changing `OPENSSL_armcap_P`.
However, Thread Sanitizer warned about data race possibilities when
trying to run a test with concurrent threads where one disables DIT at
runtime and the other tries to check for the capability. Therefore, they
are documented with a warning to use them only in initialization
contexts.

- Make the DIT functions (enable/disable and set/restore) available
regardless of whether the build flag
`DENABLE_DATA_INDEPENDENT_TIMING_AARCH64=ON` was used or not.
- If the build flag was not used, then the DIT flag is not set and reset
with every function and the instructions used for setting it and
resetting after checking the capability are omitted and don't incur
extra cost.
- The user now has the choice, regardless of the build flag, to place
`armv8_set/restore_dit` in the user's code.

Call-outs:
- The API `armv8_enable_dit` is renamed to `armv8_set_dit`. 
- `armv8_enable_dit` now means enable back the capability at runtime.
- The macro `SET_DIT_AUTO_RESET` and the functions `armv8_set_dit` and `armv8_restore_dit` are moved to be internal.
- The build flag was renamed to `ENABLE_DATA_INDEPENDENT_TIMING`.
  • Loading branch information
nebeid authored Sep 25, 2024
1 parent 2df0b55 commit fbeb5e8
Show file tree
Hide file tree
Showing 31 changed files with 513 additions and 238 deletions.
53 changes: 37 additions & 16 deletions BUILDING.md
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,8 @@ See "Snapshot Safety Prerequisites" here: https://lkml.org/lkml/2021/3/8/677

# Data Independent Timing on AArch64

The functions described in this section are still experimental.

The Data Independent Timing (DIT) flag on Arm64 processors, when
enabled, ensures the following as per [Arm A-profile Architecture
Registers
Expand All @@ -254,20 +256,39 @@ It is also expected to disable the Data Memory-dependent Prefetcher
(DMP) feature of Apple M-series CPUs starting at M3 as per [this
article](https://appleinsider.com/articles/24/03/21/apple-silicon-vulnerability-leaks-encryption-keys-and-cant-be-patched-easily).

Building with the option `-DENABLE_DATA_INDEPENDENT_TIMING_AARCH64=ON`
will enable the macro `SET_DIT_AUTO_DISABLE`. This macro is present at
the entry of functions that process/load/store secret data to enable
the DIT flag and then set it to its original value on entry. With
this build option, there is an effect on performance that varies by
Building with the option `-DENABLE_DATA_INDEPENDENT_TIMING=ON`
will enable the macro `SET_DIT_AUTO_RESET`. This macro is present at
the entry of functions that process/load/store secret data to set the
DIT flag and then restore it to its original value on entry. With this
build option, there is an effect on performance that varies by
function and by processor architecture. The effect is mostly due to
enabling and disabling the DIT flag. If it remains enabled over many
calls, the effect can be largely mitigated. Hence, the macro can be
inserted in the caller's application at the beginning of the code
scope that makes repeated calls to AWS-LC cryptographic
functions. Alternatively, the functions `armv8_enable_dit` and
`armv8_restore_dit` can be placed at the beginning and the end of
the code section, respectively.
An example of that usage is present in the benchmarking function
`Speed()` in `tool/speed.cc` when the `-dit` option is used

./tool/bssl speed -dit
setting and resetting the DIT flag. If it remains set over many calls,
the effect can be largely mitigated.

The macro and the functions invoked by it are internally declared,
being experimental. In the following, we tested the effect of
inserting the macro in the caller's application at the beginning of
the code scope that makes repeated calls to AWS-LC cryptographic
functions. The functions that are invoked in the macro,
`armv8_set_dit` and `armv8_restore_dit`, are placed at the beginning
and the end, respectively, of the benchmarking function `Speed()` in
`tool/speed.cc` when the `-dit` option is used.

./tool/bssl speed -dit

This resulted in benchmarks that are close to the release build
without the `-DENABLE_DATA_INDEPENDENT_TIMING=ON` flag when tested on
Apple M2.

The DIT capability, which is checked in `OPENSSL_cpuid_setup` can be
masked out at runtime by calling `armv8_disable_dit`. This would
result in having the functions `armv8_set_dit` and `armv8_restore_dit`
being of no effect. It can be made available again at runtime by calling
`armv8_enable_dit`.

**Important**: This runtime control is provided to users that would use
the build flag `ENABLE_DATA_INDEPENDENT_TIMING`, but would
then disable DIT capability at runtime. This is ideally done in
an initialization routine of AWS-LC before any threads are spawn.
Otherwise, there may be data races created because these functions write
to the global variable `OPENSSL_armcap_P`.
7 changes: 4 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ option(ENABLE_DILITHIUM "Enable Dilithium signatures in the EVP API" OFF)
option(DISABLE_PERL "Disable Perl for AWS-LC" OFF)
option(DISABLE_GO "Disable Go for AWS-LC" OFF)
option(ENABLE_FIPS_ENTROPY_CPU_JITTER "Enable FIPS entropy source: CPU Jitter" OFF)
option(ENABLE_DATA_INDEPENDENT_TIMING_AARCH64 "Enable Data-Independent Timing (DIT) flag on Arm64" OFF)
option(ENABLE_DATA_INDEPENDENT_TIMING "Enable automatic setting/resetting Data-Independent Timing
(DIT) flag in cryptographic functions. Currently only applicable to Arm64 (except on Windows)" OFF)
include(cmake/go.cmake)

enable_language(C)
Expand Down Expand Up @@ -867,8 +868,8 @@ if(ARCH STREQUAL "x86" AND NOT OPENSSL_NO_SSE2_FOR_TESTING)
endif()
endif()

if(ENABLE_DATA_INDEPENDENT_TIMING_AARCH64)
add_definitions(-DMAKE_DIT_AVAILABLE)
if(ENABLE_DATA_INDEPENDENT_TIMING)
add_definitions(-DENABLE_AUTO_SET_RESET_DIT)
endif()

if(USE_CUSTOM_LIBCXX)
Expand Down
1 change: 1 addition & 0 deletions crypto/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -805,6 +805,7 @@ if(BUILD_TESTING)
fipsmodule/sha/sha_test.cc
fipsmodule/sha/sha3_test.cc
fipsmodule/cpucap/cpu_arm_linux_test.cc
fipsmodule/cpucap/cpu_aarch64_dit_test.cc
fipsmodule/hkdf/hkdf_test.cc
fipsmodule/sshkdf/sshkdf_test.cc
hpke/hpke_test.cc
Expand Down
9 changes: 5 additions & 4 deletions crypto/evp_extra/p_dh_asn1.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

#include "internal.h"
#include "../internal.h"
#include "../fipsmodule/cpucap/internal.h"


static void dh_free(EVP_PKEY *pkey) {
Expand Down Expand Up @@ -89,7 +90,7 @@ const EVP_PKEY_ASN1_METHOD dh_asn1_meth = {
};

int EVP_PKEY_set1_DH(EVP_PKEY *pkey, DH *key) {
SET_DIT_AUTO_DISABLE
SET_DIT_AUTO_RESET
if (EVP_PKEY_assign_DH(pkey, key)) {
DH_up_ref(key);
return 1;
Expand All @@ -98,14 +99,14 @@ int EVP_PKEY_set1_DH(EVP_PKEY *pkey, DH *key) {
}

int EVP_PKEY_assign_DH(EVP_PKEY *pkey, DH *key) {
SET_DIT_AUTO_DISABLE
SET_DIT_AUTO_RESET
evp_pkey_set_method(pkey, &dh_asn1_meth);
pkey->pkey.dh = key;
return key != NULL;
}

DH *EVP_PKEY_get0_DH(const EVP_PKEY *pkey) {
SET_DIT_AUTO_DISABLE
SET_DIT_AUTO_RESET
if (pkey->type != EVP_PKEY_DH) {
OPENSSL_PUT_ERROR(EVP, EVP_R_EXPECTING_A_DH_KEY);
return NULL;
Expand All @@ -114,7 +115,7 @@ DH *EVP_PKEY_get0_DH(const EVP_PKEY *pkey) {
}

DH *EVP_PKEY_get1_DH(const EVP_PKEY *pkey) {
SET_DIT_AUTO_DISABLE
SET_DIT_AUTO_RESET
DH *dh = EVP_PKEY_get0_DH(pkey);
if (dh != NULL) {
DH_up_ref(dh);
Expand Down
8 changes: 4 additions & 4 deletions crypto/fipsmodule/aes/aes.c
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@
// code, above, is incompatible with the |aes_hw_*| functions.

void AES_encrypt(const uint8_t *in, uint8_t *out, const AES_KEY *key) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
if (hwaes_capable()) {
aes_hw_encrypt(in, out, key);
} else if (vpaes_capable()) {
Expand All @@ -71,7 +71,7 @@ void AES_encrypt(const uint8_t *in, uint8_t *out, const AES_KEY *key) {
}

void AES_decrypt(const uint8_t *in, uint8_t *out, const AES_KEY *key) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
if (hwaes_capable()) {
aes_hw_decrypt(in, out, key);
} else if (vpaes_capable()) {
Expand All @@ -82,7 +82,7 @@ void AES_decrypt(const uint8_t *in, uint8_t *out, const AES_KEY *key) {
}

int AES_set_encrypt_key(const uint8_t *key, unsigned bits, AES_KEY *aeskey) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
if (bits != 128 && bits != 192 && bits != 256) {
return -2;
}
Expand All @@ -96,7 +96,7 @@ int AES_set_encrypt_key(const uint8_t *key, unsigned bits, AES_KEY *aeskey) {
}

int AES_set_decrypt_key(const uint8_t *key, unsigned bits, AES_KEY *aeskey) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
if (bits != 128 && bits != 192 && bits != 256) {
return -2;
}
Expand Down
10 changes: 5 additions & 5 deletions crypto/fipsmodule/cipher/aead.c
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ int EVP_AEAD_CTX_init_with_direction(EVP_AEAD_CTX *ctx, const EVP_AEAD *aead,
const uint8_t *key, size_t key_len,
size_t tag_len,
enum evp_aead_direction_t dir) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
if (key_len != aead->key_len) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_UNSUPPORTED_KEY_SIZE);
ctx->aead = NULL;
Expand Down Expand Up @@ -125,7 +125,7 @@ int EVP_AEAD_CTX_seal(const EVP_AEAD_CTX *ctx, uint8_t *out, size_t *out_len,
size_t max_out_len, const uint8_t *nonce,
size_t nonce_len, const uint8_t *in, size_t in_len,
const uint8_t *ad, size_t ad_len) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
if (in_len + ctx->aead->overhead < in_len /* overflow */) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_TOO_LARGE);
goto error;
Expand Down Expand Up @@ -164,7 +164,7 @@ int EVP_AEAD_CTX_seal_scatter(const EVP_AEAD_CTX *ctx, uint8_t *out,
size_t in_len, const uint8_t *extra_in,
size_t extra_in_len, const uint8_t *ad,
size_t ad_len) {
SET_DIT_AUTO_DISABLE; //check that it was preserved
SET_DIT_AUTO_RESET; //check that it was preserved
// |in| and |out| may alias exactly, |out_tag| may not alias.
if (!check_alias(in, in_len, out, in_len) ||
buffers_alias(out, in_len, out_tag, max_out_tag_len) ||
Expand Down Expand Up @@ -197,7 +197,7 @@ int EVP_AEAD_CTX_open(const EVP_AEAD_CTX *ctx, uint8_t *out, size_t *out_len,
size_t max_out_len, const uint8_t *nonce,
size_t nonce_len, const uint8_t *in, size_t in_len,
const uint8_t *ad, size_t ad_len) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
if (!check_alias(in, in_len, out, max_out_len)) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_OUTPUT_ALIASES_INPUT);
goto error;
Expand Down Expand Up @@ -245,7 +245,7 @@ int EVP_AEAD_CTX_open_gather(const EVP_AEAD_CTX *ctx, uint8_t *out,
const uint8_t *in, size_t in_len,
const uint8_t *in_tag, size_t in_tag_len,
const uint8_t *ad, size_t ad_len) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
if (!check_alias(in, in_len, out, in_len)) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_OUTPUT_ALIASES_INPUT);
goto error;
Expand Down
14 changes: 7 additions & 7 deletions crypto/fipsmodule/cipher/cipher.c
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ void EVP_CIPHER_CTX_free(EVP_CIPHER_CTX *ctx) {
}

int EVP_CIPHER_CTX_copy(EVP_CIPHER_CTX *out, const EVP_CIPHER_CTX *in) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
if (in == NULL || in->cipher == NULL) {
OPENSSL_PUT_ERROR(CIPHER, CIPHER_R_INPUT_NOT_INITIALIZED);
return 0;
Expand Down Expand Up @@ -146,7 +146,7 @@ int EVP_CIPHER_CTX_reset(EVP_CIPHER_CTX *ctx) {
int EVP_CipherInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *cipher,
ENGINE *engine, const uint8_t *key, const uint8_t *iv,
int enc) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
GUARD_PTR(ctx);
if (enc == -1) {
enc = ctx->encrypt;
Expand Down Expand Up @@ -264,7 +264,7 @@ static int block_remainder(const EVP_CIPHER_CTX *ctx, int len) {

int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len,
const uint8_t *in, int in_len) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
GUARD_PTR(ctx);
if (ctx->poisoned) {
OPENSSL_PUT_ERROR(CIPHER, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED);
Expand Down Expand Up @@ -357,7 +357,7 @@ int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len,
}

int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
int n;
unsigned int i, b, bl;
GUARD_PTR(ctx);
Expand Down Expand Up @@ -412,7 +412,7 @@ int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len) {

int EVP_DecryptUpdate(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len,
const uint8_t *in, int in_len) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
GUARD_PTR(ctx);
if (ctx->poisoned) {
OPENSSL_PUT_ERROR(CIPHER, ERR_R_SHOULD_NOT_HAVE_BEEN_CALLED);
Expand Down Expand Up @@ -479,7 +479,7 @@ int EVP_DecryptUpdate(EVP_CIPHER_CTX *ctx, uint8_t *out, int *out_len,
}

int EVP_DecryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *out_len) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
int i, n;
unsigned int b;
*out_len = 0;
Expand Down Expand Up @@ -552,7 +552,7 @@ int EVP_DecryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *out_len) {

int EVP_Cipher(EVP_CIPHER_CTX *ctx, uint8_t *out, const uint8_t *in,
size_t in_len) {
SET_DIT_AUTO_DISABLE;
SET_DIT_AUTO_RESET;
GUARD_PTR(ctx);
GUARD_PTR(ctx->cipher);
const int ret = ctx->cipher->cipher(ctx, out, in, in_len);
Expand Down
38 changes: 30 additions & 8 deletions crypto/fipsmodule/cpucap/cpu_aarch64.c
Original file line number Diff line number Diff line change
Expand Up @@ -49,16 +49,21 @@ void handle_cpu_env(uint32_t *out, const char *in) {
}
}

#if defined(MAKE_DIT_AVAILABLE) && !defined(OPENSSL_WINDOWS)
#if defined(AARCH64_DIT_SUPPORTED)
// "DIT" is not recognised as a register name by clang-10 (at least)
// Register's encoded name is from e.g.
// https://github.com/ashwio/arm64-sysreg-lib/blob/d421e249a026f6f14653cb6f9c4edd8c5d898595/include/sysreg/dit.h#L286
#define DIT_REGISTER s3_3_c4_c2_5
DEFINE_STATIC_MUTEX(OPENSSL_armcap_P_lock)

static uint64_t armv8_get_dit(void) {
uint64_t val = 0;
__asm__ volatile("mrs %0, s3_3_c4_c2_5" : "=r" (val));
return (val >> 24) & 1;
uint64_t armv8_get_dit(void) {
if (CRYPTO_is_ARMv8_DIT_capable()) {
uint64_t val = 0;
__asm__ volatile("mrs %0, s3_3_c4_c2_5" : "=r" (val));
return (val >> 24) & 1;
} else {
return 0;
}
}

// See https://github.com/torvalds/linux/blob/53eaeb7fbe2702520125ae7d72742362c071a1f2/arch/arm64/include/asm/sysreg.h#L82
Expand All @@ -70,7 +75,7 @@ static uint64_t armv8_get_dit(void) {
// Op1 (3 for DIT) , Op2 (5 for DIT) encodes the PSTATE field modified and defines the constraints.
// CRm = Imm4 (#0 or #1 below)
// Rt = 0x1f
uint64_t armv8_enable_dit(void) {
uint64_t armv8_set_dit(void) {
if (CRYPTO_is_ARMv8_DIT_capable()) {
uint64_t original_dit = armv8_get_dit();
// Encoding of "msr dit, #1"
Expand All @@ -82,11 +87,28 @@ uint64_t armv8_enable_dit(void) {
}

void armv8_restore_dit(volatile uint64_t *original_dit) {
if (CRYPTO_is_ARMv8_DIT_capable() && *original_dit != 1) {
if (*original_dit != 1 && CRYPTO_is_ARMv8_DIT_capable()) {
// Encoding of "msr dit, #0"
__asm__ volatile(".long 0xd503405f");
}
}
#endif // MAKE_DIT_AVAILABLE && !OPENSSL_WINDOWS

void armv8_disable_dit(void) {
CRYPTO_STATIC_MUTEX_lock_write(OPENSSL_armcap_P_lock_bss_get());
OPENSSL_armcap_P &= ~ARMV8_DIT_ALLOWED;
CRYPTO_STATIC_MUTEX_unlock_write(OPENSSL_armcap_P_lock_bss_get());
}

void armv8_enable_dit(void) {
CRYPTO_STATIC_MUTEX_lock_write(OPENSSL_armcap_P_lock_bss_get());
OPENSSL_armcap_P |= ARMV8_DIT_ALLOWED;
CRYPTO_STATIC_MUTEX_unlock_write(OPENSSL_armcap_P_lock_bss_get());
}

int CRYPTO_is_ARMv8_DIT_capable_for_testing(void) {
return CRYPTO_is_ARMv8_DIT_capable();
}

#endif // AARCH64_DIT_SUPPORTED

#endif // OPENSSL_AARCH64 && !OPENSSL_STATIC_ARMCAP
4 changes: 1 addition & 3 deletions crypto/fipsmodule/cpucap/cpu_aarch64_apple.c
Original file line number Diff line number Diff line change
Expand Up @@ -99,11 +99,9 @@ void OPENSSL_cpuid_setup(void) {
OPENSSL_armcap_P |= ARMV8_APPLE_M1;
}

#if defined(MAKE_DIT_AVAILABLE)
if (has_hw_feature("hw.optional.arm.FEAT_DIT")) {
OPENSSL_armcap_P |= ARMV8_DIT;
OPENSSL_armcap_P |= (ARMV8_DIT | ARMV8_DIT_ALLOWED);
}
#endif // MAKE_DIT_AVAILABLE

// OPENSSL_armcap is a 32-bit, unsigned value which may start with "0x" to
// indicate a hex value. Prior to the 32-bit value, a '~' or '|' may be given.
Expand Down
Loading

0 comments on commit fbeb5e8

Please sign in to comment.