Skip to content

Commit

Permalink
reduce size of memory allowed for parallel handling of histograms in …
Browse files Browse the repository at this point in the history
…BinSumsBoosting
  • Loading branch information
paulbkoch committed Mar 31, 2024
1 parent 9ca58ec commit fe10ac3
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions shared/libebm/bridge/bridge.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,10 @@ static_assert(sizeof(UIntSmall) < sizeof(UIntBig), "UIntBig must be able to cont
static_assert(sizeof(FloatSmall) < sizeof(FloatBig), "FloatBig must be able to contain FloatSmall");


// This is large enough that with 1024 bins using AVX512F that the entire 16 SIMD tensors can fit into memory
// It is larger than most L1 caches, so some of it will need to go into L2, but even with a lot of threads which
// share the L2/L3 cache, I do not think it should exceed the full L2/L3 caches, so it shouldn't use main memory.
#define PARALLEL_BINS_BYTES_MAX (STATIC_CAST(size_t, (131072)))
// from benchmarking, it seems that things are faster if our data can fit into L1 data cache, but get
// slower if we depend on L2, so keep it under typical L1 data cache sizes, which for smaller mobile
// laptops and ARM is 16kB
#define PARALLEL_BINS_BYTES_MAX (STATIC_CAST(size_t, (16384)))

struct ApplyUpdateBridge {
size_t m_cScores;
Expand Down

0 comments on commit fe10ac3

Please sign in to comment.