ICU-22785 uprops.icu: coalesce scx+sc bits #3025
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
More uprops.icu property vectors cleanup.
The second commit moves the CodePointTrie bit setter functions from the emojipropsbuilder.cpp into the toolutil library, and adds a getCPTrieSize() function. This should help with future experimentation. (Copied from the experimental #2926.)
I also experimented with moving the Age bits into yet another new trie (small, 8-bit CodePointTrie). (Very easy with that second commit.)
Good: Frees up another 8 bits in properties vector word 0.
Bad: Slightly increases the uprops.icu data size by some 4.5% compared to the current state, or 3.9% compared with when the Block bits were still in properties vector word 0.
With the current Unicode 16 alpha data, we would go from
to
Since we don't need the additional bits yet, there is no immediate benefit to counterbalance the slight size increase, so I just added a comment.
Checklist
ALLOW_MANY_COMMITS=true