Replies: 1 comment 3 replies
-
I believe 7F is correct. Not sure what's going on with DUCET. But since the area in question is unassigned anyway, I don't think it should matter for your use case. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! I'm really, REALLY sorry for going off-topic. I understand that this might not be entirely relevant, but perhaps you could help me figure it out?
I'm writing a collator, and for the Tangut language, the weights are implicit.
In this case, I'm considering the DUCET, without delving into the CLDR territory.
In the specification (TR#10), it's stated:
Siniform ideographic scripts: Tangut = Assigned code points in Block=
Tangut
OR Block=Tangut_Components
OR Block=Tangut_Supplement
So, as I understand it, it includes
U+17000
..=U+18AFF
andTangut_Supplement
.If we trust the UCD Blocks.txt,
Tangut_Supplement
ends atU+18D7F
. If we trust UCA DUCET allkeys.txt,Tangut_Supplement
ends atU+18D8F
.So is it
U+18D7F
orU+18D8F
?Or should one calculate implicit weights according to the scheme for
Tangut_Supplement
forU+18D00
..=U+18D08
, and considerU+18D09
as Unassigned (which seems logical, but the confusion about UCD/UCA remains)?I understand that the issue is minor, and perhaps not an issue at all, but if there's a mistake somewhere here, it would be good to correct it. If not, and it's just me being a bit stupid, I would really appreciate comments from smart and knowledgeable people in this matter.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions