Skip to content

Commit

Permalink
x/text/encoding/charmap: support for code page 1125 aka cp866u
Browse files Browse the repository at this point in the history
Add Ukrainian government standard (RST 2018-91) for DOS, based on common
alternative encoding, but different from cp866 in 0xF2-0xF9

Fixes #69779
  • Loading branch information
yukal committed Oct 5, 2024
1 parent 3043346 commit a58b05d
Show file tree
Hide file tree
Showing 4 changed files with 200 additions and 1 deletion.
4 changes: 4 additions & 0 deletions encoding/charmap/charmap_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,10 @@ func TestBasics(t *testing.T) {
e: CodePage1047,
encoded: "\xc8\x54\x93\x93\x9f",
utf8: "Hèll¤",
}, {
e: CodePage1125,
encoded: "Hello.\x20\x80\x81\x82\x83\xF2\x84\x85\xF4\xF0\x86\x87\x88\xF6\xF8\x89\x8A\x8B\x8C\x8D\x8E\x8F\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9A\x9B\x9C\x9D\x9E\x9F\xA0\xA1\xA2\xA3\xF3\xA4\xA5\xF5\xF1\xA6\xA7\xA8\xF7\xF9\xA9\xAA\xAB\xAC\xAD\xAE\xAF\xE0\xE1\xE2\xE3\xE4\xE5\xE6\xE7\xE8\xE9\xEA\xEB\xEC\xED\xEE\xEF\xB0\xB1\xB2\xB3\xB4\xB5\xB6\xB7\xB8\xB9\xBA\xBB\xBC\xBD\xBE\xBF\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF\xD0\xD1\xD2\xD3\xD4\xD5\xD6\xD7\xD8\xD9\xDA\xDB\xDC\xDD\xDE\xDF\xFA\xFB\xFC\xFD\xFE",
utf8: "Hello. АБВГҐДЕЄЁЖЗИІЇЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгґдеєёжзиіїйклмнопрстуфхцчшщъыьэюя░▒▓│┤╡╢╖╕╣║╗╝╜╛┐└┴┬├─┼╞╟╚╔╩╦╠═╬╧╨╤╥╙╘╒╓╫╪┘┌█▄▌▐▀·√№¤■",
}, {
e: CodePage1140,
encoded: "\xc8\x9f\x93\x93\xcf",
Expand Down
8 changes: 8 additions & 0 deletions encoding/charmap/maketables.go
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,14 @@ var encodings = []struct {
0x3f,
"https://raw.githubusercontent.com/unicode-org/icu-data/main/charset/data/ucm/glibc-IBM1047-2.1.2.ucm",
},
{
"IBM Code Page 1125 (aka cp866u)",
"IBM1125",
"Ukrainian government standard (RST 2018-91) for DOS, based on common alternative encoding, but different from cp866 in 0xF2-0xF9",
"CodePage1125",
encoding.ASCIISub,
"https://raw.githubusercontent.com/unicode-org/icu-data/main/charset/data/ucm/glibc-CP1125-2.3.3.ucm",
},
{
"IBM Code Page 1140",
"IBM01140",
Expand Down
180 changes: 179 additions & 1 deletion encoding/charmap/tables.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions encoding/internal/identifier/mib.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit a58b05d

Please sign in to comment.