Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6 locale tests fail to properly sort ä #176

Open
nieder opened this issue Sep 7, 2024 · 6 comments
Open

6 locale tests fail to properly sort ä #176

nieder opened this issue Sep 7, 2024 · 6 comments

Comments

@nieder
Copy link

nieder commented Sep 7, 2024

Describe the bug
6 tests fail, all seem to have trouble with where to place ä (lower case 'a' with umlaut). The basic assertion is something like this for all of them:

E       AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3]
E         At index 0 diff: 'b' != 'ä'
E         Full diff:
E         - ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3]
E         + ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3]

See the full log at the bottom.

Expected behavior
Tests should pass.

Environment (please complete the following information):

  • Python Version: 3.10.4
  • OS: macOS 10.14.6
  • If the bug involves LOCALE or humansorted:
    • Is PyICU installed? It is not installed
    • Have tried LC_CTYPE=en_US.UTF-8 LANG=en_US.UTF-8 and other variations, as well as unsetting it. All lead to the same problem.

To Reproduce
See the full log below.

Full test output LC_CTYPE=en_US.UTF-8 LANG=en_US.UTF-8 python3.10 -m pytest -p no:relaxed -vv =========================================================== test session starts =========================================================== platform darwin -- Python 3.10.4, pytest-7.4.4, pluggy-1.4.0 -- /sw/bin/python3.10 cachedir: .pytest_cache benchmark: 3.4.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/sw/build.build/natsort-py310-8.4.0-1/natsort-8.4.0/.hypothesis/examples') Using --randomly-seed=1000599648 rootdir: /sw/build.build/natsort-py310-8.4.0-1/natsort-8.4.0 plugins: benchmark-3.4.1, hypothesis-6.42.1, randomly-3.15.0, datadir-1.5.0, asyncio-0.21.1, flaky-3.8.1, mock-3.12.0, xdist-3.5.0, cov-4.1.0, requests-mock-1.12.1 asyncio: mode=strict collected 333 items tests/test_natsorted.py::test_natsorted_sorts_an_odd_collection_of_strings[ns.NUMAFTER-expected1] PASSED [ 0%] tests/test_natsorted.py::test_natsorted_locale_bug_regression_test_140 SKIPPED (requires a functioning locale library to run) [ 0%] tests/test_natsorted.py::test_natsorted_consistent_ordering_with_nan_and_friends[ns.DEFAULT-expected0] PASSED [ 0%] tests/test_natsorted.py::test_natsorted_returns_list_in_reversed_order_with_reverse_option PASSED [ 1%] tests/test_natsorted.py::test_natsort_sorts_consistently_with_presort PASSED [ 1%] tests/test_natsorted.py::test_natsorted_supports_case_handling[ns.GROUPLETTERS-expected3] PASSED [ 1%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[520-expected6] PASSED [ 2%] tests/test_natsorted.py::test_natsorted_can_sort_with_or_without_accounting_for_sign[ns.SIGNED-expected1] PASSED [ 2%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.UNGROUPLETTERS-expected2] PASSED [ 2%] tests/test_natsorted.py::test_natsorted_can_sort_as_unsigned_and_ignore_exponents[51] PASSED [ 3%] tests/test_natsorted.py::test_natsorted_supports_case_handling[ns.LOWERCASEFIRST-expected2] PASSED [ 3%] tests/test_natsorted.py::test_natsorted_can_sort_using_locale[ns.LOWERCASEFIRST-expected2] PASSED [ 3%] tests/test_natsorted.py::test_natsorted_supports_case_handling[ns.DEFAULT-expected0] PASSED [ 3%] tests/test_natsorted.py::test_natsorted_handles_numbers_and_filesystem_paths_simultaneously PASSED [ 4%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.DEFAULT-expected0] FAILED [ 4%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[4104-expected5] FAILED [ 4%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.PATH-expected4] FAILED [ 5%] tests/test_natsorted.py::test_natsorted_can_sort_locale_specific_numbers_en FAILED [ 5%] tests/test_natsorted.py::test_natsorted_can_sort_using_locale[640-expected3] PASSED [ 5%] tests/test_natsorted.py::test_natsorted_consistent_ordering_with_nan_and_friends[ns.NANLAST-expected1] PASSED [ 6%] tests/test_natsorted.py::test_natsorted_can_sort_as_version_numbers PASSED [ 6%] tests/test_natsorted.py::test_natsorted_can_sort_as_unsigned_ints_which_is_default[ns.DEFAULT1] PASSED [ 6%] tests/test_natsorted.py::test_natsorted_can_sort_using_locale[ns.DEFAULT-expected0] PASSED [ 6%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[4608-expected3] PASSED [ 7%] tests/test_natsorted.py::test_natsorted_can_sort_as_unsigned_ints_which_is_default[ns.DEFAULT0] PASSED [ 7%] tests/test_natsorted.py::test_natsorted_handles_mixed_types[ns.DEFAULT-expected0] PASSED [ 7%] tests/test_natsorted.py::test_natsorted_can_sorts_paths_same_as_strings PASSED [ 8%] tests/test_natsorted.py::test_natsorted_sorts_an_odd_collection_of_strings[ns.DEFAULT-expected0] PASSED [ 8%] tests/test_natsorted.py::test_natsorted_sorts_mixed_ascii_and_non_ascii_numbers PASSED [ 8%] tests/test_natsorted.py::test_natsorted_handles_mixed_types[ns.NUMAFTER-expected1] PASSED [ 9%] tests/test_natsorted.py::test_natsorted_locale_bug_regression_test_109 PASSED [ 9%] tests/test_natsorted.py::test_natsorted_raises_type_error_for_non_iterable_input PASSED [ 9%] tests/test_natsorted.py::test_natsorted_supports_nested_case_handling[ns.LOWERCASEFIRST-expected1] PASSED [ 9%] tests/test_natsorted.py::test_natsorted_can_sort_locale_specific_numbers_de FAILED [ 10%] tests/test_natsorted.py::test_natsorted_supports_case_handling[ns.IGNORECASE-expected1] PASSED [ 10%] tests/test_natsorted.py::test_natsorted_handles_filesystem_paths PASSED [ 10%] tests/test_natsorted.py::test_natsorted_supports_nested_case_handling[ns.IGNORECASE-expected2] PASSED [ 11%] tests/test_natsorted.py::test_natsorted_recurses_into_nested_lists PASSED [ 11%] tests/test_natsorted.py::test_natsorted_supports_nested_case_handling[ns.DEFAULT-expected0] PASSED [ 11%] tests/test_natsorted.py::test_natsorted_can_sort_with_or_without_accounting_for_sign[ns.DEFAULT-expected0] PASSED [ 12%] tests/test_natsorted.py::test_natsorted_path_extensions_heuristic PASSED [ 12%] tests/test_natsorted.py::test_natsorted_can_sort_using_locale[ns.UNGROUPLETTERS-expected1] PASSED [ 12%] tests/test_natsorted.py::test_natsorted_can_sort_as_unsigned_and_ignore_exponents[50] PASSED [ 12%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[4616-expected7] PASSED [ 13%] tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.NUMAFTER-expected1] FAILED [ 13%] tests/test_natsorted.py::test_natsorted_applies_key_to_each_list_element_before_sorting_list PASSED [ 13%] tests/test_natsorted.py::test_natsorted_with_mixed_bytes_and_str_input_raises_type_error PASSED [ 14%] \ tests/test_natsorted_convenience.py::test_index_natsorted_can_presort PASSED [ 98%] tests/test_natsorted_convenience.py::test_order_by_index_sorts_list_according_to_order_of_integer_list PASSED [ 98%] tests/test_natsorted_convenience.py::test_index_realsorted_is_identical_to_index_natsorted_with_real_alg PASSED [ 99%] tests/test_natsorted_convenience.py::test_index_natsorted_applies_key_function_before_sorting PASSED [ 99%] tests/test_natsorted_convenience.py::test_index_natsorted_returns_integer_list_of_sort_order_for_input_list PASSED [ 99%] tests/test_natsorted_convenience.py::test_as_ascii_converts_bytes_to_ascii PASSED [100%] ================================================================ FAILURES ================================================================= __________________________________ test_natsorted_handles_mixed_types_with_locale[ns.DEFAULT-expected0] ___________________________________ mixed_list = ['Ä', '0', 'ä', 3, 'b', 1.5, ...], alg = , expected = ['0', 1.5, '2', 3, 'ä', 'Ä', ...] @pytest.mark.parametrize( "alg, expected", [ (ns.DEFAULT, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), # Adding PATH changes nothing. (ns.PATH, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.PATH | ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.PATH | ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.PATH | ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), ], ) @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_handles_mixed_types_with_locale( mixed_list: List[Union[str, int, float]], alg: NSType, expected: List[Union[str, int, float]], ) -> None: > assert natsorted(mixed_list, alg=ns.LOCALE | alg) == expected E AssertionError: assert ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] == ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] E At index 4 diff: 'b' != 'ä' E Full diff: E - ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] E + ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] tests/test_natsorted.py:318: AssertionError _____________________________________ test_natsorted_handles_mixed_types_with_locale[4104-expected5] ______________________________________ mixed_list = ['Ä', '0', 'ä', 3, 'b', 1.5, ...], alg = 4104, expected = ['ä', 'Ä', 'b', 'Z', '0', 1.5, ...] @pytest.mark.parametrize( "alg, expected", [ (ns.DEFAULT, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), # Adding PATH changes nothing. (ns.PATH, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.PATH | ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.PATH | ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.PATH | ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), ], ) @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_handles_mixed_types_with_locale( mixed_list: List[Union[str, int, float]], alg: NSType, expected: List[Union[str, int, float]], ) -> None: > assert natsorted(mixed_list, alg=ns.LOCALE | alg) == expected E AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] E At index 0 diff: 'b' != 'ä' E Full diff: E - ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] E + ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] tests/test_natsorted.py:318: AssertionError ____________________________________ test_natsorted_handles_mixed_types_with_locale[ns.PATH-expected4] ____________________________________ mixed_list = ['Ä', '0', 'ä', 3, 'b', 1.5, ...], alg = , expected = ['0', 1.5, '2', 3, 'ä', 'Ä', ...] @pytest.mark.parametrize( "alg, expected", [ (ns.DEFAULT, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), # Adding PATH changes nothing. (ns.PATH, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.PATH | ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.PATH | ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.PATH | ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), ], ) @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_handles_mixed_types_with_locale( mixed_list: List[Union[str, int, float]], alg: NSType, expected: List[Union[str, int, float]], ) -> None: > assert natsorted(mixed_list, alg=ns.LOCALE | alg) == expected E AssertionError: assert ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] == ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] E At index 4 diff: 'b' != 'ä' E Full diff: E - ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] E + ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] tests/test_natsorted.py:318: AssertionError ___________________________________________ test_natsorted_can_sort_locale_specific_numbers_en ____________________________________________ @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_can_sort_locale_specific_numbers_en() -> None: given = ["c", "a5,467.86", "ä", "b", "a5367.86", "a5,6", "a5,50"] expected = ["a5,6", "a5,50", "a5367.86", "a5,467.86", "ä", "b", "c"] > assert natsorted(given, alg=ns.LOCALE | ns.F) == expected E AssertionError: assert ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'b', 'c', 'ä'] == ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'ä', 'b', 'c'] E At index 4 diff: 'b' != 'ä' E Full diff: E - ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'ä', 'b', 'c'] E ? ----- E + ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'b', 'c', 'ä'] E ? +++++ tests/test_natsorted.py:272: AssertionError ___________________________________________ test_natsorted_can_sort_locale_specific_numbers_de ____________________________________________ @pytest.mark.usefixtures("with_locale_de_de") def test_natsorted_can_sort_locale_specific_numbers_de() -> None: given = ["c", "a5.467,86", "ä", "b", "a5367.86", "a5,6", "a5,50"] expected = ["a5,50", "a5,6", "a5367.86", "a5.467,86", "ä", "b", "c"] > assert natsorted(given, alg=ns.LOCALE | ns.F) == expected E AssertionError: assert ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'b', 'c', 'ä'] == ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'ä', 'b', 'c'] E At index 4 diff: 'b' != 'ä' E Full diff: E - ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'ä', 'b', 'c'] E ? ----- E + ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'b', 'c', 'ä'] E ? +++++ tests/test_natsorted.py:279: AssertionError __________________________________ test_natsorted_handles_mixed_types_with_locale[ns.NUMAFTER-expected1] __________________________________ mixed_list = ['Ä', '0', 'ä', 3, 'b', 1.5, ...], alg = , expected = ['ä', 'Ä', 'b', 'Z', '0', 1.5, ...] @pytest.mark.parametrize( "alg, expected", [ (ns.DEFAULT, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), # Adding PATH changes nothing. (ns.PATH, ["0", 1.5, "2", 3, "ä", "Ä", "b", "Z"]), (ns.PATH | ns.NUMAFTER, ["ä", "Ä", "b", "Z", "0", 1.5, "2", 3]), (ns.PATH | ns.UNGROUPLETTERS, ["0", 1.5, "2", 3, "Ä", "Z", "ä", "b"]), (ns.PATH | ns.UG | ns.NA, ["Ä", "Z", "ä", "b", "0", 1.5, "2", 3]), ], ) @pytest.mark.usefixtures("with_locale_en_us") def test_natsorted_handles_mixed_types_with_locale( mixed_list: List[Union[str, int, float]], alg: NSType, expected: List[Union[str, int, float]], ) -> None: > assert natsorted(mixed_list, alg=ns.LOCALE | alg) == expected E AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] E At index 0 diff: 'b' != 'ä' E Full diff: E - ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] E + ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] tests/test_natsorted.py:318: AssertionError ========================================================= short test summary info ========================================================= FAILED tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.DEFAULT-expected0] - AssertionError: assert ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] == ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] FAILED tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[4104-expected5] - AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] FAILED tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.PATH-expected4] - AssertionError: assert ['0', 1.5, '2', 3, 'b', 'Z', 'ä', 'Ä'] == ['0', 1.5, '2', 3, 'ä', 'Ä', 'b', 'Z'] FAILED tests/test_natsorted.py::test_natsorted_can_sort_locale_specific_numbers_en - AssertionError: assert ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'b', 'c', 'ä'] == ['a5,6', 'a5,50', 'a5367.86', 'a5,467.86', 'ä', 'b... FAILED tests/test_natsorted.py::test_natsorted_can_sort_locale_specific_numbers_de - AssertionError: assert ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'b', 'c', 'ä'] == ['a5,50', 'a5,6', 'a5367.86', 'a5.467,86', 'ä', 'b... FAILED tests/test_natsorted.py::test_natsorted_handles_mixed_types_with_locale[ns.NUMAFTER-expected1] - AssertionError: assert ['b', 'Z', 'ä', 'Ä', '0', 1.5, '2', 3] == ['ä', 'Ä', 'b', 'Z', '0', 1.5, '2', 3] ================================================ 6 failed, 326 passed, 1 skipped in 13.20s ================================================
@nieder
Copy link
Author

nieder commented Sep 7, 2024

It seems that installing pyicu fixed the problem

====================================================================== 333 passed in 12.98s ======================================================================

Will try a few other iterations, but most likely fixed. Perhaps pyicu should be made an always dependency (not just for extras) for macOS?

@SethMMorton
Copy link
Owner

I cannot reproduce the problem. I am running on macOS 14.6.1 without pyicu installed.

You say you are on macOS 10.14.6 - that sounds very old, are you sure that is the correct version?

@SethMMorton
Copy link
Owner

You say you are on macOS 10.14.6 - that sounds very old, are you sure that is the correct version?

Ah, silly me, it's just that sometimes the leading 10 is omitted, so we are likely on the same version.

@nieder
Copy link
Author

nieder commented Sep 8, 2024

Nope. Definitely on 10.14.6
image

I have access to a macOS 13 system and will try it there later +/- pyicu.

@nieder
Copy link
Author

nieder commented Sep 8, 2024

Tested on 13.6.3 machine. All tests pass here without pyicu installed.
So the native macOS library for older macOS seems to behave different than more modern libraries? I saw several other closed issues referencing FreeBSD, so since some of the underlying macOS things are BSD derived, perhaps that's a common thread that has since diverged/improved?

@SethMMorton
Copy link
Owner

I did some digging. The special handling that attempts to fix sorting on macOS was implemented by 2015. It looks like Mojave (10.14) was released in 2018. natsort unit tests worked fine all through that time, so I'm not confident that the OS version is responsible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants