improve ergonomics of test execution #771

vbarua · 2025-01-09T17:55:27Z

The test format implementation is difficult to work with for a number of reasons.

Cannot Run Coverage Code Locally

Attempting to run

pytest tests/test_extensions.py::test_substrait_extension_coverage

locally results in errors like

__________________________________________________ ERROR collecting tests/test_extensions.py __________________________________________________
ImportError while importing test module '/Users/victor.barua/scratch/substrait/tests/test_extensions.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/test_extensions.py:4: in <module>
    from tests.coverage.case_file_parser import load_all_testcases
tests/coverage/case_file_parser.py:4: in <module>
    from antlr4 import CommonTokenStream, FileStream
E   ModuleNotFoundError: No module named 'antlr4'
=========================================================== 1 error in 0.02 seconds ===========================================================
ERROR: not found: /Users/victor.barua/scratch/substrait/tests/test_extensions.py::test_substrait_extension_coverage
(no name '/Users/victor.barua/scratch/substrait/tests/test_extensions.py::test_substrait_extension_coverage' in any of [<Module test_extensions.py>])

even with then antlr4 runtime available

➜ pip3 freeze
antlr4-python3-runtime==4.13.2
...

As a result I cannot test changes locally AND because this is included as a pre-commit hook, not of my commits are able to use the hooks.

Adding Tests Requires Mucking With Coverage Asserts

When trying to fix the lints in #769 as a result of me adding coverage to existing tests, I have to muck with some of the numbers in

substrait/tests/test_extensions.py

Lines 10 to 38 in 2d8b1b6

    
           # NOTE: this test is run as part of pre-commit hook 
        
           def test_substrait_extension_coverage(): 
        
               script_dir = os.path.dirname(os.path.abspath(__file__)) 
        
               extensions_path = os.path.join(script_dir, "../extensions") 
        
               registry = Extension.read_substrait_extensions(extensions_path) 
        
               assert len(registry.registry) >= 161 
        
               num_overloads = sum([len(f) for f in registry.registry.values()]) 
        
               assert num_overloads >= 510 
        
               assert len(registry.dependencies) >= 13 
        
               assert len(registry.scalar_functions) >= 162 
        
               assert len(registry.aggregate_functions) >= 29 
        
               assert len(registry.window_functions) >= 0 
        
               test_case_dir = os.path.join(script_dir, "./cases") 
        
               all_test_files = load_all_testcases(test_case_dir) 
        
               coverage = get_test_coverage(all_test_files, registry) 
        
               assert coverage.test_count >= 1077 
        
               assert ( 
        
                   coverage.num_tests_with_no_matching_function == 0 
        
               ), f"{coverage.num_tests_with_no_matching_function} tests with no matching function" 
        
               assert coverage.num_covered_function_variants >= 226 
        
               assert coverage.total_function_variants >= 513 
        
               assert ( 
        
                   coverage.total_function_variants - coverage.num_covered_function_variants 
        
               ) <= 287, ( 
        
                   f"Coverage gap too large: {coverage.total_function_variants - coverage.num_covered_function_variants} " 
        
                   f"function variants with no tests, out of {coverage.total_function_variants} total function variants." 
        
               )

It's not clear what I need to be modifying based on the error, and also it's hard to verify what needs to change as I can't run this locally.

Desired State

In order of increasing opinionatedness

If were going to have Python tooling, we should have way to configure environments reproducibly (i.e. requirements.tx/pipfiles/etc) so that folks can execute test and hooks locally.
We should seperate out the execution of the tests from coverage.
CI should block on test failures, but not on coverage changes. Coverage should be a report that we can look at.

The text was updated successfully, but these errors were encountered:

jacques-n · 2025-01-17T06:37:48Z

@scgkiran , can you or someone on the team pick up the first two items. I think the third is more complicated to get right.

jacques-n · 2025-01-17T06:39:55Z

For item 2, let's make it so if the test fails against baselines, it reports the new baselines versus the old and the file that should be updated if the baselines are wrong.

jacques-n · 2025-01-17T06:44:13Z

For item 3, my thinking has historically been that we update the code so it prints out some kind of "coverage file" that codecov understands so we can use codecov capability for determining baselining, etc rather than having to build that part in github actions ourselves. Would love it if someone was excited to pick that up!

scgkiran · 2025-01-17T14:37:45Z

For item 2, raised PR

scgkiran · 2025-01-17T14:43:19Z

For item 1:

Now pytest is part of PR workflow.
Locally I installed antlr using pip install antlr4-python3-runtime==4.13.2. I don't remember doing anything else.

Based on PR workflow, requirements.txt will look like

black==22.3.0
flake8==7.0.0
pytest
antlr4-python3-runtime==4.13.2

pip3 install -r requirements.txt

@vbarua can you try this? if this works, try running the tests in same terminal.

vbarua · 2025-01-29T21:21:13Z

Using that as a base, I was able to get the tests running locally. I've added some additional Python configurations to the build to make this more repeatable for future userss in #781

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve ergonomics of test execution #771

improve ergonomics of test execution #771

vbarua commented Jan 9, 2025

jacques-n commented Jan 17, 2025

jacques-n commented Jan 17, 2025

jacques-n commented Jan 17, 2025

scgkiran commented Jan 17, 2025

scgkiran commented Jan 17, 2025

vbarua commented Jan 29, 2025

improve ergonomics of test execution #771

improve ergonomics of test execution #771

Comments

vbarua commented Jan 9, 2025

Cannot Run Coverage Code Locally

Adding Tests Requires Mucking With Coverage Asserts

Desired State

jacques-n commented Jan 17, 2025

jacques-n commented Jan 17, 2025

jacques-n commented Jan 17, 2025

scgkiran commented Jan 17, 2025

scgkiran commented Jan 17, 2025

vbarua commented Jan 29, 2025