ENH: make tests not conflicting and thus runnable in parallel with pytest-xdist #194

yarikoptic · 2020-08-13T14:59:23Z

ATM it seems that there is some pecularity to be resolved about session level fixture:

$> python -m pytest -n 5 -v dandi/                                                                        
================================================= test session starts ==================================================
platform linux -- Python 3.8.5, pytest-4.6.9, py-1.8.1, pluggy-0.13.0 -- /home/yoh/proj/dandi/dandi-cli/venvs/dev3/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/yoh/proj/dandi/dandi-cli/.hypothesis/examples')
rootdir: /home/yoh/proj/dandi/dandi-cli, inifile: tox.ini
plugins: mock-3.2.0, cov-2.10.0, xdist-1.32.0, timeout-1.3.3, forked-1.1.3, hypothesis-4.36.2
[gw0] linux Python 3.8.5 cwd: /home/yoh/proj/dandi/dandi-cli
[gw1] linux Python 3.8.5 cwd: /home/yoh/proj/dandi/dandi-cli
[gw2] linux Python 3.8.5 cwd: /home/yoh/proj/dandi/dandi-cli
[gw3] linux Python 3.8.5 cwd: /home/yoh/proj/dandi/dandi-cli
[gw4] linux Python 3.8.5 cwd: /home/yoh/proj/dandi/dandi-cli
[gw0] Python 3.8.5 (default, Jul 20 2020, 18:32:44)  -- [GCC 9.3.0]
[gw1] Python 3.8.5 (default, Jul 20 2020, 18:32:44)  -- [GCC 9.3.0]
[gw2] Python 3.8.5 (default, Jul 20 2020, 18:32:44)  -- [GCC 9.3.0]
[gw3] Python 3.8.5 (default, Jul 20 2020, 18:32:44)  -- [GCC 9.3.0]
[gw4] Python 3.8.5 (default, Jul 20 2020, 18:32:44)  -- [GCC 9.3.0]
gw0 [66] / gw1 [66] / gw2 [66] / gw3 [66] / gw4 [66]
scheduling tests via LoadScheduling
...
[gw4] [ 33%] PASSED dandi/tests/test_download.py::test_download_multiple_files 
dandi/tests/test_download.py::test_download_000027[https://deploy-preview-341--gui-dandiarchive-org.netlify.app/#/dandiset/000027/0.200721.2222] 
[gw3] [ 34%] ERROR dandi/tests/test_girder.py::test_lock_dandiset 
dandi/tests/test_organize.py::test_organize_nwb_test_data[symlink] 
[gw4] [ 36%] PASSED dandi/tests/test_download.py::test_download_000027[https://deploy-preview-341--gui-dandiarchive-org.netlify.app/#/dandiset/000027/0.200721.2222] 
...
[gw3] [ 53%] PASSED dandi/tests/test_organize.py::test_organize_nwb_test_data[symlink-relative] 
dandi/tests/test_upload.py::test_upload 
[gw3] [ 54%] ERROR dandi/tests/test_upload.py::test_upload 
dandi/tests/test_utils.py::test_find_files_dotfiles 
...
============================================================================================================= ERRORS ==============================================================================================================
______________________________________________________________________________________________ ERROR at setup of test_lock_dandiset _______________________________________________________________________________________________
[gw3] linux -- Python 3.8.5 /home/yoh/proj/dandi/dandi-cli/venvs/dev3/bin/python

    @pytest.fixture(scope="session")
    def local_docker_compose():
        # Check that we're running on a Unix-based system (Linux or macOS), as the
        # Docker images don't work on Windows.
        if os.name != "posix":
            pytest.skip("Docker images require Unix host")
        skipif.no_network()
        skipif.no_docker_engine()
    
        instance_id = "local-docker-tests"
        instance = known_instances[instance_id]
    
>       run(["docker-compose", "up", "-d"], cwd=str(LOCAL_DOCKER_DIR), check=True)

dandi/tests/fixtures.py:122: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

input = None, capture_output = False, timeout = None, check = True, popenargs = (['docker-compose', 'up', '-d'],), kwargs = {'cwd': '/home/yoh/proj/dandi/dandi-cli/dandi/tests/data/dandiarchive-docker'}
process = <subprocess.Popen object at 0x7efca13e0ee0>, stdout = None, stderr = None, retcode = 1

    def run(*popenargs,
            input=None, capture_output=False, timeout=None, check=False, **kwargs):
        """Run command with arguments and return a CompletedProcess instance.
    
        The returned instance will have attributes args, returncode, stdout and
        stderr. By default, stdout and stderr are not captured, and those attributes
        will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.
    
        If check is True and the exit code was non-zero, it raises a
        CalledProcessError. The CalledProcessError object will have the return code
        in the returncode attribute, and output & stderr attributes if those streams
        were captured.
    
        If timeout is given, and the process takes too long, a TimeoutExpired
        exception will be raised.
    
        There is an optional argument "input", allowing you to
        pass bytes or a string to the subprocess's stdin.  If you use this argument
        you may not also use the Popen constructor's "stdin" argument, as
        it will be used internally.
    
        By default, all communication is in bytes, and therefore any "input" should
        be bytes, and the stdout and stderr will be bytes. If in text mode, any
        "input" should be a string, and stdout and stderr will be strings decoded
        according to locale encoding, or by "encoding" if set. Text mode is
        triggered by setting any of text, encoding, errors or universal_newlines.
    
        The other arguments are the same as for the Popen constructor.
        """
        if input is not None:
            if kwargs.get('stdin') is not None:
                raise ValueError('stdin and input arguments may not both be used.')
            kwargs['stdin'] = PIPE
    
        if capture_output:
            if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
                raise ValueError('stdout and stderr arguments may not be used '
                                 'with capture_output.')
            kwargs['stdout'] = PIPE
            kwargs['stderr'] = PIPE
    
        with Popen(*popenargs, **kwargs) as process:
            try:
                stdout, stderr = process.communicate(input, timeout=timeout)
            except TimeoutExpired as exc:
                process.kill()
                if _mswindows:
                    # Windows accumulates the output in a single blocking
                    # read() call run on child threads, with the timeout
                    # being done in a join() on those threads.  communicate()
                    # _after_ kill() is required to collect that and add it
                    # to the exception.
                    exc.stdout, exc.stderr = process.communicate()
                else:
                    # POSIX _communicate already populated the output so
                    # far into the TimeoutExpired exception.
                    process.wait()
                raise
            except:  # Including KeyboardInterrupt, communicate handled that.
                process.kill()
                # We don't call process.wait() as .__exit__ does that for us.
                raise
            retcode = process.poll()
            if check and retcode:
>               raise CalledProcessError(retcode, process.args,
                                         output=stdout, stderr=stderr)
E               subprocess.CalledProcessError: Command '['docker-compose', 'up', '-d']' returned non-zero exit status 1.

/usr/lib/python3.8/subprocess.py:512: CalledProcessError
------------------------------------------------------------------------------------------------------ Captured stderr setup ------------------------------------------------------------------------------------------------------
WARNING: No swap limit support
No such container: ccf93257334efa51ef4d34b8e9e84f2099354ecf6b95e02b04c13f29544dab4b
__________________________________________________________________________________________________ ERROR at setup of test_upload __________________________________________________________________________________________________
[gw3] linux -- Python 3.8.5 /home/yoh/proj/dandi/dandi-cli/venvs/dev3/bin/python

tp = <class 'subprocess.CalledProcessError'>, value = None, tb = None

    def reraise(tp, value, tb=None):
        try:
            if value is None:
                value = tp()
            if value.__traceback__ is not tb:
>               raise value.with_traceback(tb)

venvs/dev3/lib/python3.8/site-packages/six.py:702: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
dandi/tests/fixtures.py:122: in local_docker_compose
    run(["docker-compose", "up", "-d"], cwd=str(LOCAL_DOCKER_DIR), check=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

input = None, capture_output = False, timeout = None, check = True, popenargs = (['docker-compose', 'up', '-d'],), kwargs = {'cwd': '/home/yoh/proj/dandi/dandi-cli/dandi/tests/data/dandiarchive-docker'}
process = <subprocess.Popen object at 0x7efca13e0ee0>, stdout = None, stderr = None, retcode = 1

    def run(*popenargs,
            input=None, capture_output=False, timeout=None, check=False, **kwargs):
        """Run command with arguments and return a CompletedProcess instance.
    
        The returned instance will have attributes args, returncode, stdout and
        stderr. By default, stdout and stderr are not captured, and those attributes
        will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.
    
        If check is True and the exit code was non-zero, it raises a
        CalledProcessError. The CalledProcessError object will have the return code
        in the returncode attribute, and output & stderr attributes if those streams
        were captured.
    
        If timeout is given, and the process takes too long, a TimeoutExpired
        exception will be raised.
    
        There is an optional argument "input", allowing you to
        pass bytes or a string to the subprocess's stdin.  If you use this argument
        you may not also use the Popen constructor's "stdin" argument, as
        it will be used internally.
    
        By default, all communication is in bytes, and therefore any "input" should
        be bytes, and the stdout and stderr will be bytes. If in text mode, any
        "input" should be a string, and stdout and stderr will be strings decoded
        according to locale encoding, or by "encoding" if set. Text mode is
        triggered by setting any of text, encoding, errors or universal_newlines.
    
        The other arguments are the same as for the Popen constructor.
        """
        if input is not None:
            if kwargs.get('stdin') is not None:
                raise ValueError('stdin and input arguments may not both be used.')
            kwargs['stdin'] = PIPE
    
        if capture_output:
            if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
                raise ValueError('stdout and stderr arguments may not be used '
                                 'with capture_output.')
            kwargs['stdout'] = PIPE
            kwargs['stderr'] = PIPE
    
        with Popen(*popenargs, **kwargs) as process:
            try:
                stdout, stderr = process.communicate(input, timeout=timeout)
            except TimeoutExpired as exc:
                process.kill()
                if _mswindows:
                    # Windows accumulates the output in a single blocking
                    # read() call run on child threads, with the timeout
                    # being done in a join() on those threads.  communicate()
                    # _after_ kill() is required to collect that and add it
                    # to the exception.
                    exc.stdout, exc.stderr = process.communicate()
                else:
                    # POSIX _communicate already populated the output so
                    # far into the TimeoutExpired exception.
                    process.wait()
                raise
            except:  # Including KeyboardInterrupt, communicate handled that.
                process.kill()
                # We don't call process.wait() as .__exit__ does that for us.
                raise
            retcode = process.poll()
            if check and retcode:
>               raise CalledProcessError(retcode, process.args,
                                         output=stdout, stderr=stderr)
E               subprocess.CalledProcessError: Command '['docker-compose', 'up', '-d']' returned non-zero exit status 1.

/usr/lib/python3.8/subprocess.py:512: CalledProcessError

The text was updated successfully, but these errors were encountered:

jwodder · 2020-08-13T15:39:38Z

I seem to have gotten the local_docker_compose fixture to handle being run in parallel using a recipe from the pytest-xdist README, but then the tests fail because multiple tests are trying to lock Dandiset 000001 at once. I don't think this test suite is parallelizable. (Would you like me to push my code anyway?)

yarikoptic · 2020-08-13T16:08:35Z

And there is no way to eg create a fixture (eg needs_lock) which could be used only by a single test at a time? Alternative is to use different dandisets across tests where locking is needed, but probably that would be sub optimal burden

jwodder · 2020-08-13T16:42:59Z

I managed to write a needs_lock fixture that holds a lock on a lockfile while the test runs, and the tests do seem to be succeeding, but now, when running in parallel, the coverage is coming out at 10%. Basic googling doesn't offer an explanation.

Edit: Coverage remains at 80% if I use pytest-cov instead of plain coverage, but then there are a dozen .coverage.$HOSTNAME.#.# files left lying around after the tests are done, with or without parallelization.

yarikoptic · 2020-08-14T03:48:45Z

Great!

re coverage files: may be smth here could come handy? pytest-dev/pytest-cov#250 (comment)

jwodder · 2020-08-14T12:27:21Z

The page linked to there seems to have moved to https://pytest-cov.readthedocs.io/en/latest/subprocess-support.html. The only element of multiprocessing we use directly is Lock; do any of our dependencies use multiprocessing?

yarikoptic · 2020-08-14T13:34:11Z

joblib? (#191)

jwodder · 2020-08-14T13:38:02Z

This says joblib.Parallel uses loky by default, which in turn seems to be a mix of multiprocessing and custom code.

yarikoptic · 2020-08-14T13:38:49Z

BTW, if it becomes too "involved" to make it work smoothly, may be it is not worth it at this point? Getting tests run fast via parallelization is a great DX feature to have but our tests aren't that long ATM anyways, so it is not that critical, we could get back to it later when more pressing need arises.

yarikoptic · 2020-08-14T13:40:46Z

This says joblib.Parallel uses loky by default, which in turn seems to be a mix of multiprocessing and custom code.

also there is use inside joblib itself:

$> git grep multiprocessing | grep import | grep -v /loky/
CHANGES.rst:    Fix joblib import setting the global start_method for multiprocessing.
continuous_integration/run_tests.sh:python -c "import multiprocessing as mp; print('multiprocessing.cpu_count():', mp.cpu_count())"
joblib/_memmapping_reducer.py:from multiprocessing import util
joblib/_multiprocessing_helpers.py:"""Helper module to factorize the conditional multiprocessing import logic
joblib/_multiprocessing_helpers.py:        import multiprocessing as mp
joblib/_multiprocessing_helpers.py:        from _multiprocessing import SemLock
joblib/_multiprocessing_helpers.py:    from multiprocessing.context import assert_spawning
joblib/_parallel_backends.py:from ._multiprocessing_helpers import mp
joblib/_parallel_backends.py:    from multiprocessing.pool import ThreadPool
joblib/_parallel_backends.py:    from multiprocessing import TimeoutError
joblib/backports.py:from multiprocessing import util
joblib/disk.py:from multiprocessing import util
joblib/parallel.py:from ._multiprocessing_helpers import mp
joblib/pool.py:This module should not be imported if multiprocessing is not
joblib/pool.py:from ._multiprocessing_helpers import mp, assert_spawning
joblib/pool.py:from multiprocessing.pool import Pool
joblib/test/common.py:from joblib._multiprocessing_helpers import mp
joblib/test/test_memmapping.py:from joblib.test.common import with_multiprocessing
joblib/test/test_memory.py:from joblib.test.common import with_multiprocessing
joblib/test/test_module.py:        import multiprocessing as mp
joblib/test/test_module.py:        from multiprocessing import semaphore_tracker
joblib/test/test_module.py:        msg = "multiprocessing.semaphore_tracker has been spawned on import"
joblib/test/test_parallel.py:from multiprocessing import TimeoutError
joblib/test/test_parallel.py:from joblib.test.common import with_multiprocessing
joblib/test/test_parallel.py:    from multiprocessing import get_context
joblib/test/test_parallel.py:        from multiprocessing import get_context
joblib/test/test_parallel.py:import multiprocessing as mp
joblib/test/test_store_backends.py:from joblib.test.common import with_multiprocessing

yarikoptic · 2020-08-14T13:41:56Z

also -- we use joblib ATM as a very rudimentary mass parallelization helper... at some point we might just code up our own thus having better control over our needs and avoiding joblib shortcomings (no generator support for Parallel yet)

satra · 2020-08-14T14:23:16Z

consider pydra as well, depending on needs.

jwodder · 2020-08-14T18:25:44Z

If I disable joblib.Parallel by setting DANDI_DEVEL=1 and adding the --devel-debug option to the command in test_organize_nwb_test_data, testing with pytest-cov still creates a bunch of (though fewer) .coverage.$HOSTNAME.#.# files, so it seems that something else is also responsible.

yarikoptic · 2021-07-27T18:06:00Z

no solution was identified, so let's let it RiP and reopen if need arises etc

yarikoptic added the UX label Aug 13, 2020

yarikoptic assigned jwodder Aug 13, 2020

yarikoptic added the no solution identified Issues which are blocked by no proper solution yet found label Sep 14, 2020

yarikoptic closed this as completed Jul 27, 2021

yarikoptic mentioned this issue Apr 5, 2022

Analyze and possibly speed up testing #946

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: make tests not conflicting and thus runnable in parallel with pytest-xdist #194

ENH: make tests not conflicting and thus runnable in parallel with pytest-xdist #194

yarikoptic commented Aug 13, 2020

jwodder commented Aug 13, 2020 •

edited

Loading

yarikoptic commented Aug 13, 2020

jwodder commented Aug 13, 2020 •

edited

Loading

yarikoptic commented Aug 14, 2020

jwodder commented Aug 14, 2020

yarikoptic commented Aug 14, 2020

jwodder commented Aug 14, 2020

yarikoptic commented Aug 14, 2020

yarikoptic commented Aug 14, 2020

yarikoptic commented Aug 14, 2020

satra commented Aug 14, 2020

jwodder commented Aug 14, 2020

yarikoptic commented Jul 27, 2021

ENH: make tests not conflicting and thus runnable in parallel with pytest-xdist #194

ENH: make tests not conflicting and thus runnable in parallel with pytest-xdist #194

Comments

yarikoptic commented Aug 13, 2020

jwodder commented Aug 13, 2020 • edited Loading

yarikoptic commented Aug 13, 2020

jwodder commented Aug 13, 2020 • edited Loading

yarikoptic commented Aug 14, 2020

jwodder commented Aug 14, 2020

yarikoptic commented Aug 14, 2020

jwodder commented Aug 14, 2020

yarikoptic commented Aug 14, 2020

yarikoptic commented Aug 14, 2020

yarikoptic commented Aug 14, 2020

satra commented Aug 14, 2020

jwodder commented Aug 14, 2020

yarikoptic commented Jul 27, 2021

jwodder commented Aug 13, 2020 •

edited

Loading

jwodder commented Aug 13, 2020 •

edited

Loading