Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimum dependency test job #2816

Merged
merged 60 commits into from
Feb 15, 2024
Merged
Show file tree
Hide file tree
Changes from 59 commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
141eb6a
Start mindeps
ivirshup Jan 17, 2024
a07568e
Fix check_is_fitted import
ivirshup Jan 17, 2024
705acfe
Temporarilly bump anndata dep to access test utilities
ivirshup Jan 17, 2024
9a0dd1a
Support for numpy 1.23 where np.equal didn't work on strings
ivirshup Jan 17, 2024
e4dbcbc
Fix palette color mapping for pandas < 2.1
ivirshup Jan 17, 2024
355c904
Bump networkx
ivirshup Jan 17, 2024
a8bd01b
Exit on error for test script
ivirshup Jan 17, 2024
d36b977
Bump numba for numpy compat
ivirshup Jan 17, 2024
6b4823c
update ci
ivirshup Jan 18, 2024
efa8a39
Fix array comparison in both envs
ivirshup Jan 18, 2024
6b7a37f
Bump statsmodels version
ivirshup Jan 18, 2024
47beebe
Test returns different plot type with older dependencies
ivirshup Jan 18, 2024
1a5d701
Skip test that relies on pd.value_counts
ivirshup Jan 18, 2024
4b04a76
Try to use better naming test results in CI
ivirshup Jan 22, 2024
27fb272
Temporarily bump pandas
ivirshup Jan 22, 2024
4e61bea
Add dependency on pynndescent, bump packaging version
ivirshup Jan 22, 2024
ed71b11
skip doctest for dendrogram
ivirshup Jan 22, 2024
cb95628
install pre-commit in env
ivirshup Jan 22, 2024
f29be77
Bump networkx
ivirshup Jan 22, 2024
483129b
Merge branch 'master' into mindeps
ivirshup Jan 26, 2024
199cefc
Merge branch 'master' into mindeps
ivirshup Jan 29, 2024
098fea3
Get tests to collect with an old anndata version
ivirshup Jan 29, 2024
1cb9396
Fix most preprocessing tests (account for old anndata constructor)
ivirshup Jan 29, 2024
079ad46
Merge branch 'master' into mindeps
ivirshup Jan 29, 2024
5d7494a
Bump anndata min version to 0.7.8
ivirshup Jan 29, 2024
cadd7db
Fix pytest_itemcollected
ivirshup Jan 29, 2024
4a7784c
Bump min anndata version to 0.8
ivirshup Jan 29, 2024
f9cad08
Fix test_get.py cases
ivirshup Jan 29, 2024
3b8db82
Fix neighbor test
ivirshup Jan 29, 2024
78c7ee9
Fix dendrogram plotting cases
ivirshup Jan 29, 2024
e46911c
fix stacked violin ordering
ivirshup Jan 29, 2024
e01eb01
Bump tolerance for older versions of numba
ivirshup Jan 29, 2024
aa2dd50
Fix ordering for matrixplot
ivirshup Jan 29, 2024
e598ce7
Fix preprocessing tests
ivirshup Feb 7, 2024
debfc4b
xfail masking test for anndata 0.8
ivirshup Feb 7, 2024
0112aa5
Merge branch 'master' into mindeps
ivirshup Feb 7, 2024
70e5866
Fix order
flying-sheep Feb 8, 2024
bbdddf2
Fix min-deps.py
flying-sheep Feb 8, 2024
eef7055
Discard changes to scanpy/plotting/_utils.py
flying-sheep Feb 8, 2024
70d151d
removed TODOs from min-deps.py
ivirshup Feb 8, 2024
cb36a62
Remove dev script
ivirshup Feb 8, 2024
2bf0789
Merge branch 'master' into mindeps
ivirshup Feb 8, 2024
07f6d57
Rename test jobs to be more identifiable
ivirshup Feb 12, 2024
5090fff
Use marker for xfail
ivirshup Feb 12, 2024
907544e
Add warning for PCA order
ivirshup Feb 12, 2024
208d413
Fix usage of pytest.mark.xfail
ivirshup Feb 12, 2024
826d3dd
Remove commented out code from CI job
ivirshup Feb 12, 2024
e4ee55d
Obey signature test
ivirshup Feb 13, 2024
2b74f41
Merge branch 'master' into mindeps
ivirshup Feb 13, 2024
aa4eda0
Merge branch 'master' into mindeps
ivirshup Feb 13, 2024
0ee1ced
Merge branch 'master' into mindeps
ivirshup Feb 13, 2024
264aa9a
Don't error on warning for dask.dataframe
ivirshup Feb 13, 2024
6693773
update dask version
ivirshup Feb 15, 2024
b412bfb
fix dask version better
ivirshup Feb 15, 2024
bacb2e7
Merge branch 'master' into mindeps
ivirshup Feb 15, 2024
b6afd99
Fix view issue with anndata==0.8
ivirshup Feb 15, 2024
c7d95b6
Typo
ivirshup Feb 15, 2024
0ddc5e2
Release note
ivirshup Feb 15, 2024
b653dd6
coverage for min deps
ivirshup Feb 15, 2024
2b2ec5d
fix coverage for minimum-version install
ivirshup Feb 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 23 additions & 17 deletions .azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,9 @@ variables:
python.version: '3.11'
PIP_CACHE_DIR: $(Pipeline.Workspace)/.pip
PYTEST_ADDOPTS: '-v --color=yes --durations=0 --nunit-xml=test-data/test-results.xml'
ANNDATA_DEV: no
RUN_COVERAGE: no
TEST_EXTRA: 'test-full'
PRERELEASE_DEPENDENCIES: no
DEPENDENCIES_VERSION: "latest" # |"pre-release" | "minimum-version"
TEST_TYPE: "standard" # | "coverage"

jobs:
- job: PyTest
Expand All @@ -20,12 +19,16 @@ jobs:
Python3.9:
python.version: '3.9'
Python3.11: {}
minimal_tests:
minimal_dependencies:
TEST_EXTRA: 'test-min'
anndata_dev:
ANNDATA_DEV: yes
RUN_COVERAGE: yes
PRERELEASE_DEPENDENCIES: yes
DEPENDENCIES_VERSION: "pre-release"
TEST_TYPE: "coverage"
minimum_versions:
python.version: '3.9'
DEPENDENCIES_VERSION: "minimum-version"
TEST_TYPE: "coverage"


steps:
- task: UsePythonVersion@0
Expand All @@ -52,51 +55,54 @@ jobs:
pip install wheel coverage
pip install .[dev,$(TEST_EXTRA)]
displayName: 'Install dependencies'
condition: eq(variables['PRERELEASE_DEPENDENCIES'], 'no')
condition: eq(variables['DEPENDENCIES_VERSION'], 'latest')

- script: |
python -m pip install --pre --upgrade pip
pip install --pre wheel coverage
pip install --pre .[dev,$(TEST_EXTRA)]
pip install -v "anndata[dev,test] @ git+https://github.com/scverse/anndata"
displayName: 'Install dependencies release candidates'
condition: eq(variables['PRERELEASE_DEPENDENCIES'], 'yes')
condition: eq(variables['DEPENDENCIES_VERSION'], 'pre-release')

- script: |
pip install -v "anndata[dev,test] @ git+https://github.com/scverse/anndata"
displayName: 'Install development anndata'
condition: eq(variables['ANNDATA_DEV'], 'yes')
python -m pip install pip wheel tomli packaging
pip install `python3 ci/scripts/min-deps.py pyproject.toml --extra dev test`
pip install --no-deps .
displayName: 'Install dependencies minimum version'
condition: eq(variables['DEPENDENCIES_VERSION'], 'minimum-version')

- script: |
pip list
displayName: 'Display installed versions'

- script: pytest
displayName: 'PyTest'
condition: eq(variables['RUN_COVERAGE'], 'no')
condition: eq(variables['TEST_TYPE'], 'standard')

- script: |
coverage run -m pytest
coverage xml
displayName: 'PyTest (coverage)'
condition: eq(variables['RUN_COVERAGE'], 'yes')
condition: eq(variables['TEST_TYPE'], 'coverage')

- task: PublishCodeCoverageResults@1
inputs:
codeCoverageTool: Cobertura
summaryFileLocation: 'test-data/coverage.xml'
failIfCoverageEmpty: true
condition: eq(variables['RUN_COVERAGE'], 'yes')
condition: eq(variables['TEST_TYPE'], 'coverage')

- task: PublishTestResults@2
condition: succeededOrFailed()
inputs:
testResultsFiles: 'test-data/test-results.xml'
testResultsFormat: NUnit
testRunTitle: 'Publish test results for Python $(python.version)'
testRunTitle: 'Publish test results for $(Agent.JobName)'

- script: bash <(curl -s https://codecov.io/bash)
displayName: 'Upload to codecov.io'
condition: eq(variables['RUN_COVERAGE'], 'yes')
condition: eq(variables['TEST_TYPE'], 'coverage')

- job: CheckBuild
pool:
Expand Down
99 changes: 99 additions & 0 deletions ci/scripts/min-deps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
#!python3
from __future__ import annotations

import argparse
import sys
from collections import deque
from pathlib import Path
from typing import TYPE_CHECKING

if sys.version_info >= (3, 11):
import tomllib
else:
import tomli as tomllib

from packaging.requirements import Requirement
from packaging.version import Version

if TYPE_CHECKING:
from collections.abc import Generator, Iterable


def min_dep(req: Requirement) -> Requirement:
"""
Given a requirement, return the minimum version specifier.

Example
-------

>>> min_dep(Requirement("numpy>=1.0"))
"numpy==1.0"
"""
req_name = req.name
if req.extras:
req_name = f"{req_name}[{','.join(req.extras)}]"

if not req.specifier:
return Requirement(req_name)

min_version = Version("0.0.0.a1")
for spec in req.specifier:
if spec.operator in [">", ">=", "~="]:
ivirshup marked this conversation as resolved.
Show resolved Hide resolved
min_version = max(min_version, Version(spec.version))
elif spec.operator == "==":
min_version = Version(spec.version)

return Requirement(f"{req_name}=={min_version}.*")


def extract_min_deps(
dependencies: Iterable[Requirement], *, pyproject
) -> Generator[Requirement, None, None]:
dependencies = deque(dependencies) # We'll be mutating this
project_name = pyproject["project"]["name"]

while len(dependencies) > 0:
req = dependencies.pop()

# If we are referring to other optional dependency lists, resolve them
if req.name == project_name:
assert req.extras, f"Project included itself as dependency, without specifying extras: {req}"
for extra in req.extras:
extra_deps = pyproject["project"]["optional-dependencies"][extra]
dependencies += map(Requirement, extra_deps)
else:
yield min_dep(req)


def main():
parser = argparse.ArgumentParser(
prog="min-deps",
description="""Parse a pyproject.toml file and output a list of minimum dependencies.

Output is directly passable to `pip install`.""",
usage="pip install `python min-deps.py pyproject.toml`",
)
parser.add_argument(
"path", type=Path, help="pyproject.toml to parse minimum dependencies from"
)
parser.add_argument(
"--extras", type=str, nargs="*", default=(), help="extras to install"
ivirshup marked this conversation as resolved.
Show resolved Hide resolved
)

args = parser.parse_args()

pyproject = tomllib.loads(args.path.read_text())

project_name = pyproject["project"]["name"]
deps = [
*map(Requirement, pyproject["project"]["dependencies"]),
*(Requirement(f"{project_name}[{extra}]") for extra in args.extras),
]

min_deps = extract_min_deps(deps, pyproject=pyproject)

print(" ".join(map(str, min_deps)))


if __name__ == "__main__":
main()
1 change: 1 addition & 0 deletions docs/release-notes/1.10.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
* Fix setting `sc.settings.verbosity` in some cases {pr}`2605` {smaller}`P Angerer`
* Fix all remaining pandas warnings {pr}`2789` {smaller}`P Angerer`
* Fix some annoying plotting warnings around violin plots {pr}`2844` {smaller}`P Angerer`
* Scanpy now has a test job which tests against the minumum versions of the dependencies. In the process of implementing this, many bugs associated with using older versions of `pandas`, `anndata`, `numpy`, and `matplotlib` were fixed. {pr}`2816` {smaller}`I Virshup`

```{rubric} Ecosystem
```
Expand Down
28 changes: 15 additions & 13 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -46,24 +46,25 @@ classifiers = [
"Topic :: Scientific/Engineering :: Visualization",
]
dependencies = [
"anndata>=0.7.4",
"anndata>=0.8",
# numpy needs a version due to #1320
"numpy>=1.17.0",
"numpy>=1.23",
"matplotlib>=3.6",
"pandas >=2.1.3",
"scipy>=1.4",
"seaborn>=0.13.0",
"h5py>=3",
"pandas >=1.5",
"scipy>=1.8",
"seaborn>=0.13",
"h5py>=3.1",
"tqdm",
"scikit-learn>=0.24",
"statsmodels>=0.10.0rc2",
"statsmodels>=0.13",
"patsy",
"networkx>=2.3",
"networkx>=2.7",
"natsort",
"joblib",
"numba>=0.41.0",
"numba>=0.56",
"umap-learn>=0.3.10",
flying-sheep marked this conversation as resolved.
Show resolved Hide resolved
"packaging",
"pynndescent>=0.5",
"packaging>=21.3",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is calver, but they have never issued more than one release per month so it doesn’t matter.

"session-info",
"legacy-api-wrap>=1.4", # for positional API deprecations
"get-annotations; python_version < '3.10'",
Expand Down Expand Up @@ -132,8 +133,8 @@ dev = [
]
# Algorithms
paga = ["igraph"]
louvain = ["igraph", "louvain>=0.6,!=0.6.2"] # Louvain community detection
leiden = ["igraph>=0.10", "leidenalg>=0.9"] # Leiden community detection
louvain = ["igraph", "louvain>=0.6.0,!=0.6.2"] # Louvain community detection
leiden = ["igraph>=0.10", "leidenalg>=0.9.0"] # Leiden community detection
Comment on lines -135 to +137
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change? you do e.g numba>=0.56 too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically because the min-deps.py script is hacky, and is appending .* to the end of the version it finds here. I am aiming for testing against the latest bugfix release within a release series (at least for projects that do semantic versioning).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn’t sound good, I don’t think it should break if we specify >=1.2.4 somewhere because that patch release has some patch we rely on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't break, it will try to install 1.2.4.* which finds 1.2.4

Copy link
Member

@flying-sheep flying-sheep Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it won’t be what we want, that would be >=1.2.4, <1.3

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don’t like losing the ability to say foo>=3.6.4

I mean, we can still do that with the existing code. It just means the CI tests against that bugfix release specifically and not whatever is the latest from pypi.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how we can both pin a minimum bugfix release and test against the latest in a release series from pypi. If all our dependencies followed semver, then this would be easy since we could just strip the bugfix release number from the specification when making the environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means that we can’t do this unless we annotate for every project if it’s semver.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means that we can’t do this unless we annotate for every project if it’s semver.

Yes


Do you have a specific change here that you would like to see?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see below, this is too fundamental for a thread.

bbknn = ["bbknn"] # Batch balanced KNN (batch correction)
magic = ["magic-impute>=2.0"] # MAGIC imputation method
skmisc = ["scikit-misc>=0.1.3"] # highly_variable_genes method 'seurat_v3'
Expand All @@ -142,7 +143,7 @@ scanorama = ["scanorama"] # Scanorama dataset integration
scrublet = ["scikit-image"] # Doublet detection with automatic thresholds
# Acceleration
rapids = ["cudf>=0.9", "cuml>=0.9", "cugraph>=0.9"] # GPU accelerated calculation of neighbors
dask = ["dask[array]!=2.17.0"] # Use the Dask parallelization engine
dask = ["dask[array]>=2022.09.2"] # Use the Dask parallelization engine
dask-ml = ["dask-ml", "scanpy[dask]"] # Dask-ML for sklearn-like API

[tool.hatch.build]
Expand All @@ -166,6 +167,7 @@ nunit_attach_on = "fail"
markers = [
"internet: tests which rely on internet resources (enable with `--internet-tests`)",
"gpu: tests that use a GPU (currently unused, but needs to be specified here as we import anndata.tests.helpers, which uses it)",
"anndata_dask_support: tests that require dask support in anndata",
]
filterwarnings = [
# legacy-api-wrap: internal use of positional API
Expand Down
2 changes: 1 addition & 1 deletion scanpy/get/get.py
Original file line number Diff line number Diff line change
Expand Up @@ -260,7 +260,7 @@ def obs_df(
... )
>>> plotdf.columns
Index(['CD8B', 'n_genes', 'X_umap-0', 'X_umap-1'], dtype='object')
>>> plotdf.plot.scatter("X_umap-0", "X_umap-1", c="CD8B")
>>> plotdf.plot.scatter("X_umap-0", "X_umap-1", c="CD8B") # doctest: +SKIP
<Axes: xlabel='X_umap-0', ylabel='X_umap-1'>

Calculating mean expression for marker genes by cluster:
Expand Down
3 changes: 2 additions & 1 deletion scanpy/neighbors/_backends/rapids.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
from typing import TYPE_CHECKING, Any, Literal

import numpy as np
from sklearn.base import BaseEstimator, TransformerMixin, check_is_fitted
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.exceptions import NotFittedError
from sklearn.utils.validation import check_is_fitted

from ..._settings import settings
from ._common import TransformerChecksMixin
Expand Down
2 changes: 1 addition & 1 deletion scanpy/plotting/_baseplot_class.py
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ def add_totals(
>>> adata = sc.datasets.pbmc68k_reduced()
>>> markers = {'T-cell': 'CD3D', 'B-cell': 'CD79A', 'myeloid': 'CST3'}
>>> plot = sc.pl._baseplot_class.BasePlot(adata, markers, groupby='bulk_labels').add_totals()
>>> plot.plot_group_extra['counts_df']
>>> plot.plot_group_extra['counts_df'] # doctest: +SKIP
flying-sheep marked this conversation as resolved.
Show resolved Hide resolved
bulk_labels
CD4+/CD25 T Reg 68
CD4+/CD45RA+/CD25- Naive T 8
Expand Down
10 changes: 9 additions & 1 deletion scanpy/plotting/_matrixplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,15 @@ def __init__(

if values_df is None:
# compute mean value
values_df = self.obs_tidy.groupby(level=0, observed=True).mean()
values_df = (
self.obs_tidy.groupby(level=0, observed=True)
.mean()
.loc[
self.categories_order
if self.categories_order is not None
else self.categories
]
)

if standard_scale == "group":
values_df = values_df.sub(values_df.min(1), axis=0)
Expand Down
15 changes: 9 additions & 6 deletions scanpy/plotting/_stacked_violin.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,14 +383,17 @@ def _mainplot(self, ax):
if self.var_names_idx_order is not None:
_matrix = _matrix.iloc[:, self.var_names_idx_order]

if self.categories_order is not None:
_matrix.index = _matrix.index.reorder_categories(
self.categories_order, ordered=True
)

# get mean values for color and transform to color values
# using colormap
_color_df = _matrix.groupby(level=0, observed=True).median()
_color_df = (
_matrix.groupby(level=0, observed=True)
.median()
.loc[
self.categories_order
if self.categories_order is not None
else self.categories
]
)
if self.are_axes_swapped:
_color_df = _color_df.T

Expand Down
7 changes: 5 additions & 2 deletions scanpy/plotting/_tools/scatterplots.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from matplotlib.colors import Colormap, Normalize
from matplotlib.figure import Figure # noqa: TCH002
from numpy.typing import NDArray # noqa: TCH002
from packaging.version import Version

from ... import logging as logg
from ..._settings import settings
Expand Down Expand Up @@ -1247,8 +1248,10 @@
}
# If color_map does not have unique values, this can be slow as the
# result is not categorical
color_vector = pd.Categorical(values.map(color_map, na_action="ignore"))

if Version(pd.__version__) < Version("2.1.0"):
flying-sheep marked this conversation as resolved.
Show resolved Hide resolved
color_vector = pd.Categorical(values.map(color_map))
else:
color_vector = pd.Categorical(values.map(color_map, na_action="ignore"))

Check warning on line 1254 in scanpy/plotting/_tools/scatterplots.py

View check run for this annotation

Codecov / codecov/patch

scanpy/plotting/_tools/scatterplots.py#L1254

Added line #L1254 was not covered by tests
# Set color to 'missing color' for all missing values
if color_vector.isna().any():
color_vector = color_vector.add_categories([to_hex(na_color)])
Expand Down
5 changes: 5 additions & 0 deletions scanpy/preprocessing/_highly_variable_genes.py
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,11 @@ def _highly_variable_genes_single_batch(
`highly_variable`, `means`, `dispersions`, and `dispersions_norm`.
"""
X = _get_obs_rep(adata, layer=layer)

if hasattr(X, "_view_args"): # AnnData array view
# For compatibility with anndata<0.9
X = X.copy() # Doesn't actually copy memory, just removes View class wrapper

if flavor == "seurat":
X = X.copy()
if "log1p" in adata.uns_keys() and adata.uns["log1p"].get("base") is not None:
Expand Down
Loading
Loading