Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python/ci] Remote-storage CI YAML, part 1 of 2 #3675

Merged
merged 8 commits into from
Feb 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .github/workflows/daily-remote-tests-issue-template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: Daily GitHub Actions fail for remote tests on {{ date | date('ddd, MMMM Do YYYY') }}
assignees: nguyenv, johnkerl
labels: bug
---

See run for more details:
https://github.com/{{ env.GITHUB_REPOSITORY }}/actions/runs/{{ env.GITHUB_RUN_ID }}
2 changes: 1 addition & 1 deletion .github/workflows/daily-test-build-issue-template.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Daily GitHub Actions Build Fail on {{ date | date('ddd, MMMM Do YYYY') }}
title: Daily GitHub Actions build fail on {{ date | date('ddd, MMMM Do YYYY') }}
assignees: nguyenv, ryan-williams, johnkerl
labels: bug
---
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/libtiledb-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ on:
- "apis/python/**"
- "apis/r/**"
- ".pre-commit-config.yaml"
- ".github/workflows/python-remote-storage.yml"
push:
branches:
- main
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/libtiledbsoma-asan-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ on:
- "apis/python/**"
- "apis/r/**"
- ".pre-commit-config.yaml"
- ".github/workflows/python-remote-storage.yml"
push:
branches:
- main
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/python-ci-minimal.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ on:
- '!.github/**'
- '.github/workflows/python-ci-minimal.yml'
- '.github/workflows/python-ci-single.yml'
- ".github/workflows/python-remote-storage.yml"
workflow_dispatch:

jobs:
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/python-ci-packaging.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ on:
push:
paths:
- '.github/workflows/python-ci-packaging.yml'
- ".github/workflows/python-remote-storage.yml"
- 'apis/python/MANIFEST.in'
- 'apis/python/pyproject.toml'
- 'apis/python/setup.py'
Expand All @@ -21,6 +22,7 @@ on:
pull_request:
paths:
- '.github/workflows/python-ci-packaging.yml'
- ".github/workflows/python-remote-storage.yml"
- 'apis/python/MANIFEST.in'
- 'apis/python/pyproject.toml'
- 'apis/python/setup.py'
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/python-dependency-variation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ on:
- '**.md'
- 'apis/r/**'
- 'docs/**'
- ".github/workflows/python-remote-storage.yml"
workflow_dispatch:

jobs:
Expand Down
124 changes: 124 additions & 0 deletions .github/workflows/python-remote-storage.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
name: TileDB-SOMA Python CI (remote storage)

on:
# Not for regular use, but you can uncomment this when putting up PRs on this
# file. Just remember to take it back out. There's no need to run
# remote-storage tests on every single PR push to our repo.
#
# pull_request:
#
# Allows for 'as needed' manual trigger:
workflow_dispatch:
#
# Use a regular nighly build as well (time is UTC):
schedule:
- cron: "25 5 * * *"

env:
# Don't name this "TILEDB_REST_TOKEN" since that will map into a core
# env/config override, as if config key "rest.token" had been set. One of the
# purposes of this CI is to run tests where all config is passed via context
# arguments and none via environment variables, in order to flush out
# callsites within the code which aren't passing context as they should.
TILEDB_REST_UNITTEST_TOKEN: ${{ secrets.TILEDB_REST_UNITTEST_TOKEN}}

jobs:
ci:
strategy:
fail-fast: false
matrix:
include:
# This could be run on MacOS too, but, we have enough OS redundancy,
# and MacOS-runner availability is a more tightly constrained resource
# in GitHub Actions as of 2025-02-06.
- name: linux
os: ubuntu-24.04
# TO DO: also on 3.12. But 3.9 is higher-pri, until we drop support
# for it. (Note our main CI run tests across a broader set of Python
# versions.)
python_version: 3.9
cc: gcc-13
cxx: g++-13

runs-on: ${{ matrix.os }}

steps:
- uses: actions/checkout@v4

- name: Show matrix OS
run: echo "matrix.os:" ${{ matrix.os }}

- name: Linux CPU info
if: ${{ matrix.os == 'ubuntu-24.04' }}
run: cat /proc/cpuinfo

- name: MacOS CPU info
if: ${{ matrix.os == 'macOS-latest' }}
run: sysctl -a | grep cpu

- name: Select XCode version
if: startsWith(matrix.os, 'macos')
uses: maxim-lobanov/setup-xcode@v1
with:
xcode-version: '15.4'

- name: Set up Python ${{ matrix.python_version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python_version }}
cache: pip
cache-dependency-path: ./apis/python/setup.py

- name: Show XCode version
run: clang --version

- name: Check out TileDB-SOMA
uses: actions/checkout@v4
with:
fetch-depth: 0 # ensure we get all tags to inform package version determination

- name: Log pip dependencies
run: pip list

- name: Install tiledbsoma
# If you'reiterating on the testing framework itself, install from PyPI to avoid the
# time-consuming build. If you're trying to evaluate the code modified by a PR,
# install from source.
#
# From PyPI:
# run: pip install tiledbsoma==1.15.7
#
# From source:
run: pip -v install -e apis/python[all] -C "--build-option=--no-tiledb-deprecated"
#
env:
CC: ${{ matrix.cc }}
CXX: ${{ matrix.cxx }}

- name: Install dependencies
run: pip install --prefer-binary pytest typeguard tiledb.cloud

- name: Show package versions
run: python scripts/show-versions.py

- name: Show tiledb.cloud version
run: python -c 'import tiledb.cloud; print(tiledb.cloud.version.version)'

- name: Run pytests for Python
shell: bash
run: python -m pytest apis/python/remote_tests -v --durations=20 --maxfail=50

# File a bug report if anything fails, but don't file tickets for manual runs
# -- only for scheduled ones.
create_issue_on_fail:
needs: [ci]
if: (failure() || cancelled()) && github.event_name != 'workflow_dispatch'
steps:
- name: Checkout TileDB-SOMA `main`
uses: actions/checkout@v2
- name: Create Issue if Build Fails
uses: JasonEtco/create-an-issue@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
filename: .github/workflows/daily-remote-tests-issue-template.md
48 changes: 48 additions & 0 deletions apis/python/remote_tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# How to run these tests

These are nominally a nightly cron. However, you can run them manually if you want.

```
export TILEDB_REST_UNITTEST_TOKEN="..." # Get the token for the Saas `unittest` user
unset TILEDB_REST_TOKEN # If you have that set
unset TILEDB_REST_PAYER_NAMESPACE # If you have that set
```

As of 2025-02-07, use Python 3.9 to run UDF tests; otherwise they will be skipped.

```
python -m pytest path/to/this/directory
```

# Test-data setup

This is what was done for initial setup of these tests, and what should be done for future releases.

```
export TILEDB_REST_TOKEN="..." # Get the token for the Saas `unittest` user
export TILEDB_REST_PAYER_NAMESPACE=unittest
```

Here are source data you can find in the sandbox account `unittest` space:

```
s3://tiledb-unittest/soma-prod-test-data/h5ad/pbmc3k_unprocessed.h5ad
s3://tiledb-unittest/soma-prod-test-data/h5ad/pbmc3k_processed.h5ad
```

Local copy:

```
aws s3 cp s3://tiledb-unittest/soma-prod-test-data/h5ad .
```

Then use `tiledbsoma.io.from_h5ad` with the following sources and data:

* Prefer a bare Docker image
* Repeat for all desired TileDB-SOMA versions:
* `pip install tiledbsoma==1.15.7` (or whichever version)
* Ingest to `s3://tiledb-unittest/soma-prod-test-data/1.15.7/pbmc3k_unprocessed_1.15.7`
* Register this in the cloud UI
* Note: as of 2025-02-07 the cloud UI disallows `.` in group names so register with name like `1_15_7`.
* Tracked at [sc-63068](https://app.shortcut.com/tiledb-inc/story/63068/allow-in-registration-paths)
* Do not ingest directly to `tiledb://unittest/s3://tiledb-unittest/soma-prod-test-data/1.15.7/pbmc3k_unprocessed_1.15.7` since this will use today's version of core server-side, and what we want to really test is data written entirely by the pip-installed versions of tiledbsoma and core.
3 changes: 3 additions & 0 deletions apis/python/remote_tests/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from typeguard import install_import_hook

install_import_hook("tiledbsoma")
47 changes: 47 additions & 0 deletions apis/python/remote_tests/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import os

import pytest

import tiledbsoma
import tiledb.cloud


@pytest.fixture
def conftest_token():
env_name = "TILEDB_REST_UNITTEST_TOKEN"
token = os.getenv(env_name)
if token is None:
raise Exception(f'Environment variable "{env_name}" is not set')
return token


@pytest.fixture
def conftest_tiledb_cloud_login(conftest_token):
print("conftest_tiledb_cloud_login")
tiledb.cloud.login(token=conftest_token)
return None


@pytest.fixture
def conftest_user_profile(conftest_tiledb_cloud_login):
return tiledb.cloud.user_profile()


@pytest.fixture
def conftest_namespace(conftest_user_profile):
return conftest_user_profile.username


@pytest.fixture
def conftest_default_s3_path(conftest_user_profile):
return conftest_user_profile.default_s3_path


@pytest.fixture
def conftest_context(conftest_token, conftest_namespace):
return tiledbsoma.SOMATileDBContext(
tiledb_config={
"rest.token": conftest_token,
"rest.payer_namespace": conftest_namespace,
}
)
83 changes: 83 additions & 0 deletions apis/python/remote_tests/test_01_setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# These are test that need to run first to check basic functionality, before we
# go on to test other, more complex things.

from __future__ import annotations

import os
import sys

import pytest
import scanpy

import tiledbsoma
import tiledbsoma.io
import tiledb.cloud

from .util import util_make_uri, util_tear_down_uri

# Nominally this is the 'unittest' SaaS user. What we require is:
#
# * The user can _read_ data in the 'unittest' namespace.
# * For data _written_, the namespace and default_s3_path are taken from the
# cloud profile.
#
# For CI, this environment variable is a GitHub Actions secret, propagated in
# the CI YAML.
if os.getenv("TILEDB_REST_UNITTEST_TOKEN") is None:
pytest.skip(
reason="$TILEDB_REST_UNITTEST_TOKEN is not set", allow_module_level=True
)


def test_skipping_correctly():
assert os.getenv("TILEDB_REST_UNITTEST_TOKEN") is not None


def test_basic_read(conftest_context):
uri = "tiledb://unittest/pbmc3k_unprocessed_1_15_7"
assert tiledbsoma.Experiment.exists(uri, context=conftest_context)
with tiledbsoma.Experiment.open(uri, context=conftest_context) as exp:
assert exp.obs.count == 2700
assert "RNA" in exp.ms
assert exp.ms["RNA"].var.count == 13714


def test_basic_write(conftest_context, conftest_namespace, conftest_default_s3_path):
(creation_uri, readback_uri) = util_make_uri(
"soma-prod-ephemeral-data",
"ephemeral_basic_write",
conftest_namespace,
conftest_default_s3_path,
)

adata = scanpy.datasets.pbmc3k()

tiledbsoma.io.from_anndata(
creation_uri,
adata,
measurement_name="RNA",
context=conftest_context,
)

with tiledbsoma.Experiment.open(readback_uri, context=conftest_context) as exp:
assert exp.obs.count == 2700
assert "RNA" in exp.ms
assert exp.ms["RNA"].var.count == 32738

util_tear_down_uri(readback_uri)


@pytest.mark.skipif(
(sys.version_info.major, sys.version_info.minor) != (3, 9),
reason="As of 2025-02-05 UDFs require Python 3.9",
)
def test_remote_version(conftest_tiledb_cloud_login):

def remote_version():
import tiledbsoma

return {"tiledbsoma": tiledbsoma.__version__}

output = tiledb.cloud.udf.exec(remote_version)
assert "tiledbsoma" in output
assert output["tiledbsoma"].startswith("1.")
Loading