diff --git a/.github/workflows/main-workflow.yaml b/.github/workflows/main-workflow.yaml index 2c3699d..3eb1482 100644 --- a/.github/workflows/main-workflow.yaml +++ b/.github/workflows/main-workflow.yaml @@ -26,6 +26,8 @@ jobs: python-version: [ '3.7', '3.8', '3.9', '3.10' ] steps: - uses: actions/checkout@v4 + - name: Install pandoc + run: sudo apt-get update && sudo apt-get install -y pandoc - uses: ./.github/actions/unit-tests env: TEST_API_TOKEN: ${{ secrets.TEST_API_TOKEN }} diff --git a/README.md b/README.md index 766e593..45f4a49 100644 --- a/README.md +++ b/README.md @@ -1,38 +1,32 @@ -# Cellarium Cell Annotation Service (CAS) Client +Cellarium Logo + +# Cellarium Cell Annotation Service (CAS) Client Library This codebase contains the Python client library for using Cellarium Cell Annotation Service (CAS). # Installation +The cellarium-cas package officially supports Python versions between 3.7 and 3.10. We recommend using Python 3.10. +You can install CAS from PyPI using `pip`. To install the latest version, please run the following command: ``` $ pip install cellarium-cas ``` -# Usage -To use Cellarium CAS, create a client instance with your API token: - -```python3 -from cellarium.cas import CASClient - -api_token = "a_very_long_string_with_some_symbols" -cas = CASClient( - api_token=api_token, - api_url="" -) +To install a specific version `version_number`, you can use the following command: ``` - -## Annotation -You can annotate 10x Cell Ranger h5 matrices from local disk: -```python3 - -response = cas.annotate_10x_h5_file(filepath="your_path_to_local_h5_file.h5") +$ pip install cellarium-cas== ``` -or an anndata file from local disk: -```python3 -response = cas.annotate_anndata_file(filepath="your_path_to_local_h5_file.h5ad") +If you wish to use visualization features, you can install the package with the visualization extras: +``` +$ pip install cellarium-cas[vis] ``` -or a previously loaded (unnormalized) anndata object: -```python3 -import anndata +# Obtaining an API Token +You need an API token to use CAS. We are offerring a free public beta program for a limited time to try CAS and explore ways it can enhance your cell biology research. To obtain your unique API token to join the public beta program, please navigate to the CAS webpage at [cellarium.ai](https://cellarium.ai/tool/cellarium-cell-annotation-service-cas/), scroll to the bottom of the page, and [sign up](https://cellarium.ai/cell-annotation-service-cas-access/). We will contact you with your unique API key as soon as the public beta is available. -adata = anndata.read("you_anndata_file.h5ad") -response = cas.annotate_anndata(adata) -``` +# Quickstart Tutorial +The fastest way to get started with using CAS is to follow the quickstart tutorial: +[Click here to open the quickstart tutorial on GitHub](notebooks/quickstart_tutorial.ipynb) + +It is even easier to go through the quickstart tutorial on Google Colab. Remember, you still need an API key to successfully run through the tutorial: +[Click here to open the quickstart tutorial on Google Colab](https://colab.research.google.com/drive/1m9zgqP5n7E4pGGCg5RjfvlCnS6uqUdSa) + +# Documentation +Please visit the project's [ReadTheDocs page](https://cellarium-cas.readthedocs.io/) for additional documentation. diff --git a/docs/source/conf.py b/docs/source/conf.py index 960425f..cb34637 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -2,6 +2,9 @@ # # For the full list of built-in configuration values, see the documentation: # https://www.sphinx-doc.org/en/master/usage/configuration.html +import os +import shutil +import glob import time from setuptools_git_versioning import get_tag @@ -10,8 +13,8 @@ # https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information project = "Cellarium CAS" -copyright = f"{time.strftime('%Y')}, Cellarium AI" -author = "Cellarium AI" +copyright = f"{time.strftime('%Y')}, Cellarium AI Lab" +author = "Cellarium AI Lab" version = get_tag() or "" release = get_tag() or "" @@ -19,6 +22,10 @@ # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration extensions = [ + 'nbsphinx', + 'sphinx.ext.mathjax', + 'sphinx.ext.autodoc', + 'sphinx.ext.viewcode', "sphinx_rtd_theme", "sphinx.ext.autodoc", "sphinx.ext.napoleon", @@ -26,8 +33,11 @@ "sphinx.ext.intersphinx", "sphinx_substitution_extensions", "sphinxcontrib.autodoc_pydantic", + "IPython.sphinxext.ipython_console_highlighting", ] +exclude_patterns = ['_build', '**.ipynb_checkpoints'] + # Provide substitutions for common values rst_prolog = f""" .. |project| replace:: {project} @@ -60,3 +70,9 @@ # The JSON schema is a bit much in the docs autodoc_pydantic_model_show_json = False + +if not os.path.exists("notebooks"): + os.makedirs("notebooks") + +for src_file in glob.glob("../../notebooks/*.ipynb"): + shutil.copy(src_file, "notebooks/") diff --git a/docs/source/index.rst b/docs/source/index.rst index 4a6e28c..89039b2 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -1,20 +1,49 @@ :tocdepth: 3 -Cellarium Cell Annotation Service -################################# +Cellarium Cell Annotation Service (CAS) Documentation +##################################################### + +Overview +++++++++ + +**What is the Cell Annotation Service (CAS)?** +The Cell Annotation Service (CAS) is a cloud-native platform designed for rapid and efficient querying of single-cell omics data. It utilizes compact and information-rich vector representations combined with approximate nearest neighbor search engines to enable seamless exploration of vast single-cell datasets. This documentation focuses on the CAS client library, which offers a user-friendly interface for querying the CAS backend and visualizing the results. + +**Hows does CAS work?** +CAS operates by building a vector search index derived from low-dimensional embeddings of a comprehensive repository of publicly available single-cell transcriptomics data. Currently, the CAS reference dataset comprises the entire `CZI CELLxGENE `_ data catalog. This reference database includes rich metadata such as cell type, disease state, tissue origin, and more. In future iterations, the reference catalog will be expanded to incorporate additional datasets. + +When a user queries CAS with their single-cell transcriptomics data, the system maps the input data into the same vector space as the reference dataset. It then performs an approximate nearest neighbor search to identify similar cells within the reference database. Based on these similar reference cells, CAS generates label summary statistics, providing comprehensive annotations for each query cell. The methodology employed by CAS is comparable to how inverse search engines (such as image search techniques) index and retrieve information from vast amounts of internet data, making it accessible and interpretable. The embeddings used by CAS are generated using distributed machine learning models implemented in the `Cellarium ML Library `_ + +Understanding the similarities and differences between cellular measurements in various contexts is crucial for unraveling disease mechanisms. CAS streamlines this process by leveraging its reference context to provide community-consensus annotations for the queried cells, helping researchers gain deeper insights into their data. + +Funding ++++++++ + +Cellarium CAS was co-developed by 10x Genomics and Cellarium AI Lab at the Data Sciences Platform, Broad Institute. The project was funded by 10x Genomics, NIH Grant UM1 MH130966, and additional support from Broad Institute. + +Future Plans +++++++++++++ + +At present, CAS outputs annotations exclusively for cell types. In upcoming updates, we plan to extend the service to include a broader range of informative metadata such as disease status, developmental stage, tissue origin, and other relevant biological contexts, thereby providing users with a more comprehensive annotation framework for their single-cell data. -**What's Cell Annotation Service (CAS)?** CAS is a software library for rapid querying of single-cell omics data relying on ML algorithms. .. toctree:: :maxdepth: 1 - :caption: General Info + :caption: General Usage modules/installation modules/usage modules/workflow modules/changelog +.. toctree:: + :maxdepth: 1 + :caption: Tutorials + + Quickstart Tutorial (on Google Colab) + Quickstart Tutorial (static) + .. toctree:: :maxdepth: 1 :caption: Codebase Documentation @@ -23,7 +52,8 @@ Cellarium Cell Annotation Service automodules/visualization -Other Tools -=========== - * `Machine Learning library Cellarium ML `_ +Related Projects +================ + * `Cellarium ML Library `_ + * `Cellarium Cloud (Backend) `_ diff --git a/docs/source/modules/installation.rst b/docs/source/modules/installation.rst index 9cb35db..2d3fe03 100644 --- a/docs/source/modules/installation.rst +++ b/docs/source/modules/installation.rst @@ -8,12 +8,27 @@ Prerequisites The cellarium-cas package officially supports Python versions between 3.7 and 3.10. We recommend using Python 3.10. -From github repository -++++++++++++++++++++++ +From PyPI repository +++++++++++++++++++++ -Install the latest alpha version using `pip` with a specified version: +Install the latest version using `pip` by running the following command: + +.. code-block:: bash + :substitutions: + + pip install cellarium-cas + +If you wish to install a specific version, you can specify the version number: .. code-block:: bash :substitutions: pip install cellarium-cas==|version| + + +Finally, if you wish to use visualization features, you can install the package with the visualization extras: + +.. code-block:: bash + :substitutions: + + pip install cellarium-cas[vis]==|version| diff --git a/docs/source/modules/usage.rst b/docs/source/modules/usage.rst index 7b0967a..743d917 100644 --- a/docs/source/modules/usage.rst +++ b/docs/source/modules/usage.rst @@ -3,11 +3,12 @@ Usage ##### -API Token and the CAS Early Access Program ------------------------------------------- -In order to use the CAS API, you will need to join the Cell Annotation Service Early Access program. To join the program, -please fill out the form at https://cellarium.ai/cell-annotation-service-cas-early-access/ and we will notify you -when your account is created, and provide you with an API token to use with the CAS API. +API Token and the CAS Public Beta Program +----------------------------------------- +In order to use the CAS API, you will need to join the CAS Public Beta program. To obtain your unique API token to join +the public beta program, please navigate to the CAS webpage at `cellarium.ai `_, +scroll to the bottom of the page, and `sign up `_. We will contact +you with your unique API key as soon as the public beta is available. Initialization -------------- diff --git a/requirements/docs.txt b/requirements/docs.txt index 1fc767c..7b3f440 100644 --- a/requirements/docs.txt +++ b/requirements/docs.txt @@ -4,4 +4,6 @@ sphinx_gallery~=0.14 sphinx_rtd_theme~=2.0 sphinx_substitution_extensions==2024.8.6 setuptools-git-versioning==2.0.0 -autodoc_pydantic==2.2.0 \ No newline at end of file +autodoc_pydantic==2.2.0 +nbsphinx +IPython