Skip to content

Commit

Permalink
Merge pull request #117 from SciCatProject/v4-docs
Browse files Browse the repository at this point in the history
Update and expand docs for v4
  • Loading branch information
jl-wynen authored Aug 14, 2023
2 parents a567a4e + 3619f07 commit fdc3993
Show file tree
Hide file tree
Showing 36 changed files with 1,321 additions and 114 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,6 @@ If you need help with using Scitacean or contributing to it, have a look at the
For bug reports and other problems, please open an [issue](https://github.com/SciCatProject/scitacean/issues/new) in GitHub.

You are welcome to submit pull requests at any time.
But to avoid having to make large modifications during review or even have your PR rejected, please first open an issue first to discuss your idea!
But to avoid having to make large modifications during review or even have your PR rejected, please open an issue first to discuss your idea!

Check out the subsections of the [developer documentation](https://scicatproject.github.io/scitacean/developer/index.html) for details on how Scitacean is developed.
4 changes: 4 additions & 0 deletions docs/_static/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ html[data-theme="light"] {
--pst-color-secondary: #459db9;
--pst-color-link-hover: var(--pst-color-link);

--bs-body-color: var(--pst-color-text-base);

--cean-color-header-text: #2c2c2c;
--cean-color-header-text-highlight: black;
/* match cean-color-header-text */
Expand All @@ -16,6 +18,8 @@ html[data-theme="dark"] {
--pst-color-border: #aaa;
--pst-color-link-hover: var(--pst-color-link);

--bs-body-color: var(--pst-color-text-base);

--cean-color-header-text: #d4d4d4;
--cean-color-header-text-highlight: #e5e5e5;
/* match cean-color-header-text */
Expand Down
7 changes: 4 additions & 3 deletions docs/_templates/scitacean-class-template.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
{{ fullname | escape | underline }}

{% set constructors = {"Client": ["from_credentials", "from_token", "without_login"],
"Dataset": ["__init__", "from_models"],
"File": ["from_local", "from_scicat"],
"OrigDatablockProxy": ["__init__", "from_model"],
"Dataset": ["__init__", "from_download_models"],
"File": ["from_local", "from_download_model"],
"OrigDatablockProxy": ["__init__", "from_download_model"],
"PID": ["__init__", "parse"],
"ScicatClient": ["from_credentials", "from_token", "without_login"],
} %}
{% set regular_methods = methods | reject("in", constructors.get(name, []) + ["__init__"]) | list %}

Expand Down
4 changes: 4 additions & 0 deletions docs/_templates/scitacean-module-template.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

.. automodule:: {{ fullname }}

{% if name not in ["model"] %}

{% block attributes %}
{% if attributes %}
.. rubric:: {{ _('Module Attributes') }}
Expand Down Expand Up @@ -64,3 +66,5 @@
{%- endfor %}
{% endif %}
{% endblock %}

{% endif %}
1 change: 1 addition & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@

intersphinx_mapping = {
"fabric": ("https://docs.fabfile.org/en/stable", None),
"hypothesis": ("https://hypothesis.readthedocs.io/en/latest/", None),
"python": ("https://docs.python.org/3", None),
"paramiko": ("https://docs.paramiko.org/en/stable", None),
}
Expand Down
7 changes: 3 additions & 4 deletions docs/developer/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@ Development dependencies are specified in ``requirements/dev.txt`` and can be in
Additionally, building the documentation requires `pandoc <https://pandoc.org/>`_ which is not on PyPI and needs to be installed through other means.
(E.g. with your OS package manager.)

If you want to run tests against a real backend, you also need ``docker-compose``.
And the ``scicatlive`` git submodule needs to be initialized.
If you want to run tests against a real backend or SSH server, you also need ``docker-compose``.
See `Testing <./testing.rst>`_ for what this is good for and why.

Install the package
Expand Down Expand Up @@ -57,12 +56,12 @@ Running tests
python -m pytest -n<number-of-threads>
Or to run tests against a real backend (see setup above)
Or to run tests against a real backend and SSH server (see setup above)


.. code-block:: sh
pytest --backend-tests
pytest --backend-tests --ssh-tests
Note that the setup and teardown of the backend takes several seconds.

Expand Down
4 changes: 1 addition & 3 deletions docs/developer/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,9 @@ Developer documentation
.. include:: ../../CONTRIBUTING.md
:parser: myst_parser.sphinx_

Table of contents
-----------------

.. toctree::
:maxdepth: 2
:hidden:

getting-started
coding-conventions
Expand Down
1 change: 1 addition & 0 deletions docs/reference/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ Auxiliary classes
:template: scitacean-class-template.rst
:recursive:

client.ScicatClient
datablock.OrigDatablock
dataset.DatablockUploadModels
PID
Expand Down
126 changes: 126 additions & 0 deletions docs/user-guide/classes-and-concepts.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
Classes and concepts
====================

Scitacean uses a number of different classes to store metadata and interact with the data catalogue.
This page gives an overview of the most important ones.
See the `API reference <../reference/index.rst>`_ for a complete list.

Encoding metadata
-----------------

Dataset
~~~~~~~

:class:`scitacean.Dataset` is the main class for encoding metadata.
Each instance represents a single SciCat dataset.
But unlike in SciCat itself, it also contains links to files and their metadata via :ref:`files` objects.

``Dataset`` contains all fields of both raw and derived datasets.
Some fields are managed automatically (e.g. ``size``) and some are read-only as they are not allowed to be set during uploads (e.g. ``created_by``).
Some fields hold sub models like :class:`scitacean.model.Relationship`; those models are always :ref:`user-models`.
Field names use Scitacean's and thereby Python's naming convention, that is snake_case as opposed to camelCase as used by SciCat.

Datablocks
~~~~~~~~~~

SciCat separates general metadata and file-specific metadata into 'datasets' and 'datablocks', respectively.
In Scitacean, those are large unified into :class:`scitacean.Dataset` and for a user, it is usually possible to ignore datablocks entirely.
Datablocks (for archived files) and Original Datablocks (for directly accessible files) are managed automatically by the high-level interface.
But :class:`scitacean.Dataset` and :ref:`scicat-client` support handling them manually if need be.

.. _files:

Files
~~~~~

:class:`scitacean.File` links to a single file that may be located on the remote fileserver or the local filesystem or both.
It also encodes a number of metadata fields as specified by :class:`scitacean.model.DownloadDataFile` and :class:`scitacean.model.UploadDataFile`.
See the class documentation for details on how local vs. remote files are handled.

Models
~~~~~~

Models are Python representations of the various objects in a SciCat database.
See :mod:`scitacean.model` for a list.

.. _user-models:

User models
^^^^^^^^^^^

User models are dataclasses that are exposed as part of the high-level interface of Scitacean.
They have writable fields that can be set in both uploads and downloads as well as read-only fields that may only be set in downloads.
Field names use Scitacean's and thereby Python's convention, that is snake_case as opposed to camelCase as used by SciCat.

``Dataset``, ``(Orig)Datablock``, and ``File`` don't have separate user models.
Instead they are represented by the specialized classes described above.

.. _download-models:

Download models
^^^^^^^^^^^^^^^

Download models are `Pydantic <https://docs.pydantic.dev/latest/>`_ models that encode the data received from SciCat in downloads.
They may contain fields that correspond to read-only fields in user models and cannot be set in uploads.
Field names use SciCat's convention, that is camelCase.

Download models can be converted to user models by using the appropriate user model's ``from_download_model`` class method.
In the case of Dataset, :meth:`scitacean.Dataset.from_download_models` requires models for a dataset and (orig) datablocks.

.. _upload-models:

Upload models
^^^^^^^^^^^^^

Upload models are `Pydantic <https://docs.pydantic.dev/latest/>`_ models that encode the data sent to SciCat in uploads.
Field names use SciCat's convention, that is camelCase.

Upload models can be constructed by the corresponding user models using their ``to_upload_model`` method.

For :class:`scitacean.Dataset`, there are two distinct upload models, namely :class:`scitacean.model.UploadRawDataset` and :class:`scitacean.model.UploadDerivedDataset`.
In addition, :class:`scitacean.model.UploadOrigDatablock` and :class:`scitacean.model.UploadDataFile` are needed to fully represent Scitacean's ``Dataset`` objects.

Downloading & uploading (meta) data
-----------------------------------

.. _client:

Client
~~~~~~

:class:`scitacean.Client` is the high-level interface for downloading and uploading datasets from and to SciCat.
It deals directly with :class:`scitacean.Dataset` and :ref:`user-models`.
It also controls the download and upload of files as implemented by :ref:`file-transfers`.

.. _file-transfers:

File transfers
~~~~~~~~~~~~~~

SciCat itself only deals with metadata and files are stored separately.
However, for ease of use, :class:`scitacean.Dataset` and :class:`scitacean.Client` unify handling of metadata and files.
The latter requires `file transfers <../reference/index.rst#file-transfer>`_ to implement concrete download and upload methods.
File transfers should not be used directly but passed as arguments when constructing a ``Client``.

SciCat is deployed in diverse environments and each facility has its own ways of accessing files.
So it is necessary to pick an appropriate one for the concrete SciCat instance in use.
Scitacean cannot guarantee that it can download or upload files for every instance of SciCat.
But it is possible to implement custom file transfers if the bundled ones are not enough.
Each transfer must satisfy the :class:`scitacean.typing.FileTransfer` protocol.

.. _scicat-client:

ScicatClient
~~~~~~~~~~~~

:class:`scitacean.client.ScicatClient` is the low-level interface for downloading and uploading metadata from and to SciCat.
In contrast to :ref:`client`, it deals with :ref:`download-models` and :ref:`upload-models`.
It does not handle files.

It should almost never be necessary to use ``ScicatClient`` directly.
If you find yourself reaching for it because ``Client`` is insufficient, please consider starting a `discussion <https://github.com/SciCatProject/scitacean/discussions>`_ or opening an `issue <https://github.com/SciCatProject/scitacean/issues/new>`_ on GitHub as this likely indicates a missing feature in the high-level client.

There are two notable exceptions to this.
The first is testing with a fake client as described in the `Testing user guide <./testing.ipynb#FakeClient>`_.
The second is needing direct control over data blocks and Scitacean's automated handling doesn't work for you.
The latter is considered an advanced use case and out of scope for Scitacean's high-level interface.
1 change: 1 addition & 0 deletions docs/user-guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ User guide
downloading
uploading
testing
classes-and-concepts
Loading

0 comments on commit fdc3993

Please sign in to comment.