Added the Ingest Workflow Developer's Guide

lsst · Oct 17, 2024 · f945429 · f945429
1 parent d1e4716
commit f945429
Show file tree

Hide file tree

Showing 43 changed files with 5,559 additions and 118 deletions.
diff --git a/doc/_static/ingest-transaction-fsm.png b/doc/_static/ingest-transaction-fsm.png
diff --git a/doc/_static/ingest-transaction-fsm.pptx b/doc/_static/ingest-transaction-fsm.pptx
diff --git a/doc/_static/ingest-transactions-aborted.png b/doc/_static/ingest-transactions-aborted.png
diff --git a/doc/_static/ingest-transactions-aborted.pptx b/doc/_static/ingest-transactions-aborted.pptx
diff --git a/doc/_static/ingest-transactions-failed.png b/doc/_static/ingest-transactions-failed.png
diff --git a/doc/_static/ingest-transactions-failed.pptx b/doc/_static/ingest-transactions-failed.pptx
diff --git a/doc/_static/ingest-transactions-resolved.png b/doc/_static/ingest-transactions-resolved.png
diff --git a/doc/_static/ingest-transactions-resolved.pptx b/doc/_static/ingest-transactions-resolved.pptx
diff --git a/doc/admin/data-table-indexes.rst b/doc/admin/data-table-indexes.rst
diff --git a/doc/admin/index.rst b/doc/admin/index.rst
@@ -1,7 +1,5 @@
-.. warning::
 
-   **Information in this guide is known to be outdated.** A documentation sprint is underway which will
-   include updates and revisions to this guide.
+.. _admin:
 
 #####################
 Administrator's Guide
@@ -11,3 +9,5 @@ Administrator's Guide
    :maxdepth: 4
 
    k8s
+   row-counters
+   data-table-indexes
diff --git a/doc/admin/row-counters.rst b/doc/admin/row-counters.rst
@@ -0,0 +1,176 @@
+
+.. _admin-row-counters:
+
+=========================
+Row counters optimization
+=========================
+
+.. _admin-row-counters-intro:
+
+Introduction
+------------
+
+Shortly after the first public instances of Qserv were made available to the users, it was observed that many users
+were launching the following query:
+
+.. code-block:: sql
+
+    SELECT COUNT(*) FROM <database>.<table>
+
+Normally, Qserv would process the query by broadcasting the query to all workers to count rows at each chunk
+table and aggregate results into a single number at Czar. This is basically the very same mechanism that is found
+behind the *shared scan* (or just *scan*) queries. The performnce of the *scan* queries is known to vary depending on
+the following factors:
+
+- the number of chunks in the table of interest
+- the number of workers
+- and the presence of other competing queries (especially the slow ones)
+
+In the best case scenario such scan would take seconds, in the worst one - many minutes or even hours.
+This could quickly cause (and has caused) frustration among users since this query looks like (and in reality is)
+the very trivial non-scan query.
+
+To address this situation, Qserv has a built-in optimization that is targeting exactly this class of queries.
+Here is how it works. For each data table Qserv Czar would have an optional metadata table to store the number
+of rows for each chunk. The table is populated and managed by the Qserv Replication system.
+
+Note that this optimization is presently an option. And these are the reasons:
+
+-  Counter collection requires scanning all chunk tables, which would take time. Doing this during
+   the catalog *publishing* time would prolong the ingest time and increase the chances of instabilities
+   for the workflows (in general, the longer some operation is going - the higher the probability of runing into
+   the infrastructure-related faulures).
+- The counters are not needed for the purposes of the data ingest *per se*. These are just optimizations for the queries.
+- Building the counters before the ingested data have been Q&A-ed may not be a good idea.
+- The counters may need to be rebuilt if the data have been changed (after fix ups to the ingested catalogs)
+
+The rest of this section along with the formal description of the corresponding REST services explains how to build
+and manage the counters.
+
+.. note::
+
+    In the future, the per-chunk counters will be used for optimizing another class of the unconditional queries
+    presented below:
+
+    .. code-block:: sql
+
+        SELECT * FROM <database>.<table> LIMIT <N>
+        SELECT `col`,`col2` FROM <database>.<table> LIMIT <N>
+
+    For these "indiscriminate" data probes Qserv would dispatch chunk queries to a subset of random chunks that have enough
+    rows to satisfy the requirements specified in ``LIMIT <N>``.
+
+
+.. _admin-row-counters-build:
+
+Building and deploying
+----------------------
+
+.. warning::
+
+    Depending on a scale of a catalog (data size of the affected table), it may take a while before this operation
+    will be complete.
+
+.. note::
+
+    Please, be advised that the very same operation could be performed at the catalog publishing time as explained in:
+
+    - :ref:`ingest-db-table-management-publish-db` (REST)
+
+    The choice of doing this at the catalog publishing time, or doing this as a separate operation explained in this document
+    is left to the Qserv administrators or developers of the ingest workflows. The general recommendation is to make it
+    a separate stage of the ingest workflow. In this case, the overall transition time of a catalog to the final published
+    state would be faster. In the end, the row counters optimization is optional, and it doesn't affect the overall
+    functionality of Qserv or query results seen by users.
+
+To build and deploy the counters one would need to use the following REST service:
+
+- :ref:`ingest-row-counters-deploy` (REST)
+
+The service needs to be invoked for every table of the ingested catalog. This is the typical example of using this service
+that would work regardless if the very same operation was already done before:
+
+.. code-block:: bash
+
+    curl http://localhost:25080/ingest/table-stats \
+      -X POST -H "Content-Type: application/json" \
+      -d '{"database":"test101",
+           "table":"Object",
+           "overlap_selector":"CHUNK_AND_OVERLAP",
+           "force_rescan":1,
+           "row_counters_state_update_policy":"ENABLED",
+           "row_counters_deploy_at_qserv":1,
+           "auth_key":""}'
+
+This would work for tables of any type: *director*, *dependent*, *RefMatch*, or *regular* (fully replicated). If the counters
+already existed in the Replication system's database, they would still be rescanned and redeployed in there.
+
+It may be a good idea to compare the performance of Qserv for executing the above-mentioned queries before and after running
+this operation. Normally, if the table statistics are available at Qserv, it should take a small fraction of
+a second (about 10 milliseconds) to see the result on the lightly loaded Qserv.
+
+.. _admin-row-counters-delete:
+
+Deleting
+--------
+
+Sometimes, if there is doubt that the row counters were incorrectly scanned, or when Q&A-in the ingested catalog,
+a data administrator may want to remove the counters and let Qserv do the full scan of the table instead. This can be done
+by using the following REST service:
+
+
+- :ref:`ingest-row-counters-delete` (REST)
+
+Likewise, the previously explained service, this one should also be invoked for each table needing attention. Here is
+an example:
+
+.. code-block:: bash
+
+    curl http://localhost:25080/ingest/table-stats/test101/Object \
+      -X DELETE -H "Content-Type: application/json" \
+      -d '{"overlap_selector":"CHUNK_AND_OVERLAP","qserv_only":1,"auth_key":""}'
+
+Note that with a combination of the parameters shown above, the statistics will be removed from Qserv only.
+So, the system would not need to rescan the tables again should the statistics need to be rebuilt. The counters could be simply
+redeployed later at Qserv. To remove the counters from the Replication system's persistent state as well
+the request should have ``qserv_only=0``.
+
+An alternative technique explained in the next section is to tell Qserv not to use the counters for optimizing queries.
+
+
+.. _admin-row-counters-disable:
+
+Disabling the optimization at run-time
+---------------------------------------
+
+.. warning::
+
+    This is a global setting that affects all users of Qserv. All new quries will be ru w/o the optimization.
+    It should be used with caution. Normally, it is meant to be used by the Qserv data administrator to investigate
+    suspected issues with Qserv or the catalogs it serves.
+
+To complement the previously explained methods for scanning, deploying, or deleting row counters for query optimization,
+Qserv also supports the run-time switch. The switch is turned on or off by the following statement to be submitted via
+the front-ends of Qserv:
+
+.. code-block:: sql
+
+    SET GLOBAL QSERV_ROW_COUNTER_OPTIMIZATION = 1
+    SET GLOBAL QSERV_ROW_COUNTER_OPTIMIZATION = 0
+
+
+The default behavior of Qserv when the variable is not set is to enable the optimization for tables where the counters
+are available.
+
+.. _admin-row-counters-retrieve:
+
+Inspecting
+----------
+
+It's also possible to retrieve the counters from the Replication system's state using the following REST service:
+
+- :ref:`ingest-row-counters-inspect` (REST)
+
+The information obtained in this way could be used for various purposes, such as investigating suspected issues with
+the counters, monitoring data placement in the chunks, or making visual representations of the chunk density maps.
+See the description of the REST service for further details on this subject.
diff --git a/doc/conf.py b/doc/conf.py
@@ -42,6 +42,7 @@
     r"^https://rubinobs.atlassian.net/wiki/",
     r"^https://rubinobs.atlassian.net/browse/",
     r"^https://www.slac.stanford.edu/",
+    r".*/_images/",
 ]
 
 html_additional_pages = {

diff --git a/doc/ingest/api/advanced/index.rst b/doc/ingest/api/advanced/index.rst
@@ -0,0 +1,71 @@
+
+.. _ingest-api-advanced:
+
+==================
+Advanced Scenarios
+==================
+
+.. _ingest-api-advanced-multiple-transactions:
+
+Multiple transactions
+----------------------
+
+.. _ingest-api-advanced-unpublishing-databases:
+
+Ingesting tables into the published catalogs
+--------------------------------------------
+
+.. _ingest-api-advanced-transaction-abort:
+
+Aborting transactions
+----------------------
+
+When should a transaction be aborted?
+What happens when a transaction is aborted?
+How long does it take to abort a transaction?
+How to abort a transaction?
+What to do if a transaction cannot be aborted?
+What happens if a transaction is aborted while contributions are being processed?
+Will the transaction be aborted if the Master Replication Controller is restarted?
+Is it possible to have unfinished transactions in the system when publishing a catalog?
+
+.. _ingest-api-advanced-contribution-requests:
+
+Options for making contribution requests
+----------------------------------------
+
+.. _ingest-api-advanced-global-config:
+
+Global configuration options
+----------------------------
+
+Certain configuration parameters of Qserv and the Replication/Ingest system may affect the ingest activities for
+all databases. This section explains which parameters are available and how they can could be retrieved via the REST API.
+
+The number of workers
+^^^^^^^^^^^^^^^^^^^^^
+
+The first parameter is the number of workers which are available for processing the contributions. The number
+can be obtained using the following REST service:
+
+- :ref:`ingest-config-global-workers`
+
+The workflow needs to select the workers which are available for processing the contributions. These have to be
+in the following state (both apply):
+
+- ``is-enabled=1``
+- ``is-read-only=0``
+
+There are a few possibilities how the workflow could use this information. For exaple, the workflow
+could start a separate transaction (or a set of transactions) per worker.
+
+
+Explain which configuration parameters of Qserv and the Replication/Inges system may affect
+the inges activities.
+
+- the number of workers
+- the number of sync loader processing threads per worker
+- the number of async loader processing threads per worker
+- the maxmum number of the automatic retries
+- etc.
+
diff --git a/doc/ingest/api/appendix/index.rst b/doc/ingest/api/appendix/index.rst
@@ -0,0 +1,6 @@
+.. _ingest-api-appendix:
+
+========
+APPENDIX
+========
+
diff --git a/doc/ingest/api/concepts/contributions.rst b/doc/ingest/api/concepts/contributions.rst
@@ -0,0 +1,67 @@
+
+.. _ingest-api-concepts-contributions:
+
+Table contributions
+===================
+
+The API defines the *contribution* as a set of rows that is being ingested into a table via
+a separate request. The contribution ingest requests are considered as atomic operations with
+a few important caveats:
+
+- The contributions are not committed to the table until the transaction is committed even
+  if the contribution request was successful.
+
+- Failed contribution requests have to be evaluated by the workflow to see if the target table
+  was left in a consistent state. This is indicated by the ``retry-allowed`` attributed returned
+  in the response object. There are two scenarios for what the workflow would do based on The
+  value of the flag:
+
+  - ``retry-allowed=0`` - the workflow has to roll back the transaction and make another
+    contribution request in a scope of a new transaction. See more on this subject in:
+
+    - :ref:`ingest-api-advanced-transaction-abort` (ADVANCED)
+    - :ref:`ingest-trans-management-end` (REST)
+
+  - ``retry-allowed=1`` - the workflow can re-try the contribution in a scope of the same
+    transaction using:
+
+    - :ref:`ingest-worker-contrib-retry` (REST)
+
+Note that for contributions submitted by reference there is an option to configure a request
+to automatically retry the failed contributions. The maximum number of such retries is controlled
+by the ``num_retries`` attribute of the request:
+
+- :ref:`ingest-worker-contrib-by-ref` (REST)
+
+Contributions pushed to the service by value can not be automatically retried. The workflow
+would have to decide on the retrying the failed contributions explicitly.
+
+Data (rows) of the contributions are typically stored in the ``CSV``-formatted files. In this
+case the files would be either directly pushed to the worker Ingest server or uploaded by
+the Ingest service from a location that is accessible to the worker:
+
+- :ref:`ingest-worker-contrib-by-val` (REST)
+- :ref:`ingest-worker-contrib-by-ref` (REST)
+
+The first option (ingesting by value) also allows pushing data of contributions
+directly from the memory of the client process (worklow) w/o the need to store the data in the files.
+
+Contributions have other important characteristic, such as:
+
+- Each contribution request is always made in a scope of a transaction. This association is
+  important for the data provenance and the data tagging purposes.
+- The information on contributions is preserved in the persistent state of the Ingest system.
+- Contributions have unique identifiers that are assigned by the Ingest system.
+
+The system allows to pull the information on the contributions given their identifiers:
+
+- :ref:`ingest-info-contrib-requests` (REST)
+
+An alternative option is to query the information on contributions submitted in a scope of
+a transaction:
+
+- :ref:`ingest-trans-management-status-one` (REST)
+
+The schema of the contribution descriptor objects is covered by:
+
+- :ref:`ingest-worker-contrib-descriptor`
diff --git a/doc/ingest/api/concepts/families.rst b/doc/ingest/api/concepts/families.rst
@@ -0,0 +1,11 @@
+
+.. _ingest-api-concepts-database-families:
+
+Database families
+=================
+
+Explain roles and implications of the database families in Qserv.
+
+- to allow joins between the tables accros the same family
+- required by the Ingest workflow
+
diff --git a/doc/ingest/api/concepts/index.rst b/doc/ingest/api/concepts/index.rst
@@ -0,0 +1,35 @@
+
+.. _ingest-api-concepts:
+
+=============
+Main concepts
+=============
+
+.. hint::
+
+    This section of the document begins with the high-level overview of the Qserv ingest API.
+    Please read this section carefully to learn about the main concepts of the API and a sequence
+    of operations for ingesting catalogs into Qserv.
+
+    After completing the overview, a reader has two options for what to read next:
+
+    - Study the core concepts of the API in depth by visiting subsections:
+
+      - :ref:`ingest-api-concepts-table-types`
+      - :ref:`ingest-api-concepts-transactions`
+      - :ref:`ingest-api-concepts-publishing-data`
+      - etc.
+
+    - Go straight to the practical example of a simple workflow presented at:
+
+      - :ref:`ingest-api-simple`
+
+..  toctree::
+    :maxdepth: 4
+
+    overview
+    table-types
+    transactions
+    contributions
+    publishing
+    families