-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
8 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,31 +26,31 @@ Running faro | |
Shared Gen3 Data Repositories | ||
----------------------------- | ||
|
||
Information on shared Gen3 data repositories for data managed at NCSA can be found in `/repo/README.md`. See `DMTN-167 <https://dmtn-167.lsst.io/>`_ for more information on the organization of Gen3 data repositories. | ||
Information on shared Gen3 data repositories for data managed at NCSA can be found in `/repo/README.md`. See `DMTN-167 <https://dmtn-167.lsst.io/>`__ for more information on the organization of Gen3 data repositories. | ||
|
||
.. warning:: When developing metrics in ``faro``, particular care should be taken when creating a new dataset type name associated with a metric. As noted in `DMTN-167 <https://dmtn-167.lsst.io/#naming-conventions-for-dataset-types>`_, the dataset type names are *global* with no implicit name spacing. This may change in the future; see `DM-29817 <https://jira.lsstcorp.org/browse/DM-29817>`_. When developing metrics, it is recommended to run on a local data repository rather than a shared Gen3 data repository in case metrics need to renamed or the `dimensions <https://pipelines.lsst.io/modules/lsst.daf.butler/dimensions.html>`_ associated with metric calculation need to be changed. | ||
.. warning:: When developing metrics in ``faro``, particular care should be taken when creating a new dataset type name associated with a metric. As noted in `DMTN-167 <https://dmtn-167.lsst.io/#naming-conventions-for-dataset-types>`__, the dataset type names are *global* with no implicit name spacing. This may change in the future; see `DM-29817 <https://jira.lsstcorp.org/browse/DM-29817>`_. When developing metrics, it is recommended to run on a local data repository rather than a shared Gen3 data repository in case metrics need to renamed or the `dimensions <https://pipelines.lsst.io/modules/lsst.daf.butler/dimensions.html>`_ associated with metric calculation need to be changed. | ||
|
||
Example: rc2_subset | ||
------------------- | ||
|
||
Running ``faro`` on a small local dataset. The `rc2_subset <[email protected]:lsst/rc2_subset.git>`_ is the smallest CI dataset for which all ``faro`` metrics can be run without error and produce meaingful results. | ||
|
||
1. Set up ``rc2_subset`` following the instructions `here <https://pipelines.lsst.io/v/daily/getting-started/data-setup.html#downloading-the-sample-hsc-data>`_. | ||
1. Set up ``rc2_subset`` following the instructions `here <https://pipelines.lsst.io/v/daily/getting-started/data-setup.html#downloading-the-sample-hsc-data>`__. | ||
|
||
2. Set up ``faro`` package; see :ref:`setting up <lsst.faro.setting_up>`. | ||
|
||
3. An example command (update the command):: | ||
|
||
pipetask run -b $RC2_SUBSET_DIR/SMALL_HSC/butler.yaml -p $FARO_DIR/pipelines/metrics_pipeline_matched.yaml -i u/$USER/single_frame -o u/$USER/faro_matched_visits_r --register-dataset-types -d "instrument='HSC' AND detector=42 AND band='r'" | ||
|
||
Documentation for using the ``pipetask run`` command and various options can be found `here <https://pipelines.lsst.io/modules/lsst.ctrl.mpexec/pipetask.html#pipetask-run>`_. Briefly, the example command above uses the ``-b`` option to specify the Butler repository, ``-p`` to specify the pipeline, ``-i`` to specify the input collection, ``-o`` to specify the output collection (this should almost always be a user collection prefixed with ``u/username/`` unless you are running in production), and ``-d`` to provide a query to select a subset of data on which to compute metrics. | ||
Documentation for using the ``pipetask run`` command and various options can be found `here <https://pipelines.lsst.io/modules/lsst.ctrl.mpexec/pipetask.html#pipetask-run>`__. Briefly, the example command above uses the ``-b`` option to specify the Butler repository, ``-p`` to specify the pipeline, ``-i`` to specify the input collection, ``-o`` to specify the output collection (this should almost always be a user collection prefixed with ``u/username/`` unless you are running in production), and ``-d`` to provide a query to select a subset of data on which to compute metrics. | ||
|
||
.. warning:: The ``--register-dataset-types`` option should be used with caution as this will allow the registration of new dataset types that are global across the repository. | ||
|
||
Example: HSC RC2 dataset | ||
------------------------ | ||
|
||
Running ``faro`` on a Gen3 repository at NCSA. The HSC RC2 data that is reprocessed monthly with the latest version of the Science Pipelines is a good example, see `DMTN-091<https://dmtn-091.lsst.io>`_. Information on the current status of HSC RC2 re-processing and latest runs can be found `here <https://jira.lsstcorp.org/browse/DM-26911>`_. | ||
Running ``faro`` on a Gen3 repository at NCSA. The HSC RC2 data that is reprocessed monthly with the latest version of the Science Pipelines is a good example, see `DMTN-091 <https://dmtn-091.lsst.io>`_. Information on the current status of HSC RC2 re-processing and latest runs can be found `here <https://jira.lsstcorp.org/browse/DM-26911>`__. | ||
|
||
1. Set up ``lsst.faro`` package; see :ref:`setting_up <lsst.faro.setting_up>`. | ||
|
||
|
@@ -65,7 +65,7 @@ Example: DRP processing | |
|
||
``lsst.faro`` can be run together with other processing steps in a pipeline, e.g., as part of DRP processing. | ||
|
||
Examples of this functionality can be found in the `rc2_subset <https://github.com/lsst/drp_pipe/blob/main/pipelines/HSC/DRP-RC2_subset.yaml>`_. | ||
Examples of this functionality can be found in the `rc2_subset <https://github.com/lsst/drp_pipe/blob/main/pipelines/HSC/DRP-RC2_subset.yaml>`__. | ||
One could follow the steps `in this tutorial <https://pipelines.lsst.io/v/daily/getting-started/singleframe.html#running-single-frame-processing>`_ for more information. | ||
|
||
.. _lsst.faro.adding_a_metric: | ||
|
@@ -131,15 +131,15 @@ All development should happen on ticket branches (and should have associated JIR | |
Adding a Metric | ||
--------------- | ||
|
||
1. Identify the analysis context. Review the associated connections, config, and task base classes for that analysis context to understand the in-memory python objects that will be passed to the ``run`` method of the metric measurement task and the configuration options. See :ref:`design concepts <lsst.faro.design_concepts>` for more information. Currently implemented analysis contexts are listed :ref:`here<lsst.faro.currently_implemented_analysis_contexts>`. | ||
1. Identify the analysis context. Review the associated connections, config, and task base classes for that analysis context to understand the in-memory python objects that will be passed to the ``run`` method of the metric measurement task and the configuration options. See :ref:`design concepts <lsst.faro.design_concepts>` for more information. Currently implemented analysis contexts are listed :ref:`here <lsst.faro.currently_implemented_analysis_contexts>`. | ||
|
||
2. Implement Measurement task. This will be an instance of ``lsst.pipe.base.Task`` that performs the specific operations of a given metric. See ``NumSourcesTask`` defined in `BaseSubTasks.py <https://github.com/lsst/faro/blob/master/python/lsst/faro/base/BaseSubTasks.py>`_ for a simple example metric that returns the number of rows in an input source/object catalog. Additional examples of measurement tasks can be found in the ``python/lsst/faro/measurement`` directory of the package. | ||
|
||
3. Implement unit tests. All algorithmic code used for metric computation should have associated unit tests. Examples can be found in the package ``tests`` directory. | ||
|
||
4. Add metric to a pipeline yaml file. The pipeline yaml contains the configuration information to execute metrics. See `measurement_visit_table.yaml <https://github.com/lsst/faro/blob/master/pipelines/measurement/measurement_visit_table.yaml>` for an example that uses ``VisitTableMeasurementTask`` to count the number of rows in an input source/object catalog. Additional examples of pipeline files can be found in ``pipelines/measurement`` directory of the package. | ||
|
||
5. Name the metric. Currently each metric is associated with separately named dataset type that is global (more info :ref:`here<lsst.faro.shared>`). To date, metric names have followed the pattern "metricvalue_{package}_{metric}" where the "package" and "metric" are given in the yaml configuration file. Metric naming conventions is an area of active development and it is recommended to contact the ``faro`` development team for up-to-date guidance. | ||
5. Name the metric. Currently each metric is associated with separately named dataset type that is global (more info :ref:`here <lsst.faro.shared>`). To date, metric names have followed the pattern "metricvalue_{package}_{metric}" where the "package" and "metric" are given in the yaml configuration file. Metric naming conventions is an area of active development and it is recommended to contact the ``faro`` development team for up-to-date guidance. | ||
|
||
Review | ||
------ | ||
|