Merge pull request #12 from pytorch-lumo/dev1

Weakly update 2023.03.26
lumo-tech · Mar 26, 2023 · b713ec0 · b713ec0
2 parents 08ddb05 + a209c97
commit b713ec0
Show file tree

Hide file tree

Showing 41 changed files with 1,295 additions and 725 deletions.
diff --git a/README.ch.md b/README.ch.md
@@ -17,6 +17,7 @@
     - 数据: 数据集构建流程抽象、组合多个 DataLoader、...
     - 分布式训练：同样支持多种训练加速框架，统一抽象，方便随时切换
 - 更多工具类...
+- 尽可能多的支持现代 IDE 的自动补全
 
 ![lumo-framework](./images/lumo-intro.png)
 
@@ -33,6 +34,7 @@
 - [More](#more)
 - :pencil: [Acknowledge](#pencil-acknowledge)
 - :scroll: [License](#scroll-license)
+- [完整文档](https://pytorch-lumo.github.io/lumo/)
 
 # :cloud: 安装
 

diff --git a/README.md b/README.md
@@ -25,6 +25,7 @@ and focuses on enhancing the experience of deep learning practitioners.
     - **Distributed Training:** Also supports multiple training acceleration frameworks, unified abstraction, and easy
       switching at any time.
 - More utilities...
+- **Type Hint:** Support as much as possible for modern IDE's auto-completion.
 
 ![lumo-framework](./images/lumo-intro.png)
 
@@ -38,6 +39,7 @@ and focuses on enhancing the experience of deep learning practitioners.
     - :small_orange_diamond: [re-run](#small_orange_diamond-re-run)
     - :small_orange_diamond: [backup](#small_orange_diamond-backup)
 - :scroll: [License](#scroll-license)
+- [Full Document](https://pytorch-lumo.github.io/lumo/)
 
 # :cloud: Installation
 
@@ -65,9 +67,9 @@ Here are two classic scenarios:
 
 ## :small_orange_diamond: Embedding into Existing Projects
 
-For existing projects, you can quickly embed Lumo by following these steps:
+For existing projects, you can quickly embed `lumo` by following these steps:
 
-- Import Lumo and initialize Logger and Experiment:
+- Import `lumo` and initialize Logger and Experiment:
 
 ```python
 import random
@@ -117,8 +119,9 @@ exp.end()
 
 ## :small_orange_diamond: Building from Scratch
 
-If you want to start a new deep learning experiment from scratch, you can use Lumo to accelerate your code development.
-Below are examples of Lumo training at different scales:
+If you want to start a new deep learning experiment from scratch, you can use `lumo` to accelerate your code
+development.
+Below are examples of `lumo` training at different scales:
 
 one-fine training:
 

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -31,7 +31,9 @@ def extract_version():
 import os
 import sys
 
-sys.path.insert(0, os.path.abspath('../../src/'))
+sys.path.insert(0, Path(__file__).parent.parent.joinpath('src').as_posix())
+
+# sys.path.insert(0, os.path.abspath('../../src/'))
 
 # -- General configuration ---------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
@@ -130,7 +132,7 @@ def setup(app: Sphinx):
 #     '.md': 'markdown',
 # }
 #
-commonmark_suffixes = ['.rst']
+# commonmark_suffixes = ['.rst']
 
 source_parsers = {
     '.md': CommonMarkParser,

diff --git a/docs/source/custom_rtd_theme/versions.html b/docs/source/custom_rtd_theme/versions.html
@@ -5,7 +5,7 @@
     <div class="rst-versions" data-toggle="rst-versions" role="note" aria-label="{{ _('Versions') }}">
     <span class="rst-current-version" data-toggle="rst-current-version">
       <span class="fa fa-book"> Read the Docs</span>
-      v: {{ current_version }}
+      v: {{ current_version.name }}
       <span class="fa fa-caret-down"></span>
     </span>
         <div class="rst-other-versions">

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -1,16 +1,28 @@
 .. lumo documentation master file, created by
-   sphinx-quickstart on Sat Mar 18 14:41:26 2023.
-   You can adapt this file completely to your liking, but it should at least
-   contain the root `toctree` directive.
+sphinx-quickstart on Sat Mar 18 14:41:26 2023.
+You can adapt this file completely to your liking, but it should at least
+contain the root `toctree` directive.
 
 Welcome to lumo's documentation!
 ================================
 
+.. toctree::
+   :maxdepth: 1
+   :caption: Tutorial
+
+   tutorial/reproducibility.md
+   tutorial/configuration.md
+   tutorial/dataset_builder.md
+
+
+
+
 .. toctree::
    :maxdepth: 2
-   :caption: Contents
+   :caption: Development
+
+
 
-   ../tutorial/getting_start.md
 
 Indices and tables
 ==================

diff --git a/docs/source/others/why_lumo.md b/docs/source/others/why_lumo.md
diff --git a/docs/source/tutorial/configuration.md b/docs/source/tutorial/configuration.md
@@ -0,0 +1,94 @@
+# Runtime Configuration and Params
+
+## Params
+
+`~lumo.Params`is used to specify the configuration required for the current experiment. In addition to defining parameters that support autocompletion, it also supports command-line parameters, inheritance, and reading from multiple configuration files.
+
+The simplest usage is as follows:
+
+```python
+from lumo import Params
+
+params = Params()
+params.lr = 1e-3
+params.dataset = 'cifar10'
+params.from_args() # python main.py --dataset=cifar100
+
+print(params.dataset)
+>>> "cifar100"
+```
+
+Limit the value of parameters:
+
+```python
+params.dataset = params.choice('cifar10', 'cifar100')
+print(params.dataset)
+>>> "cifar10" # by default is the first value
+
+params.dataset = "imagenet"
+>>> raise BoundCheckError: value of param 'dataset' should in values ('cifar10',
+'cifar100'), but got imagenet
+```
+
+Read from other locations:
+
+```python
+params.from_json("*.json")
+params.from_yaml("*.yaml")
+params.from_yaml("*.yml")
+params.from_dict({})
+```
+
+`params.config`or`params.c`is a built-in reserved parameter. When the values of these two variables are strings and the path judgment is a yaml or json file or file list, the configuration is read from the corresponding position:
+
+```json
+# cfg.json
+{
+    "dataset": "cifar100"
+}
+```
+
+```python
+params.from_args(['--c','cfg.json'])
+print(params.dataset)
+>>> "cifar100"
+```
+
+## Configuration
+
+`lumo` provides a multi-level configuration system, including three file locations:
+
+```arduino
+~/.lumorc.json -> user-level
+<repo>/.lumorc.json -> repo-level, private
+<repo>/.lumorc.public.json -> repo-level, public
+```
+
+All configurations are loaded into`lumo.glob`at runtime for global settings:
+
+```css
+from lumo import glob
+
+glob['xxx']
+```
+
+## Difference between Configuration and Hyperparameters
+
+In `lumo`, configurations are mostly used for non-experiment-related content that is related to the computer environment and `lumo` behavior, such as the location of the dataset, GitHub access tokens, etc. All supported optional behaviors in`lumo`can be controlled by modifying the configuration in`glob`. The following are the currently supported configurable items:
+
+| Configuration | Description |
+| --- | --- |
+| github_access_token | Replaces the access_token parameter of the exp.backup() method. |
+| exp_root | One of several initial paths. |
+| db_root | One of several initial paths. |
+| progress_root | One of several initial paths. |
+| metric_root | One of several initial paths. |
+| cache_dir | One of several initial paths. |
+| blob_root | One of several initial paths. |
+| timezone | Determines the timezone used by lumo. Default is 'Asia/Shanghai'. |
+| TRAINER_LOGGER_STDIO | Controls whether the Logger outputs to the standard output stream. |
+| dev_branch | The branch used for saving code snapshots during version control. Default is 'lumo_experiments'. |
+| HOOK_LOCKFILE | Behavior control for loading LockFile ExpHook. |
+| HOOK_RECORDABORT | Behavior control for loading RecordAbort ExpHook. |
+| HOOK_GITCOMMIT | Behavior control for loading GitCommit ExpHook. |
+
diff --git a/docs/source/tutorial/dataset_builder.md b/docs/source/tutorial/dataset_builder.md
@@ -0,0 +1,54 @@
+# Build your Dataset Easily
+
+lumo has designed`~lumo.DatasetBuilder`to provide a unified interface for constructing datasets. This can greatly reduce repetitive dataset design in most cases.
+
+Taking the CIFAR10 dataset as an example, if the dataset requires images to be output with two augmentations, either the Dataset class needs to be modified or the transform function needs to be rewritten:
+
+```python
+class MyCIFAR(CIFAR10):
+    ...
+
+    def __getitem(self,index):
+        sample = self.data[index]
+        label = self.target[index]
+        return self.transform1(sample), self.transform2(sample), label
+
+
+# or
+
+def two_transform(sample):
+    ...
+    return transform1(sample), transform2(sample)
+```
+
+When facing such changes in multiple datasets, this rewriting method can be time-consuming. Especially when the output format is not yet certain and may be subject to frequent changes.
+
+To solve this, lumo provides a universal and streaming solution through`DatasetBuilder`. You only need to prepare the raw data in the standard format and the standard one-to-one augmentation functions:
+
+```python
+...
+
+source = CIFAR10()
+transform1 = ...
+transform2 = ...
+```
+
+Then, any output format can be defined through`DatasetBuilder`:
+
+```python
+from lumo import DatasetBuilder
+
+ds = (
+    DatasetBuilder()
+        # Define input stream
+        .add_input('xs', source.data)
+        .add_input('ys', source.targets)
+        # Define output stream
+        .add_output('xs','xs1',transform1)
+        .add_output('xs','xs2',transform2)
+        .add_output('ys','ys')
+)
+
+print(ds[0])
+>>> {'xs1': ..., 'xs2': ..., "ys": ...}
+```
diff --git a/docs/source/tutorial/getting_start.md b/docs/source/tutorial/getting_start.md
diff --git a/docs/source/tutorial/images/2023-03-24-15-57-11.png b/docs/source/tutorial/images/2023-03-24-15-57-11.png