Skip to content

Commit

Permalink
temp cmt
Browse files Browse the repository at this point in the history
  • Loading branch information
zezhishao committed Sep 3, 2024
1 parent 889185c commit d1e193a
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 4 deletions.
17 changes: 13 additions & 4 deletions tutorial/dataset_design.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# 📦 Dataset Design

## Data Download
## Data Download

To get started with the datasets, download the `all_data.zip` file from either [Google Drive](https://drive.google.com/drive/folders/14EJVODCU48fGK0FkyeVom_9lETh80Yjp?usp=sharing) or [Baidu Netdisk](https://pan.baidu.com/s/1shA2scuMdZHlx6pj35Dl7A?pwd=s2xe). After downloading, unzip the files into the `datasets/` directory:

Expand All @@ -13,7 +13,7 @@ rmdir datasets/all_data

These datasets are preprocessed and ready for immediate use.

## Data Format
## 💿 Data Format

Each dataset contains at least two essential files: `data.dat` and `desc.json`:

Expand All @@ -38,7 +38,7 @@ Each dataset contains at least two essential files: `data.dat` and `desc.json`:
- Evaluation metrics
- Handling of outliers

## Dataset Class Design
## 🧑‍💻 Dataset Class Design

<div align="center">
<img src="figures/DatasetDesign.jpeg" height=250>
Expand All @@ -48,10 +48,19 @@ In time series forecasting, datasets are typically generated from raw time serie

BasicTS provides a built-in `Dataset` class called [`TimeSeriesForecastingDataset`](../basicts/data/simple_tsf_dataset.py), designed specifically for time series data. This class generates samples in the form of a dictionary containing two objects: `inputs` and `target`. `inputs` represents the input data, while `target` represents the target data. Detailed documentation can be found in the class's comments.

## How to Add or Customize Datasets
## 🧑‍🍳 How to Add or Customize Datasets

If your dataset follows the structure described above, you can preprocess your data into the `data.dat` and `desc.json` format and place it in the `datasets/` directory, e.g., `datasets/YOUR_DATA/{data.dat, desc.json}`. BasicTS will then automatically recognize and utilize your dataset.

For reference, you can review the scripts in `scripts/data_preparation/`, which are used to process datasets from `raw_data.zip` ([Google Drive](https://drive.google.com/drive/folders/14EJVODCU48fGK0FkyeVom_9lETh80Yjp?usp=sharing), [Baidu Netdisk](https://pan.baidu.com/s/1shA2scuMdZHlx6pj35Dl7A?pwd=s2xe)).

If your dataset does not conform to the standard format or has specific requirements, you can define your own dataset class by inheriting from `torch.utils.data.Dataset`. In this custom class, the `__getitem__` method should return a dictionary containing `inputs` and `target`.

## Explore Further

- **🧠 [Diving into the Model Convention and Creating Your Own Model](./model.md)**
- **📉 [Examining the Metrics Convention and Developing Your Own Loss & Metrics](./metrics_design.md)**
- **🛠️ [Navigating The Scaler Convention and Designing Your Own Scaler](./scaler_design.md)**
- **🏃‍♂️ [Mastering The Runner Convention and Building Your Own Runner](./runner_design.md)**
- **📜 [Interpreting the Config File Convention and Customizing Your Configuration](./config_design.md)**
- **🔍 [Exploring a Variety of Baseline Models](../baselines/)**
10 changes: 10 additions & 0 deletions tutorial/overall_design.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,13 @@ To streamline the configuration of training strategies and centralize all option
Users can easily configure models, datasets, scaling methods, evaluation metrics, optimizers, learning rates, and other hyperparameters by modifying the configuration file—**as simple as filling out a form**.

For example, setting `CFG.TRAIN.EARLY_STOPPING_PATIENCE = 10` enables early stopping with a patience level of 10.

## Explore Further

- **📦 [Exploring the Dataset Convention and Customizing Your Own Dataset](./dataset_design.md)**
- **🧠 [Diving into the Model Convention and Creating Your Own Model](./model.md)**
- **📉 [Examining the Metrics Convention and Developing Your Own Loss & Metrics](./metrics_design.md)**
- **🛠️ [Navigating The Scaler Convention and Designing Your Own Scaler](./scaler_design.md)**
- **🏃‍♂️ [Mastering The Runner Convention and Building Your Own Runner](./runner_design.md)**
- **📜 [Interpreting the Config File Convention and Customizing Your Configuration](./config_design.md)**
- **🔍 [Exploring a Variety of Baseline Models](../baselines/)**

0 comments on commit d1e193a

Please sign in to comment.