diff --git a/README.md b/README.md index 936223e..f656cf6 100644 --- a/README.md +++ b/README.md @@ -15,14 +15,14 @@ We are exploring the generation of new Bliss vocabulary using emerging AI techni ### Prerequisites -* [Python 3](https://www.python.org/downloads/) - * Version 3.9+. On Mac, Homebrew is the easiest way to install. +- [Python 3](https://www.python.org/downloads/) + - Version 3.9+. On Mac, Homebrew is the easiest way to install. ### Clone the Repository -* Clone the project from GitHub. [Create a fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) -with your GitHub account, then run the following in your command line (make sure to replace `your-username` with -your username): +- Clone the project from GitHub. [Create a fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) + with your GitHub account, then run the following in your command line (make sure to replace `your-username` with + your username): ```bash git clone https://github.com/your-username/baby-bliss-bot @@ -30,26 +30,29 @@ cd baby-bliss-bot ``` ### Create/Activitate Virtual Environment + Always activate and use the python virtual environment to maintain an isolated environment for project's dependencies. -* [Create the virtual environment](https://docs.python.org/3/library/venv.html) - (one time setup): - - `python -m venv .venv` +- [Create the virtual environment](https://docs.python.org/3/library/venv.html) + (one time setup): + + - `python -m venv .venv` -* Activate (every command-line session): +- Activate (every command-line session): - Windows: `.\.venv\Scripts\activate` - Mac/Linux: `source .venv/bin/activate` ### Install Python Dependencies Run in the baby-bliss-bot directory: -* `pip install -r requirements.txt` + +- `pip install -r requirements.txt` ## Linting Run the following command to lint all python scripts: -* `flake8` +- `flake8` ## Model Experiments @@ -65,10 +68,14 @@ on how to train this model, training results and the conclusion about how useful ### Texture Inversion -Concolusion: not useful +Concolusion: not useful See the [Texture Inversion documentation](./notebooks/README.md) for details. +### CLIP Training + +See the [Clip Taining documentation](./docs/CLIP-Training.md) for details. + ## Notebooks [`/notebooks`](./notebooks/) directory contains all notebooks used for training or fine-tuning various models. @@ -76,10 +83,11 @@ Each notebook usually comes with a accompanying `dockerfile.yml` to elaborate th running in. ## Jobs + [`/jobs`](./jobs/) directory contains all jobs used for training or fine-tuning various models. ## Utility Scripts -All utility functions are in the [`utils`](./utils) directory. +All utility functions are in the [`utils`](./utils) directory. See [README.md](./utils/README.md) in the [`utils`](./utils) directory for details. diff --git a/docs/CLIP-Training.md b/docs/CLIP-Training.md new file mode 100644 index 0000000..e5ef7c6 --- /dev/null +++ b/docs/CLIP-Training.md @@ -0,0 +1,51 @@ +# Train CLIP + +This article documents training a CLIP model with Bliss symbols image files and an annotated json file containing metadata for each image. + +The following steps were followed for this training: + +## Set Environment + +``` +# create new env clip_train +conda create -n clip_train python=3.8.5 + +# activate clip_train +conda activate clip_train + +# install pytorch, torchvision +conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.2 -c pytorch + +# Added additional depedency +pip install future + +# install other dependencies +pip install -r requirements.txt +``` + +## Clone the repoository [clip-training](https://github.com/revantteotia/clip-training) + +This repository contains code to train [CLIP](https://github.com/openai/CLIP) on [MS-COCO](https://cocodataset.org/#home) captions. + +## Extract Bliss dataset in the directory data + +The structure of the COCO dataset was used to prepare the Bliss Annotated dataset. Bliss dataset including images and annotations can be downloaded [here](https://drive.google.com/file/d/1kSE4egEvg2g5wKZLHCFTE1ZijUf0ZC2_/view?usp=sharing) + +## Update [./dataloader/data_config.yaml](./dataloader/data_config.yaml) + +``` +train_img_dir : 'data/bliss/train' +train_annotation_file : 'data/bliss/annotations/bliss_data_annotated_CLIP.json' +``` + +## Run train. + +Take dataset paths from 'dataloader/data_config.yaml' + +``` +$ python train.py +``` + +# Results + +The results of the training can be downloaded [here - checkpoint_34_3395.pt.tar.gz](https://drive.google.com/file/d/1J_U2yW9MmRa4f23044brM_Winku507ZL/view?usp=sharing) diff --git a/jobs/README.md b/jobs/README.md index a815d16..d18639d 100644 --- a/jobs/README.md +++ b/jobs/README.md @@ -3,13 +3,14 @@ This directory contains all jobs used for training or fine-tuning various models. ## StyleGAN2-ADA + The [stylegan2-ada](./stylegan2-ada) folder contains: - - `def-styleGan2AdaPytorchDataSetupBatch.sh` is the SBatch script for preparing the training dataset for StyleGAN2-ADA. The script uses the `def-whkchun` cluster. - - `def-styleGAN2AdaPytorchTrainBatch.sh` is the SBatch script for training. The script uses the `def-whkchun` cluster. - - `ctb-styleGAN2AdaPytorchGenerateBatch.sh` is the SBatch script for generating an image from the StyleGAN2-ADA model. The script uses the `ctb-whkchun` cluster. - - `def-styleGAN2AdaPytorchGenerateBatch.sh` is the SBatch script that also can be used to generate images from the StyleGAN2-ADA model. This version uses the `def-whkchun` cluster. - - `requirements.txt` shows the packages used by the PyTorch implementation of StyleGAN2-ADA. Note that this is not used to create the environment, but to document the environment after it was created. +- `def-styleGan2AdaPytorchDataSetupBatch.sh` is the SBatch script for preparing the training dataset for StyleGAN2-ADA. The script uses the `def-whkchun` cluster. +- `def-styleGAN2AdaPytorchTrainBatch.sh` is the SBatch script for training. The script uses the `def-whkchun` cluster. +- `ctb-styleGAN2AdaPytorchGenerateBatch.sh` is the SBatch script for generating an image from the StyleGAN2-ADA model. The script uses the `ctb-whkchun` cluster. +- `def-styleGAN2AdaPytorchGenerateBatch.sh` is the SBatch script that also can be used to generate images from the StyleGAN2-ADA model. This version uses the `def-whkchun` cluster. +- `requirements.txt` shows the packages used by the PyTorch implementation of StyleGAN2-ADA. Note that this is not used to create the environment, but to document the environment after it was created. See the [StyleGAN2-ADATraining.md](../docs/StyleGAN2-ADATraining.md) in the [documentation](../docs) folder for details on how to set up the environment. @@ -17,10 +18,15 @@ See the [StyleGAN2-ADATraining.md](../docs/StyleGAN2-ADATraining.md) in the [doc The [stylegan3](./stylegan3) directory contains: - - `requirements.txt` is used with other module installations to set up the environment for training - [the stylegan3 model](https://github.com/NVlabs/stylegan3) with the Bliss single characters. - - `job_stylegan3.sh` is the job script submitted in [the Cedar platform](https://docs.alliancecan.ca/wiki/Cedar) - to perform the training. +- `requirements.txt` is used with other module installations to set up the environment for training + [the stylegan3 model](https://github.com/NVlabs/stylegan3) with the Bliss single characters. +- `job_stylegan3.sh` is the job script submitted in [the Cedar platform](https://docs.alliancecan.ca/wiki/Cedar) + to perform the training. See the [TrainStyleGAN3Model.md](../docs/TrainStyleGAN3Model.md) in the [documentation](../docs) folder for details on how to about how to train this model, training results and the conclusion about how useful it is. + +## CLIP + +See the [CLIP-Training.md](../docs/CLIP-Training.md) in the [documentation](../docs) folder for details on +how to about how to train this model, download the dataset used and the model resulting from the training.