From 1c88d2c3ba5234aafb0dbcce619eba9e6383052b Mon Sep 17 00:00:00 2001
From: Narges Rezaie <57965133+nargesr@users.noreply.github.com>
Date: Thu, 15 Aug 2024 10:15:45 -0700
Subject: [PATCH 1/2] Update README.md

---
 workflow/snakemake/README.md | 35 +++++++++++++++++------------------
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/workflow/snakemake/README.md b/workflow/snakemake/README.md
index a7297b3..2ac5bb0 100644
--- a/workflow/snakemake/README.md
+++ b/workflow/snakemake/README.md
@@ -4,38 +4,38 @@ This directory contains a Snakemake pipeline for running the Topyfic automatical
 
 The snakemake will run training (Train) and building model (topModel, Analysis). 
 
-**Note**: Please make sure to install necessary packages and set up your Snakemake appropriately.
+**Note**: Please make sure to install the necessary packages and set up your Snakemake appropriately.
 
 **Note**: pipeline is tested for Snakemake >= 8.X ([more info](https://snakemake.readthedocs.io/en/stable/index.html))
 
 ## Getting started
 
-### 1. setting up environment
+### 1. Setting up environment
 
-Build your environment and install necessary packages
+Build your environment and install the necessary packages
 - [Suggested environment](workflow/envs/Topyfic_env.yml)
 
-### 2. Setting up config file
+### 2. Setting up the config file
 
 Modify the [config file](config/config.yaml) or create a new one with the same structure.
 
 1. **names**
-   - Contains name of the input dataset(s). 
-   - Name will be used as a name of train and topModel models
-   - If there is multiple names, Topyfic will normalize the models across names using [harmony](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6884693/).
+   - Contains the name of the input dataset(s). 
+   - Name will be used as the name of train and topModel models
+   - If there are multiple names, Topyfic will normalize the models across names using [harmony](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6884693/).
    - list of name: `[parse, 10x]`
 
 2. **count_data**
-   - Contains path of each input data
-   - Name of each path should match name in `names`
+   - Contains the path of each input data
+   - Name of each path should match the name in `names`
    - Recommended to use full path rather than relative path
 
 3. **n_topics**
-   - Contains list of number of initial topics you wish to train model base on them
+   - Contains a list of integers of initial topics you wish to train the model based on them
    - list of int: `[5, 10, 15, 20, 25, 30, 35, 40, 45, 50]`
 
 4. **organism**
-   - Indicate spices which will be used for downstream analysis
+   - Indicate spices that will be used for downstream analysis
    - Example: human or mouse
 
 5. **workdir**
@@ -45,22 +45,23 @@ Modify the [config file](config/config.yaml) or create a new one with the same s
 
 6. **train**
    - most of the item is an input of `train_model()`
-   - n_runs: number of run to define rLDA model (default: 100)
-   - random_states: list of random state, we used to run LDA models (default: range(n_runs))
+   - n_runs: number of runs to define the rLDA model (default: 100)
+   - random_states: list of random states, we used to run LDA models (default: range(n_runs))
 
 7. **top_model**
    - n_top_genes (int): Number of highly-variable genes to keep (default: 50)
    - resolution (int): A parameter value controlling the coarseness of the clustering. Higher values lead to more clusters. (default: 1)
-   - max_iter_harmony (int): Number of iteration for running harmony (default: 10)
+   - max_iter_harmony (int): Number of iterations for running harmony (default: 10)
    - min_cell_participation (float): Minimum cell participation across for each topic to keep them, when is `None`, it will keep topics with cell participation more than 1% of #cells (#cells / 100)
 
 8. **merge**
    - Indicate if you want to also get a model for all data together.
+   - Make sure you have write access.
 
 
 ### 3. Run snakemake
 
-First run it with `-n` to make sure the steps that it plans to run are reasonable. 
+First, run it with `-n` to make sure the steps that it plans to run are reasonable. 
 After it finishes, run the same command without the `-n` option.
 
 `snakemake -n`
@@ -85,10 +86,8 @@ snakemake \
 -p \
 --verbose
 ```
-highmem
-standard
 
-Development hints: If you ran to any error `-p --verbose` would give you more detail about each run and will help you to debug your code.
+Development hints: If you run into any error `-p --verbose` would give you more detail about each run and help you to debug your code.
 
 
 ### 4. Further downstream analysis

From 6942cd4208857af3e6270dccd3a2fec625c3b152 Mon Sep 17 00:00:00 2001
From: Narges Rezaie <57965133+nargesr@users.noreply.github.com>
Date: Thu, 15 Aug 2024 10:16:53 -0700
Subject: [PATCH 2/2] Update README.md

---
 workflow/snakemake/README.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/workflow/snakemake/README.md b/workflow/snakemake/README.md
index 2ac5bb0..3a1a00d 100644
--- a/workflow/snakemake/README.md
+++ b/workflow/snakemake/README.md
@@ -93,6 +93,7 @@ Development hints: If you run into any error `-p --verbose` would give you more
 ### 4. Further downstream analysis
 
 Once you get all the three main objects (Train, TopModel, Analysis), I would recommend using [this notebook](resources/analysing.ipynb) for depth_in downstream analysis.
+** Section 4 is still under construction **