Skip to content

Uploading Data for TRE‐MPRA

samhimes92 edited this page Mar 29, 2024 · 2 revisions

Overview

In order to maintain this code and upload new data you will need to have access to the scratch/ directory on the bioinformatics computer. (This is where we keep all the data)

To push new data you will need to.

  1. Put the data in the correct directories
  2. Run parse_vis_input.py
  3. Push the data to GitHub

Sorting input data

After you run TMP_empirical_docker.py a new directory is created with the name of your run is created. That directory will contain {your run name}__empirical_results.csv and {your run name}__meta_data.csv. Copy these into /Scratch/Sam/projects/github/JGEnglishLab.github.io/tre_mpra/data/emp_data and /Scratch/Sam/projects/github/JGEnglishLab.github.io/tre_mpra/data/meta_data respectively.

A note on the meta data file. If you want the file to be anonymized you must have a value other than None in the anonymous_name column. If there is a value in the anonymous_name column, everything will be redacted for the visualization. The redacted data will all be kept in the anonymous_key.csv. When you are picking a tag value for data that will be anonymized use, use the group name for whom you are anonymizing the data. For instance, when you anonymize samples from our group choose "English" as the tag name for all samples. You can look at the anonymous_key.csv to see the anonymous names for each tag.

After you run TMP_comparative_docker.py you will get a csv file for each comparison you ran. Copy all of those files to /Scratch/Sam/projects/github/JGEnglishLab.github.io/tre_mpra/data/comp_data.

Run parse_vis_input.py

parse_vis_input.py will take all the files from comp_data/, emp_data, and meta_data and produce current_runs.csv, current_runs_alpha_data.csv and current_runs_meta_data.csv. These new CSVs are the input for the visualizer. Running parse_vis_input.py will overwrite the output files already in the directory!!! If you don't want to lose the old files make a copy of them before you run parse_vis_input.py

Commit and push the new data

Checking it locally

Before you actually push anything to GitHub, I recommend that you test the visualizer locally to make sure everything looks they way it should. To do this, install Visual Studio Code. Click on the extension window Screen Shot 2024-03-21 at 11 25 38 AM and install the "Live Server" extension. Screen Shot 2024-03-21 at 11 26 37 AM

After it is installed. Open the directory in Visual Studio Code where you cloned the repository. Then click "Go Live" button. (Should be on the bottom right hand side of the screen.)Screen Shot 2024-03-21 at 11 29 56 AM

That should open a tab in your internet browser. Click on mpra_vis.html to open the visualization. (It may be in the JGEnglishLab.gihub.io/ folder.) This will pull up a local version of the visualization. Verify that everything looks right. If it does then you are ready to push your data.

Going live

To go live, open a terminal where you cloned the repository. If you type git status you should see a list of files that have been changed.

You will need to git add the files that you want to commit. Specifically you should git add the following files.

  1. current_runs.csv
  2. current_runs_alpha_data.csv
  3. current_runs_meta_data.csv

After you've used git add to add those files. Do git status one more time to make sure that you added everything that you want. Then commit the changes by typing git commit -m "..." (Replace the ... with a message describing the changes you are about to make.) Then type git push. Once the push has finished your new data should be live!

Clone this wiki locally