Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
nargesr committed Jan 18, 2024
1 parent 4346e87 commit 6ff0d05
Show file tree
Hide file tree
Showing 20 changed files with 2,997 additions and 110 deletions.
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ git cloning the [Topyfic repository](https://github.com/mortazavilab/Topyfic), g

In general, you need to make three objects (Train, TopModel and Analysis).

![Topyfic workflow](docs/Topyfic_workflow.png)

The Train object can be initialized either from (a) single cell RNA-seq dataset or (b) single cell ATAC-seq or (c) bulk RNA-seq.

Training part can be time-consuming depending on how big your data is, however you can learn each train model per random state in different jobs and then combine all together. Look at [this tutorial](tutorials/make_train_object.ipynb) for mor information.
Expand All @@ -49,6 +51,8 @@ For guidance on using Topyfic to analyze your data look at our more depth-in tut

- [Analysing single cell C2C12 data only using regulatory elements](tutorials/C2C12_TFs_mirhgs_chromreg/C2C12.ipynb): Analysing single cell and single nucleus using C2C12 ENCODE datasets using regulatory elements instead of all genes.
- [Analysing single cell microglia data](tutorials/microglia_all_genes/microglia.ipynb): Analysing single cell microglia data from [Model-AD portal](https://www.model-ad.org/).
- [Analysing ENCODE time course hippocampus data](tutorials/ENCODE_Hipp_parse_10x/analysing.ipynb): Analysing parse single-nucleus RNA-seq data and RNA part of 10x multiome hippocampus data from ENCODE.


If you are using other methods to learn your topics but you are still interested in doing downstream analysis, you can embeded your results in the format describe [here](tutorials/topic_modeling_model.md). Once you have all your files ready you can embed them Topyfic format following the instruction in the same [tutorial](tutorials/topic_modeling_model.md).

Expand Down
6 changes: 4 additions & 2 deletions Topyfic/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ def my_autopct(pct):

def structure_plot(self,
level,
category,
category=None,
topic_order=None,
ascending=None,
metaData=None,
Expand Down Expand Up @@ -217,6 +217,8 @@ def structure_plot(self,
:param file_name: name and path of the plot use for save (default: piechart_topicAvgCell)
:type file_name: str
"""
if category is None:
category = self.cell_participation.obs[level].unique().tolist()
if figsize is None:
figsize = (10 * (len(category) + 1), 10)

Expand Down Expand Up @@ -325,7 +327,7 @@ def structure_plot(self,
axs[0, i].set_title(category[i], fontsize=40)
axs[0, i].set_ylim(0, 1)
axs[0, i].set_xlim(0, a[i])
axs[0, 0].set_ylabel("Topic proportion", fontsize=25)
axs[0, 0].set_ylabel("Topic proportion", fontsize=30)

tissue = tissue[metaData]
tissue = tissue.reindex(tmp.index.tolist())
Expand Down
Binary file added docs/Topyfic_workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/doctrees/api.doctree
Binary file not shown.
Binary file modified docs/doctrees/environment.pickle
Binary file not shown.
10 changes: 4 additions & 6 deletions docs/html/_modules/Topyfic/topic.html
Original file line number Diff line number Diff line change
Expand Up @@ -287,16 +287,14 @@ <h1>Source code for Topyfic.topic</h1><div class="highlight"><pre>
<span class="sd"> :type save: bool</span>
<span class="sd"> &quot;&quot;&quot;</span>

<span class="c1"># check require columns</span>
<span class="n">cols</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">gene_information</span><span class="o">.</span><span class="n">reset_index</span><span class="p">()</span><span class="o">.</span><span class="n">columns</span>
<span class="k">if</span> <span class="ow">not</span> <span class="p">{</span><span class="s1">&#39;gene_name&#39;</span><span class="p">,</span> <span class="s1">&#39;gene_id&#39;</span><span class="p">}</span><span class="o">.</span><span class="n">issubset</span><span class="p">(</span><span class="n">cols</span><span class="p">):</span>
<span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Gene information doesn&#39;t contain gene_name and gene_id columns!&quot;</span><span class="p">)</span>

<span class="c1"># Open the file and load the file</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">model_yaml_path</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">model_yaml</span> <span class="o">=</span> <span class="n">yaml</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">Loader</span><span class="o">=</span><span class="n">SafeLoader</span><span class="p">)</span>

<span class="k">if</span> <span class="n">topic_id</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">model_yaml</span><span class="p">[</span><span class="s1">&#39;Topic IDs&#39;</span><span class="p">]:</span>
<span class="k">if</span> <span class="n">topic_id</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">topic_id</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">id</span>

<span class="k">if</span> <span class="n">topic_id</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">model_yaml</span><span class="p">[</span><span class="s1">&#39;Topic file_name(s)&#39;</span><span class="p">]:</span>
<span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="s2">&quot;Topic_id is not in model YAML file!&quot;</span><span class="p">)</span>

<span class="n">topic_yaml</span> <span class="o">=</span> <span class="p">{</span><span class="s1">&#39;Topic ID&#39;</span><span class="p">:</span> <span class="n">topic_id</span><span class="p">,</span>
Expand Down
55 changes: 43 additions & 12 deletions docs/html/_modules/Topyfic/train.html
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ <h1>Source code for Topyfic.train</h1><div class="highlight"><pre>
<span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">repeat</span>
<span class="kn">import</span> <span class="nn">pickle</span>
<span class="kn">from</span> <span class="nn">sklearn.decomposition</span> <span class="kn">import</span> <span class="n">LatentDirichletAllocation</span>
<span class="kn">import</span> <span class="nn">h5py</span>

<span class="kn">from</span> <span class="nn">Topyfic.topModel</span> <span class="kn">import</span> <span class="n">TopModel</span>

Expand Down Expand Up @@ -301,23 +302,53 @@ <h1>Source code for Topyfic.train</h1><div class="highlight"><pre>

<span class="k">return</span> <span class="n">all_components</span><span class="p">,</span> <span class="n">all_exp_dirichlet_component</span><span class="p">,</span> <span class="n">all_others</span></div>

<div class="viewcode-block" id="Train.save_train"><a class="viewcode-back" href="../../api.html#Topyfic.train.Train.save_train">[docs]</a> <span class="k">def</span> <span class="nf">save_train</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">save_path</span><span class="o">=</span><span class="s2">&quot;&quot;</span><span class="p">):</span>
<div class="viewcode-block" id="Train.save_train"><a class="viewcode-back" href="../../api.html#Topyfic.train.Train.save_train">[docs]</a> <span class="k">def</span> <span class="nf">save_train</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">save_path</span><span class="o">=</span><span class="s2">&quot;&quot;</span><span class="p">,</span> <span class="n">file_format</span><span class="o">=</span><span class="s1">&#39;pickle&#39;</span><span class="p">):</span>
<span class="sd">&quot;&quot;&quot;</span>
<span class="sd"> save Train class as a pickle file</span>

<span class="sd"> :param name: name of the pickle file (default is train_Train.name)</span>
<span class="sd"> :type name: str</span>
<span class="sd"> :param save_path: directory you want to use to save pickle file (default is saving near script)</span>
<span class="sd"> :type save_path: str</span>
<span class="sd"> save Train class as a pickle file</span>

<span class="sd"> :param name: name of the pickle file (default is train_Train.name)</span>
<span class="sd"> :type name: str</span>
<span class="sd"> :param save_path: directory you want to use to save pickle file (default is saving near script)</span>
<span class="sd"> :type save_path: str</span>
<span class="sd"> :param file_format: format of the file you want to save (option: pickle (default), HDF5)</span>
<span class="sd"> :type file_format: str</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">if</span> <span class="n">file_format</span> <span class="ow">not</span> <span class="ow">in</span> <span class="p">[</span><span class="s1">&#39;pickle&#39;</span><span class="p">,</span> <span class="s1">&#39;HDF5&#39;</span><span class="p">]:</span>
<span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">file_format</span><span class="si">}</span><span class="s2"> is not correct! It should be &#39;pickle&#39; or &#39;HDF5&#39;.&quot;</span><span class="p">)</span>
<span class="k">if</span> <span class="n">name</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">name</span> <span class="o">=</span> <span class="sa">f</span><span class="s2">&quot;train_</span><span class="si">{</span><span class="bp">self</span><span class="o">.</span><span class="n">name</span><span class="si">}</span><span class="s2">&quot;</span>

<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Saving train class as </span><span class="si">{</span><span class="n">name</span><span class="si">}</span><span class="s2">.p&quot;</span><span class="p">)</span>

<span class="n">picklefile</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">save_path</span><span class="si">}{</span><span class="n">name</span><span class="si">}</span><span class="s2">.p&quot;</span><span class="p">,</span> <span class="s2">&quot;wb&quot;</span><span class="p">)</span>
<span class="n">pickle</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">picklefile</span><span class="p">)</span>
<span class="n">picklefile</span><span class="o">.</span><span class="n">close</span><span class="p">()</span></div></div>
<span class="k">if</span> <span class="n">file_format</span> <span class="o">==</span> <span class="s2">&quot;pickle&quot;</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Saving train as </span><span class="si">{</span><span class="n">name</span><span class="si">}</span><span class="s2">.p&quot;</span><span class="p">)</span>

<span class="n">picklefile</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">save_path</span><span class="si">}{</span><span class="n">name</span><span class="si">}</span><span class="s2">.p&quot;</span><span class="p">,</span> <span class="s2">&quot;wb&quot;</span><span class="p">)</span>
<span class="n">pickle</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">picklefile</span><span class="p">)</span>
<span class="n">picklefile</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>

<span class="k">if</span> <span class="n">file_format</span> <span class="o">==</span> <span class="s2">&quot;HDF5&quot;</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Saving train as </span><span class="si">{</span><span class="n">name</span><span class="si">}</span><span class="s2">.h5&quot;</span><span class="p">)</span>

<span class="n">f</span> <span class="o">=</span> <span class="n">h5py</span><span class="o">.</span><span class="n">File</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">name</span><span class="si">}</span><span class="s2">.h5&quot;</span><span class="p">,</span> <span class="s2">&quot;w&quot;</span><span class="p">)</span>

<span class="c1"># models</span>
<span class="n">models</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">create_group</span><span class="p">(</span><span class="s2">&quot;models&quot;</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">top_models</span><span class="p">)):</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">create_group</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
<span class="n">model</span><span class="p">[</span><span class="s1">&#39;components_&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">top_models</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">components_</span>
<span class="n">model</span><span class="p">[</span><span class="s1">&#39;exp_dirichlet_component_&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">top_models</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">exp_dirichlet_component_</span>
<span class="n">model</span><span class="p">[</span><span class="s1">&#39;n_batch_iter_&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">int_</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">top_models</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">n_batch_iter_</span><span class="p">)</span>
<span class="n">model</span><span class="p">[</span><span class="s1">&#39;n_features_in_&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">top_models</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">n_features_in_</span>
<span class="n">model</span><span class="p">[</span><span class="s1">&#39;n_iter_&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">int_</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">top_models</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">n_iter_</span><span class="p">)</span>
<span class="n">model</span><span class="p">[</span><span class="s1">&#39;bound_&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">float_</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">top_models</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">bound_</span><span class="p">)</span>
<span class="n">model</span><span class="p">[</span><span class="s1">&#39;doc_topic_prior_&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">float_</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">top_models</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">doc_topic_prior_</span><span class="p">)</span>
<span class="n">model</span><span class="p">[</span><span class="s1">&#39;topic_word_prior_&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">float_</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">top_models</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">topic_word_prior_</span><span class="p">)</span>

<span class="n">f</span><span class="p">[</span><span class="s1">&#39;name&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">string_</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">name</span><span class="p">)</span>
<span class="n">f</span><span class="p">[</span><span class="s1">&#39;k&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">int_</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">k</span><span class="p">)</span>
<span class="n">f</span><span class="p">[</span><span class="s1">&#39;n_runs&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">int_</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">n_runs</span><span class="p">)</span>
<span class="n">f</span><span class="p">[</span><span class="s1">&#39;random_state_range&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">random_state_range</span><span class="p">))</span>

<span class="n">f</span><span class="o">.</span><span class="n">close</span><span class="p">()</span></div></div>
</pre></div>

</div>
Expand Down
Loading

0 comments on commit 6ff0d05

Please sign in to comment.