Skip to content

Commit

Permalink
Update manipulating_vcf.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Netzach authored Aug 22, 2024
1 parent a416881 commit 466e2e7
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions docs/Case_studies/manipulating_vcf.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,14 +71,15 @@ A table with counted and sorted different genomic features in chromosome 4.
Start by splitting the task into sub-tasks. This makes it easier to see what happens and you might get interesting intermediary results.

??? "_Hint_, Example"
Let's say we want to find out all genes that contains a variant and all variants that are located within a gene. What do we want to do first? Take a look at the vcf file. That is the one that contains all the variants. Then look at the gff file, which contains the genes and other annotations. Finally, take a look at the DNA sequence. You will need to combine all three to answer the question.

```Let's say we want to find out all genes that contains a variant and all variants that are located within a gene. What do we want to do first? Take a look at the vcf file. That is the one that contains all the variants. Then look at the gff file, which contains the genes and other annotations. Finally, take a look at the DNA sequence. You will need to combine all three to answer the question.
```
```
??? "_Hint_, Example, what do we need to get?"
* Positions for SNPs and INDELs
* Positions for genes and CDSs
* Separation of variants (SNPs and INDELs) into two groups, inside and outside genes (and CDSs)
* Separation of genes/CDSs into those with and without variants (and maybe how many there are per gene)

```
### The exercise
Identify the steps you need to do and what each step does. Open the hints if you get stuck.
Expand Down

0 comments on commit 466e2e7

Please sign in to comment.