Skip to content

Commit

Permalink
updates: lit ref & sigs & touch-ups
Browse files Browse the repository at this point in the history
 - Update  the reference in README.md to peer-reviewed publication
   in Nature Microbiology
 - Signature in sync with what V-pipe is using in production for
   Wastewater Surveillance in Switzuerland
 - Commands update in README.md
  • Loading branch information
DrYak committed Mar 29, 2023
1 parent 327a5e7 commit 436b520
Show file tree
Hide file tree
Showing 13 changed files with 341 additions and 226 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,9 +243,9 @@ Select a directory containing a collection of virus definitions YAMLs using the
# fetch the repository of standardised variant definitions
git clone https://github.com/phe-genomics/variant_definitions.git
# generate a YAML for omicron subvariant BA.2 using the corresponding standardised variant definitions
phe2cojac --shortname 'om2' --yaml voc/omicron_ba2_mutations.yaml variant_definitions/variant_yaml/imagines-viewable.yml
cojac phe2cojac --shortname 'om2' --yaml voc/omicron_ba2_mutations.yaml variant_definitions/variant_yaml/imagines-viewable.yml
# now have a look at the frequencies of mutations using CoV-Spectrum
cooc-curate voc/omicron_ba2_mutations.yaml
cojac cooc-curate voc/omicron_ba2_mutations.yaml
# adjust the content of the YAML files to your needs
```

Expand Down Expand Up @@ -422,7 +422,7 @@ You can install _cojac_ in its own environment and activate it:
conda create -n cojac cojac
conda activate cojac
# test it
cojac cooc-mutbamscan --help
cojac --help
```

And to update it to the latest version, run:
Expand Down Expand Up @@ -461,7 +461,7 @@ cojac should now be accessible from your PATH
```bash
# activate the environment if not already active:
conda activate cojac
cojac cooc-mutbamscan --help
cojac --help
```

### Remove conda environment
Expand Down Expand Up @@ -525,11 +525,11 @@ Long term goal:

If you use this software in your research, please cite:

- Katharina Jahn, David Dreifuss, Ivan Topolsky, Anina Kull, Pravin Ganesanandamoorthy, Xavier Fernandez-Cassi, Carola Bänziger, Elyse Stachler, Lara Fuhrmann, Kim Philipp Jablonski, Chaoran Chen, Catharine Aquino, Tanja Stadler, Christoph Ort, Tamar Kohn, Timothy R. Julian, Niko Beerenwinkel
- Katharina Jahn, David Dreifuss, Ivan Topolsky, Anina Kull, Pravin Ganesanandamoorthy, Xavier Fernandez-Cassi, Carola Bänziger, Alexander J. Devaux, Elyse Stachler, Lea Caduff, Federica Cariti, Alex Tuñas Corzón, Lara Fuhrmann, Chaoran Chen, Kim Philipp Jablonski, Sarah Nadeau, Mirjam Feldkamp, Christian Beisel, Catharine Aquino, Tanja Stadler, Christoph Ort, Tamar Kohn, Timothy R. Julian & Niko Beerenwinkel

"*Detection of SARS-CoV-2 variants in Switzerland by genomic analysis of wastewater samples*."
"*Early detection and surveillance of SARS-CoV-2 genomic variants in wastewater using COJAC*."

medRxiv 2021.01.08.21249379; [doi:10.1101/2021.01.08.21249379](https://doi.org/10.1101/2021.01.08.21249379)
Nature Microbiology volume 7, pages 1151–1160 (2022); [doi:10.1038/s41564-022-01185-x](https://doi.org/10.1038/s41564-022-01185-x)

## Contacts

Expand Down
6 changes: 3 additions & 3 deletions tests/test_integration.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,9 @@ def test_workflow():
"cooc-curate",
"-a",
"amplicons.v3.yaml",
"voc/omicron_ba2_mutations.yaml",
"voc/omicron_ba1_mutations.yaml",
"voc/delta_mutations.yaml",
"voc/omicron_ba2_mutations_full.yaml",
"voc/omicron_ba1_mutations_full.yaml",
"voc/delta_mutations_full.yaml",
]
)
subprocess.check_call(
Expand Down
51 changes: 25 additions & 26 deletions voc/alpha_mutations.yaml → voc/alpha_mutations_full.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,45 +8,44 @@ source:
- https://virological.org/t/preliminary-genomic-characterisation-of-an-emergent-sars-cov-2-lineage-in-the-uk-defined-by-a-novel-set-of-spike-mutations/563
- https://www.gov.uk/government/publications/investigation-of-novel-sars-cov-2-variant-variant-of-concern-20201201#attachment_4970549
threshold: 5

mut:
# ORF1ab
241: 'C>T'
913: 'C>T'
3037: 'C>T'
3267: 'C>T' # T1001I
5388: 'C>A' # A1708D
5986: 'C>T'
6954: 'T>C' # I2230T
# S
11288: '-----' # see below:
#actually, due to repeated 'T's flanking the region, can be any of these:
#11284: '---------' # SGF3675del
#11288: '---------' # SGF3675del
14408: 'C>T'
14676: 'C>T'
15279: 'C>T'
16176: 'T>C'
21765: '------' # T1001I
21993: '-' # see below:
#actually, due to repeated 'T's flanking the region, can be any of these:
#21991: '---' # 144 -Y
#21993: '---' # 144 -Y
23063: 'A>T' # N501Y !RBD
23271: 'C>A' # A570D
23403: 'A>G'
23604: 'C>A' # P681H
23709: 'C>T' # T716I
24506: 'T>G' # S982A
24914: 'G>C' # D1118H
# ORF8
27972: 'C>T' # Q27stop
28048: 'G>T' # R52I
28111: 'A>G' # Y73C
# N
28271: '-'
28280: 'GAT>CTA' # D3L
28881: 'GGG>AAC'
28977: 'C>T' # S235F
extra:
# ORF1ab
11288: '---------' # 3675-3677 -SGF
913: 'C>T' # syn
5986: 'C>T' # syn
14676: 'C>T' # syn
15279: 'C>T' # syn
16176: 'T>C' # syn
# S
21765: '------' # 69-70 -HV
21991: '---' # 144 -Y
# M
26801: 'T>C' # syn


## Used by nextclade, but shared with...
## ... 20A / B.1
# 8782: 'C'
# 14408: 'T'
# 23403: 'G'
## ... 20B / B.1.1
# 28881: 'A'
# 28882: 'A
#revert:
## used by Nextclade's signatures.
#8782: 'C'
#26801: 'T>C' # syn
54 changes: 0 additions & 54 deletions voc/beta_mutations.yaml

This file was deleted.

50 changes: 50 additions & 0 deletions voc/beta_mutations_full.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
variant:
short: 'be'
who: 'beta'
pangolin: 'B.1.351'
nextstrain: '20H/501Y.V2'
voc: 'VOC-202012/02'
source:
- doi:10.1101/2020.12.21.20248640
- https://www.gov.uk/government/publications/investigation-of-novel-sars-cov-2-variant-variant-of-concern-20201201#attachment_4970549
threshold: 4
# or 5 out of 9 (*)
mut:
174: 'G>T'
241: 'C>T'
1059: 'C>T' # ORF1ab:T265I
3037: 'C>T'
5230: 'G>T' # ORF1ab:K1655N (*)
10323: 'A>G' # ORF1ab:K3353R
11288: '-----' # see below:
#actually, due to repeated 'T's flanking the region, can be any of these:
#11284: '---------' # ORF1ab:SGF3675del
#11288: '---------' # ORF1ab:SGF3675del
14408: 'C>T' # ORF1ab:P4715L
21801: 'A>C' # S:D80A (*)
22206: 'A>G' # D215G 'not fixed', but determining(*) in source 2

22287: '---' # see below:
#actually, due to repeated 'CTTTAC' motive, can be any of these:
#22281: '---------' <- left-most possibility
#22283: '---------' <- covspectrum
#22286: '---------' <- original pub 'not fixed & un-resolved'
#22287: '---------' <- right-most
# phegenomics seems wrong due to this causing a mutation 22280: A>C
#22280: '---------' <-

22813: 'G>T' # S:K417N !RBD (*), but source 2 calls it non-determining
23012: 'G>A' # S:E484K !RBD
23063: 'A>T' # S:N501Y !RBD
23403: 'A>G' # S:D614G
23664: 'C>T' # S:A701V
25563: 'G>T' # ORF3a:Q57H
25904: 'C>T' # ORF3a:S171L
26456: 'C>T' # E:P71L (*)
28253: 'C>T'
28887: 'C>T' # N:T205I (*)

#revert:
## Used by nextclade
# 8782: 'C'

28 changes: 23 additions & 5 deletions voc/delta_mutations.yaml → voc/delta_mutations_full.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,20 +9,38 @@ source:
- https://github.com/cov-lineages/pango-designation/issues/49
threshold: 4
mut:
# S
210: 'G>T'
241: 'C>T'
3037: 'C>T'
4181: 'G>T'
6402: 'C>T'
7124: 'C>T'
8986: 'C>T'
9053: 'G>T'
10029: 'C>T'
11201: 'A>G'
11332: 'A>G'
14408: 'C>T'
15451: 'G>A'
16466: 'C>T'
19220: 'C>T'
21618: 'C>G' # surface glycoprotein:T19R
21987: 'G>A'
22029: '------'
22917: 'T>G' # surface glycoprotein:L452R
22995: 'C>A' # surface glycoprotein:T478K
23403: 'A>G'
23604: 'C>G' # surface glycoprotein:P681R
24410: 'G>A' # surface glycoprotein:D950N
# ORF3a
25469: 'C>T' # ORF3a protein:S26L
# M
26767: 'T>C' # membrane glycoprotein:I82T
# ORF7a
27638: 'T>C' # ORF7a protein:V82A
27752: 'C>T' # ORF7a protein:T120I
# N
27874: 'C>T'
28248: '------'
28271: '-'
28461: 'A>G' # nucleocapsid phosphoprotein:D63G
28881: 'G>T' # nucleocapsid phosphoprotein:R203M
28916: 'G>T'
29402: 'G>T' # nucleocapsid phosphoprotein:D377Y
29742: 'G>T'
54 changes: 0 additions & 54 deletions voc/gamma_mutations.yaml

This file was deleted.

47 changes: 47 additions & 0 deletions voc/gamma_mutations_full.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
variant:
short: 'ga'
who: 'gamma'
pangolin: 'P.1'
nextstrain: '20J/501Y.V3'
voc: 'VOC-202101/02'
source:
- https://virological.org/t/genomic-characterisation-of-an-emergent-sars-cov-2-lineage-in-manaus-preliminary-findings/586
- doi:10.1101/2021.02.26.21252554
- https://www.gov.uk/government/publications/investigation-of-novel-sars-cov-2-variant-variant-of-concern-20201201#attachment_4970549
threshold: 5
mut:
241: 'C>T'
733: 'T>C' # syn
2749: 'C>T' # syn
3037: 'C>T'
3828: 'C>T' # ORF1ab:S1188L
5648: 'A>C' # ORF1ab:K1795Q
6319: 'A>G'
6613: 'A>G'
11288: '-----' # see below:
#actually, due to repeated 'T's flanking the region, can be any of these:
#11284: '---------' # SGF3675del
#11288: '---------' # SGF3675del
12778: 'C>T' # syn
13860: 'C>T' # syn
14408: 'C>T'
17259: 'G>T' # ORF1ab:E5665D
21614: 'C>T' # S:L18F
21621: 'C>A' # S:T20N
21638: 'C>T' # S:P26S
21974: 'G>T' # S:D138Y
22132: 'G>T' # S:R190S
22812: 'A>C' # S:K417T !RBD
23012: 'G>A' # S:E484K !RBD
23063: 'A>T' # S:N501Y !RBD
23403: 'A>G'
23525: 'C>T' # S:H655Y
24642: 'C>T' # S:T1027I
25088: 'G>T' # S/1176F
26149: 'T>C' # ORF3A/253P
28167: 'G>A' # ORF8:E92K
28263: '+AACA'
28512: 'C>G' # N:P80R
28877: 'AG>TC' # N:syn
28881: 'GGG>AAC'
29834: 'T>A'
Loading

0 comments on commit 436b520

Please sign in to comment.