Releases: EpistasisLab/AlzKB
Version 2.0.0
What's Changed (Aug 31, 2024)
-
Data Updates in Alzkb:
- DrugBank: Updated to version 5.1.12 (2024-03-14)
- NCBI Gene: Updated to V2024-05-13
- Gene Ontology: Updated to V2024-04-24*
- MESH: Updated to V2023-12*
- Uberon: Updated to V2024-03-22*
- DrugCentral: Updated to V2023-11-01*
- BindingDB: Updated to V2024-05*
- MEDLINE: Updated to V2024-05-02*
*Updates based on Hetionet. Please see the alzkb-updates Github repository for more details.
-
Enhancements:
- Added TranscriptionFactor nodes and TRANSCRIPTIONFACTORINTERACTSWITHGENE relationships.
- Added chromosome number as a property to gene nodes.
- Added sourcedatabase as properties to nodes.
- Added correlation, score, p_fisher, z_score, affinity_nm, confidence, sourcedatabase, and unbiased, from Hetionet, DisGeNET, and DoRothEA as properties to relationships.
The instructions for adding new data resources and importing data to the Memgraph graph database are available at alzkb Github repository.
-
Data Quality Improvements:
- Removed the mapping between Creutzfeldt-Jakob disease (CJD) and Familial Alzheimer Disease (FAD). CJD and FAD are different diseases but got merged to the same node in AlzKB because of the DisGeNET “disease_mappings.tsv” file, in which CJD is mapped to FAD.
- Filtered genes to keep human genes only (tax-id = 9606).
- Implemented case-insensitive matching when extracting Alzheimer’s data from DisGeNET to include disease names that are in all caps.
- Consolidated pathways with the same names but different values of pathwayid and sourcedatabase.
- Removed duplicated pathways from AOP-DB that have “Homo sapiens (human)” in their names.
- Removed 21,724 Drug nodes from AOP-DB that had only xrefmesh values and NULL as commonName and were not connected to any other nodes.
Summary of the changes in nodes and relationships
Nodes:
Label | NodeCount | NodeCount previous version | NumChanges |
---|---|---|---|
BiologicalProcess | 12322 | 11381 | 941 |
BodyPart | 652 | 402 | 250 |
CellularComponent | 1695 | 1391 | 304 |
Disease | 34 | 20 | 14 |
Drug | 16581 | 36959 | -20378 |
DrugClass | 474 | 345 | 129 |
Gene | 193279 | 193313 | -34 |
MolecularFunction | 3460 | 2884 | 576 |
Pathway | 4516 | 4570 | -54 |
Symptom | 505 | 438 | 67 |
TranscriptionFactor | 519 | 519 | |
Total | 234037 | 251703 | -17666 |
Relationships:
Type | RelCount | RelCount previous version | NumChanges |
---|---|---|---|
BODYPARTOVEREXPRESSESGENE | 97772 | 97772 | 0 |
BODYPARTUNDEREXPRESSESGENE | 102185 | 102185 | 0 |
CHEMICALBINDSGENE | 25726 | 11531 | 14195 |
CHEMICALDECREASESEXPRESSION | 21051 | 21051 | 0 |
CHEMICALINCREASESEXPRESSION | 18713 | 18713 | 0 |
DISEASELOCALIZESTOANATOMY | 33 | 29 | 4 |
DRUGCAUSESEFFECT | 2 | 2 | 0 |
DRUGINCLASS | 1945 | 1029 | 916 |
DRUGTREATSDISEASE | 9 | 9 | 0 |
GENEASSOCIATEDWITHCELLULARCOMPONENT | 88880 | 73553 | 15327 |
GENEASSOCIATESWITHDISEASE | 508 | 502 | 6 |
GENECOVARIESWITHGENE | 61606 | 61606 | 0 |
GENEHASMOLECULARFUNCTION | 104752 | 97191 | 7561 |
GENEINPATHWAY | 178991 | 179433 | -442 |
GENEINTERACTSWITHGENE | 147088 | 147001 | 87 |
GENEPARTICIPATESINBIOLOGICALPROCESS | 548285 | 559385 | -11100 |
GENEREGULATESGENE | 263978 | 265667 | -1689 |
SYMPTOMMANIFESTATIONOFDISEASE | 53 | 79 | -26 |
TRANSCRIPTIONFACTORINTERACTSWITHGENE | 6910 | 6910 | |
TOTAL | 1668487 | 1636738 | 31749 |
The full database dump can be downloaded from the following link: https://cedars.box.com/v/alzkb-v2-0-0
Instruction for Installing from the CYPHERL file can be found here.
Version 1.2.1
What's Changed (May, 2024)
- Migrated Alzkb from Neo4j to Memgraph 2.17.0.
The full database dump can be downloaded from the following link: https://cedars.box.com/v/alzkb-v1-2-1
Instruction for Installing from the CYPHERL file can be found here.
Version 1.2.0
What's Changed (Jan 24, 2024)
- Modified the graph schema adding two new relationship types between genes (imported from Hetionet): geneCovariesWithGene and geneRegulatesGene.
- Updated AlzKB with recent NCBI Gene data (2023-11-09).
Summary of the changes in nodes and relationships
Node (208,436 --> 251,704):
- Gene: 150,045 --> 193,313
Relationship (1,309,466 --> 1,636,739):
- geneCovariesWithGene: 61,606
- geneRegulatesGene: 265,667
The full database dump can be downloaded from the following link:
https://upenn.box.com/s/jmnlcc3zk9yzclky0uv8dd9q7nkxmcgm
or
https://cedars.box.com/v/alzkb-v1-2-0
Version 1.1.0
What's Changed (Aug 2, 2023)
- Updated the AlzKB with the most recent DrugBank data (V5.1.10 that was released on 2023-01-04) and NCBI Gene data (2023-05-03).
- Upgraded AlzKB to neo4j version 5.
Summary of the changes in nodes and relationships
Node (118,902 --> 208,436):
- Gene: 62407 --> 150045 (geneAssociatesWithDisease doesn't change)
- Drug from drug bank: 13339 --> 15235 (Drug 35063 -->36959)
Relationship (1,309,527 --> 1,309,466):
- bodyPartOverexpressesGene: 97782 --> 97772
- bodyPartUnderexpressesGene: 102194 --> 102185
- geneInPathway: 179464 --> 179433
- geneInteractsWithGene: 147008 --> 147001
- geneParticipatesInBiologicalProcess: 559389 --> 559385
The full database dump can be downloaded from the following link:
https://cedars.box.com/v/epistasislab-alzkb-v1-1-0
AlzKB first DOI release
This is a duplicate of the first release of AlzKB at https://github.com/EpistasisLab/AlzKB/releases/tag/v1.0.0 .
Version 1.0.0
This is our initial release of AlzKB!
The full database dump can be downloaded from the following link:
https://upenn.box.com/s/dalcofa8i7rkkc2h2n6bfg8nvmwi83pq
or
https://cedars.box.com/v/epistasislab-alzkb-v1-0-0
Advanced users have the option of building the database from scratch, although we are still working on finishing the documentation for this procedure..