-
Notifications
You must be signed in to change notification settings - Fork 3
Maintenance guide
-
Use data exported as described in here to run ActiveDriver and save
results$fdr
data frame as a tab-separated file. -
Replace the relevant file referred in
active_driver_gene_lists()
inimports/protein_data.py
or change the paths to point to the updated one. -
Remove old gene lists:
./manage.py remove protein_related -m GeneListEntry GeneList
- Load the new lists:
./manage.py load protein_related -i active_driver_gene_lists
- Commit the changes.
If a database schema migration is needed:
./manage.py migrate
Optional: make sure index is created:
./manage shell
>>> from database import get_engine
>>> engine = get_engine('bio', app)
>>> models.bio.diseases.ClinicalDataImportIndex.create(engine) # should raise exception if already there
Then reimport mutations from updated source (e.g. clinvar):
./manage.py remove mutations -s clinvar && ./manage.py load mutations -s clinvar
Sites and kinases are closely related entities; while reloading all sites, it is recommended to reload kinases data too (the kinases will be imported with sites).
./manage.py remove protein_related --models MIMPMutation
./manage.py remove protein_related --models Site SiteType SiteMotif Kinase KinaseGroup
./manage.py load protein_related -i kinase_mappings
./manage.py load protein_related -i kinase_classification
# note: the list below expands with new sites sources added
./manage.py load protein_related -i hprd others_uniprot glycosylation_uniprot phospho_site_plus phospho_elm
./manage.py load protein_related -i sites_motifs
./manage.py load mutations -s mimp
./manage.py load protein_related -i calculate_interactors # needs to be run if kinases are updated
./manage.py load protein_related -i precompute_ptm_mutations
When migrating the CMS also remember to copy the static files:
cp -r ActiveDriverDB-2018/website/static/uploaded ActiveDriverDB-2020/website/static/uploaded
Upgrading Python to a new point release will be required periodically. As the major parts of the ADDB are built with Python, this is an important step.
If the newer Python3 version is not present in your distribution repositories, the best way is to install it manually with:
# install sqlite3 dependencies
sudo apt-get install sqlite3 libsqlite3-dev sqlite3-pcre
# install other, additional Python 3.7 dependencies
sudo apt-get install libncursesw5-dev libssl-dev tk-dev libgdbm-dev libc6-dev libbz2-dev uuid uuid-dev
# this one is required for Python 3.7
sudo apt-get install libffi-dev
## if still using an old Jessie Debian, a backport version of openssl (1.0.2 instead of 1.0.1) will be needed for Python 3.7
## add following line to /etc/apt/sources.list.d/sources.list:
## deb http://ftp.debian.org/debian jessie-backports main
## and then use install the backport:
# sudo apt-get -t jessie-backports install openssl
# download
wget https://www.python.org/ftp/python/3.7.3/Python-3.7.3.tgz -O Python.tgz
tar -xvf Python.tgz
rm Python.tgz
cd Python-3.7.3
# note: it will take much more time to build with optimizations enabled
# also, for --enable-optimizations a modern gcc (8+) is needed, or otherwise, the build will end with errors
# --enable-loadable-sqlite-extensions is normally only required for tests
./configure --enable-shared --enable-optimizations --enable-loadable-sqlite-extensions
LDFLAGS="-R/usr/local/lib"
make
sudo make altinstall
export LD_LIBRARY_PATH=/usr/local/lib
Create a new virtual environment
python3.7 -m venv virtual_environment_3.7
cp virtual_environment/bin/activate_this.py virtual_environment_3.7/bin/activate_this.py
Install all requirements:
source virtual_environment_3.7/bin/activate
cd ActiveDriverDB/website/
pip install --upgrade pip
pip install -r requirements.txt
pip install -r tests/requirements.txt
Run tests:
./run_tests.sh
Replace path to the environment (virtual_environment -> virtual_environment_3.7) in WSGI script:
vim app.wsgi
Or make a symbolic link to the virtual_environment instead of modifying the app.wsgi
mv virtual_environment virtual_environment_3.{old_version}
ln -s virtual_environment_3.7 virtual_environment
If your installation was relying on repository provided apache2-wsgi mod integration, uninstall it
sudo apt-get remove libapache2-mod-wsgi-py3
sudo apt-get install apache2-dev
And install (and enable it) from within the environment:
pip install mod_wsgi
sudo a2enmod wsgi
Replace content /etc/apache2/mods-enabled/wsgi.load
with output of:
mod_wsgi-express module-config
# e.g.:
# LoadModule wsgi_module "/path-to-your-environment/lib/python3.6/site-packages/mod_wsgi/server/mod_wsgi-py36.cpython-36m-x86_64-linux-gnu.so"
# WSGIPythonHome "/path-to-your-environment/"
Restart apache2:
sudo /etc/init.d/apache2 restart
Restart celery:
sudo /etc/init.d/celeryd restart
- When upgrading Apache2 version, see: https://httpd.apache.org/docs/trunk/upgrading.html
- For issues with mod_wsgi see: https://modwsgi.readthedocs.io/en/develop/user-guides/debugging-techniques.html
- If mod_wsgi does not work properly with upgraded BerkeleyDB, compile bsddb3 against the version provided in repositories instead
There are three places where persistent information is stored:
- SQL database of biological entities (imported once using the import scripts, treated as immutable later)
- SQL database of Content Management System (CMS) - storing editable pages, plots, user and website configuration data
- Two BerkleyDB key-value hash maps, by default stored in
databases
directory, albeit having customizable location withBDB_DNA_TO_PROTEIN_PATH
andBDB_GENE_TO_ISOFORM_PATH
variables (inconfig.py
file) - Additionally, separated, temporary files containing dumps of user uploaded mutations are stored in
user_mutations
directory. These files are removed after one week.
Please note, that while backup of (2) and (4) is recommended on a frequent basis, backup of (1) and (3) can be performed only once after data import and repeated after every update.
mysqldump db_cms -u cms_user -p | gzip -c > "db_cms_dump_`date '+%Y_%m_%d__%H_%M_%S'`.gz"
mysqldump db_bio -u bio_user -p | gzip -c > "db_bio_dump_`date '+%Y_%m_%d__%H_%M_%S'`.gz"
Entering the commands will invoke password prompt; please note that there will be two passwords to enter - for cms_user and bio_user - which may (and ought to) be different.
Where both username (after -u
) and database name (after mysqldump
) may vary, depending on your local installation.
Please check SQLALCHEMY_BINDS
variable to determine these.
tar cf - databases | gzip -9 > "bdb_dump_`date '+%Y_%m_%d__%H_%M_%S'`.tar.gz"
tar cf - user_mutations | gzip -9 > "user_mutations_dump_`date '+%Y_%m_%d__%H_%M_%S'`.tar.gz"
Please note that user mutations backups files should be removed on a weekly basis.
./manage.py shell
from tqdm import tqdm
for pathway in tqdm(Pathway.query.all()):
db.session.close()
db.session.delete(pathway)
db.session.commit()