Proxy that allows for
- pushing prometheus metrics signed with bittensor wallets. Operating in this manner does not require a db or redis.
- verifying incoming signed metrics. Operating in this manner does not require a wallet. Verification is two-fold:
- the full payload is signed, both the signature and the hotkey are included in the headers - that is verified
- the metrics data blob is unpacked and each metric is checked for the "hotkey" label - it has to be the same as the value in the header
- docker with compose plugin
- python 3.11
- pdm
- nox
./setup-dev.sh
docker compose up -d # this will also start node_Exporter and two prometheus instances
cd app/src
pdm run manage.py wait_for_database --timeout 10
pdm run manage.py migrate
pdm run manage.py runserver 0.0.0.0:8000
this setup requires a working bittensor wallet (for the on-site prometheus to read the hotkey and so that the proxy can sign requests). Requests will be sent from on-site prometheus to proxy then to the same proxy (different view though) and to the central prometheus. Starting celery and celery beat is not, however, required for local development, because instead of having a periodic task populate the validator list, one can add records to it manually using
python manage.py debug_add_validator <hotkey>
This sets up "deployment by pushing to git storage on remote", so that:
git push origin ...
just pushes code to Github / other storage without any consequences;git push production master
pushes code to a remote server running the app and triggers a git hook to redeploy the application.
Local .git ------------> Origin .git
\
------> Production .git (redeploy on push)
Use ssh-keygen
to generate a key pair for the server, then add read-only access to repository in "deployment keys" section (ssh -A
is easy to use, but not safe).
# remote server
mkdir -p ~/repos
cd ~/repos
git init --bare --initial-branch=master bittensor-prometheus-proxy.git
mkdir -p ~/domains/bittensor-prometheus-proxy
# locally
git remote add production root@<server>:~/repos/bittensor-prometheus-proxy.git
git push production master
# remote server
cd ~/repos/bittensor-prometheus-proxy.git
cat <<'EOT' > hooks/post-receive
#!/bin/bash
unset GIT_INDEX_FILE
export ROOT=/root
export REPO=bittensor-prometheus-proxy
while read oldrev newrev ref
do
if [[ $ref =~ .*/master$ ]]; then
export GIT_DIR="$ROOT/repos/$REPO.git/"
export GIT_WORK_TREE="$ROOT/domains/$REPO/"
git checkout -f master
cd $GIT_WORK_TREE
./deploy.sh
else
echo "Doing nothing: only the master branch may be deployed on this server."
fi
done
EOT
chmod +x hooks/post-receive
./hooks/post-receive
cd ~/domains/bittensor-prometheus-proxy
sudo bin/prepare-os.sh
./setup-prod.sh
# adjust the `.env` file
mkdir letsencrypt
./letsencrypt_setup.sh
./deploy.sh
Only master
branch is used to redeploy an application.
If one wants to deploy other branch, force may be used to push desired branch to remote's master
:
git push --force production local-branch-to-deploy:master
A task should be annotated with on_failure=send_to_dead_letter_queue
.
Once the reason of tasks failure is fixed, the task can be re-processed
by moving tasks from dead letter queue to the main one ("celery"):
manage.py move_tasks "dead_letter" "celery"
If tasks fails again, it will be put back to dead letter queue.
To flush add tasks in specific queue, use
manage.py flush_tasks "dead_letter"
Running the app requires proper certificates to be put into nginx/monitoring_certs
,
see nginx/monitoring_certs/README.md for more details.
Somewhere, probably in metrics.py
:
some_calculation_time = prometheus_client.Histogram(
'some_calculation_time',
'How Long it took to calculate something',
namespace='django',
unit='seconds',
labelnames=['task_type_for_example'],
buckets=[0.5, 1, *range(2, 30, 2), *range(30, 75, 5), *range(75, 135, 15)]
)
Somewhere else:
with some_calculation_time.labels('blabla').time():
do_some_work()
Click to for backup setup & recovery information
Add to crontab:
# crontab -e
30 0 * * * cd ~/domains/bittensor-prometheus-proxy && ./bin/backup-db.sh > ~/backup.log 2>&1
Set BACKUP_LOCAL_ROTATE_KEEP_LAST
to keep only a specific number of most recent backups in local .backups
directory.
Backups are put in .backups
directory locally, additionally then can be stored offsite in following ways:
Backblaze
Set in .env
file:
BACKUP_B2_BUCKET_NAME
BACKUP_B2_KEY_ID
BACKUP_B2_KEY_SECRET
Set in .env
file:
EMAIL_HOST
EMAIL_PORT
EMAIL_HOST_USER
EMAIL_HOST_PASSWORD
EMAIL_TARGET
- Follow the instructions above to set up a new production environment
- Restore the database using bin/restore-db.sh
- See if everything works
- Set up backups on the new machine
- Make sure everything is filled up in .env, error reporting integration, email accounts etc
Skeleton of this project was generated using cookiecutter-rt-django.
Use cruft update
to update the project to the latest version of the template with all current bugfixes and features.