ComputeHorde is a specialized subnet within the Bittensor network designed to supercharge Bittensor with scalable and trusted GPU computing power.
By transforming untrusted GPUs provided by miners into trusted compute resources, ComputeHorde enables validators of other subnets to access large amounts of decentralized computing power cost-effectively, paving the way for Bittensor to scale beyond its current limitations to support potentially over 1,000 subnets.
-
Decentralized Compute for Bittensor Validators
ComputeHorde aims to become the go-to decentralized source for hardware needed to validate other subnets. The mission is to decrease the Bittensor ecosystem's dependency on centralized services. The assurance of the miners' work quality is essential for the Bittensor's overall reliability. -
Fair and Verified Work
ComputeHorde employs mechanisms to ensure miners provide authentic compute work, fairly verified by the validators:- Execute tasks from validators stake-proportionally
- Handle both organic (external, from other subnets) and synthetic (ComputeHorde miners validation) tasks.
- Match jobs to the advertised hardware (e.g., ensuring A6000 GPUs are used for tasks requiring them).
- Prevent malicious behaviors like "weight-copying" through innovative validation mechanisms.
-
Scalable Mining with Executors
Each miner in ComputeHorde can spawn multiple executors, performing individual compute tasks. This removes the 256 miner (UID) limit and significantly scales the potentially available computing power. -
Hardware Classes
ComputeHorde introduces hardware classes to create a free market for GPU resources, balancing cost-effectiveness with performance. Currently, A6000 is the supported class, with A100 coming next. The end goal is to eventually support all GPU types/configurations required by validators across Bittensor subnets.
Bittensor is a decentralized network designed to ensure that AI, the most critical technology of our era, remains accessible to everyone and free from the control of centralized entities. Each Bittensor subnet specializes in a digital commodity, ranging from storage and large language models to general computing.
This is achieved by distributing $TAO tokens to incentivize:
- Subnet owners to define the most useful and reliable commodities (by designing the incentive mechanism),
- Miners to deliver high-quality and efficient services innovatively,
- Validators to reward miners based on their performance.
Bittensor's end goal is to create an unstoppable, self-sustaining ecosystem free from single-point control, enabling innovation and resilience for the entire network. ComputeHorde adds GPU-powered validation to this ecosystem, helping other subnets operate effectively without relying on centralized cloud services.
The scoring mechanism in ComputeHorde is designed to incentivize miners to perform organic jobs while maintaining accountability and fairness in the network.
The goal is to eliminate the current disincentive where miners avoid organic jobs to prevent penalties for rejecting synthetic jobs.
- 1 point for each successfully completed synthetic job.
- (in development) 1 point for each successfully completed organic job.
- (in development) 1 point for each properly rejected synthetic job.
A successfully completed job is one that finishes within a specified timeout.
A synthetic job is considered properly rejected when the miner provides a receipt proving they are currently occupied with an organic job from another validator (with a minimum of 50k stake).
Miners who implement dancing—moving their executors between different UIDs—receive a 30% bonus (as of December 2024) to their scores.
This encourages variance, which is essential for preventing weight-copying.
Each hardware class supported by ComputeHorde has a configurable weight parameter. These weights determine the relative contribution of a miner's work to their ultimate score.
This system allows the network to prioritize specific hardware classes based on utility and demand, creating a flexible and fair reward structure.
- Acts as a gateway for organic requests (from other subnets’ validators) to enter ComputeHorde.
- Sends tasks to chosen validators, who then distribute them to miners.
- Receives organic requests via the Facilitator or generates synthetic tasks for validation.
- Distributes both kinds of tasks to miners and evaluates the results:
- Uses a separate GPU, called a Trusted Miner, to pre-run part of the validation tasks and establish expected results.
The Trusted Miner shares the same code as a regular miner, but is configured differently:
- It is not registered in the metagraph.
- It only accepts tasks from the associated validator.
- See validator's README for more details
- Accepts job requests from validators.
- Manages executors to perform tasks and sends results back to validators.
- See miner's README for more details
- An instance spawned by a miner to perform a single dockerized task.
- Operates in a restricted environment, with limited network access necessary for:
- communicating with miners,
- downloading docker images,
- handling job data.
- Executors form a horde of a miner and are assigned hardware classes.
- See executor's README for more details
- Commit-Reveal: Validators post hidden weights and reveal them in the next epoch, making the copying of current weights impossible.
- Executor Dancing: Miners randomly move GPUs across multiple UIDs, further reducing the effectiveness of copying old weights.
- Synthetic tasks are designed to run only on specific hardware (e.g., A6000 GPUs), ensuring miners deliver the advertised compute power.
- Scoring system incentivizing for completing organic tasks.
-
Bring organic jobs from other subnets' validators
Allow the free market to regulate demand and prioritize cost-effective hardware, rather than solely focusing on the strongest hardware. -
Strengthen Security
Introduce rules and safeguards to prevent malicious actors from exploiting the network, ensuring a fair and secure environment for all participants. -
Support Long-Running Jobs
Implement accounting mechanisms for miners to be rewarded proportionally to the time spent, even for incomplete long-running tasks. -
Expand Hardware Support
Add support for all GPU classes required by other Bittensor subnets. -
Fair Resource Sharing
Allocate resources based on validators' stakes while allowing low-stake validators access when demand is low.
- ComputeHorde mainnet UID: 12
- ComputeHorde testnet UID: 174
- ComputeHorde channel within Bittensor discord
- Information dashboards:
This repository contains the implementations of:
- Validator: Requires a Trusted Miner for cross-checking synthetic tasks.
- Miner: Modifying the miner code on subnet 12 is discouraged, as the stock implementation manages only communications between components. The competitive edge lies in optimizing executor provisioning. Users can create custom executor managers to scale and optimize mining efficiency. The default executor manager runs a single executor and is not intended for mainnet use.
In the following sections, you can find instructions on running Validator and Miner. There are more details in each component's README and in the Troubleshooting section below.
Modifications to ComputeHorde components are generally not recommended, with the exception of the ExecutorManager class. Customizing this class allows you to implement dedicated logic for handling executors, such as running multiple executors per miner.
ComputeHorde validator is built out of three components
- trusted miner (requires A6000 - the only GPU supported now) for cross-validation
- two S3 buckets for sharing LLM data (lots of small text files)
- validator machine (standard, non-GPU) - for regular validating & weight-setting
The steps, performed by running installation scripts on your local machine, which has your wallet files. For clarity, these installation scripts are not run on the machine that will become the trusted miner or the validator, the scripts will connect through SSH to those machines from your local machine:
Prepare a trusted miner and S3 buckets (find out how using the links above).
Then, set the environment variables directly in the .env
file of your validator instance and restart your validator:
$ docker compose down --remove-orphans && docker compose up -d
Set the following environment variables in a terminal on your local machine (on the machine where you have your wallet files):
export TRUSTED_MINER_ADDRESS=...
export TRUSTED_MINER_PORT=...
export S3_BUCKET_NAME_PROMPTS=...
export S3_BUCKET_NAME_ANSWERS=...
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_DEFAULT_REGION=...
Note: The AWS_DEFAULT_REGION
property is optional. Use it when your buckets are not in your default AWS region.
Export AWS_ENDPOINT_URL
to use another cloud object storage (s3-compatible) provider. If not given, AWS S3 will be used.
Then execute the following command from the same terminal session:
curl -sSfL https://github.com/backend-developers-ltd/ComputeHorde/raw/master/install_validator.sh | bash -s - SSH_DESTINATION HOTKEY_PATH
Replace:
SSH_DESTINATION
with your server's connection info (i.e.[email protected]
)HOTKEY_PATH
with the path of your hotkey (i.e.~/.bittensor/wallets/my-wallet/hotkeys/my-hotkey
)
This script installs the necessary tools in the server, copies the public keys, and starts the validator with the corresponding runner and the default config.
If you want to change the default config, see Validator runner README for details.
If you want to trigger jobs from the validator see Validator README for details.
If anything seems wrong, check the troubleshooting section.
To quickly start a miner, create an Ubuntu Server and execute the following command from your local machine (where you have your wallet files).
curl -sSfL https://github.com/backend-developers-ltd/ComputeHorde/raw/master/install_miner.sh | bash -s - production SSH_DESTINATION HOTKEY_PATH
Replace SSH_DESTINATION
with your server's connection info (i.e. [email protected]
)
and HOTKEY_PATH
with the path of your hotkey (i.e. ~/.bittensor/wallets/my-wallet/hotkeys/my-hotkey
).
This script installs necessary tools in the server, copies the keys, and starts the miner with the corresponding runner and default config.
If you want to change the default config, see Miner runner README for details.
- Check if your miner is reachable from a machine different from the miner:
curl {ADDRESS}:{PORT}/admin/login/ -i
. BothPORT
andADDRESS
can be obtained from the metagraph. If everything is ok the first line should readHTTP/1.1 200 OK
. By default, the address is automatically determined by bittensor lib, but you can input your own in .env - Check if you're getting any jobs and what the outcomes are. An admin panel for that is coming but for now you
achieve that with
docker-compose exec miner-runner docker-compose exec db psql postgres -U postgres -c 'select * from miner_acceptedjob order by id desc;
If you need to move your miner or validator to a new server, see the migration guide.
The ComputeHorde software starts several Docker containers, with layout differing slightly between the miner and the validator.
The most relevant logs are from the container with a name ending in app-1
.
- SSH into the miner machine.
- Run
docker ps
to find the name of the appropriate container (e.g.,compute_horde_miner-app-1
). - Run
docker logs CONTAINER_NAME
.
- SSH into the validator machine.
- Navigate to the directory where the docker compose yaml file is located.
- Run
docker compose logs
.
To perform a hard restart of all ComputeHorde Docker containers, run the following commands:
docker compose down --remove-orphans
docker compose up
Afterwards, use docker ps
to verify that the containers have started successfully.
To start fresh and remove all persistent data, follow these steps:
- Stop the validator or miner (all running containers)
- Run
docker volume ls
to list all existing volumes and identify the ones to delete. Key volumes to consider:- Miner:
miner_db_data
,miner_redis_data
- Validator:
validator_db
,validator_redis
,validator_static
- Miner:
- Run the following command to remove all Docker volumes:
docker volume rm $(docker volume ls -q)
- Start the validator or miner again
Miner installation may occasionally fail with an error about the system being unable to install the cuda-drivers
package.
This issue is often caused by mismatched drivers already installed before running the installation script.
To resolve this:
- Run the following command on the miner machine to purge any conflicting NVIDIA packages:
sudo apt-get purge -y '^nvidia-.*'
- Re-run the
install_miner.sh
script from your local machine.
To verify the health of the NVIDIA setup, run the following command on the miner machine:
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
If the output indicates a problem (especially immediately after installation), a restart of the services may help.
To verify that the S3 buckets are configured correctly, you can list their contents by running the following command on a machine with the AWS CLI installed (check out the Amazon instructions). Replace the placeholders with the appropriate values:
AWS_ACCESS_KEY_ID=... AWS_SECRET_ACCESS_KEY=... aws s3api list-objects --bucket BUCKET_NAME
The bucket names and required AWS credentials are stored in the validator’s .env
file as:
S3_BUCKET_NAME_PROMPTS
S3_BUCKET_NAME_ANSWERS
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
If you encounter a permissions error, such as missing the s3:ListBucket
permission, you may need to use the AWS root user credentials.