-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
d2fb04a
commit 4598354
Showing
1 changed file
with
131 additions
and
31 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,66 +1,166 @@ | ||
# Genesys | ||
<p align="center"> | ||
</p> | ||
|
||
<img src="https://github.com/user-attachments/assets/51e44795-5206-49d6-a12a-ecacd2799df2" alt="Prime Intellect" style="width: 100%; height: auto;"/> | ||
|
||
## install | ||
--- | ||
|
||
<h3 align="center"> | ||
GENESYS: Reasoning Data Generation & Verification | ||
</h3> | ||
<p align="center"> | ||
| <a href=""><b>Blog</b></a> | <a href=""><b>X Thread</b></a> | <a href=""><b>SYNTHETIC-1 Dashboard</b></a> | | ||
</p> | ||
|
||
--- | ||
|
||
|
||
Genesys is a library for synthetic reasoning data generation and verification, used to generate [SYNTHETIC-1](). | ||
|
||
The library has two main entrypoints: | ||
- `src/genesys/generate.py` is used to sample responses to tasks from a given dataset using a teacher model. | ||
- `src/genesys/verify.py` is used to verify responses and assign rewards using verifiers | ||
|
||
quick install (only will work when repo got open source) | ||
``` | ||
curl -sSL https://raw.githubusercontent.com/PrimeIntellect-ai/genesys/main/scripts/install/install.sh | bash | ||
``` | ||
|
||
# Usage | ||
|
||
## Installation | ||
|
||
**Quick Install:** Run the following command for a quick install: | ||
``` | ||
curl -LsSf https://astral.sh/uv/install.sh | sh | ||
source $HOME/.local/bin/env | ||
git clone [email protected]:PrimeIntellect-ai/genesys.git | ||
cd genesys | ||
uv sync --extra sglang | ||
curl -sSL https://raw.githubusercontent.com/PrimeIntellect-ai/genesys/main/scripts/install/install.sh | bash | ||
``` | ||
|
||
## run | ||
## Data Generation | ||
|
||
This is a short run to test if the repo is installed correctly | ||
To check that your installation has succeeded, you can run the following command to generate data with a small model: | ||
|
||
``` | ||
# with config file; see /configs for all configurations | ||
uv run python src/genesys/generate.py @ configs/debug.toml | ||
# otherwise, with --flags | ||
uv run python src/genesys/generate.py \ | ||
--name_model "Qwen/Qwen2.5-Coder-0.5B" \ | ||
--num_gpus 1 \ | ||
--sample_per_file 8 \ | ||
--temperature 0.6 \ | ||
--max_tokens 16384 \ | ||
--data.max_samples 16 \ | ||
--data.batch_size 8 \ | ||
--data.path "PrimeIntellect/verifiable-math" # Follow the schema from "PrimeIntellect/verifiable-math", "PrimeIntellect/verifiable-coding", etc. | ||
``` | ||
|
||
For pushing the data to s3/gcp bucket, you have to download a service account key file with the permission to push to your bucket, encode it to base64 and set the encoded file as `GCP_CREDENTIALS_BASE64`. Then you can specify your bucket via the `--gcp_bucket` flag: | ||
Your file with responses will be saved to `/output`. | ||
|
||
**Run with Docker:** You can also generate data using the docker image: | ||
|
||
``` | ||
export GCP_CREDENTIALS_BASE64=$(base64 -w 0 /path/to/your/service-account-key.json) | ||
uv run python src/genesys/generate.py @ configs/debug.toml --gcp_bucket checkpoints_pi/test_data | ||
sudo docker run --gpus all -it primeintellect/genesys:latest uv run python src/genesys/generate.py @ configs/debug.toml | ||
``` | ||
|
||
automatically detect the right model to run | ||
```sh | ||
./script/entrypoint.sh | ||
``` | ||
## Verification | ||
|
||
for dev setup: | ||
To verify model responses, you can use the `src/genesys/verify.py` script along with the output file from `src/genesys/generate.py` located in `output`. | ||
|
||
``` | ||
uv run pre-commit install | ||
uv run python src/genesys/verify.py --file <path-to-out-file> # output file is usually at /output/out_<some_uuid>.jsonl | ||
``` | ||
|
||
The verification loop runs asynchronously to parallelize verification and speed up processing. . | ||
|
||
### running test | ||
## Adding your own Tasks & Verifiers | ||
|
||
Genesys is built to be easily extendable for your own tasks & verifiers. You can generate responses for your own data by using a Huggingface dataset with our schema and add your own verifier with minimal code. | ||
|
||
### Using your own Data | ||
|
||
To generate data from your own tasks using `src/genesys/generate.py`, you should pass a huggingface dataset with the same schema as `PrimeIntellect/verifiable-math` or others. This is what the schema looks like: | ||
|
||
```python | ||
class Task(BaseModel): | ||
problem_id: str | ||
source: str # source of the dataset | ||
task_type: str # this will be used to map the response to a verifier | ||
in_source_id: Optional[str] | ||
prompt: str | ||
gold_standard_solution: Optional[str] | ||
verification_info: Dict # dict is empty if no data needed | ||
metadata: Dict # dict is empty if no data needed | ||
``` | ||
uv run pytest | ||
|
||
The output from the generation script is a `.jsonl` file with each line containing a `Response` object: | ||
|
||
```python | ||
class Response(BaseModel): | ||
problem_id: str | ||
source: str | ||
task_type: str | ||
in_source_id: Optional[str] | ||
prompt: str | ||
gold_standard_solution: Optional[str] | ||
verification_info: Dict | ||
metadata: Dict | ||
llm_response: str # llm response string | ||
response_id: str | ||
model_name: str | ||
generation_config: Dict # sampling parameters | ||
``` | ||
|
||
### Adding a Verifier | ||
|
||
## Docker | ||
To implement a verifier, you have to 1) add a verifier class implementing a `verify` function that receives a `Response` object and 2) add the verifier to a verifier registry. | ||
|
||
You can implement your own verifier in `src/genesys/verifiers`: | ||
```python | ||
from genesys.schemas import Response | ||
from genesys.verifiers.base_verifier import BaseVerifier | ||
|
||
build | ||
class LengthVerifier(BaseVerifier): | ||
max_parallel = 30 # how many times the task should be executed in parallel max - relevant when doing LLM calls with rate limits | ||
timeout = 10 # timeout in seconds - when running LLM generated code, we want to avoid it getting stuck in an infinite loop | ||
|
||
``` | ||
sudo docker build -t primeintellect/genesys:latest . | ||
``` | ||
# optional: __init__ function for set up (e.g. LLM API client) | ||
|
||
def verify(self, result: Response): | ||
""" | ||
Required: this example verify function checks the length of the llm response | ||
and rewards responses over a threshold specified in the dataset. | ||
run | ||
The output should be a dict with a score from 0-1 and a 'verification_result_info' dict containing metadata | ||
""" | ||
response = result["llm_response"] | ||
threshold_small = result["verification_info"]["length_threshold_small"] | ||
threshold_large = result["verification_info"]["ength_threshold_large"] | ||
|
||
if len(response) > threshold_large: | ||
score = 1.0 | ||
elif len(response) > threshold_small: | ||
score = 0.5 | ||
else: | ||
score = 0.0 | ||
|
||
return dict(score=score, verification_result_info={}) # add metadata if needed in verification_result_info | ||
|
||
# optional: shutdown() function for termination, gets called after all responses are verified. For instance, we use this function in the code verifier to shut down docker images | ||
|
||
``` | ||
sudo docker run --gpus all -it primeintellect/genesys:latest uv run python src/genesys/generate.py @ configs/debug.toml | ||
|
||
Now, add your verifier to the verifier registry in `src/genesys/verifiers/registry.py`: | ||
```python | ||
from genesys.verifiers.code_test_verifier import CodeVerifier | ||
from genesys.verifiers.math_verifier import MathVerifier | ||
from genesys.verifiers.llm_judge_verifier import LlmJudgeVerifier | ||
from genesys.verifiers.code_output_prediction_verifier import CodeUnderstandingVerifier | ||
from genesys.verifiers.length_verifier import LengthVerifier # your verifier | ||
|
||
VERIFIER_REGISTRY = { | ||
"verifiable_code": CodeVerifier, | ||
"verifiable_math": MathVerifier, | ||
"llm_judgeable_groundtruth_similarity": LlmJudgeVerifier, | ||
"verifiable_code_understanding": CodeUnderstandingVerifier, | ||
"length_adherance": LengthVerifier | ||
} | ||
``` | ||
|
||
Every task from your dataset with `"task_type": "length_adherance"` will now be verified with the implemented length verifier when running `src/genesys/verify.py`. |