Skip to content

Commit

Permalink
feat: new readme (#35)
Browse files Browse the repository at this point in the history
  • Loading branch information
justusmattern27 authored Feb 5, 2025
1 parent d2fb04a commit 4598354
Showing 1 changed file with 131 additions and 31 deletions.
162 changes: 131 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,66 +1,166 @@
# Genesys
<p align="center">
</p>

<img src="https://github.com/user-attachments/assets/51e44795-5206-49d6-a12a-ecacd2799df2" alt="Prime Intellect" style="width: 100%; height: auto;"/>

## install
---

<h3 align="center">
GENESYS: Reasoning Data Generation & Verification
</h3>
<p align="center">
| <a href=""><b>Blog</b></a> | <a href=""><b>X Thread</b></a> | <a href=""><b>SYNTHETIC-1 Dashboard</b></a> |
</p>

---


Genesys is a library for synthetic reasoning data generation and verification, used to generate [SYNTHETIC-1]().

The library has two main entrypoints:
- `src/genesys/generate.py` is used to sample responses to tasks from a given dataset using a teacher model.
- `src/genesys/verify.py` is used to verify responses and assign rewards using verifiers

quick install (only will work when repo got open source)
```
curl -sSL https://raw.githubusercontent.com/PrimeIntellect-ai/genesys/main/scripts/install/install.sh | bash
```

# Usage

## Installation

**Quick Install:** Run the following command for a quick install:
```
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env
git clone [email protected]:PrimeIntellect-ai/genesys.git
cd genesys
uv sync --extra sglang
curl -sSL https://raw.githubusercontent.com/PrimeIntellect-ai/genesys/main/scripts/install/install.sh | bash
```

## run
## Data Generation

This is a short run to test if the repo is installed correctly
To check that your installation has succeeded, you can run the following command to generate data with a small model:

```
# with config file; see /configs for all configurations
uv run python src/genesys/generate.py @ configs/debug.toml
# otherwise, with --flags
uv run python src/genesys/generate.py \
--name_model "Qwen/Qwen2.5-Coder-0.5B" \
--num_gpus 1 \
--sample_per_file 8 \
--temperature 0.6 \
--max_tokens 16384 \
--data.max_samples 16 \
--data.batch_size 8 \
--data.path "PrimeIntellect/verifiable-math" # Follow the schema from "PrimeIntellect/verifiable-math", "PrimeIntellect/verifiable-coding", etc.
```

For pushing the data to s3/gcp bucket, you have to download a service account key file with the permission to push to your bucket, encode it to base64 and set the encoded file as `GCP_CREDENTIALS_BASE64`. Then you can specify your bucket via the `--gcp_bucket` flag:
Your file with responses will be saved to `/output`.

**Run with Docker:** You can also generate data using the docker image:

```
export GCP_CREDENTIALS_BASE64=$(base64 -w 0 /path/to/your/service-account-key.json)
uv run python src/genesys/generate.py @ configs/debug.toml --gcp_bucket checkpoints_pi/test_data
sudo docker run --gpus all -it primeintellect/genesys:latest uv run python src/genesys/generate.py @ configs/debug.toml
```

automatically detect the right model to run
```sh
./script/entrypoint.sh
```
## Verification

for dev setup:
To verify model responses, you can use the `src/genesys/verify.py` script along with the output file from `src/genesys/generate.py` located in `output`.

```
uv run pre-commit install
uv run python src/genesys/verify.py --file <path-to-out-file> # output file is usually at /output/out_<some_uuid>.jsonl
```

The verification loop runs asynchronously to parallelize verification and speed up processing. .

### running test
## Adding your own Tasks & Verifiers

Genesys is built to be easily extendable for your own tasks & verifiers. You can generate responses for your own data by using a Huggingface dataset with our schema and add your own verifier with minimal code.

### Using your own Data

To generate data from your own tasks using `src/genesys/generate.py`, you should pass a huggingface dataset with the same schema as `PrimeIntellect/verifiable-math` or others. This is what the schema looks like:

```python
class Task(BaseModel):
problem_id: str
source: str # source of the dataset
task_type: str # this will be used to map the response to a verifier
in_source_id: Optional[str]
prompt: str
gold_standard_solution: Optional[str]
verification_info: Dict # dict is empty if no data needed
metadata: Dict # dict is empty if no data needed
```
uv run pytest

The output from the generation script is a `.jsonl` file with each line containing a `Response` object:

```python
class Response(BaseModel):
problem_id: str
source: str
task_type: str
in_source_id: Optional[str]
prompt: str
gold_standard_solution: Optional[str]
verification_info: Dict
metadata: Dict
llm_response: str # llm response string
response_id: str
model_name: str
generation_config: Dict # sampling parameters
```

### Adding a Verifier

## Docker
To implement a verifier, you have to 1) add a verifier class implementing a `verify` function that receives a `Response` object and 2) add the verifier to a verifier registry.

You can implement your own verifier in `src/genesys/verifiers`:
```python
from genesys.schemas import Response
from genesys.verifiers.base_verifier import BaseVerifier

build
class LengthVerifier(BaseVerifier):
max_parallel = 30 # how many times the task should be executed in parallel max - relevant when doing LLM calls with rate limits
timeout = 10 # timeout in seconds - when running LLM generated code, we want to avoid it getting stuck in an infinite loop

```
sudo docker build -t primeintellect/genesys:latest .
```
# optional: __init__ function for set up (e.g. LLM API client)

def verify(self, result: Response):
"""
Required: this example verify function checks the length of the llm response
and rewards responses over a threshold specified in the dataset.
run
The output should be a dict with a score from 0-1 and a 'verification_result_info' dict containing metadata
"""
response = result["llm_response"]
threshold_small = result["verification_info"]["length_threshold_small"]
threshold_large = result["verification_info"]["ength_threshold_large"]

if len(response) > threshold_large:
score = 1.0
elif len(response) > threshold_small:
score = 0.5
else:
score = 0.0

return dict(score=score, verification_result_info={}) # add metadata if needed in verification_result_info

# optional: shutdown() function for termination, gets called after all responses are verified. For instance, we use this function in the code verifier to shut down docker images

```
sudo docker run --gpus all -it primeintellect/genesys:latest uv run python src/genesys/generate.py @ configs/debug.toml

Now, add your verifier to the verifier registry in `src/genesys/verifiers/registry.py`:
```python
from genesys.verifiers.code_test_verifier import CodeVerifier
from genesys.verifiers.math_verifier import MathVerifier
from genesys.verifiers.llm_judge_verifier import LlmJudgeVerifier
from genesys.verifiers.code_output_prediction_verifier import CodeUnderstandingVerifier
from genesys.verifiers.length_verifier import LengthVerifier # your verifier

VERIFIER_REGISTRY = {
"verifiable_code": CodeVerifier,
"verifiable_math": MathVerifier,
"llm_judgeable_groundtruth_similarity": LlmJudgeVerifier,
"verifiable_code_understanding": CodeUnderstandingVerifier,
"length_adherance": LengthVerifier
}
```

Every task from your dataset with `"task_type": "length_adherance"` will now be verified with the implemented length verifier when running `src/genesys/verify.py`.

0 comments on commit 4598354

Please sign in to comment.