🔍 Overview | 🤖 Models | 📚 Dataset | 🛠️ Get Started | 🧑💻 Experiments | 📝 Citation | 🙏 Acknowledgements
- 🚀 [Oct. 30] We have publicly released checkpoints, datasets, and code for SemCoder🔥🔥!!
- 🎉 [Sep. 25] SemCoder has been accepted to NeurIPS'24!!
- SemCoder not only generates code, but also comprehensively understands code semantics.
- We propose to learn varied semantics: from high-level functionalities to low-level details, from static properties to dynamic program states.
- SemCoder-S-6.7B outperforms GPT-3.5-turbo on code generation (HumanEval: 79.3 vs. 76.8; LiveCodeBench-Lite: 27.5 vs. 23.9) and execution reasoning (CRUXEval-I: 63.6 vs. 50.3; CRUXEval-O: 63.9 vs. 59.0; LiveCodeBench-CodeExecution: 61.2 vs. 43.6)
- Motivated by rubber-duck debugging, we propose monologue reasoning, learning to explain dynamic execution by reasoning important values, properties and constraints.
- Monologues are bi-directional: forward and backward.
- Monologue is notably more effective than both scratchpad and chain-of-thoughts in reasoning dynamic execution.
Model | HF Checkpoints | Size | License |
---|---|---|---|
SemCoder | 🤗 HF Link | 6.7B | DeepSeek |
SemCoder-S | 🤗 HF Link | 6.7B | DeepSeek |
- 📚 PyX: A fully executable Python datasets with comprehensive code semantics.
- 👨🏼🔧 PyX-R: A Python dataset to teach LLM to perform rubber-duck debugging and self-repair.
git clone https://github.com/ARiSE-Lab/SemCoder.git;
cd SemCoder;
conda env create --name semcoder --file=environment.yml;
conda activate semcoder;
export PYTHONPATH=$(pwd);
from transformers import pipeline
import torch
generator = pipeline(
model="semcoder/semcoder_s_1030",
task="text-generation",
torch_dtype=torch.float16,
device_map="auto",
)
# Generate Code
CODEGEN_REQUEST = """You are an exceptionally intelligent coding assistant that consistently delivers accurate and reliable <Code> according to <NL_Description>
<NL_Description>
{desc}
<Code>
"""
desc = """You are tasked with implementing a Python class that simulates a simple version of a "To-Do List" application. The class should have the following functionalities:
1. Add a new task to the to-do list.
2. Mark a task as completed.
3. Display all tasks in the to-do list.
4. Display only the incomplete tasks in the to-do list.
"""
prompt = CODEGEN_REQUEST.format(desc=desc)
result = generator(prompt, max_length=2048, num_return_sequences=1, temperature=0.0)
code = result[0]["generated_text"].split("```python")[1].split("```")[0]
print(code)
# Understand Code with Monologues
FWD_MNL_REQUEST = """Simulate the Execution: You are given a Python function and an assertion containing a function input. Complete the assertion containing the execution output corresponding to the given input in [ANSWER] and [/ANSWER] tags.
{code}
"""
tests = """
todo_list = ToDoList()
todo_list.add_task("Buy groceries")
todo_list.add_task("Complete assignment")
todo_list.mark_completed("Buy groceries")
assert todo_list.tasks == ???
"""
code += tests
prompt = FWD_MNL_REQUEST.format(code=code)
result = generator(prompt, max_length=2048, num_return_sequences=1, temperature=0.0)
print(result[0]["generated_text"])
We follow Magicoder script to lanuch a gradio server for the local demo. You can launch your local gradio demo as following:
CUDA_VISIBLE_DEVICES=0 python semcoder_demo.py \
--base_model "semcoder/semcoder_s_1030" \
--device "cuda:0" \
--port 8080
🧑💻 To reproduce evaluation results mentioned in the paper, please see experiments.
@article{ding2024semcoder,
title={SemCoder: Training Code Language Models with Comprehensive Semantics},
author={Yangruibo Ding and Jinjun Peng and Marcus J. Min and Gail Kaiser and Junfeng Yang and Baishakhi Ray},
journal={arXiv preprint arXiv:2406.01006},
year={2024}
}
My favorite quote of 2024 from the GREAT Andrej Karpathy (No Priors Ep. 80):
The Internet data is not the data you want for your Transformers – a nearest neighbor actually gets you really far, surprisingly. What you want is the inner-thought monologue of your brain. If we had billions of that, AGI is here, roughly speaking.
We thank the following amazing projects that inspired our design choices:
- Magicoder: Synthetic Code Generation.
- EvalPlus: Test-case Generation & Augmentation.
- DeepSeek-Coder: Base model for SemCoder.
The template of this README is also borrowed from Magicoder.