Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function calling and structured outputs support #46

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 59 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -206,17 +206,73 @@ python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
3. For mathematical problems, it is advisable to include a directive in your prompt such as: "put your final answer within \boxed{}".
4. When evaluating model performance, it is recommended to conduct multiple tests and average the results.

## 7. License
## 7. Function Calling and Structured Outputs

### Current Status
- As of now, **DeepSeek R1 does not natively support function calling or structured outputs**.
- The model is primarily optimized for **reasoning-heavy tasks** (e.g., math, code, and STEM) and follows a conversational format.

### Future Plans
- We recognize the importance of **function calling** and **structured outputs** for many use cases, such as API integrations, automation, and data extraction.
- We are actively exploring ways to add support for these features in future updates. This includes:
- Extending the model’s capabilities to handle structured data formats (e.g., JSON, XML).
- Adding support for function calling to enable seamless integration with external tools and APIs.

### Timeline
- While we don’t have a specific release date yet, we aim to roll out these features in the **next major update**.
- We will keep the community updated on our progress through GitHub announcements and release notes.

### Workarounds for Now
If you need structured outputs or function-like behavior in the meantime, here are some workarounds:
1. **Post-Processing Outputs:**
- Use a script to parse the model’s responses into structured formats (e.g., JSON).
- Example:
```python
import json

response = model.generate("Extract the following data as JSON: ...")
structured_data = json.loads(response)
```

2. **Prompt Engineering:**
- Design prompts to guide the model to produce outputs in a specific format.
- Example:
```
Extract the following information and format it as JSON:
- Name: ...
- Age: ...
- Location: ...
```

3. **Custom Wrapper:**
- Build a custom wrapper around the model to simulate function calling behavior.
- Example:
```python
def call_function(model, function_name, args):
prompt = f"Call function {function_name} with args {args} and return the result."
return model.generate(prompt)
```

### Community Feedback
We appreciate the enthusiasm from the community (x2 + 5 and counting!). Your feedback is invaluable in shaping the future of DeepSeek R1. If you have specific use cases or feature requests related to function calling and structured outputs, please share them in this thread.

### Next Steps
- We will prioritize this feature based on community demand and provide updates as development progresses.
- Stay tuned for announcements and feel free to contribute ideas or suggestions!

Thank you for your patience and support as we work to make DeepSeek R1 even better! Let us know if you have further questions or need additional assistance.

## 8. License
This code repository and the model weights are licensed under the [MIT License](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/LICENSE).
DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that:
- DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from [Qwen-2.5 series](https://github.com/QwenLM/Qwen2.5), which are originally licensed under [Apache 2.0 License](https://huggingface.co/Qwen/Qwen2.5-1.5B/blob/main/LICENSE), and now finetuned with 800k samples curated with DeepSeek-R1.
- DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under [llama3.1 license](https://huggingface.co/meta-llama/Llama-3.1-8B/blob/main/LICENSE).
- DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under [llama3.3 license](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/blob/main/LICENSE).

## 8. Citation
## 9. Citation
```

```

## 9. Contact
## 10. Contact
If you have any questions, please raise an issue or contact us at [[email protected]]([email protected]).
42 changes: 42 additions & 0 deletions src/model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
import json
import xml.etree.ElementTree as ET

class Model:
def __init__(self):
# Initialize the model
pass

def generate(self, prompt):
# Generate a response based on the prompt
pass

def parse_json(self, response):
try:
return json.loads(response)
except json.JSONDecodeError:
return None

def parse_xml(self, response):
try:
return ET.fromstring(response)
except ET.ParseError:
return None

def call_function(self, function_name, args):
prompt = f"Call function {function_name} with args {args} and return the result."
return self.generate(prompt)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really function calling? The function calling usually just "returns" function name and params to use from the list of supplied function name. it doesn't call any function itself, right?

Has anyone else taken a look at this?


def integrate_with_api(self, api_endpoint, data):
# Example function to integrate with an external API
import requests
response = requests.post(api_endpoint, json=data)
return response.json()

def generate_structured_output(self, prompt, format="json"):
response = self.generate(prompt)
if format == "json":
return self.parse_json(response)
elif format == "xml":
return self.parse_xml(response)
else:
return response
28 changes: 28 additions & 0 deletions src/utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import json
import xml.etree.ElementTree as ET

def parse_json(response):
try:
return json.loads(response)
except json.JSONDecodeError:
return None

def parse_xml(response):
try:
return ET.fromstring(response)
except ET.ParseError:
return None

def generate_json(data):
return json.dumps(data)

def generate_xml(data):
root = ET.Element("root")
for key, value in data.items():
child = ET.SubElement(root, key)
child.text = str(value)
return ET.tostring(root, encoding='unicode')

def call_function(model, function_name, args):
prompt = f"Call function {function_name} with args {args} and return the result."
return model.generate(prompt)
38 changes: 38 additions & 0 deletions tests/test_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import unittest
from src.model import Model

class TestModel(unittest.TestCase):

def setUp(self):
self.model = Model()

def test_generate_structured_output_json(self):
prompt = "Extract the following data as JSON: {\"name\": \"John\", \"age\": 30}"
result = self.model.generate_structured_output(prompt, format="json")
self.assertIsInstance(result, dict)
self.assertEqual(result["name"], "John")
self.assertEqual(result["age"], 30)

def test_generate_structured_output_xml(self):
prompt = "Extract the following data as XML: <person><name>John</name><age>30</age></person>"
result = self.model.generate_structured_output(prompt, format="xml")
self.assertIsInstance(result, ET.Element)
self.assertEqual(result.find("name").text, "John")
self.assertEqual(result.find("age").text, "30")

def test_call_function(self):
function_name = "add"
args = {"a": 5, "b": 3}
result = self.model.call_function(function_name, args)
self.assertIsInstance(result, str) # Assuming the result is a string
self.assertIn("result", result) # Assuming the result contains the word "result"

def test_integrate_with_api(self):
api_endpoint = "https://api.example.com/endpoint"
data = {"key": "value"}
result = self.model.integrate_with_api(api_endpoint, data)
self.assertIsInstance(result, dict)
self.assertIn("response", result) # Assuming the API response contains the key "response"

if __name__ == "__main__":
unittest.main()
46 changes: 46 additions & 0 deletions tests/test_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import unittest
from src.utils import parse_json, parse_xml, generate_json, generate_xml, call_function
from src.model import Model

class TestUtils(unittest.TestCase):

def setUp(self):
self.model = Model()

def test_parse_json(self):
response = '{"name": "John", "age": 30}'
result = parse_json(response)
self.assertIsInstance(result, dict)
self.assertEqual(result["name"], "John")
self.assertEqual(result["age"], 30)

def test_parse_xml(self):
response = "<person><name>John</name><age>30</age></person>"
result = parse_xml(response)
self.assertIsInstance(result, ET.Element)
self.assertEqual(result.find("name").text, "John")
self.assertEqual(result.find("age").text, "30")

def test_generate_json(self):
data = {"name": "John", "age": 30}
result = generate_json(data)
self.assertIsInstance(result, str)
self.assertIn('"name": "John"', result)
self.assertIn('"age": 30', result)

def test_generate_xml(self):
data = {"name": "John", "age": 30}
result = generate_xml(data)
self.assertIsInstance(result, str)
self.assertIn("<name>John</name>", result)
self.assertIn("<age>30</age>", result)

def test_call_function(self):
function_name = "add"
args = {"a": 5, "b": 3}
result = call_function(self.model, function_name, args)
self.assertIsInstance(result, str) # Assuming the result is a string
self.assertIn("result", result) # Assuming the result contains the word "result"

if __name__ == "__main__":
unittest.main()