-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add function calling and structured outputs support #46
Open
jasonkneen
wants to merge
1
commit into
deepseek-ai:main
Choose a base branch
from
jasonkneen:add-function-calling
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+213
−3
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -206,17 +206,73 @@ python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | |
3. For mathematical problems, it is advisable to include a directive in your prompt such as: "put your final answer within \boxed{}". | ||
4. When evaluating model performance, it is recommended to conduct multiple tests and average the results. | ||
|
||
## 7. License | ||
## 7. Function Calling and Structured Outputs | ||
|
||
### Current Status | ||
- As of now, **DeepSeek R1 does not natively support function calling or structured outputs**. | ||
- The model is primarily optimized for **reasoning-heavy tasks** (e.g., math, code, and STEM) and follows a conversational format. | ||
|
||
### Future Plans | ||
- We recognize the importance of **function calling** and **structured outputs** for many use cases, such as API integrations, automation, and data extraction. | ||
- We are actively exploring ways to add support for these features in future updates. This includes: | ||
- Extending the model’s capabilities to handle structured data formats (e.g., JSON, XML). | ||
- Adding support for function calling to enable seamless integration with external tools and APIs. | ||
|
||
### Timeline | ||
- While we don’t have a specific release date yet, we aim to roll out these features in the **next major update**. | ||
- We will keep the community updated on our progress through GitHub announcements and release notes. | ||
|
||
### Workarounds for Now | ||
If you need structured outputs or function-like behavior in the meantime, here are some workarounds: | ||
1. **Post-Processing Outputs:** | ||
- Use a script to parse the model’s responses into structured formats (e.g., JSON). | ||
- Example: | ||
```python | ||
import json | ||
|
||
response = model.generate("Extract the following data as JSON: ...") | ||
structured_data = json.loads(response) | ||
``` | ||
|
||
2. **Prompt Engineering:** | ||
- Design prompts to guide the model to produce outputs in a specific format. | ||
- Example: | ||
``` | ||
Extract the following information and format it as JSON: | ||
- Name: ... | ||
- Age: ... | ||
- Location: ... | ||
``` | ||
|
||
3. **Custom Wrapper:** | ||
- Build a custom wrapper around the model to simulate function calling behavior. | ||
- Example: | ||
```python | ||
def call_function(model, function_name, args): | ||
prompt = f"Call function {function_name} with args {args} and return the result." | ||
return model.generate(prompt) | ||
``` | ||
|
||
### Community Feedback | ||
We appreciate the enthusiasm from the community (x2 + 5 and counting!). Your feedback is invaluable in shaping the future of DeepSeek R1. If you have specific use cases or feature requests related to function calling and structured outputs, please share them in this thread. | ||
|
||
### Next Steps | ||
- We will prioritize this feature based on community demand and provide updates as development progresses. | ||
- Stay tuned for announcements and feel free to contribute ideas or suggestions! | ||
|
||
Thank you for your patience and support as we work to make DeepSeek R1 even better! Let us know if you have further questions or need additional assistance. | ||
|
||
## 8. License | ||
This code repository and the model weights are licensed under the [MIT License](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/LICENSE). | ||
DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that: | ||
- DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from [Qwen-2.5 series](https://github.com/QwenLM/Qwen2.5), which are originally licensed under [Apache 2.0 License](https://huggingface.co/Qwen/Qwen2.5-1.5B/blob/main/LICENSE), and now finetuned with 800k samples curated with DeepSeek-R1. | ||
- DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under [llama3.1 license](https://huggingface.co/meta-llama/Llama-3.1-8B/blob/main/LICENSE). | ||
- DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under [llama3.3 license](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/blob/main/LICENSE). | ||
|
||
## 8. Citation | ||
## 9. Citation | ||
``` | ||
|
||
``` | ||
|
||
## 9. Contact | ||
## 10. Contact | ||
If you have any questions, please raise an issue or contact us at [[email protected]]([email protected]). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
import json | ||
import xml.etree.ElementTree as ET | ||
|
||
class Model: | ||
def __init__(self): | ||
# Initialize the model | ||
pass | ||
|
||
def generate(self, prompt): | ||
# Generate a response based on the prompt | ||
pass | ||
|
||
def parse_json(self, response): | ||
try: | ||
return json.loads(response) | ||
except json.JSONDecodeError: | ||
return None | ||
|
||
def parse_xml(self, response): | ||
try: | ||
return ET.fromstring(response) | ||
except ET.ParseError: | ||
return None | ||
|
||
def call_function(self, function_name, args): | ||
prompt = f"Call function {function_name} with args {args} and return the result." | ||
return self.generate(prompt) | ||
|
||
def integrate_with_api(self, api_endpoint, data): | ||
# Example function to integrate with an external API | ||
import requests | ||
response = requests.post(api_endpoint, json=data) | ||
return response.json() | ||
|
||
def generate_structured_output(self, prompt, format="json"): | ||
response = self.generate(prompt) | ||
if format == "json": | ||
return self.parse_json(response) | ||
elif format == "xml": | ||
return self.parse_xml(response) | ||
else: | ||
return response |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
import json | ||
import xml.etree.ElementTree as ET | ||
|
||
def parse_json(response): | ||
try: | ||
return json.loads(response) | ||
except json.JSONDecodeError: | ||
return None | ||
|
||
def parse_xml(response): | ||
try: | ||
return ET.fromstring(response) | ||
except ET.ParseError: | ||
return None | ||
|
||
def generate_json(data): | ||
return json.dumps(data) | ||
|
||
def generate_xml(data): | ||
root = ET.Element("root") | ||
for key, value in data.items(): | ||
child = ET.SubElement(root, key) | ||
child.text = str(value) | ||
return ET.tostring(root, encoding='unicode') | ||
|
||
def call_function(model, function_name, args): | ||
prompt = f"Call function {function_name} with args {args} and return the result." | ||
return model.generate(prompt) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
import unittest | ||
from src.model import Model | ||
|
||
class TestModel(unittest.TestCase): | ||
|
||
def setUp(self): | ||
self.model = Model() | ||
|
||
def test_generate_structured_output_json(self): | ||
prompt = "Extract the following data as JSON: {\"name\": \"John\", \"age\": 30}" | ||
result = self.model.generate_structured_output(prompt, format="json") | ||
self.assertIsInstance(result, dict) | ||
self.assertEqual(result["name"], "John") | ||
self.assertEqual(result["age"], 30) | ||
|
||
def test_generate_structured_output_xml(self): | ||
prompt = "Extract the following data as XML: <person><name>John</name><age>30</age></person>" | ||
result = self.model.generate_structured_output(prompt, format="xml") | ||
self.assertIsInstance(result, ET.Element) | ||
self.assertEqual(result.find("name").text, "John") | ||
self.assertEqual(result.find("age").text, "30") | ||
|
||
def test_call_function(self): | ||
function_name = "add" | ||
args = {"a": 5, "b": 3} | ||
result = self.model.call_function(function_name, args) | ||
self.assertIsInstance(result, str) # Assuming the result is a string | ||
self.assertIn("result", result) # Assuming the result contains the word "result" | ||
|
||
def test_integrate_with_api(self): | ||
api_endpoint = "https://api.example.com/endpoint" | ||
data = {"key": "value"} | ||
result = self.model.integrate_with_api(api_endpoint, data) | ||
self.assertIsInstance(result, dict) | ||
self.assertIn("response", result) # Assuming the API response contains the key "response" | ||
|
||
if __name__ == "__main__": | ||
unittest.main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
import unittest | ||
from src.utils import parse_json, parse_xml, generate_json, generate_xml, call_function | ||
from src.model import Model | ||
|
||
class TestUtils(unittest.TestCase): | ||
|
||
def setUp(self): | ||
self.model = Model() | ||
|
||
def test_parse_json(self): | ||
response = '{"name": "John", "age": 30}' | ||
result = parse_json(response) | ||
self.assertIsInstance(result, dict) | ||
self.assertEqual(result["name"], "John") | ||
self.assertEqual(result["age"], 30) | ||
|
||
def test_parse_xml(self): | ||
response = "<person><name>John</name><age>30</age></person>" | ||
result = parse_xml(response) | ||
self.assertIsInstance(result, ET.Element) | ||
self.assertEqual(result.find("name").text, "John") | ||
self.assertEqual(result.find("age").text, "30") | ||
|
||
def test_generate_json(self): | ||
data = {"name": "John", "age": 30} | ||
result = generate_json(data) | ||
self.assertIsInstance(result, str) | ||
self.assertIn('"name": "John"', result) | ||
self.assertIn('"age": 30', result) | ||
|
||
def test_generate_xml(self): | ||
data = {"name": "John", "age": 30} | ||
result = generate_xml(data) | ||
self.assertIsInstance(result, str) | ||
self.assertIn("<name>John</name>", result) | ||
self.assertIn("<age>30</age>", result) | ||
|
||
def test_call_function(self): | ||
function_name = "add" | ||
args = {"a": 5, "b": 3} | ||
result = call_function(self.model, function_name, args) | ||
self.assertIsInstance(result, str) # Assuming the result is a string | ||
self.assertIn("result", result) # Assuming the result contains the word "result" | ||
|
||
if __name__ == "__main__": | ||
unittest.main() |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really function calling? The function calling usually just "returns" function name and params to use from the list of supplied function name. it doesn't call any function itself, right?
Has anyone else taken a look at this?