-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Details about the Yi Chat model? #1
Comments
Oh, fun! Re yi34b, you're spot on. It's only the q4_0. I picked many Ollama defaults because I assumed that's what most beginner users pick. I'm not sure why they chose q4_0; in most tests, the difference to Q4_K_M is sizeable. I can't run FP16 locally, so it's waiting for me to set up some cloud GPU pipeline to evaluate these bigger options. The three things I would call out though are that:
The biggest bottleneck for OSS models on Julia performance that I see is to get the basics right. That's something that should be easily solvable with a reflexion/fixer loop. However, when I tried to build a naive agent loop, it didn't work well. I have a LATS-like agent drafted but haven't had the time to finish it. That reminds me, I have 4 blog posts drafted that I never published because they are not very exciting/complete:
If you'd be interested in one of them, I could prioritize it. In terms of collaboration, there could be a few things:
You can find me on Julia Slack :-) |
Thanks for your detailed reply!
That makes sense. We received some feedbacks from developers that the Yi-34B-Chat model is not very good at math and code. I guess things are even worse with the quantized version.
Indeed. Based on my experience, it might be even easier to fix with an extra finetuning step.
Personally I'm pretty interested in these two topics. 😄
That would be much appreciated!
Yes, we can provide the computing resource to support such work. I'll discuss details with you on slack later.
Actually I'm working on porting the HumanEval dataset with EvalPlus test cases into Julia (https://github.com/oolong-dev/OolongEval.jl/tree/add_human_eval still in progress). Hopefully it will provide a different measurement for reachers to evaluate LLM's performance on under presented programming languages.
Same here 👋 |
Exciting! When it's ready, let's make sure we put it on the awesome list: https://github.com/svilupp/awesome-generative-ai-meets-julia-language |
Hello from the Yi team ;)
Based on the evaluation result, it looks you're using the default https://ollama.ai/library/yi:34b-chat ollama 4-bits quantized version, am I right?
Have you ever tried the version before quantization?
It seems we share some common interests in both code LLM and Julia. Maybe we can collaborate somehow in the future.
The text was updated successfully, but these errors were encountered: