Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemini Fine Tuning #104

Closed
scosman opened this issue Jan 13, 2025 · 3 comments
Closed

Gemini Fine Tuning #104

scosman opened this issue Jan 13, 2025 · 3 comments

Comments

@scosman
Copy link
Collaborator

scosman commented Jan 13, 2025

Related to convo here: #29 (comment)

Looking at options: Google has incredibly weird mix of APIs here. At least 2 deprecated APIs and 2 active APIs. Some of the active APIs have 2 names. Everything is called Vertex, except when it isn't.

They have a lovely AI assistant which will happily hallucinate services that don't exist.

Of the active ones:

Gemini API aka Generative AI API

  • https://ai.google.dev/gemini-api/docs/model-tuning/tutorial?lang=python
  • Kinda a toy API:
    • max 4MB of training data (and that limit is only documented in the tutorial, not the API docs).
    • Limited to "input" and "output". Not a chat format model. Need to embed prompt into input
  • Super nice clean API: 2 endpoints, don't even need a library - standard request is fine (or library if it isn't huge).
  • Supports serverless serving! This is important.
  • Can consume it with Langchain or OpenAI compatible API

Google AI Studio

  • No API, max 500 samples. Cut.

Vertex AI

If this is serverless, it's an amazing API with lots of control that can fine tine Gemini, Llama, and much more. If it's not, it's expensive and hard to use for rapid prototyping.

@scosman
Copy link
Collaborator Author

scosman commented Jan 13, 2025

Okay: gemini models work serverless and are pretty slick. Fireworks is still better for Llama/etc as it offers serverless.

@scosman
Copy link
Collaborator Author

scosman commented Jan 14, 2025

conclusion: serverless gemini fine tuning is possible. However, the auth and setup experience would be awful. GCP really needs to clean this up, it manages to be much worse that AWS (which is quite bad). Putting this on ice indefinitely. Would prefer a provider with more open models than an ugly config for 1 model.

@scosman scosman closed this as not planned Won't fix, can't repro, duplicate, stale Jan 14, 2025
@leonardmq
Copy link
Contributor

leonardmq commented Jan 14, 2025

EDIT: just noticed you had already implemented what I was suggesting below, in #106 - nice one - thanks 🥇

@scosman - fine-tuning Gemini models seems to require a slightly different dataset format than OpenAI. If adding Google / Vertex as a full provider does not seem worth the effort, perhaps we could just have a formatter to download the dataset in the right format - then the user can do the fine-tuning on their own using Google's tooling, off-Kiln. Though not sure if this flow aligns well with the downstream features coming up like eval.

I can look into implementing the formatter if it sounds good to you.

(The main reason I am interested in Gemini is word on the street is that Gemini 2.0 Flash pricing will be GPT-4o-mini ballpark, with GPT-4o level quality. So might be a good one to fine tune)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants