Gemini Fine Tuning #104

scosman · 2025-01-13T15:04:27Z

Related to convo here: #29 (comment)

Looking at options: Google has incredibly weird mix of APIs here. At least 2 deprecated APIs and 2 active APIs. Some of the active APIs have 2 names. Everything is called Vertex, except when it isn't.

They have a lovely AI assistant which will happily hallucinate services that don't exist.

Of the active ones:

Gemini API aka Generative AI API

https://ai.google.dev/gemini-api/docs/model-tuning/tutorial?lang=python
Kinda a toy API:
- max 4MB of training data (and that limit is only documented in the tutorial, not the API docs).
- Limited to "input" and "output". Not a chat format model. Need to embed prompt into input
Super nice clean API: 2 endpoints, don't even need a library - standard request is fine (or library if it isn't huge).
Supports serverless serving! This is important.
Can consume it with Langchain or OpenAI compatible API

Google AI Studio

No API, max 500 samples. Cut.

Vertex AI

Docs: https://cloud.google.com/vertex-ai/generative-ai/docs/models/tune-models
API ref: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/tuning
Has a real fine-tuning system! Lets you tune non-Gemini models like Llama, Gemma, etc. Not a toy API like Gemini API. Supports full fine-tuning, or lora/qlora. Can control Lora/Qlora params. 1GB max training size. Proper chat-formatted training data. Has API for trading metrics (val_loss). Has val sets. Looks solid.
Has python lib.
TODO: check how easy it is to get started. AWS was awful. Looks like this might also have permission hell. I got many errors in console first try.
Unclear if it supports serverless deployments, or even scale-to-zero deployments.
- If not, it's not ideal for rapid prototyping, which is kinda the goal with Kiln.
- Pricing page says no scale to zero.
- This endpoint looks serverless to me though: https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini-use-supervised-tuning#test_the_tuned_model_with_a_prompt

If this is serverless, it's an amazing API with lots of control that can fine tine Gemini, Llama, and much more. If it's not, it's expensive and hard to use for rapid prototyping.

scosman · 2025-01-13T22:33:29Z

Okay: gemini models work serverless and are pretty slick. Fireworks is still better for Llama/etc as it offers serverless.

scosman · 2025-01-14T07:05:54Z

conclusion: serverless gemini fine tuning is possible. However, the auth and setup experience would be awful. GCP really needs to clean this up, it manages to be much worse that AWS (which is quite bad). Putting this on ice indefinitely. Would prefer a provider with more open models than an ugly config for 1 model.

leonardmq · 2025-01-14T23:44:13Z

EDIT: just noticed you had already implemented what I was suggesting below, in #106 - nice one - thanks 🥇

@scosman - fine-tuning Gemini models seems to require a slightly different dataset format than OpenAI. If adding Google / Vertex as a full provider does not seem worth the effort, perhaps we could just have a formatter to download the dataset in the right format - then the user can do the fine-tuning on their own using Google's tooling, off-Kiln. Though not sure if this flow aligns well with the downstream features coming up like eval.

I can look into implementing the formatter if it sounds good to you.

(The main reason I am interested in Gemini is word on the street is that Gemini 2.0 Flash pricing will be GPT-4o-mini ballpark, with GPT-4o level quality. So might be a good one to fine tune)

scosman closed this as not planned Won't fix, can't repro, duplicate, stale Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemini Fine Tuning #104

Gemini Fine Tuning #104

scosman commented Jan 13, 2025 •

edited

Loading

scosman commented Jan 13, 2025

scosman commented Jan 14, 2025

leonardmq commented Jan 14, 2025 •

edited

Loading

Gemini Fine Tuning #104

Gemini Fine Tuning #104

Comments

scosman commented Jan 13, 2025 • edited Loading

Gemini API aka Generative AI API

Google AI Studio

Vertex AI

scosman commented Jan 13, 2025

scosman commented Jan 14, 2025

leonardmq commented Jan 14, 2025 • edited Loading

scosman commented Jan 13, 2025 •

edited

Loading

leonardmq commented Jan 14, 2025 •

edited

Loading