Adjust LLM parameters to speed things up #361
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The most recent model,
![圖片](https://private-user-images.githubusercontent.com/108608/411172883-a9a9a886-8a73-4d4d-a197-9f6bec78b60e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk2MDk1MzIsIm5iZiI6MTczOTYwOTIzMiwicGF0aCI6Ii8xMDg2MDgvNDExMTcyODgzLWE5YTlhODg2LThhNzMtNGQ0ZC1hMTk3LTlmNmJlYzc4YjYwZS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE1JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxNVQwODQ3MTJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT00MDBjOWExYjE2ZDQzYjAzMTg3YTFiYTljZmI0ZDQxZTJiMmNlYjc0ZjYyYWU5MjI5MzY2MGMzNTg2YzliOGZkJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.kZJyQ-4SKfzK58DqKYDFsI2S5IT2f1jYWG0Y14IkynM)
gemini-2.0-flash-001
, suffers from slow response issue, compared to its predecessor, the experimentalgemini-2.0-flash-exp
.On the other hand, I found that
![圖片](https://private-user-images.githubusercontent.com/108608/411172752-b7652d65-f903-41bb-8daa-d5b2822d6125.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk2MDk1MzIsIm5iZiI6MTczOTYwOTIzMiwicGF0aCI6Ii8xMDg2MDgvNDExMTcyNzUyLWI3NjUyZDY1LWY5MDMtNDFiYi04ZGFhLWQ1YjI4MjJkNjEyNS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE1JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxNVQwODQ3MTJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1kYjQ5Mjc4NzMwNDVjZDZkMWNhYjNhZTZjNThjOTZlNTNkNWQwOTFjYWU0YmUxNGMwNGVhNDg0NmQxZmI4NDM1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.G6cROaNwy-t4PrFc4hokV6ORncC7ibgjG5KHEtNBixc)
gemini-1.5-pro-002
inasia-east1
(Taiwan) andasia-northeast1
(Tokyo) is actually as fast asgemini-2.0-flash-exp
onus-cental1
, and its quality is also pretty good.This PR sets up the LLM model settings for LLM transcript to prioritize
gemini-1.5-pro-002
in Asia, thengemini-2.0-flash-exp
. For now, we ignore the latestgemini-2.0-flash-001
until it gets a significant speed boost.