Skip to content

Information about hypothesis on Inference #92

Answered by samuelrince
djuillard asked this question in Q&A
Discussion options

You must be logged in to vote

Hello @djuillard,

Thanks a lot for the feedback, we really appreciate it! 💚

The 4 bits hypothesis comes from the data extraction we did in early 2024 of the LLM-Perf Leaderboard. To get the full range from ~1B to 70B models, we had to look at the data with 4 bits quantization. So it is not really a choice, but more a constraint from the data source.

We plan to integrate more data sources in the near future to better estimate the linear regression model of energy/token function of the model size, including our own experiments. Hopefully, we will be able to conduct more experiments with different sets of optimizations and hardware configurations, thus progressively gaining in precision/robu…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@djuillard
Comment options

@samuelrince
Comment options

Answer selected by samuelrince
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants