You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Working with LLMs presents challenges like domain knowledge gaps, factuality issues, and hallucinations. Retrieval Augmented Generation (RAG) mitigates these by integrating external knowledge, such as databases, making it valuable for knowledge-intensive or dynamic, domain-specific applications. A key benefit is that RAG eliminates the need to retrain LLMs for specific tasks, and its use has grown, particularly in conversational agents.
In the SpeziLLM context, RAG methods are currently instantiated in the form of function calling for the OpenAI layer, enabling cloud-based models to dynamically and precisely fetch content (e.g. health data from HealthKit) from the local device in order to answer user's queries.
However, function calling is only a limited form of the RAG approach, as it relies on static function (tool) definitions and therefore rather limited complexity and fixed structure of data.
Solution
To unlock the full potential of enriching LLM responses with personal health data, SpeziLLM should enhance its support for RAG methods. This includes building infrastructure to interact with vector databases (both local and remote), retrieve relevant information (embedded) from these databases, and seamlessly inject it into the LLM’s context to improve the accuracy and relevance of responses to user queries.
There is currently no clear approach for implementing RAG infrastructure in SpeziLLM, making this a rather exploratory issue that will likely require further discussion to determine the best solution. Feel free to reach out to @philippzagar or @LeonNissen if there's interest in working on RAGs in SpeziLLM.
A high-level overview of RAG can be found here:
Additional context
No response
Code of Conduct
I agree to follow this project's Code of Conduct and Contributing Guidelines
The text was updated successfully, but these errors were encountered:
@LeonNissen made some great progress in this in the last few months; we might be able to push some experimental features into SpeziLLM in the near future to support this on a local level.
Would be great to see if we can already incorporate into remote calls, but I suspect that a local framework might not be the best way to set this up wound't it? I think the focus might rather lie on the local execution at this point.
Problem
Working with LLMs presents challenges like domain knowledge gaps, factuality issues, and hallucinations. Retrieval Augmented Generation (RAG) mitigates these by integrating external knowledge, such as databases, making it valuable for knowledge-intensive or dynamic, domain-specific applications. A key benefit is that RAG eliminates the need to retrain LLMs for specific tasks, and its use has grown, particularly in conversational agents.
In the SpeziLLM context, RAG methods are currently instantiated in the form of function calling for the OpenAI layer, enabling cloud-based models to dynamically and precisely fetch content (e.g. health data from HealthKit) from the local device in order to answer user's queries.
However, function calling is only a limited form of the RAG approach, as it relies on static function (tool) definitions and therefore rather limited complexity and fixed structure of data.
Solution
To unlock the full potential of enriching LLM responses with personal health data, SpeziLLM should enhance its support for RAG methods. This includes building infrastructure to interact with vector databases (both local and remote), retrieve relevant information (embedded) from these databases, and seamlessly inject it into the LLM’s context to improve the accuracy and relevance of responses to user queries.
There is currently no clear approach for implementing RAG infrastructure in SpeziLLM, making this a rather exploratory issue that will likely require further discussion to determine the best solution. Feel free to reach out to @philippzagar or @LeonNissen if there's interest in working on RAGs in SpeziLLM.
A high-level overview of RAG can be found here:

Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: