Extended Retrieval Augmented Generation (RAG) support #88

philippzagar · 2024-12-21T12:08:13Z

Problem

Working with LLMs presents challenges like domain knowledge gaps, factuality issues, and hallucinations. Retrieval Augmented Generation (RAG) mitigates these by integrating external knowledge, such as databases, making it valuable for knowledge-intensive or dynamic, domain-specific applications. A key benefit is that RAG eliminates the need to retrain LLMs for specific tasks, and its use has grown, particularly in conversational agents.

In the SpeziLLM context, RAG methods are currently instantiated in the form of function calling for the OpenAI layer, enabling cloud-based models to dynamically and precisely fetch content (e.g. health data from HealthKit) from the local device in order to answer user's queries.
However, function calling is only a limited form of the RAG approach, as it relies on static function (tool) definitions and therefore rather limited complexity and fixed structure of data.

Solution

To unlock the full potential of enriching LLM responses with personal health data, SpeziLLM should enhance its support for RAG methods. This includes building infrastructure to interact with vector databases (both local and remote), retrieve relevant information (embedded) from these databases, and seamlessly inject it into the LLM’s context to improve the accuracy and relevance of responses to user queries.

There is currently no clear approach for implementing RAG infrastructure in SpeziLLM, making this a rather exploratory issue that will likely require further discussion to determine the best solution. Feel free to reach out to @philippzagar or @LeonNissen if there's interest in working on RAGs in SpeziLLM.

A high-level overview of RAG can be found here:

Additional context

No response

Code of Conduct

I agree to follow this project's Code of Conduct and Contributing Guidelines

PSchmiedmayer · 2025-01-20T01:01:04Z

@LeonNissen made some great progress in this in the last few months; we might be able to push some experimental features into SpeziLLM in the near future to support this on a local level.

Would be great to see if we can already incorporate into remote calls, but I suspect that a local framework might not be the best way to set this up wound't it? I think the focus might rather lie on the local execution at this point.

philippzagar added enhancement help wanted labels Dec 21, 2024

philippzagar added this to Project Planning Dec 21, 2024

github-project-automation bot moved this to Backlog in Project Planning Dec 21, 2024

PSchmiedmayer moved this from Backlog to Focus Areas in Project Planning Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extended Retrieval Augmented Generation (RAG) support #88

Extended Retrieval Augmented Generation (RAG) support #88

philippzagar commented Dec 21, 2024 •

edited

Loading

PSchmiedmayer commented Jan 20, 2025

Extended Retrieval Augmented Generation (RAG) support #88

Extended Retrieval Augmented Generation (RAG) support #88

Comments

philippzagar commented Dec 21, 2024 • edited Loading

Problem

Solution

Additional context

Code of Conduct

PSchmiedmayer commented Jan 20, 2025

philippzagar commented Dec 21, 2024 •

edited

Loading