-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenFGA/OktaFGA retriever to authorize user/document access when doing RAG #6629
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Skipped Deployment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks interesting. I'll answer your implementation questions in a little after talking to the team. As for where to put examples & tests:
- ideally you'd add the documentation inside a
.ipynb
file indocs/core_docs/docs/integrations/retrievers
(use tslab for running typescript in notebooks) - for tests, if you see a test-set that is skipped, it's probably because it's an integration which has low usage/we don't have API keys on hand, and don't need to run tests every time.
Cool. I'll wait for guidelines around the implementation and then add tests + samples once there's alignment there. |
@bracesproul how's it going? bumping this thread. any insights here? |
Hey @dschenkelman, we unfortunately can't change the base signatures. I think passing async getRelevantDocuments(
query: string,
config?: Callbacks | BaseCallbackConfig
): Promise<DocumentInterface<Metadata>[]> {
const parsedConfig = ensureConfig(parseCallbackConfigArg(config));
const callbackManager_ = await CallbackManager.configure(
parsedConfig.callbacks,
this.callbacks,
parsedConfig.tags,
this.tags,
parsedConfig.metadata,
this.metadata,
{ verbose: this.verbose }
);
const runManager = await callbackManager_?.handleRetrieverStart(
this.toJSON(),
query,
parsedConfig.runId,
undefined,
undefined,
undefined,
parsedConfig.runName
);
try {
const user = config.configurable?. user;
// Your logic here
await runManager?.handleRetrieverEnd(results);
return results;
} catch (error) {
await runManager?.handleRetrieverError(error);
throw error;
}
} But would be much simpler to just rely on the constructor |
Opening this PR to start a discussion on best way to implement this.
First some context: OpenFGA/Okta FGA are systems to simplify authorization at scale. When implementing RAG, it is possible that not all users have access to all documents (chunks) to be retrieved. Systems like FGA (there are others beyond the ones I linked to), can help solve this problem.
The
FGARetriever
implementation aims to maintain the same interface for developers doing RAG with Langchain today, but adding support for authorization. This first implementation is a basic one that can be improved over time as use of it increases.Usage
Assuming tuples are already in FGA, the retrieval logic would look like this:
To discuss
As the example above shows, to make
FGARetriever
filter results based on a specific user, an instance ofFGARetriever
needs to be created per user. This is obvious from the resulting API, but can be confusing to developers.This can be addressed by adding a
context
object to bothinvoke
and_getRelevantDocuments
inBaseRetriever
. So instead ofit'd define:
I am not a Langchain expert, so I am open to other suggestions/alternatives. If the above makes sense I could send a PR to support that API first and then update the
FGARetriever
implementation based on it.Questions
Thanks!