Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow kwargs in body_func #46

Merged
merged 3 commits into from
Sep 26, 2024

Conversation

gdahia
Copy link
Contributor

@gdahia gdahia commented Sep 23, 2024

Closes #41

This PR enables the passing of auxiliary arguments when forming the body in a elastic search retrieval.

It seems to me that we need to override the _aget_relevant_documents because langchain.core does not pass kwargs to _aget_relevant_documents when it is called via ainvoke. It may be the case to open a PR in langchain instead, but for now I will leave it here.

I will add tests and try to fix the mypy errors if the approach is validated by the maintainers.

@@ -112,3 +129,18 @@ def _multi_field_mapper(self, hit: Mapping[str, Any]) -> Document:
field = self.content_field[hit["_index"]]
content = hit["_source"].pop(field)
return Document(page_content=content, metadata=hit)

async def _aget_relevant_documents(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method necessary? We do not currently have support for async in any other part of this library, and we would most likely not use an executor when we add it, since the Elasticsearch client does support async natively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize that, sorry!

Anyway, these ainvoke calls work even while the underlying implementation being fully synchronous.

Leaving the retriever without this _aget_relevant_documents override may confuse users by allowing them to call the retriever in an async fashion (even when it is just running sychronously under the hood) and obtaining either a different result than they expect, if the body_func supports the omission of its keyword arguments, or a hard-to-debug error for the absence of said arguments.

Does that make sense to you?

Copy link
Collaborator

@miguelgrinberg miguelgrinberg Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I do realize that this works, but it is not a great solution. When we add async support to this library we are going to do it properly. Adding this hack seems out of place, considering that no other function in the library does it. Of course you can subclass our implementation and add async methods in a subclass if this soluiton works for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! I will just remove the async bit, then. Thanks!

Copy link
Collaborator

@miguelgrinberg miguelgrinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@miguelgrinberg miguelgrinberg merged commit 00e593f into langchain-ai:main Sep 26, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Passing kwargs to body_func
2 participants