-
Notifications
You must be signed in to change notification settings - Fork 16.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add source metadata to bedrock retriever response #21349
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rozerarenu
Thanks for making the change. One minor comment, can you also make sure this PR is in alignment with changes here.
langchain-ai/langchain-aws#36
Co-authored-by: Piyush Jain <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good! Can you please fix the lint errors.
@baskaryan |
…ng delay I am not able to reproduce the latest CI lint failures in my local even with same python versions and the workflow has been failing one after another for tests. Removing these for now as we need this change merged in for unblocking a bedrock customer. Will revisit the tests later at some point.
This was implemented in #21349 but dropped before merge.
added back the unit tests here: #21485 |
Thank you for contributing to LangChain! - [X] **PR title**: "community: Add source metadata to bedrock retriever response" - [X] **PR message**: - **Description:** Bedrock retrieve API returns extra metadata in the response which is currently not returned in the retriever response - **Issue:** The change adds the metadata from bedrock retrieve API response to the bedrock retriever in a backward compatible way. Renamed metadata to sourceMetadata as metadata term is being used in the Document already. This is in sync with what we are doing in llama-index as well. - **Dependencies:** No - [X] **Add tests and docs**: 1. Added unit tests 2. Notebook already exists and does not need any change 3. Response from end to end testing, just to ensure backward compatibility: `[Document(page_content='Exoplanets.', metadata={'location': {'s3Location': {'uri': 's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647, 'source_metadata': {'x-amz-bedrock-kb-source-uri': 's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year': 1946.0}})]` - [X] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Piyush Jain <[email protected]>
This was implemented in langchain-ai#21349 but dropped before merge.
Thank you for contributing to LangChain! - [X] **PR title**: "community: Add source metadata to bedrock retriever response" - [X] **PR message**: - **Description:** Bedrock retrieve API returns extra metadata in the response which is currently not returned in the retriever response - **Issue:** The change adds the metadata from bedrock retrieve API response to the bedrock retriever in a backward compatible way. Renamed metadata to sourceMetadata as metadata term is being used in the Document already. This is in sync with what we are doing in llama-index as well. - **Dependencies:** No - [X] **Add tests and docs**: 1. Added unit tests 2. Notebook already exists and does not need any change 3. Response from end to end testing, just to ensure backward compatibility: `[Document(page_content='Exoplanets.', metadata={'location': {'s3Location': {'uri': 's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647, 'source_metadata': {'x-amz-bedrock-kb-source-uri': 's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year': 1946.0}})]` - [X] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Piyush Jain <[email protected]>
This was implemented in langchain-ai#21349 but dropped before merge.
Thank you for contributing to LangChain!
PR title: "community: Add source metadata to bedrock retriever response"
PR message:
Add tests and docs:
[Document(page_content='Exoplanets.', metadata={'location': {'s3Location': {'uri': 's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647, 'source_metadata': {'x-amz-bedrock-kb-source-uri': 's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year': 1946.0}})]
Lint and test: Run
make format
,make lint
andmake test
from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/Additional guidelines:
If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.