-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INTPYTHON-504 Add DriverInfo to MongoClients #73
INTPYTHON-504 Add DriverInfo to MongoClients #73
Conversation
vector_store = MongoDBAtlasVectorSearch( | ||
collection=MONGODB_COLLECTION, | ||
vector_store = MongoDBAtlasVectorSearch.from_connection_string( | ||
connection_string="mongodb+srv://user:[email protected]/?w=majority&appName=Cluster0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe, to make this easier on the eyes, to remove all of the metadata in the connection_string? So it's just
connection_string="mongodb+srv://user:[email protected]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think keeping the parts in that are part of the standard Atlas connection string makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep it to MONGODB_ATLAS_CLUSTER_URI
client = MongoClient(connection_string) | ||
client = MongoClient( | ||
connection_string, | ||
driver=DriverInfo(name="Langchain", version=version("langchain-mongodb")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use the official capitalization LangChain
, or use langchain-mongodb
if we need to differentiate from langgraph (though, can we just use LangGraph
?). We should not need to report language specifically, because that will be known based on the driver.
I believe the version is intended to be the library version number, which should equal 0.2.69
. If it's not available, it seems the empty string is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had similar questions originally. The decision to use Langchain, not LangChain was made long ago, and fees a number of BI (business intelligence) reports. langchain-mongodb
is actually our package name. (e.g. from langchain_mongodb.pipelines import vector_search_stage
). In any case, I am not starting from scratch here. I am just doing a review of our monorepo and adding the DriverInfo where it could have been added previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, the main thing for the library naming is consistency.
Regarding the version, our handshake specification says the following:
The version of the library wrapping the driver.
I would agree with what @blink1073 is suggesting below, which is adding the package or feature, if needed, to the name, for example Langchain|chat-message
(and the versions would then be 0.2.69|
). This is what we would do in Java. Note that:
The entire client metadata BSON document MUST NOT exceed 512 bytes.
This means that we have relatively few bytes to work with, and a long name like langchain-mongodb-chat-message
increases the chance that the entire driver info is dropped from the message.
vector_store = MongoDBAtlasVectorSearch( | ||
collection=MONGODB_COLLECTION, | ||
vector_store = MongoDBAtlasVectorSearch.from_connection_string( | ||
connection_string="mongodb+srv://user:[email protected]/?w=majority&appName=Cluster0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think keeping the parts in that are part of the standard Atlas connection string makes sense
collection=MONGODB_COLLECTION, | ||
vector_store = MongoDBAtlasVectorSearch.from_connection_string( | ||
connection_string="mongodb+srv://user:[email protected]/?w=majority&appName=Cluster0" | ||
namespace=f"db_name.collection_name", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary f-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
self.client = MongoClient( | ||
connection_string, | ||
driver=DriverInfo( | ||
name="Langchain", version=version("langchain-mongodb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we overload the version info to also include info about the specific feature?
for instance here it would be version=version("langchain-mongodb-chat-message")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Version is only tracked at the package level. If we want to include more info, we usually add it to the name, like PyMongo|c|async
, so in this case it would be Langchain|chat-message
.
…STER_URI to MONGODB_ATLAS_CONNECTION_STRING
Consistently add DriverInfo to MongoClient calls so that we track metadata.