We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi, when using pymongoarrow.api.aggregate_arrow_all() it seems to omit columns that would contain only null values.
data = [ {"name": "Charlie", "email": None}, {"name": "Eve", "email": None}, ]
PyMongoArrow result: [{'_id': ObjectId('66a36acc11ce1209ca0bfcf8'), 'name': 'Charlie'}, {'_id': ObjectId('66a36acc11ce1209ca0bfcf9'), 'name': 'Eve'}] PyMongo result: [{'_id': ObjectId('66a36acc11ce1209ca0bfcf8'), 'name': 'Charlie', 'email': None}, {'_id': ObjectId('66a36acc11ce1209ca0bfcf9'), 'name': 'Eve', 'email': None}]
PyMongoArrow result contains field 'name' but is missing field "email".
data = [ {"name": "Charlie", "email": None}, {"name": "Eve", "email": ""}, ]
PyMongoArrow result: [{'_id': ObjectId('66a3689f75fbe1b2bef04931'), 'name': 'Charlie', 'email': None}, {'_id': ObjectId('66a3689f75fbe1b2bef04932'), 'name': 'Eve', 'email': ''}] PyMongo result: [{'_id': ObjectId('66a3689f75fbe1b2bef04931'), 'name': 'Charlie', 'email': None}, {'_id': ObjectId('66a3689f75fbe1b2bef04932'), 'name': 'Eve', 'email': ''}]
PyMongoArrow result contains 'name' and 'email' fields.
from pymongo import MongoClient from pymongoarrow.api import aggregate_arrow_all data = [ {"name": "Charlie", "email": None}, {"name": "Eve", "email": None}, ] # Insert data client = MongoClient("mongodb://localhost:27017/") db = client["my_dummy_database"] collection = db["my_dummy_collection"] collection.insert_many(data) # Retrieve results pipeline = [{"$match": {"email": {"$exists": True}}}] result_arrow = aggregate_arrow_all(collection, pipeline) result_regular = collection.aggregate(pipeline) print("PyMongoArrow result:\n", result_arrow.to_pylist()) print("PyMongo result:\n", list(result_regular))
The text was updated successfully, but these errors were encountered:
Thanks for reporting this bug @K-to-the-D@ This has to do with the auto schema, and hopefully straightforward to fix given Arrow's null type
Sorry, something went wrong.
caseyclements
No branches or pull requests
Hi,
when using pymongoarrow.api.aggregate_arrow_all() it seems to omit columns that would contain only null values.
Field "email" with None only
PyMongoArrow result contains field 'name' but is missing field "email".
Field "email" with None and empty string
PyMongoArrow result contains 'name' and 'email' fields.
Code used for this example:
The text was updated successfully, but these errors were encountered: