-
Notifications
You must be signed in to change notification settings - Fork 28
change insert_one() to insert_many() in mongo_db save function. #455
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Plus, if there is duplicate "_id" key in mongo db, it occurs error and stop everything)
You need to add test code for this kind of situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to give option to user about upsert
.
If upsert
is True, duplicate id does not occur error and just overwrite existing id.
If upsert
is False, duplicate id occurs error.
Sometimes upsert
option is super convenient, so we might support upsert
feature. Find way to implement this with pymongo.
Add test code.
If I use pymongo's update_many() with upsert=True, it would update all existing passages into a "single passage". So we take the upsert parameter directly, and if true, we put the existing ids in a list and only update the duplicates (and even then, we use bulk_wirte to minimize the iteration as much as possible). Instead of updating all of them, we just proceed with insert_many() for non-existing ids. (I've been googling hard for a function to update all at once in pymongo, but I haven't found one...) |
…xtend will update it for you)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finally..
close #438
(Plus, if there is duplicate "_id" key in mongo db, it occurs error and stop everything)
-> This feature is already well supported by mongodb itself (�BulkwriteError)