-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean md store objects #8700
base: master
Are you sure you want to change the base?
Clean md store objects #8700
Conversation
Clean md store objects once the max deleted objects reach the limit. Fixes: https://issues.redhat.com/browse/DFBUGS-1339 Signed-off-by: Vinayakswami Hariharmath <[email protected]>
@@ -58,23 +58,39 @@ async function clean_md_store(last_date_to_remove) { | |||
${total_objects_count} objects - Skipping...`); | |||
return; | |||
} | |||
const objects_to_remove = await clean_md_store_objects(last_date_to_remove, config.DB_CLEANER_DOCS_LIMIT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you rearrange the same code in smaller functions? I'm not sure if it's needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. looking to clean only deleted md objects and other calls in the function not needed. Divided the function into 3 sub functions.
dbg.log2('DB_CLEANER: list objects:', objects_to_remove); | ||
if (objects_to_remove.length) { | ||
if (objects_to_remove.length > config.MD_STORE_MAX_DELETED_OBJECTS_LIMIT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we adding this limit? why not deleting less?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the good number ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think number is arbitrary, but we can go with 10 or 20 for example
@@ -42,6 +43,9 @@ class ObjectsReclaimer { | |||
if (has_errors) { | |||
return config.OBJECT_RECLAIMER_ERROR_DELAY; | |||
} | |||
|
|||
await clean_md_store_objects(Date.now()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think this belongs here... why should we run db_cleaner inside object_reclaimer - db_cleaner will run on objects with no regard to the list that was reclaimed (and of course, we don't want to db-delete objects that were just marked as deleted completely from the DB!). I think what we wanted here is a new functionality that for each of the objects, will check how many deleted objects we have with the same key and, if it's more than X - delete the older copies - @dannyzaken to keep me honest here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jira issue (https://issues.redhat.com/browse/DFBUGS-1339) is specifically mentioned that we can call clean the deleted objects from the md store in object reclaimer. Thought this is the place. @dannyzaken Please correct me here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vh05 I didn't intend for the description in Jira to be specific instructions on how to fix it. I wanted to provide some general context.
I don't see a reason to call from object_reclaimer to the db_cleaner. These are two separate bg_workers.
@vh05. I think this PR does not do what it is supposed to do. We want to keep up to In your implementation, you delete any object_md that is deleted and only keep 100 in total. This is a bit too aggressive, in my opinion. Another thing to consider is that you probably want to ignore objectmds that are not marked as I strongly suggest creating a similar case on a local deployment. You can easily produce a dataset with >100 overwrites of the same key, so you can test your code. |
You will probably need to implement new functions in md_store. noobaa-core/src/server/object_services/md_store.js Lines 805 to 826 in f32827b
It is probably not sufficient, and you need to aggregate more data than this (for example, you will need to keep track of the _ids of objects to delete\keep). noobaa-core/src/util/postgres_client.js Lines 1180 to 1200 in f32827b
This translation is far from perfect, and we should verify that the generated queries work as expected and perform reasonably well. |
|
We perform soft delete (mark the
I'm not sure I understand your point. We don't want to remove deleted rows arbitrarily after reaching some limit. We specifically want to handle this overwrite use case, where a single object is frequently overwritten.
Of course we need to profile it once we have the code ready. We can run EXPLAIN queries once we have a working query, and analyze the performance. |
Sure Danny. That clarifies my queries. |
Clean md store objects once the max deleted objects reach the limit.
Fixes: https://issues.redhat.com/browse/DFBUGS-1339