-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upsert dataset #12
Upsert dataset #12
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you might want to just support the single generic call that supports Raw or Derived, or the two separate calls for Raw and Derived. But perhaps I'm missing something.
I see that the build is failing on the pre-commit step. I like the idea of the pre-commit step, but it has been really buggy. This particular error was about a dependency of black that needs to be pinned. I have removed the pre-commit step in gh actions in #15 . If you merge this into your branch, you can probably get the build running on it. |
Thank you for your comments, I will implement changes based upon them. Regarding the generic versus raw/derived upsert functions, I included both because I wasn't sure which way would be preferred. Is there a preference in the existing Scicat code for one method over another? |
Funny, I just noticed now that the upload follows the same pattern of three functions. I don't have a strong opinion. I think it's a little unfortunate that the backend API supports this same pattern as well. I think I personally prefer two separate methods. |
Okay, I'll go ahead with the two separate methods then. That seems the best approach to me as well. I've made the modifications suggested r.e. allowing the upsert method to insert if no record exists already (which is what it should do, you're correct) and the two separate functions. It's likely these will fail on pre-commit again however, as I have yet to merge the removal of pre-commit, apologies. |
With the pre-commit removal merged, it looks like it passes all the checks now. Hopefully I've implemented all the changes you suggested correctly, many thanks for your help. |
An interesting note...there is an upcoming new version of the scicat backend (https://github.com/SciCatProject/scicat-backend-next) which will no longer have separate URLs for Raw vs. Derived datasets. Pyscicat still needs to for now, but in the future, we might remove separate upload functions for Raw and Derived to match that pattern. |
I'll merge this now. Thanks for the PR and sorry it took so long! As a side note, I'm thinking about consolidating upsert and upload methods to match the new (but not yet released) version of the backend. See #18 |
We often upsert payloads into Scicat, especially when the datasets have been modified or appended. We are in the process of refactoring code to use Pyscicat and would make good use of an upsert functionality.
Here is our attempt, adding in upsert_dataset, upsert_raw_dataset and upsert_derived_dataset functions along with a test case. They utilise the already existing get_dataset function.