Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major updates #94

Open
jmuhlich opened this issue Jan 2, 2025 · 7 comments
Open

Major updates #94

jmuhlich opened this issue Jan 2, 2025 · 7 comments

Comments

@jmuhlich
Copy link

jmuhlich commented Jan 2, 2025

Hi Erick, I've been working on adding some missing features I need for a migration of my own, as well as streamlining the code (especially around ome_types usage since I have a lot of insider knowledge there 😅). You can see what I'm up to at https://github.com/jmuhlich/omero-cli-transfer/tree/local-changes . I have some questions for you about certain design choices in the code that will help guide my work, and also about how to split up these changes into individual PRs. What's the best place to discuss this?

@erickmartins
Copy link
Collaborator

Hi Jeremy! Before anything else, I'm about to go on vacation so any answer might be a bit late. I think maybe a good place for discussion would be a topic on the image.sc zulip - that allows for other people to see it as well and chime in. Thanks in advance for being willing to contribute code!

@michaelmell
Copy link

Hi @jmuhlich ,

we are preparing the migration of data (40TB) between two OMERO instance using omero-cli-transfer. Currently, I am using omero-cli-transfer 1.1.0 for this.

I came across your fork and after looking through your improvements / commits, I am considering switching your version / fork. I would therefore be interested to hear, how stable you consider your current version. Did you already test this for your own migration?
Any input would be greatly appreciated!

By the way: I made some modifications to the code (though they are not yet cleaned up). You can find them here, if you are interested.

Thank you and kind regards,
Michael

@jmuhlich
Copy link
Author

Hi Michael, glad to hear my changes could be of use to you as well. I think my version is quite stable -- I've migrated about 15 TB (~130 very large whole-slide tissue images) so far, which is 10% of the total I need to move. I'm pretty happy with the results.

One key feature I added (which you might also be interested in, based on your own modifications) is "--binaries all-except-images". I wanted this since I have local copies of the images on the source OMERO instance and I wanted to avoid copying that data over the internet. My workflow has been to create packs without images, then fill in the missing image files as symlinks to the proper files. I also reworked how the paths are normalized before being passed to omero import so the import metadata captures the final resolved path. Be aware that there's a minor bug with this path resolution change: If you link more than one file in a single pack folder to the same underlying file, the unpack process will crash. I know why this happens and I have a plan to address it, but for now just make sure to avoid doing that.

Most of my other changes add pack/unpack support for 1) new entities (channel names and Rendering Definitions, XMLAnnotations, TimestampAnnotations), 2) a more complete set of properties on existing entities (ROI Shape visual properties and affine transform, Annotation description and namespace, OriginalFile mimetype and name/path, Figure name), and 3) Annotations in previously unhandled places (Shapes, as distinct from annotations on the parent ROI itself). Be aware that I haven't tested the Screen/Plate/Well support after all my changes, as my current migration doesn't include any screening data. If you have screening data, please run some small-scale tests first.

One last note: There is a critical but obscure bug in the original code when using "unpack --figure --folder". If the unpack process crashes (which it did a lot during my development process), the figure .json files will be corrupted and future unpack runs will upload the corrupted version. Until this is fixed you should make backups of the figure .json files before running unpack.

This turned out to be pretty long but I hope it's useful. We can take this conversation to the Zulip chat if Erick prefers.

@joshmoore
Copy link
Member

Regardless of where, 👍 from my side.

@michaelmell
Copy link

Hi @jmuhlich ,

I created a topic and posted my reply on Zulip here. See you there...

@erickmartins
Copy link
Collaborator

hi all,

I didn't comb through the fork(s) extensively but, from your description, there is plenty here that I would love to have upstream as well (especially bugfixes!). If you can find the time to submit piecemeal PRs, that would be fantastic - we always need to keep the maintenance burden in mind when adding features (especially given that ultimately comes to me at the moment, and I find myself with less and less time to write code), but plenty of what has been described seems to be worth it. I think using GH for discussion on specific features and PRs is the right thing - discussions about bigger-picture design decisions and more "general development" kinds of things should probably live on Zulip to make sure everyone has a chance to chime in. Thanks again for extending and improving this!

@jmuhlich
Copy link
Author

@michaelmell and anyone else using my branch: I fixed the Figure json file corruption issue just now, in case that was of concern!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants