Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Support for binary point file (.bin) reading and writing #943

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ChristophKirst
Copy link

Here we provide support for a binary point file reader and writer to elastix to enable transformations of a large number of points (millions or more) for which the text format is not feasible.

This is based on the discussion in issue #665

The pull request provides the following:

  • Binary file reading and writing is detected if the -def file ends with extension 'bin'.

  • The binary serial format is assumed to be:
    isindex(uint64), numberofpoints(uint64), points(nx3, double)

  • Only the transformed points are written to get maximal performance.

We would be very happy to discuss any changes to this strategy that might fit your overall strategy better.
A binary read/writing option is essential for us. If it was integrated to your repository it would allow us elastix in our project rather then shipping our own patched version with our software.

@N-Dekker
Copy link
Member

N-Dekker commented Aug 10, 2023

Thank you very much @ChristophKirst I appreciate how you carefully implemented this feature! Will discuss with Stefan (@stefanklein) and Marius (@mstaring) how to follow-up on your PR!

In the mean time some small comments already. (But the main question remains: do we agree about the feature itself. To be discussed later.)

Comment on lines +41 to +42
/cmake-build-debug/
/build/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see this as a separate commit. (Following the principle of "One Commit. One Change.") Why do those directories appear in your source tree? Are they specific to a particular tool or platform? I'm fine with the change itself, but I would like some clarification!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry for the late reply due to other obligations.
This can be safely ignored, i did not realize this change in .gitignore

/** Read the first entry */
std::string indexOrPoint;
this->m_Reader >> indexOrPoint;
if( this->m_Binary ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use automatic code formatting by clang-format. It looks like this code still needs to be formatted that way, as I would expect it to place the { on the next line. (We use a specific (old) version of clang-format: 8.0.1) No problem, of course, we can fix that later.

this->m_Reader.open(this->m_FileName, std::ios::binary);

/** Read index flag */
unsigned long isindex;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry to say the use of (signed or unsigned) long is problematic, because its size differs between currently supported platforms and compilers. On Linux/GCC it's typically 64 bits, whereas on Windows/MSVC, it's 32 bits. If you want to write unsigned integers on one platform and read them back on another, better use either uint32_t or uint64_t. (In general use cases, unsigned int is usually fine, as it is the "default" unsigned type, and it's 32-bits on all supported platforms.)


/** Read index flag */
unsigned long isindex;
this->m_Reader.read((char*) &isindex, sizeof(isindex));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of C-style (char*) cast, please use reinterpret_cast<const char *>. C-style casts are generally considered unsafe.

@N-Dekker
Copy link
Member

Some tests should be still added. I'm not sure if they can be written as small GoogleTest unit tests (testing the extra TransformixInputPointFileReader functionality at C++ class level), or only as elastix executable tests. But I would like to see tests for various number of points (N = 0, N = 1, N = 2), as well as various coordinate values. Still to be discussed, maybe.

@N-Dekker
Copy link
Member

@ChristophKirst Is the binary file format already in use outside of your (patched) elastix version? Otherwise we would prefer to support a more commonly used file format. Would the ".vtk" file format (from vtk.org) also be of interest to you?

@ChristophKirst
Copy link
Author

@N-Dekker sorry for the late reply due to many other obligations.
We currently use the patched version with our software ClearMap to align cleared tissues, and it would be much easier to link to your repository rather than ship ClearMap with our patched elastix version.
That said, I have not looked into vtk files, is there support for fast writing and reading of millions of points via vtk files in elastix / transformix ?

@ChristophKirst
Copy link
Author

ChristophKirst commented Oct 15, 2024

@N-Dekker I have checked, the transformix seems to read vtk binary files, but will generate a text based vtk output file, which is not practical in our case.
I could try to adapt elastix so one can choose the output file as text or binary ?
On the sort run, is it possible to adapt the solution above with the fixes you suggested?

@N-Dekker
Copy link
Member

N-Dekker commented Oct 15, 2024

@ChristophKirst Thanks for your continuing interest 😃

the transformix seems to read vtk binary files, but will generate a text based vtk output file, which is not practical in our case.

I think I can fix that, by a few code adjustments 😃 What if transformix would generate a text based vtk output file, if its input vtk file would be binary? Would that solve your problem?

A draft implementation is at https://github.com/SuperElastix/elastix/tree/Write-VTK-binary

And by the way, do you prefer 32-bit float or 64-bit double precision for the coordinates of the points?

N-Dekker added a commit that referenced this pull request Oct 16, 2024
If the VTK input file is a binary file, the output file should also be a binary file.

Otherwise, the output file is still ASCII, as before.

Inspired by pull request #943 "ENH: Support for binary point file (.bin) reading and writing", ChristophKirst, Aug 2, 2023.
N-Dekker added a commit that referenced this pull request Oct 23, 2024
If the VTK input file is a binary file, the output file should also be a binary file.

Otherwise, the output file is still ASCII, as before.

Inspired by pull request #943 "ENH: Support for binary point file (.bin) reading and writing", ChristophKirst, Aug 2, 2023.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants