Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial transliteration of RawImage to an Eigen backend. #364

Merged
merged 37 commits into from
Nov 2, 2023

Conversation

DinoBektesevic
Copy link
Member

Draft PR for discussion purposes.

  • adds Eigen as a submodule
  • adds Eigen to search path during builds
  • copies RawImage to RawImageEigen
  • replaces the vector<float> image array with Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>
  • reimplements the methods in terms of Matrix objects to showcase the interface and enable perf measurements.
  • Attempts to bind a more Python native interface but it's still a bit messy.

The challenge in interpreting what are the cases processing methods are handling is probably the hardest part of the transition to Eigen. Many implementations either accrued some hidden deep knowledge about non-obvious edge-cases or are vestigial remnants of a debugging expedition that were never cleared out - it's hard to say.

run Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@jeremykubica jeremykubica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more comments.

src/kbmod/search/raw_image_eigen.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image_eigen.h Outdated Show resolved Hide resolved
@DinoBektesevic
Copy link
Member Author

Ok, I've redone the Index and Point in a way that I hope makes a lot more sense. I didn't convert all of the methods to that yet, f.e. grow_mask comes to mind as a simple example, but I did do interpolate for comparison sake.

New:

inline auto RawImageEigen::get_interp_neighbors_and_weights(const Point& p) const {

Old

std::array<float, 12> RawImageEigen::get_interp_neighbors_and_weights(float y, float x) const {

This and interpolate turned out a whole heck of a lot shorter than before. Unpacking of optional is a bit odd, maybe it'd be more readable if we did if (neighbor.value_or(false)) do_something rather than a loop, but I was told by strangers on the internet that this is a reasonable thing to do.

I'd also point out the get_pixel, set_pixel and add are all now working via contains and Point or Index so that the rounding down is consistently being (int)floord and the checks are all in 1 place (even the contains could have been boiled down to a single method, but it's getting late) and all these have been made inlined functions. I have tested simplified variants of the code and noticed that the inline hint is actually working on -O3, but I can't guarantee that it would inline them every time.

I also think it's rather interesting how context kind of already sets what a method should take - an Index or an Point and the two almost never overlap. I'd even like to remove the add that takes in a Point - because it really doesn't make sense to add a value at a point, only an index. Only interpolated_add makes sense for a Point.

Does this look better to you?

@jeremykubica
Copy link
Contributor

This approach looks good to me. I agree with your thoughts on Point vs Index. The new formulation of them looks good.

src/kbmod/search/raw_image.cpp Outdated Show resolved Hide resolved
src/kbmod/fake_data_creator.py Outdated Show resolved Hide resolved
src/kbmod/search/raw_image.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image.cpp Outdated Show resolved Hide resolved
src/kbmod/search/raw_image.cpp Outdated Show resolved Hide resolved
@DinoBektesevic DinoBektesevic marked this pull request as ready for review October 25, 2023 17:39
src/kbmod/search/common.h Show resolved Hide resolved
float y;
float x;

const Index to_index() const { return {(int)(floor(y - 0.5f) + 0.5f), (int)(floor(x - 0.5f) + 0.5f)}; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is assuming that the center of the pixel is (0.5, 0.5), correct? You had mentioned changing this assumption to be consistent with other approaches (although I can't remember which... maybe astropy?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the line in question is the one in stamps. This makes us consistent with both the Rubin stack and AstroPy I believe.

auto [corner, anchor, h, w] = indexing::anchored_block({(int)p.y, (int)p.x}, radius, width, height);

https://github.com/dirac-institute/kbmod/blob/eigen/raw_image/src/kbmod/search/raw_image.cpp#L117C61-L117C66

src/kbmod/search/geom.h Outdated Show resolved Hide resolved
src/kbmod/search/geom.h Outdated Show resolved Hide resolved
// range is inclusive of the first element, and can not be longer than start
// minus max range
int length = end - start + 1;
length = std::min(length, max_range - start);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this actually happen? It looks like it is covered with the mins and maxes above.

Copy link
Member Author

@DinoBektesevic DinoBektesevic Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit confusing because when start is 0 - we count that as an element, but when start is something like 7 and end is 10, then the length is just end-start.

Removing it will break a test in test_geom.py - but because that's the new functionality we've just added we can just remove this and fix the tests. It's a matter of what we decide the method should behave as. I'm seeing something is off (or confusing at least) regarding the range in the tests but left it as is to make progress. Have to revisit, would appreciate a sanity check.

@@ -248,19 +249,32 @@ void StackSearch::fill_psi_phi(const std::vector<RawImage>& psi_imgs, const std:

psi_vect->clear();
psi_vect->reserve(num_images * num_pixels);
phi_vect->clear();
phi_vect->reserve(num_images * num_pixels);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

psi_vect and phi_vect are both initially passed as size 0. The motivation of the reserve was to avoid a bunch of small copies during array doubling. Since we know the final size, we can preallocate the full vector at the start.

Either way, we should say consistent with psi_vect and phi_vect: preallocate both or neither.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes no sense to remove these lines, my mistake - not sure what happened. Restored

@@ -280,7 +294,8 @@ std::vector<float> StackSearch::create_curves(Trajectory t, const std::vector<Ra
/* Do not use get_pixel_interp(), because results from create_curves must
* be able to recover the same likelihoods as the ones reported by the
* gpu search.*/
float pix_val = imgs[i].get_pixel(t.x + int(times[i] * t.vx + 0.5), t.y + int(times[i] * t.vy + 0.5));
Point p({t.y + times[i] * t.vy + 0.5f, t.x + times[i] * t.vx + 0.5f});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is orthogonal to this PR, but I think we should clean up the inconsistencies sometimes shifting the trajectory positions by 0.5 and sometimes not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Best guess here is that this used to be a part of a for loop or something and the someone noticed rounding is sometimes funky, added 0.5 in. This 0.5 is now "extra" because the context was that origin was in each pixel or something like that. We should either shortcut this directly make sure Trajectory returns Point and Point.to_index does the correct rounding.

src/kbmod/search/stamp_creator.cpp Outdated Show resolved Hide resolved
src/kbmod/search/stamp_creator.cpp Outdated Show resolved Hide resolved
src/kbmod/work_unit.py Outdated Show resolved Hide resolved
src/kbmod/fake_data_creator.py Outdated Show resolved Hide resolved
src/kbmod/search/common.h Show resolved Hide resolved
src/kbmod/search/geom.h Show resolved Hide resolved
src/kbmod/search/geom.h Show resolved Hide resolved
src/kbmod/search/geom.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image.h Outdated Show resolved Hide resolved
src/kbmod/search/raw_image.cpp Show resolved Hide resolved
// replaced by
// Eigen::Index, i, j;
// array.maxCoeff(&i, &j)
// Not sure I understand the condition of !furthest_from_center
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for getting rid of furthest_from_center

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned up the comments, leaving to remember to open an issue.

tests/test_raw_image.py Show resolved Hide resolved
tests/test_work_unit.py Show resolved Hide resolved
Consistently handle all the various Cartesian vs indexing notation.
Add tests for structs and functions in geom.h.
Add bindings for missing functions.
Improve bindings for Index and Point.
  - add comparison operators for Index==Index and Index==tuple
  - add comparion operators for Point==Point and Point==tuple
  - add casting to numpy structured arrays.
  - add constructors for Point
Improve AnchoredRectangle:
  - rename to Rectangle
  - add constructors to enable use as Rectangle
  - anchor is a consequence of how we treat stamps (2r+1) and
    can be generalized, but that's a different PR. Treat that
    as implementation detail for now.
Add comments, improve code readability.
Clang and flakify.
@DinoBektesevic
Copy link
Member Author

I'm a little bit confused by this error. On Baldur it returns

(Pdb) trj
Trajectory(lh: 0.000000 flux: 0.000000 x: -3 y: 12 vx: 25.000000 vy: 10.000000 obs_count: 0)
(Pdb) best
Trajectory(lh: 66.763184 flux: 232.403976 x: -3 y: 12 vx: 25.012035 vy: 9.990126 obs_count: 18)

On bendis though:

(Pdb) trj
Trajectory(lh: 0.000000 flux: 0.000000 x: -3 y: 12 vx: 25.000000 vy: 10.000000 obs_count: 0)
(Pdb) best
Trajectory(lh: 66.763184 flux: 232.403976 x: -3 y: 12 vx: 25.206821 vy: 9.487919 obs_count: 18)
(Pdb) self.velocity_error
0.05

Not sure where the difference comes from. I'll investigate more, but do you have an idea?

@jeremykubica
Copy link
Contributor

I haven't seen a different like that before. What point of the code do those come from?

@DinoBektesevic
Copy link
Member Author

DinoBektesevic commented Oct 31, 2023

https://github.com/dirac-institute/kbmod/actions/runs/6701639278/job/18209372552?pr=364#step:8:36

Test search, pixel_off_chip. Pretty consistently tests pass on baldur (by which I mean I haven't had them fail yet), but they also pretty consistently (which means: every time) fail on bendis. Cuda 12, gcc and g++ is some version of 8.5 on both machines. Bendis uses 1080, baldur two 2080 Ti's so drivers and nvcc subversions are different.

Search produces 3 nearly identical results to within a pixel.
Different gcc/g++/nvcc versions sorting algorithm produce different
sorting of the results. This is version dependent as they are all
correctly sorted, as per the likelihood which is equal for the results.
@DinoBektesevic
Copy link
Member Author

I'm not super happy with the RawImage Python bindings and how best to handle, if at all, the non-const reference to the matrix. I do think there's been enough work done here to make addressing that into another PR. Since this is somewhat blocking, I'd prefer we just get it in, downsides included, and then work at it again.

Copy link
Contributor

@jeremykubica jeremykubica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. A few small comments.

src/kbmod/search/raw_image.cpp Show resolved Hide resolved
// it makes no sense to return RawImage here because there is no
// obstime by definition of operation, but I guess it's out of
// scope for this PR because it requires updating layered_image
// and image stack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. In a later PR we can change this to return the Eigen matrix.

src/kbmod/search/geom.h Outdated Show resolved Hide resolved
src/kbmod/search/geom.h Outdated Show resolved Hide resolved
src/kbmod/search/pydocs/geom_docs.h Outdated Show resolved Hide resolved
src/kbmod/search/pydocs/geom_docs.h Outdated Show resolved Hide resolved
DinoBektesevic and others added 2 commits November 2, 2023 16:16
Fix spelling mistake in geom_docs

Co-authored-by: Jeremy Kubica <[email protected]>
@DinoBektesevic DinoBektesevic merged commit fc69675 into main Nov 2, 2023
1 check passed
@jeremykubica jeremykubica deleted the eigen/raw_image branch February 23, 2024 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants