Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is the world coord system rotated in colmap_utils.py? #1504

Open
jkulhanek opened this issue Feb 25, 2023 · 23 comments
Open

Why is the world coord system rotated in colmap_utils.py? #1504

jkulhanek opened this issue Feb 25, 2023 · 23 comments

Comments

@jkulhanek
Copy link
Contributor

What is the reason the world coord system is rotated here?

c2w = c2w[np.array([1, 0, 2, 3]), :]
c2w[2, :] *= -1

Could we keep the coordinates as they are in COLMAP?

@pwais
Copy link
Contributor

pwais commented Feb 25, 2023

COLMAP uses OpenCV camera conventions, but nerfstudio (and many other nerf impls, maybe all of them?) use OpenGL coordinate system. The code here is a bit obfuscated but it provides a precise translation.

FMI see discussions:

@tancik
Copy link
Contributor

tancik commented Feb 25, 2023

@jkulhanek
Copy link
Contributor Author

The two lines don’t change the camera coordinate system though, there is a line above the two which flips z,y to the from opencv to opengl. I thought that in colmap the orientation of world coordinate system is more or less arbitrary. Even when you run something like model_aligner the orientation is not defined. The orientation is overridden in the dataparser anyway, so it doesn’t really matter. I just thought it would be good to have a way to preserve the poses from colmap (when disabling the transformation code in the dataparser).

to the best of my knowledge, in multinerf (first referenced link) they don’t do it. They just switch from opencv to opengl here https://github.com/google-research/multinerf/blob/5d4c82831a9b94a87efada2eee6a993d530c4226/internal/datasets.py#L109

@pwais
Copy link
Contributor

pwais commented Feb 28, 2023

you're right, the cited code in nerfstudio is not equivalent to the cited multinerf code.. at least not the precise cited lines. The cited conversion appears to do a handedness change but then also flips z:

array([[ 0, 1,  0,  0],
       [ 1, 0,  0,  0],
       [ 0, 0, -1,  0],
       [ 0, 0,  0,  1]])

(note I'm going to ignore re-scale / re-center)

So, COLMAP is right-handed, OpenGL camera model is right-handed, but OpenGL NDC is left-handed, no? np.diag([1, -1, -1, 1]) should account for the change in camera frames, but the nerfstudio code is doing a handedness too? this is very confusing because the viewer world frame indicator is definitely right-handed, at least assuming the RGB axes all indicate positive xyz

I believe the original LLFF code used in original nerf does a handedness change as well:

Note that the original nerf code also had a special "NDC" mode for forward-facing scenes, that's a little different than what we're discussing here I think:

And instant-ngp has it too:

SDFStudio in contrast flips the camera frame back to OpenCV:

One way to break all this down is into three frames: world, camera (logical), and image plane ("sensor frame" or "screen frame"). If expected ray termination is supposed to be positive, and the screen / "sensor" frame has +x right and +y up, then that's a left-handed system, while the world and camera frames are right-handed.

So I think what we're seeing is that the Nerf model code is "leaking the abstraction of the screen space" into the data encoding? In multinerf it seems the screen space stuff is absorbed into the ray casting step: https://github.com/google-research/multinerf/blob/5d4c82831a9b94a87efada2eee6a993d530c4226/internal/camera_utils.py#L646

Now I'm curious if the camera paths that can be specified in the viewer for rendering... are those poses in a right-handed or left-handed system? ha

@jkulhanek
Copy link
Contributor Author

jkulhanek commented Feb 28, 2023

The cited code does not change the handedness. It only rotates the world coordinate system, right?

@pwais
Copy link
Contributor

pwais commented Feb 28, 2023 via email

@jkulhanek
Copy link
Contributor Author

If I delete the code, it trains fine. From what I understand, the only problem would be with forward-facing scenes?

@ShkarupaDC
Copy link

ShkarupaDC commented Mar 26, 2023

@pwais, hi! I am sorry for the stupid question, but could you explain again:

  1. Why do we need this conversion after the OpenCV -> OpenGL coordinate system conversion?
  2. What problem with forward-facing scenes does it solve, and why?

@qq456cvb
Copy link

I don't think there is a handedness change in these two lines, the first line changes the handedness and the next line changes it back. You can also verify this by calculating the determinant of the matrix
array([[ 0, 1, 0, 0], [ 1, 0, 0, 0], [ 0, 0, -1, 0], [ 0, 0, 0, 1]]) (det=1 for sure)
So it is merely a rotation of world frame, which I think has no influence on training.

@panchagil
Copy link

panchagil commented Oct 11, 2023

I've faced the same problem since I wanted to visualize the camera positions along with colmap point-cloud.

The difference between COLMAP and Nerfstudio format is just on the sign.

Nerfstudio: +X is right, +Y is up, and +Z is pointing back and away from the camera. -Z is the look-at direction.

Colmap: The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image. (ref: colmap-doc)

I'm using this code instead

c2w = np.linalg.inv(w2c)
c2w[0:3, 1:3] *= -1
# that is it!

@jb-ye
Copy link
Collaborator

jb-ye commented Jan 19, 2024

@jkulhanek If you want to keep the world coordinate, this is the option

--assume_colmap_world_coordinate_convention=False \
--orientation_method=none \
--center_method=none \
--auto-scale-poses=False \

which expects this behavior in colmap parser.

@pwais
Copy link
Contributor

pwais commented Mar 11, 2024

as i originally instigated this issue, i vote that #2793 effectively closes this one. @jkulhanek any thoughts?

@pwais
Copy link
Contributor

pwais commented Mar 11, 2024

Actually I think @jb-ye #2793 might have a bug? when assume_colmap_world_coordinate_convention is off, the camera poses read from colmap will change: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L163

BUT the world cloud will not change as well: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L388

That function above will only get the auto-orient-and-scale transform I think, eh?

@jb-ye
Copy link
Collaborator

jb-ye commented Mar 11, 2024

Actually I think @jb-ye #2793 might have a bug? when assume_colmap_world_coordinate_convention is off, the camera poses read from colmap will change: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L163

BUT the world cloud will not change as well: https://github.com/jb-ye/nerfstudio/blob/8b4179fad405f1203fcfc973cad69ee051d40132/nerfstudio/data/dataparsers/colmap_dataparser.py#L388

That function above will only get the auto-orient-and-scale transform I think, eh?

This is the intended behavior. When the flag is turned on, an additional transform will be applied to world coordinates. Camera poses will be converted to OpenGL convention regardless of this flag.

@pwais
Copy link
Contributor

pwais commented Mar 11, 2024

@jb-ye but for methods like gaussian splatting, the world frame of the 3d points needs to be the same world frame referenced by the camera transforms, no? fwiw somebody was having problems with splatting hence we were trying to debug. I guess I don't understand what assume_colmap_world_coordinate_convention=False is for despite the comment in the code. Even re-reading the summary of #2793 I don't understand why / when to use assume_colmap_world_coordinate_convention=False and orientation_method=none and some others were confused too.

@jb-ye
Copy link
Collaborator

jb-ye commented Mar 12, 2024

@jb-ye but for methods like gaussian splatting, the world frame of the 3d points needs to be the same world frame referenced by the camera transforms, no? fwiw somebody was having problems with splatting hence we were trying to debug. I guess I don't understand what assume_colmap_world_coordinate_convention=False is for despite the comment in the code. Even re-reading the summary of #2793 I don't understand why / when to use assume_colmap_world_coordinate_convention=False and orientation_method=none and some others were confused too.

just to clarify, the additional transform enabled by assume_colmap_world_coordinate_convention is always there from the very beginning. My PR just wrap it around a flag so we can optional turn it off.

Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file. By default, nerfstudio will first apply a fixed transform to colmap world before re-orienting it using orientation method. The only way to disable this behavior is set both option to false/none.

The 3d point clouds are transformed in the same way as we did for cameras regardless whether those options are on/off. If you believe they are inconsistent, could you tell a bit more details about why you think so?

@pwais
Copy link
Contributor

pwais commented Mar 14, 2024

Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file

Maybe the flag and functionality should just move moved to the viewer, since that was what the original "bug" was? Options related to export should probably be in export.py and not data parsing / input stage, and today some things like gsplat don't even actually respect the colmap frame. At least, the name assume_colmap_world_coordinate_convention on input doesn't suggest anything about exporting ...

@jb-ye
Copy link
Collaborator

jb-ye commented Mar 15, 2024

Those options are reserved for the case when you want to keep the orientation of colmap world when exporting ply file

Maybe the flag and functionality should just move moved to the viewer, since that was what the original "bug" was? Options related to export should probably be in export.py and not data parsing / input stage, and today some things like gsplat don't even actually respect the colmap frame. At least, the name assume_colmap_world_coordinate_convention on input doesn't suggest anything about exporting ...

unfortunately the splatfacto model can not be exported to a different world frame because one can not rotate the spherical harmonics parameters. See #2951 for similar discussion.

In other words, if one wants to recover the model in original world coordinate, one has to choose between (1) disable any nerfstudio specific transforms during loading data or (2) export a metadata file along side ply exports that recording those transforms.

@pwais
Copy link
Contributor

pwais commented Mar 15, 2024

@jb-ye The current issue with frames etc (and especially the really poor naming e.g. "transformation_matrix" that hides the actual frame name) has lead to a lot of confusion on Discord and I've seen is a top reason why people just use the original Inria impl. Based on this discussion it seems assume_colmap_world_coordinate_convention is indeed just for the viewer, really it should be the viewer that has an option to find "logical gravity vector" and then nobody has to worry about e.g. spherical harmonics being "in the wrong frame."

@jkulhanek
Copy link
Contributor Author

Spherical harmonics can easily be rotated using Winger matrices. I have a np code somewhere if you want it

@pfxuan
Copy link

pfxuan commented Mar 15, 2024

PlayCavas's implementation seems working great:

@jb-ye
Copy link
Collaborator

jb-ye commented Mar 15, 2024

@pwais Thanks for the explanation. Just want to summarize your proposal and confirm my understanding:

(1) You want to remove assume_colmap_world_coordinate_convention flag and somehow put this transform in viewer. As such, when a user specify orientation_method=none, the exported GS is truly the original input world coordinates. The current implementation requires the user to additionally specify assume_colmap_world_coordinate_convention=False when parsing colmap data.
(2) The default setting of nerfstudio is orient, center, and scale the scene (The current splatfacto requires the input scene to be scaled to 1 in order to work properly). As such, when exporting GS ply in the original world coordinate, one has to figure out a way to rotate spherical harmonics (e.g. one suggested by @jkulhanek ) to respect the original world coordinate.

In my opinion, if we implemented feature (2), (1) becomes irrelevant anyway. We may prioritize #2951 .

@jb-ye
Copy link
Collaborator

jb-ye commented Mar 15, 2024

Based on this discussion it seems assume_colmap_world_coordinate_convention is indeed just for the viewer

Not entirely, nerfstudio doc suggested that the world coordinate's Up direction is +Z, while it is often not the case for a colmap project (where Up direction is -Y). But I agree this is really a convenience hack for ns-viewer. We can completely remove the assume_colmap_world_coordinate_convention and have options in ns-viewer to select gravity direction.

ArpegorPSGH pushed a commit to ArpegorPSGH/nerfstudio that referenced this issue Jun 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants