Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on OPENCV and OPENGL data convention #1286

Closed
yimingzhou1 opened this issue Jan 25, 2023 · 8 comments
Closed

Question on OPENCV and OPENGL data convention #1286

yimingzhou1 opened this issue Jan 25, 2023 · 8 comments

Comments

@yimingzhou1
Copy link
Contributor

You mentioned you are using OPENGL data convention in this codebase. However I saw the following code in colmap_to_json() in nerfstudio/process_data/colmap_utils.py. Assume the output of COLMAP is in OPENCV data convention, can you please explain the following transformation code? Why are you switching the rows and why the third row is multiplied with -1? Thank you!

# Convert from COLMAP's camera coordinate system to ours
c2w[0:3, 1:3] *= -1
c2w = c2w[np.array([1, 0, 2, 3]), :]
c2w[2, :] *= -1
@rockywind
Copy link

I also confused that.

@tancik
Copy link
Contributor

tancik commented Feb 2, 2023

Here is some info on the coordinate system - https://docs.nerf.studio/en/latest/quickstart/data_conventions.html#camera-view-space

@rockywind
Copy link

Thank you!

@duonglt19
Copy link

I can get the meaning of the line c2w[0:3, 1:3] *= -1 that change camera coordinate system of orientation from [x,y,z] in OpenCV to [x,-y,-z] in OpenGL, but can please you explain why do you swap rows and flip z in the following code? Thank you.

c2w = c2w[np.array([1, 0, 2, 3]), :]
c2w[2, :] *= -1

@Kai-46
Copy link

Kai-46 commented Feb 13, 2023

@duonglt19 @tancik I also got confused by these row swapping and z flipping operation, and did a quick investigation. It seems that these two operations have the effect of swapping x and y axes, and flipping z axis in the world space.

I'm not sure why this is needed though. (If it's not important, I could submit a PR removing these two lines to avoid future confusions.)

Here's the brief proof. (Please feel free to point out any errors)
First, note that these two lines can be summarized into matrix form:

A = np.array([[0, 1, 0, 0],
       [1, 0, 0, 0],
       [0, 0, -1, 0],
       [0, 0, 0, 1])).astype(float)
C2W = A @ C2W

The line c2w[0:3, 1:3] *= -1 can also be written in matrix form:

B = np.diag([1, -1, -1, 1]).astype(float)
C2W = C2W @ B

Putting together, we have the following matrix formulation of the function colmap2nerfstudio:

C2W = A @ C2W @ B

(Btw, the above A, B happen to satisfy A^{-1}=A, B^{-1}=B.)

Suppose we have a 3d point p=[x, y, z, 1]^T in camera space and denote C2W @ B @ p as [X, Y, Z, 1]^T. The additional A basically changes the world-space coordinate of this 3D point from [X, Y, Z, 1]^T to [Y, X, -Z, 1]^T. In other words, it swaps x and y axes, and flips z axis of the world coordinate frame.

@wuzirui
Copy link
Contributor

wuzirui commented May 29, 2023

It seems like the code first convert OPENCV camera coordinate (right down forward, RDF) to OPENGL coordinate (right up backward, RUB), and then change the world coordinate from RUB to (down right back, DRB), which can also be seen in the original nerf documentation .

But why do we need to convert RUB to DRB exactly?
this issue is also mentioned in #1504

@nnop
Copy link

nnop commented Feb 28, 2024

It seems this logic origins from instant-ngp (code location).

Could @Tom94 @mmalex explain a bit for this?

@maybeLx
Copy link

maybeLx commented Nov 28, 2024

It seems like the code first convert OPENCV camera coordinate (right down forward, RDF) to OPENGL coordinate (right up backward, RUB), and then change the world coordinate from RUB to (down right back, DRB), which can also be seen in the original nerf documentation .

But why do we need to convert RUB to DRB exactly? this issue is also mentioned in #1504

Maybe Nerfstuido not only want to use OpenGL camera coordinate, but aslo they want rotate the whole world system. multipling from right changes the camera coordinate, multipling from left changes the whole world system (it just like you want to roate the whole point cloud.) #2793

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants