-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predicted point of regard ~10x bigger on demo #20
Comments
Did you calibrate your camera to obtain its intrinsic parameters and more importantly, the extrinsic parameters (rotation, translation) between your camera and your monitor as described https://github.com/NVlabs/few_shot_gaze/blob/master/demo/README.md, step 2b? |
thanks for replying!
how should I tweak monitor.py based on this, can you give me an example? It's a lot of numbers and documentation is not clear FYI, I'm using a Macbook Pro webcam if that makes things simpler |
I am not 100% sure, but I imagine that you need to update https://github.com/NVlabs/few_shot_gaze/blob/master/demo/monitor.py#L28 (and its inverse) based on these values that you determined:
I imagine that this is the translation from the screen to camera coordinate system in millimeters. So for example, you could probably define (yet again, I'm not 100% sure):
and a corresponding |
Not exactly. few_shot_gaze/demo/frame_processor.py Line 178 in 2b0ea42
few_shot_gaze/demo/frame_processor.py Line 218 in 2b0ea42
Line 28 in 2b0ea42
Line 38 in 2b0ea42
assume that the z axis of the camera and the z axis of the monitor are parallel and there is no translation in the z direction, i.e. z=0. However, from the R and T given by @rogeriochaves, it can be seen that neither of the two assumptions stands. In order to correctly apply the calibration results, you need to
BTW, the R and T given by the calibration process actually describes the relationship between the chessboard pattern displayed on the monitor and the camera. It may not equal to the relationship between the monitor and the camera. You need to find the relationship between the chessboard pattern and the monitor as well. |
so is the tnm monitor calibration needed for a default laptop webcam (the assumptions of z=0 and Δy = 10 mm fits)? I've got the model to run but I'm wondering if there's some way to improve accuracy further by calibration ? |
Every laptop hardware configuration is different but the assumption of z=0 should be OK to use. But you need to at least measure Δy and Δx using a ruler if you really don't want to do the cailbration. (Δy = the distance between the camera and the upper edge of the monitor; Δx = the distance between the camera and the left edge of the monitor, usually equal to monitor width / 2.) However, a good calibration won't help improve accuray in this case. I believe the accuracy is limited by the image resolution. I did an experiment and it turned out that you almost cannot recognize the eye movement in images taken for two target points that are less than 2cm apart on the screen. Increasing image resolution might be a solution but this will also increase the complexity of the neural network and you need to build a high resolution training dataset as well. So I think this still remains an open problem. |
Hello there!
For some reason the predicted PoR is way off screen, to try to debug it, on an already trained network I ran the person calibration again, then saved the
gaze_n_vector
variable used during training andg_cnn
variable used during prediction onframe_processor.py
, and if I plot them separate I get this:Leaving the clear error aside, if I plot them together I get this:
now if I fit a linear regression I get a coef of almost exactly 0.1 for both
now by applying those I get a prediction that makes more sense
why is that? Is some part of the calculation missing during prediction on
frame_processor.py
? Why is PoR always 10x bigger?The text was updated successfully, but these errors were encountered: