Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to generate 20-dimensional landmark through Obama video frames? #7

Open
kkkmax opened this issue Sep 3, 2018 · 5 comments
Open

Comments

@kkkmax
Copy link

kkkmax commented Sep 3, 2018

Hello, I recently researched the code of the Synthesizing Obama network you wrote. The video got 2 or 3 or 4 dump files after the 29.97 fps sequence. I have two questions about this:
(1) For a single obama video, if the beginning and end of the removed video does not contain the character part, why is it divided into multiple dumps in the middle, what is the reason for doing this?
(2) For the mouth feature, you use the way to detect the mouth mark, giving 18 points along the outer and inner contours of the lip. We reshaped each 18-point shape into a 36-D vector, applied PCA on all frames, and represented each mouth shape by the coefficients of the first 20 PCA coefficients. But there is no such part in the code you give, is it convenient to put this up because it makes us look like a black box?

I am looking forward to your time to answer my current questions, thank you very much!

@supasorn
Copy link
Owner

1). Not sure what you mean by dumps and "removed video". But a single video is sometimes split into multiple sections because the view changes (e.g. camera zoom)
2). I just recently moved and my code is sitting in a hard drive without a desktop. I'll try to publish it but it could take a while. (The entire pipeline could be up to 10K lines of code, and I might not have the bandwidth)

@kkkmax
Copy link
Author

kkkmax commented Sep 13, 2018

  Thank you very much for your reply. The first question I have just mentioned is roughly what you mean by answering this question. I am concerned that a video is cut into multiple intervals, the starting point and interval length of each interval video. Does the selection principle include whether or not a face is detected in addition to camera zoom (such as the beginning and end of the video without a face)? In this part, if my guess is true, then what techniques do you use, like face-align or some detection principles in Openface? The difficulty I have now is mainly to get a reasonable starting value and interval length value, I don't know you. Is there a better suggestion?

        In addition, how to upload your code, I will be very looking forward to it! Thank you again for your reply!

@tonmoyborah
Copy link

@supasorn Will you be able to upload the full pipeline code in the near future? It'd be of immense help. Thank you for this awesome project anyway!

@wanshun123
Copy link

Definitely would be appreciated if the complete code could be uploaded.

@primejava
Copy link

i think he will upload after a thousand years

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants