Question not an issue #2

lanalex · 2023-06-19T06:15:59Z

Hi, this looks amazing. I have a quick question. given that we have the encoder transformer and the predictor transformer . How would you go about testing the KNN? Lets say we have a 224x224 image, if you use patches , the context encoder (no masking) would output many many embeddings for one image. Thank you!

LumenPallidium · 2023-06-19T06:31:34Z

Hello, thanks! I (mean) pool all patch embeddings into a single embedding. This is pretty standard for linear probes, but is somewhat atypical for k-NN tests (which usually use a [CLS] token e.g. DINO and iBOT); I did not use a [CLS] token because the JEPA paper didn’t use one.

lanalex · 2023-06-19T15:27:22Z

hi, thank you. Ye the pooling is the clear path but I was wondering maybe I missed something and wanted to see. Just a thought, what if the predictor also outputs the CLS token and its compared to the CLS token of the encoder of the target or the context? As an additional loss term (on top of the patch level)? Also I wanted to ask what is your affiliation to the main Yann Lacun paper, are you one of the authors?

LumenPallidium · 2023-06-20T00:25:09Z

That's an interesting idea, I can look into implementing a CLS token but it might be tough with how I have structured everything so far.

I am not one of the original authors, just implemented the paper out of curiosity and interest! If you are curious about seeing the official implementation, they actually open-sourced their model and code the other day:
https://github.com/facebookresearch/ijepa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question not an issue #2

Question not an issue #2

lanalex commented Jun 19, 2023

LumenPallidium commented Jun 19, 2023

lanalex commented Jun 19, 2023 •

edited

Loading

LumenPallidium commented Jun 20, 2023

Question not an issue #2

Question not an issue #2

Comments

lanalex commented Jun 19, 2023

LumenPallidium commented Jun 19, 2023

lanalex commented Jun 19, 2023 • edited Loading

LumenPallidium commented Jun 20, 2023

lanalex commented Jun 19, 2023 •

edited

Loading