the problem of download training data #2

feibin95 · 2019-06-26T08:59:47Z

First, download the training data from the website (http://www.msceleb.org/download/lowshot)

The data set has been deleted from the official website. Could you please provide a download link? Thank you very much！

wuyuebupt · 2019-06-27T02:01:11Z

@feifei9099
I am afraid that I can not provide a download link for such large dataset. You may find some alternative download links from:
https://github.com/deepinsight/insightface/wiki/Dataset-Zoo
https://ibug.doc.ic.ac.uk/resources/lightweight-face-recognition-challenge-workshop/

I also would like to share some news about the dataset:
https://www.vice.com/en_us/article/a3x4mp/microsoft-deleted-a-facial-recognition-database-but-its-not-dead
https://megapixels.cc/datasets/msceleb/

Hope this helps.

feibin95 · 2019-06-27T03:48:55Z

Thank you very much for the news you provide, I found a download link here:
https://academictorrents.com/details/9e67eb7cc23c9417f39778a8e06cca5e26196a97/tech

But I still have a problem if I want to study the Low-Shot part of ms-celeb-1mrefer to your paper “Low-shot Face Recognition with Hybrid Classifiers, ICCV Workshop, 2017”. Should I download all the data sets, and all the data sets contain low-shot parts?
In your paper you introduced the low shot part include “Base Set consists of 20,000 people, with an average of 58 training samples per person. Novel Set has the rest 1,000 people, of which each comes with 1, 2 or 5 training images.”
Thank you very much！

wuyuebupt · 2019-06-28T23:32:33Z

@feifei9099

The data split is provided by the challenge organizer. The first paper that states the setting should be "One-shot Face Recognition by Promoting Underrepresented Classes" (https://arxiv.org/pdf/1707.05574.pdf).

The training data are a part of the full dataset. I am not sure if you can split it out directly from the whole data as the low shot data is a cleaned version.

I found a list of the training ids. 0-19999 is the base set, 20000-20999 is the novel set. The link is at:
https://drive.google.com/file/d/14n5f6ZfmxP20j3iDGRiDSKGVybs2Sy8y/view?usp=sharing

I can not find the training data originally in tsv format. I do have a copy of image data, which is about 24GB. Even I find the original training file, it is still too large for me to upload.

I do find two parts of data.

training data for novel set: https://drive.google.com/file/d/18g4Cn7uSWxLM1IHxVMHbC-eI60juuXDn/view?usp=sharing
validation set that contains 25000 images, https://drive.google.com/file/d/1R0yky3CT6Uuvu6z2KQxggxsRV9XrrLRs/view?usp=sharing

One list for all training images for base set: https://drive.google.com/file/d/1by9zWY2xcocYdne8_sGKILnCARXLJsB7/view?usp=sharing

If the number of training images matches, they should be the same.

Hope these help.

feibin95 · 2019-06-29T01:28:03Z

Thank you very much for your help. It means a lot to me.

wtongping · 2020-02-29T04:09:20Z

@feifei9099

The data split is provided by the challenge organizer. The first paper that states the setting should be "One-shot Face Recognition by Promoting Underrepresented Classes" (https://arxiv.org/pdf/1707.05574.pdf).

The training data are a part of the full dataset. I am not sure if you can split it out directly from the whole data as the low shot data is a cleaned version.

I found a list of the training ids. 0-19999 is the base set, 20000-20999 is the novel set. The link is at:
https://drive.google.com/file/d/14n5f6ZfmxP20j3iDGRiDSKGVybs2Sy8y/view?usp=sharing

I can not find the training data originally in tsv format. I do have a copy of image data, which is about 24GB. Even I find the original training file, it is still too large for me to upload.

I do find two parts of data.

training data for novel set: https://drive.google.com/file/d/18g4Cn7uSWxLM1IHxVMHbC-eI60juuXDn/view?usp=sharing

validation set that contains 25000 images, https://drive.google.com/file/d/1R0yky3CT6Uuvu6z2KQxggxsRV9XrrLRs/view?usp=sharing

One list for all training images for base set: https://drive.google.com/file/d/1by9zWY2xcocYdne8_sGKILnCARXLJsB7/view?usp=sharing

If the number of training images matches, they should be the same.

Hope these help.

@wuyuebupt
Thank you for providing these links. I have been able to sort out the training data, but I have been searching on the network for a long time, but I still can't find the test data. Could you please share the list(image name) of test data of base and novel？

wuyuebupt · 2020-03-02T19:34:33Z

@wtongping

I did not find the test data. Even I found the test data, the labels were not available as the challenge evaluation was run by the organizers if I remember correctly.

wtongping · 2020-03-03T01:53:03Z

@wuyuebupt ok, thank u!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the problem of download training data #2

the problem of download training data #2

feibin95 commented Jun 26, 2019

wuyuebupt commented Jun 27, 2019

feibin95 commented Jun 27, 2019

wuyuebupt commented Jun 28, 2019 •

edited

Loading

feibin95 commented Jun 29, 2019

wtongping commented Feb 29, 2020

wuyuebupt commented Mar 2, 2020

wtongping commented Mar 3, 2020

the problem of download training data #2

the problem of download training data #2

Comments

feibin95 commented Jun 26, 2019

wuyuebupt commented Jun 27, 2019

feibin95 commented Jun 27, 2019

wuyuebupt commented Jun 28, 2019 • edited Loading

feibin95 commented Jun 29, 2019

wtongping commented Feb 29, 2020

wuyuebupt commented Mar 2, 2020

wtongping commented Mar 3, 2020

wuyuebupt commented Jun 28, 2019 •

edited

Loading