How can I download only some needed categories? #68

WuXinyang2012 · 2018-04-17T08:12:21Z

Hi I only want to use the images of fruit and vegetable categories, I dont need a huge full dataset, can you please give me some instructions?

arya-coding · 2018-04-19T14:57:22Z

Hi,

I'm also interested on that request, there is this issue but I don't really understand :
#61

If you do understand how to proceed, can you explain it ?

Thanks !

shahabkam · 2018-05-03T16:57:43Z

We currently provide annotations for the whole dataset. You can parse our CSV files and obtain the image keys only for labels of interest for your application. Then you can download images only for those keys.

YaroslavSchubert · 2018-05-04T09:06:25Z

@aryus96 I've composed file with classes I need similar to classes.txt and used code to download images from here blog.algorithmia.com/deep-dive-into-object-detection-with-open-images-using-tensorflow.

keldrom · 2018-07-28T10:46:43Z

https://github.com/EscVM/OIDv4_ToolKit

harshilpatel312 · 2018-08-02T02:14:17Z

To download specific classes, use this

jillelajitta · 2018-09-13T22:12:33Z

@harshilpatel312 your script is not downloading all the images. Could you please help me solving this.

Thanks

keldrom · 2018-09-13T22:27:06Z

@jillelajitta use our OIDv4 Toolkit
https://github.com/EscVM/OIDv4_ToolKit
We’ve been mentioned also by OIDv4 for our work. If you will have problems open an issue :)

jillelajitta · 2018-09-14T02:06:52Z

Thanks @keldrom,your script is working well but I don't want to download labels. Could you please help me where do I need to modify the script.

keldrom · 2018-09-14T06:04:24Z

@jillelajitta the labels are generated after the download of the image and takes few seconds but you give me a great feedback so next update I will add this options.

keldrom · 2018-09-14T12:57:08Z

@jillelajitta now you can avoid the creation of the labels with --noLabels option :)
OIDv4Toolkit

jillelajitta · 2018-09-17T19:02:10Z

@keldrom Awesome,it's working. Thank you so much :) :)

jillelajitta · 2018-09-20T18:20:59Z

Hi, @keldrom, I have downloaded openimages train-annotations-bbox.csv and parsed it for each class,I found they don't have annotations for all the images. But when I was downloading labels from your script, I'm getting annotations for all the images. Does CSV files have annotations for all the images? Does your script have option to download only labels?
If I download, Can I get the annotations in this format ['filename','width','height','class','xmin','ymin','xmax','ymax'].
Thanks

keldrom · 2018-09-21T07:45:41Z

@jillelajitta when you need help please create an issue on our repository.
Btw:

Does CSV files have annotations for all the images?

The label for all the images are in the csv file associated and it's a gift from Google so I think it has all what we need.

Does your script have option to download only labels?

Think about this scenario: you have downloaded some images, if you reuse the same command the scripts check the presence of the images and then creates the labels. If you want a scripts that creates only the labels you may can use a particular version of the function get_label into downloader.py.

If I download, Can I get the annotations in this format ['filename','width','height','class','xmin','ymin','xmax','ymax'].

Our scripts make .txt files with that specified structure because each application needs a properly sequence of information hence we cannot make an output for everyone. But this options is one of the next update list but I cannot assure you this release will be out in few time or not, I'm sorry.
If you need you can create a simple scripts starting from our get_label that make the labels as you wish.

jillelajitta · 2018-09-21T16:44:50Z

@keldrom
Below is the script I have used to parse the bounding boxes for specific objects. It is not parsing all the bounding boxes.
For example,for object AXE, I was able to download 115 annotations for 115 images with your script. coming to my script, I was able to parse only for few images. Please correct me if I'm going wrong with my script.

import pandas as pd
import csv
df2=pd.read_csv('./train-annotations-bbox.csv',quoting=csv.QUOTE_NONE, error_bad_lines=False)
df1=pd.read_csv('class-descriptions-boxable.csv')

objlis=['Toy', 'Beer', 'Chopsticks', 'Towel', 'Glove', 'Sunglasses', 'Ball', 'Backpack', 'Headphones', 'Fast food', 'Screwdriver', 'Laptop', 'Person', 'Wrench', 'Flashlight', 'Scissors', 'Suitcase', 'Snack', 'Medical equipment', 'Cat', 'Computer mouse', 'Coin', 'Calculator', 'Box', 'Stapler', 'Drink', 'Ratchet', 'Hat', 'Eraser', 'Tin can', 'Mug', 'Can opener', 'Goggles', 'Coffee cup', 'Paper towel', 'Flying disc', 'Face powder', 'Fruit', 'Pillow', 'Hammer', 'Drinking straw', 'Hair dryer', 'Alarm clock', 'Knife', 'Bottle', 'Bottle opener', 'Dumbbell', 'Bowl', 'Musical instrument', 'Ring binder', 'Plate', 'Mobile phone', 'Crutch', 'Pencil case', 'Briefcase', 'Plastic bag', 'Sports equipment', 'Lipstick', 'High heels', 'Shotgun', 'Picture frame', 'Tripod', 'Picnic basket', 'Handbag', 'Toilet paper', 'Footwear', 'Tablet computer', 'Dog', 'Book', 'Axe', 'Flower', 'Spoon', 'Fork', 'Camera', 'Vegetable', 'Diaper', 'Envelope', 'Watch', 'Handgun', 'Facial tissue holder', 'Ruler', 'Luggage and bags', 'Umbrella', 'Glasses', 'Pen', 'Binoculars', 'Perfume', 'Remote control', 'Helmet']

df3=pd.merge(df1,df2,on='Class_id') #Created Columns 'Class_id' & 'class' in 'class-descriptions-boxable.csv'
print(df3)
for i in objlis:
mod=i
mod=df3[df3['class']==i]
#print(mod)
mod.to_excel(i+'.xlsx')

Please try this script on your computer once.

Thanks.

keldrom · 2018-09-21T16:49:33Z

@jillelajitta open an issue on our repository

jillelajitta · 2018-09-21T16:54:13Z

@keldrom Sure Thanks.

fracarfer5 · 2018-11-12T11:37:32Z

Hi, I've got this error when I try to download the images:

python3 main.py downloader --classes Car --type_csv train --limit 100

INFO] Downloading Car.
----------Car----------
[INFO] Downloading train images.
[INFO] Found 89465 online images for train.
[INFO] Limiting to 100 images.
[INFO] Download of 100 images in train.
sh: 1: sh: 1: sh: 1: aws: not foundaws: not found

aws: not found
sh: 1: aws: not found
sh: 1: aws: not found
sh: 1: aws: not found
sh: 1: aws: not found
sh: 1: aws: not found
sh: 1: aws: not found
0%| | 0/100 [00:00<?, ?it/s]sh: 1: aws: not found
sh: 1: aws: not found
100%|████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 2328.20it/s]
[INFO] Done!
[INFO] Creating labels for Car of train.
[INFO] Labels creation completed.

narayangour · 2018-12-27T05:29:46Z

install "awscli" library u will not see any issue

ibrhmyzc · 2019-03-31T14:07:58Z

python main.py downloader --classes Knife --type_csv train
When I run it with the command above, it onl downloads jsut a over 100 images of knives. In this website I see there are a lot more than that. How can I increse it? Thanks

narayangour · 2019-04-01T06:24:51Z

Does it every time download only 100 images.or behavior is different.
if it download every time 100, images that means there is a flag called "args.limit". so while u run your command just add another flag "limit" and then try to see what happens.
like i am giving command to download 500 images.
python main.py downloader --classes Knife --type_csv train --limit 500

try this and share result if it resolve your issue.

ManuelSR · 2019-07-18T15:50:38Z

Hello,

I have a problem with the toolKit:
OS Windows 10
Python 3.7

[INFO] | Downloading Orange.
Traceback (most recent call last):
File "..\TestToolKit\OIDv4_ToolKit\modules\downloader.py", line 25, in download
columns, rows = os.get_terminal_size(0)
OSError: [WinError 6] Controlador no v▒lido

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 37, in
bounding_boxes_images(args, DEFAULT_OID_DIR)
File "..\OIDv4_ToolKit\modules\bounding_boxes.py", line 62, in bounding_boxes_images
download(args, df_val, folder[0], dataset_dir, class_name, class_code)
File "..\OIDv4_ToolKit\modules\downloader.py", line 27, in download
columns, rows = os.get_terminal_size(1)
OSError: [WinError 6] Controlador no v▒lido

WANGSHUAISWU · 2019-07-20T02:38:11Z

 Hello,

I have such a problem when runnning the Toolkit in Windows 10, Python3.7. Can anyone give some advice?
[INFO] | Downloading Person.
Traceback (most recent call last):
File "D:\Dataset\OIDv4_ToolKit-master\modules\downloader.py", line 25, in download
columns, rows = os.get_terminal_size(0)
OSError: [WinError 6] The handle is invalid。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:/Dataset/OIDv4_ToolKit-master/main.py", line 37, in
bounding_boxes_images(args, DEFAULT_OID_DIR)
File "D:\Dataset\OIDv4_ToolKit-master\modules\bounding_boxes.py", line 70, in bounding_boxes_images
download(args, df_val, folder[1], dataset_dir, class_name, class_code)
File "D:\Dataset\OIDv4_ToolKit-master\modules\downloader.py", line 27, in download
columns, rows = os.get_terminal_size(1)
OSError: [WinError 6] The handle is invalid。

Process finished with exit code 1

keldrom · 2019-07-23T08:04:55Z

@WANGSHUAISWU next time open an issue on our repo please.
Anyway, I've just used the toolkit on WIN10 without getting that error. That function is used for the computation of the terminal width, it is a standard module of the os library.

dan-r95 · 2019-08-07T15:23:16Z

Is it possible to download images from the extended open images dataset via this method as well?

WANGSHUAISWU · 2019-08-18T13:08:13Z

@keldrom , I have downloaded the images I need from the dataset. Many thanks!

sliawatimena · 2019-11-18T14:36:36Z

example how to download human mouth, human hand, etc

$ python3 main.py downloader --classes Human_mouth --type_csv all
$ python3 main.py downloader --classes Human_hand --type_csv all

DeepleMass · 2020-06-10T12:08:13Z

I have written a contribution to that concernig the class "doors" on medium. May be it helps. At least I downloaded around 13k of annoted images to train a ssd model on doors.

thurussian · 2020-09-21T21:15:55Z

script worked few times with no problems and suddenly now giving an error in line 56 in bounding_boxes.py. Using Visual Studio Code on MAC OS Python 3.6.10

Looks like few others had this issue but I don't see a resolution

class_code = df_classes.loc[df_classes[1] == class_name].values[0][0]

[INFO] | Downloading skunk.

Traceback (most recent call last):
File "main.py", line 37, in
bounding_boxes_images(args, DEFAULT_OID_DIR)
File "/Users/MyLap/CLONED/OIDv4_ToolKit/modules/bounding_boxes.py", line 56, in bounding_boxes_images
class_code = df_classes.loc[df_classes[1] == class_name].values[0][0]
IndexError: index 0 is out of bounds for axis 0 with size 0

RutviNathvani · 2021-05-12T01:49:04Z

How i rename label...for example for motorcycle i want to rename it to Twowheeler??

WuXinyang2012 closed this as completed Jun 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I download only some needed categories? #68

How can I download only some needed categories? #68

WuXinyang2012 commented Apr 17, 2018

arya-coding commented Apr 19, 2018

shahabkam commented May 3, 2018

YaroslavSchubert commented May 4, 2018

keldrom commented Jul 28, 2018

harshilpatel312 commented Aug 2, 2018 •

edited

Loading

jillelajitta commented Sep 13, 2018

keldrom commented Sep 13, 2018

jillelajitta commented Sep 14, 2018

keldrom commented Sep 14, 2018

keldrom commented Sep 14, 2018

jillelajitta commented Sep 17, 2018

jillelajitta commented Sep 20, 2018 •

edited

Loading

keldrom commented Sep 21, 2018

jillelajitta commented Sep 21, 2018 •

edited

Loading

keldrom commented Sep 21, 2018

jillelajitta commented Sep 21, 2018

fracarfer5 commented Nov 12, 2018

narayangour commented Dec 27, 2018 •

edited

Loading

ibrhmyzc commented Mar 31, 2019

narayangour commented Apr 1, 2019

ManuelSR commented Jul 18, 2019

WANGSHUAISWU commented Jul 20, 2019

keldrom commented Jul 23, 2019

dan-r95 commented Aug 7, 2019

WANGSHUAISWU commented Aug 18, 2019

sliawatimena commented Nov 18, 2019 •

edited

Loading

DeepleMass commented Jun 10, 2020

thurussian commented Sep 21, 2020 •

edited

Loading

RutviNathvani commented May 12, 2021

How can I download only some needed categories? #68

How can I download only some needed categories? #68

Comments

WuXinyang2012 commented Apr 17, 2018

arya-coding commented Apr 19, 2018

shahabkam commented May 3, 2018

YaroslavSchubert commented May 4, 2018

keldrom commented Jul 28, 2018

harshilpatel312 commented Aug 2, 2018 • edited Loading

jillelajitta commented Sep 13, 2018

keldrom commented Sep 13, 2018

jillelajitta commented Sep 14, 2018

keldrom commented Sep 14, 2018

keldrom commented Sep 14, 2018

jillelajitta commented Sep 17, 2018

jillelajitta commented Sep 20, 2018 • edited Loading

keldrom commented Sep 21, 2018

jillelajitta commented Sep 21, 2018 • edited Loading

keldrom commented Sep 21, 2018

jillelajitta commented Sep 21, 2018

fracarfer5 commented Nov 12, 2018

narayangour commented Dec 27, 2018 • edited Loading

ibrhmyzc commented Mar 31, 2019

narayangour commented Apr 1, 2019

ManuelSR commented Jul 18, 2019

WANGSHUAISWU commented Jul 20, 2019

keldrom commented Jul 23, 2019

dan-r95 commented Aug 7, 2019

WANGSHUAISWU commented Aug 18, 2019

sliawatimena commented Nov 18, 2019 • edited Loading

DeepleMass commented Jun 10, 2020

thurussian commented Sep 21, 2020 • edited Loading

RutviNathvani commented May 12, 2021

harshilpatel312 commented Aug 2, 2018 •

edited

Loading

jillelajitta commented Sep 20, 2018 •

edited

Loading

jillelajitta commented Sep 21, 2018 •

edited

Loading

narayangour commented Dec 27, 2018 •

edited

Loading

sliawatimena commented Nov 18, 2019 •

edited

Loading

thurussian commented Sep 21, 2020 •

edited

Loading