-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autolabelling: Setup a data collection pipeline (e.g. what happens when new data comes in?) #64
Comments
Potential autolabelling pipeline:
Could use the pipeline above with multiple variants of CLIP-style models for redundancy. |
See Also see Can download a large number of images from web links using: https://github.com/rom1504/img2dataset |
Much better to compute image embeddings + class embeddings up front. Then reuse over time where necessary. This could be setup via:
|
See this resource for autolabelling object detection: https://github.com/facebookresearch/CutLER |
Data collection pipeline should be reactive to data coming into a bucket, for example:
Images get added to bucket -> autolabelling pipeline happens for unlabelled images -> labelling cleaning happens -> model training pipeline happens for when all images are labelled -> evaluation pipeline happens -> deployment happens
See:
The text was updated successfully, but these errors were encountered: