-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batching on .extract_faces
to improve performance and utilize GPU in full
#1435
base: master
Are you sure you want to change the base?
Conversation
Do you have a branch in your fork that currently combines all the optimizations you've submitted. I'd like to start using them while the approval process is going whats been the total speedup you've been able to see |
I do. You can check it combines these two PRs with some other small modifications:
Not all of the detectors currently (both in this PR and in the fork) implement batching. In particular, YOLO does. I've found it to be optimal in terms of performance and inference speed. The only problem is installing both torch and tensorflow with GPU, but I've managed to somehow do that. All in all, with the combination of @serengil FYI I would be happy to contribute the aforementioned modifications if we have progress on the PRs. |
I will review this PR this week i hope |
Seems this breaks the unit tests. Must be sorted. |
should be good now |
Nope, still failing. |
resp.append(facial_area) | ||
|
||
return resp | ||
if isinstance(img, np.ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if img is (2, 224, 224, 3) sized numpy array? i mean many images in single numpy array.
You implemented OpenCv, Ssd, Yolo, MtCnn and RetinaFace to accept list inputs What if I send list to YuNet, MediaPipe, FastMtCnn, Dlib or CenterFace? I assume an exception will be thrown, but users should see a meaningful message. |
@galthran-wq you are pushing too many things, would you please inform me when it is ready. |
) | ||
|
||
|
||
def test_batch_extract_faces_single_image(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this might be not the expected behaviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean?
def test_batch_extract_faces(detector_backend): | ||
detector_backend_to_rtol = { | ||
"opencv": 0.1, | ||
"mtcnn": 0.2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turns out batching has different impact on the result for different backends. It is mostly the same (for almost all the models, the relative error is <1%.
mtcnn and opencv suffer a bit more, not quite sure why (which features are different exactly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may consider to throw exception for those detectors as they are not suitable for batch maybe
it is ready now, im usually pushing when it's done I've just fixed what you've noted:
|
len(imgs_objs_batch) == 4 and | ||
all(isinstance(obj, list) for obj in imgs_objs_batch) | ||
) | ||
assert all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment, the last one has many faces, others have just one face
@@ -79,6 +79,144 @@ def test_different_detectors(): | |||
logger.info(f"✅ extract_faces for {detector} backend test is done") | |||
|
|||
|
|||
@pytest.mark.parametrize("detector_backend", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add an unit test case for
img1 = cv2.imread(img1_path)
img2 = cv2.imread(img2_path)
img = np.stack([img1, img2])
assert len(img.shape) == 4 # Check dimension.
assert img.shape[0] == 2 # Check batch size.
what if image is batch in numpy format?
resp = [] | ||
|
||
detected_face = None | ||
if isinstance(img, np.ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend you to call detect_faces itself it if it is list. this can also be done in Detector.py or detection.py.
in that way, we will not change all detectors
is this ready? i can still see my comments not resolved. |
almost, tell me if you think that the batched numpy array test is okay and let's also settle on whether refactoring of current I plan to:
|
Tickets
#1433
#1101
#1434
What has been done
With this PR,
.extract_faces
is able to accept a list of imagesHow to test
Benchmarking on detecting 50 faces:

For yolov11n, batch size 20 is 59.27% faster than batch size 1.
For yolov11s, batch size 20 is 29.00% faster than batch size 1.
For yolov11m, batch size 20 is 31.73% faster than batch size 1.
For yolov8, batch size 20 is 12.68% faster than batch size 1.