Recognize the image from the image to determine if it exists.
Images that are part of an image may have been rotated, moved, or changed in brightness.
It presents a variety of methodologies for this.
The final goal is process on real-time & Incremental the image to recognize.
I record various attempts in this repo .
Check Basic_CV repo for a detailed Theory.
- Matching Template - opencv
- Feature-point Detection & Matching - opencv
- Homography (A part of Feature Matching)
Use opencv's matchTemplate, get loc, normalize ..
Check template_matching.py
In Smart_Camera(Navigation) project [ Easy case ]
In this case, performance is very nice.
But, size or rotational transformations (hard cases) do not work well.
So, I can't use it
Use SIFT, SURF, ORB, FAST, BRISK, AKAZE ..
Comparative analysis paper
If feature-points are simple, use FAST, BRISK ..
else(complex), use SIFT, SURF, AKAZE ..
This project's cases are complex. So I use SIFT.
-
Scale-space extrema detection: The first stage of computation searches over all scales and image locations. It is implemented efficiently by using a difference-of-Gaussian function to identify potential interest points that are invariant to scale and orientation.
-
Keypoint localization: At each candidate location, a detailed model is fit to determine location and scale. Keypoints are selected based on measures of their stability.
-
Orientation assignment: One or more orientations are assigned to each keypoint lo- cation based on local image gradient directions. All future operations are performed on image data that has been transformed relative to the assigned orientation, scale, and location for each feature, thereby providing invariance to these transformations.
-
Keypoint descriptor: The local image gradients are measured at the selected scale in the region around each keypoint. These are transformed into a representation that allows for significant levels of local shape distortion and change in illumination.
If you want more detailed information, check This docs
In code,
sift = cv2.xfeatures2d.SIFT_create()
kp, des = sift.detectAndCompute(img1, None)
kp: keypoints, < cv2.KeyPoint 0x10d4137b0 >
des: descriptors, ndarray [[n, n, n, .. n], [n, n, n, .. n], [n, n, n, .. n] ..]
matches = bf.match(des1, des2)
matches: < cv2.DMatch 0x10d4137b0 >
Check Feature_DetectMatch.py
In Smart_Camera(Navigation) project [ Easy case ]
In Smart_Camera(Navigation) project [ Hard case ]
This case's performance is not good yet..
This project is demanded working robustly(whether a small image is rotated, moved, or changed in brightness)
So, I'm doing middle processing to boolean the result from feature detection & matching.
I think if i use this one, my app work robustly.
As far as, I know homography works for planar objects
So, I use before, detect planar objects() .. in small image
Maybe, I can use Clipseg, which will be shown below
So.. If I get detected shop_sign image, I could see the good performance by using homography
Ratio = 0.6, Good matches:122/53093
Ratio = 0.5, Good matches:20/53093
Ratio = 0.6, good matches:67/53093
Ratio = 0.5, Good matches:20/53093
Ratio = 0.6, good matches:13/53093
Ratio = 0.5, Good matches:0/53093
Check Homography.py
Based on the above figures,
the matching results were good when ratio = 0.5 and good matches > 5
BFmatching is BruteForce matching.
I get the boolean result by using BFmatching
If Length of matching >= threshold is True
else(Length of matching < threshold), False.
I used BFmatching even when I could use FLANN because in our case accuracy is more important than speed. (I am using SIFT, not ORB for the similar reason.)
Check BFmatching.py
Based on the above contents, I would like to write it as a program
Preprocessed
Only used two points temporarily.
Which kind of interpolation best for resizing image
In my test above, Point image is not a wide(panorama, 360 ..) image.
I'll check performance in additional test (Point image is wide image)
When points are a lot, Maybe This app doesn't work well enough on real-time.
I want to test with various processing techniques as well as CLIPSeg. (I want to see a change in performancd according to preprocessing)
So, I completed the program.
Check Optional_processing.md
Check One_shot_learning.md
https://en.wikipedia.org/wiki/Homography
https://ieeexplore.ieee.org/document/8346440
https://arxiv.org/pdf/2103.00020.pdf
https://paperswithcode.com/paper/prototypical-networks-for-few-shot-learning
https://proceedings.neurips.cc/paper_files/paper/2017/filecb8da6767461f2812ae4290eac7cbc42-Paper.pdf
https://keras.io/examples/vision/siamese_contrastive