This is ML project done in collaboration with my friends, help in finding the reaction from internet taking reactants images as input.
- model.ipynb
- find_reaction.ipynb
- reactants.zip
Loads the data in 'dataset' variable, and spilts it so that 80% data stored in train_gen is used for training while the remaining 20% stored in val_gen is used for validation of the model. Each image was grayscaled, normalized by a factor of 255(so that the values of pixels were compressed from 0-255 to 0-1), and sized to 28 by 28.
Find Training Image Dataset from here
Model was compiled, with loss=sparseCategoricalCrossentropy(), optimizer adam, for 7 epochs, and the training(train_gen) and testing(val_gen) datasets were passed and the network trained. The loss vs epochs as well as the accuracy vs epoch plots were plotted, using matplotlib library.
The model was saved using model.save('/content/drive/MyDrive/ML Project/model',overwrite = True).
This file does three things
- Text detection: Identifies the location of and crops each character(letters, symbols, numbers) present in the image.
- Recognition: Predicts each of the character present in the image(character is obtained from text image), using convolutional network, forming a string in the same order as was present in the image.
- Searching Products: Using AutoScraper, the reactants in text string format, is used to search the products of the reaction online, and the solution is presented to the user.
Short explaination of functions defined:
- add_border: Adds a white border around the image given as input i.e img to the funtion. The border is of width bordersize.
- predict_char: It takes x1,x2,y1,y2 as input to get location of concerned character from img. Using pretrained model its predict the charcter class the image belongs to.
- find_edges: It creates partitions between adjacent characters present in the image. The logic here is to compress the image to a 1D array, with ith index denoting the sum of the elements(pixel values) of the ith column in the image. For blank regions(all pixel values 0 for the column), this sum is 0, while for regions where the character is present(nonzero pixel values), this sum is nonzero. By taking the xor of adjacent indices of the 1D array(which will be 1 if the adjacent indices hold different values), we can identify the starting and ending points of a character in the image and thus create partitions.
- find_character: Using find_edges, it uses the cropped/isolated region of the image and then passes this cropped region of the given imag, to predict_char function, which predicts the character present in the cropped region using the cnn model.
- find_result: It search the reaction from a website named chemequations.com using AutoScraper liabrary.
- find_reaction:Using OpenCV it reads image from the path. Then it detects the location of the text and makes bounding boxes around each character, passing it further to find_character function and multiple iterations are performed for identifying all characters present in the image. Using a loop the same is extracted and stored in a string which is then used for searching online using Autoscraper.
Find the test images from this zip file. You may find more from here
- With more data the accuracy of model can be improved
- Major part of data was of computer generated fonts, adding hand written text image the model can be more generalised.
- The image preprocessing before prediction, have a sound scope of improvment.