Skip to content

This is ML project done in collaboration with my friends, help in finding the reaction from internet taking reactants images as input.

Notifications You must be signed in to change notification settings

omm-prakash/Find_the_Chemical_Reaction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Find_the_Chemical_Reaction

This is ML project done in collaboration with my friends, help in finding the reaction from internet taking reactants images as input.

Content


  1. model.ipynb
  2. find_reaction.ipynb
  3. reactants.zip

model.ipynb


Data Loading

Loads the data in 'dataset' variable, and spilts it so that 80% data stored in train_gen is used for training while the remaining 20% stored in val_gen is used for validation of the model. Each image was grayscaled, normalized by a factor of 255(so that the values of pixels were compressed from 0-255 to 0-1), and sized to 28 by 28.

CNN Model

Find Training Image Dataset from here

Model Structure

model

Model was compiled, with loss=sparseCategoricalCrossentropy(), optimizer adam, for 7 epochs, and the training(train_gen) and testing(val_gen) datasets were passed and the network trained. The loss vs epochs as well as the accuracy vs epoch plots were plotted, using matplotlib library.

Saving the model

The model was saved using model.save('/content/drive/MyDrive/ML Project/model',overwrite = True).

find_reaction.ipynb


This file does three things

  • Text detection: Identifies the location of and crops each character(letters, symbols, numbers) present in the image.
  • Recognition: Predicts each of the character present in the image(character is obtained from text image), using convolutional network, forming a string in the same order as was present in the image.
  • Searching Products: Using AutoScraper, the reactants in text string format, is used to search the products of the reaction online, and the solution is presented to the user.

Short explaination of functions defined:

  1. Recognition

  • add_border: Adds a white border around the image given as input i.e img to the funtion. The border is of width bordersize.
  • predict_char: It takes x1,x2,y1,y2 as input to get location of concerned character from img. Using pretrained model its predict the charcter class the image belongs to.
  • find_edges: It creates partitions between adjacent characters present in the image. The logic here is to compress the image to a 1D array, with ith index denoting the sum of the elements(pixel values) of the ith column in the image. For blank regions(all pixel values 0 for the column), this sum is 0, while for regions where the character is present(nonzero pixel values), this sum is nonzero. By taking the xor of adjacent indices of the 1D array(which will be 1 if the adjacent indices hold different values), we can identify the starting and ending points of a character in the image and thus create partitions.
  • find_character: Using find_edges, it uses the cropped/isolated region of the image and then passes this cropped region of the given imag, to predict_char function, which predicts the character present in the cropped region using the cnn model.
  1. Searching products

  • find_result: It search the reaction from a website named chemequations.com using AutoScraper liabrary.
  1. Detection

  • find_reaction:Using OpenCV it reads image from the path. Then it detects the location of the text and makes bounding boxes around each character, passing it further to find_character function and multiple iterations are performed for identifying all characters present in the image. Using a loop the same is extracted and stored in a string which is then used for searching online using Autoscraper.

detect

reactants.zip


Find the test images from this zip file. You may find more from here


Scope of Further Development

  1. With more data the accuracy of model can be improved
  2. Major part of data was of computer generated fonts, adding hand written text image the model can be more generalised.
  3. The image preprocessing before prediction, have a sound scope of improvment.

About

This is ML project done in collaboration with my friends, help in finding the reaction from internet taking reactants images as input.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published