Aim is to accurately recognize the user who is speaking on the basis of information available in the audio signal.
One obvious challenge faced was that the audio signals are not stationary and the algorithm has to be robust to noise, variation over time and different speaking rates. The problem is tackled in two stages:
- Extracting the audio features using Mel-Frequency Cepstrum Coefficients (MFCC)
- Pattern Recognition using LBG (Linde, Buzo and Gray) Algorithm
You can view our final presentation at: https://prezi.com/view/AHBdA2f46x4e0gn0zLI1/