movie-chatbot

JupyterNotebook:

This NoteBook can totaly be run on GoogleColab, try it! -->

Usefull Links:

Vocabulary list

Creat the vocabulary list with all words stem found in the training set

Algoritm

Text processing

Can be done with:

spaCy
NLTK

Lower case
Standardizing numbers (ex. '12' -> 'number')
Transform question mark ('?' -> 'questionmark')
Word Stemming (ex. 'discount', 'discounts', 'discounted', 'discounting' -> 'discount')
Removal of non-usefull characters/words (ex. stop words, ponctuation)

Features

Word to vectors:

For the input text, fill a list of the size of the vocabulary list, with the score of each word. The following scoring method can be use for n-gram:
- Binary, i.e. the word is present or not in the text
- Count, i.e. the number of time the word appear in the text
- Frequency, i.e. Count/Total number of words in the text
- TF-IDF (Term Frequency – Inverse Document Frequency), i.e. the score increase with the word frequency, but a penality is given if this word is widely used in the training set (like 'for', 'a', 'the'). The scores have the effect of highlighting words that are distinct (contain useful information) in a given text.
If the dataset is big and the sentence are small, we can use word embeddings.

Temps Colab 25min, Temps local 22min

Classification

SVM is machine learing algorithm, MLP is a deep learning algorithm

Notes

Il faut avoir le même nombre de features pour n'importe quel text
La structure et l'ordre des mots est perdu

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
MovieLineChatBot		MovieLineChatBot
MoviesClassifier		MoviesClassifier
Presentation_V2.pptx		Presentation_V2.pptx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

movie-chatbot

JupyterNotebook:

Usefull Links:

Vocabulary list

Algoritm

Text processing

Features

Classification

Notes

About

Releases

Packages

Languages

maximecharriere/movie-chatbot

Folders and files

Latest commit

History

Repository files navigation

movie-chatbot

JupyterNotebook:

Usefull Links:

Vocabulary list

Algoritm

Text processing

Features

Classification

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages