Skip to content

Data processing by using the Openfood dataset and the natural language processing method, and data visualization

Notifications You must be signed in to change notification settings

zhaojunGUO/Nlp_OpenFood

Repository files navigation

Nlp_OpenFood

This is my natural language processing assignment in my second year as a Master OpenFoodFacts can be considerate of a wikipedia for food ! The goal of OpenFoodFacts is to share with everyone a maximum of informations on food products. It contains more than 800000 products but maybe all products are not perfectly described... Mainly, for a product, we can find the list of ingredients, nutrition facts and food categories.

  1. Define and clean the vocabulary of ingredients, do you find some mistakes ? How do you manage them ? Propose solutions to manage/identify errors.
  2. Based on nutrition facts and/or food categories, propose clustering approaches and a visualisation of some categories of products. Find outliers (a product very different from others of the same group). It exists products very similars in terms of nutrition facts but very different in terms of categories or ingredients ?
  3. Based on your expertize on this dataset, propose and describe a model (no code required) that would be interesting to enhance the OpenFoodFacts project.

And the data set:https://fr.openfoodfacts.org/data

About

Data processing by using the Openfood dataset and the natural language processing method, and data visualization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published