Skip to content

Simple classification engine for government/municipality documents built with TensorFlow

Notifications You must be signed in to change notification settings

jharting/praguehacks2016-categorizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

29bcc89 · Oct 29, 2016

History

14 Commits
Oct 1, 2016
Oct 29, 2016
Oct 2, 2016
Oct 29, 2016
Oct 2, 2016
Oct 2, 2016
Oct 2, 2016
Oct 2, 2016
Oct 1, 2016
Oct 1, 2016
Oct 1, 2016
Oct 2, 2016
Oct 2, 2016
Oct 2, 2016

Repository files navigation

Categorizer (a PragueHacks 2016 project)

Simple classification engine for government/municipality documents built with TensorFlow Documents are tagged based on occurrence of certain words and other characteristics of a document.

This project is a prototype for

Built during Prague Hacks 2016

Setup

Requirements:

sudo pip install numpy
export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.10.0-cp27-none-linux_x86_64.whl
sudo pip install --upgrade $TF_BINARY_URL

Run

Prepare data:

  1. copy tagged content files to ./input
  2. copy feature vector to features.csv
  3. export CATS=`cat cats.txt
  4. bash generate-all.sh features.csv $CATS

Train DNN

  1. python train.py $CATS

Run classification on new data

  1. python predict.py features.csv $CATS output.csv

About

Simple classification engine for government/municipality documents built with TensorFlow

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published