data-profiler is a Go project used to transform a set of datasets, based on a set of characteristics (distribution similarity, correlation, etc.), in order to model the behavior of an operator, applied on top of them using Machine Learning techniques.
You have two ways of installing data-profiler:
- Through Go:
# GOPATH must be set
~> go get github.com/giagiannis/data-profiler
- Using Docker:
~> docker pull ggian/data-profiler
data-profiler can be used both through a CLI and a Web interface.
- CLI
You can access the CLI client through the data-profiler-utils binary.
~> $GOPATH/bin/data-profiler-utils
This previous command will give an overview of the available actions.
Note: use this client only if you know how data-profiler works.
- Web UI
First run the Docker container, providing a directory with the dataset files.
~> docker run -v /src/datasets:/datasets -p 8080:8080 -d ggian/data-profiler
This command mounts the host's /src/datasets directory to the container and forwards the host's 8080 port to the container. After the successful start of the container, go to http://dockerhost:8080 and insert the first set of datasets for analysis.
Apache License v2.0 (see LICENSE file for more)
Giannis Giannakopoulos [email protected]