- database profiling
- implicit primary key detection
- implicit foreign key detection
- database metamodel visualization
- deep database search (Elastic Search over each distinct value, of each column, of each table)
- sequence mining
- general utilities (csv2sql, htmltable2csv, mysql2csv, etc.)
- SQL to REST API to dynamically expose database objects via HTTP
- sudo apt-get install python-pip
- sudo pip install virtualenv
- mkdir ~/.virtualenvs
- sudo pip install virtualenvwrapper
- echo 'export WORKON_HOME=~/.virtualenvs'
- echo '. /usr/local/bin/virtualenvwrapper.sh'
- mkvirtualenv dev2.7 --python=`which python2.7`
- mkvirtualenv dev3.4 --python=`which python3.4`
- pip install tornado
- pip install flask
- pip install flask_sqlalchemy
- pip install flask_restless
- pip install pymssql
- pip install sqlalchemy
- pip install pymssql
- pip install elasticsearch
- pip install requests
The show case will detail how to profile a database, visualize the metamodel, index an Elastic Search instance for deep database search as well as expose the database via HTTP.
Let's get some test data from launchpad so that we have a predictable data set to work from.
TBD...
First, create a config.ini
file by copying the template_config.ini
file and filling in the empty fields.
- [subjectdb] the database you will be profiling,
- [metadb] the database where profiling results will be stored and
- [es] Elastic Search options.
Start the profiler from the dev3.4 virtual environment created earlier
workon dev3.4
python profiler/profiler.py
python profiler/primarykey_detection.py -i <db_host> -u <db_user> -p <db_passwd> -c <db_catalog>
python profiler/foreignkey_detection.py -i <db_host> -u <db_user> -p <db_passwd> -c <db_catalog>