Skip to content

Latest commit

 

History

History
43 lines (29 loc) · 1.45 KB

README.md

File metadata and controls

43 lines (29 loc) · 1.45 KB

Say hello

This repository allows to download data for demonstration of DYLANs capabilities in practice.

The sample tries to be representative by sourcing data from (primarily) Dutch data providers in various sectors Healthcare, Finance, Science, Education, Traffic, Legislative and Infrastructure.

Providers:

Usage

You can download the datasets by running:

python data_sources.py

Installation

Installation for Python 3 requires two steps:

  1. Install the required Python packages.

    • Option 1: create a conda environment conda env create -f env.yml
    • Option 2: install the requirements as listen in env.yml via pip pip install -U [requirement]
  2. Install Chrome/Chromium and ChromeDriver

Example usage

The type_analysis directory contains jupyter notebooks with analyses of the data types of each dataset. These are also an excellent reference implementation for loading the sets.

The profile.py script generates exploratory data analysis reports for the datasets.