Skip to content
/ etl Public

Provides a basic directory structure and template files for setting up a DataLoader using the ETL methodology.

Notifications You must be signed in to change notification settings

jayemar/etl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ETL - Extract / Translate / Load

Provides a basic directory structure and template files for setting up a DataLoader using the ETL methodology.

Installation

pip install git+https://gitlab.com/jayemar/etl.git

Basic Usage

from etl.dataloader import DataLoader
dl = DataLoader()

train_gen = dl.retrieve_data(<ml_cfg>)
test_gen = dl.get_test_data()
valid_gen = dl.get_validation_data()

Config File

The config file can be in either JSON or YAML format. Fields are optional unless otherwise stated.

Fields

  • data_dir: directory where data is located; path can be absolute or relative to directory of task.py
  • batch_size: number of records per batch
  • epochs: number of epochs to run through during training
  • train_size: decimal ratio of training data
  • test_size: decimal ratio of test data
  • valid_size: decimal ratio of validation data

About

Provides a basic directory structure and template files for setting up a DataLoader using the ETL methodology.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages