MultiData provides a machine learning oriented data layer on top of DataFrames.jl for:
- Instantiating and manipulating multimodal datasets for (un)supervised machine learning;
- Describing datasets via basic statistical measures;
- Saving to/loading from npy/npz format, as well as a custom CSV-based format (with interesting features such as lazy loading of datasets);
- Performing basic data processing operations (e.g., windowing, moving average, etc.).
The package is developed by the ACLAI Lab @ University of Ferrara.
MultiData.jl was originally built for representing multimodal datasets in Sole.jl, an open-source framework for symbolic machine learning.