[wiki] Read the wiki page for the instructions for development.
Sparrow is an impementation of boosting that is optimized for training on very large datasets, as well as training in limited memory settings. The opimizations involve two technologies: early stopping and selective sampling. Please read our paper for more details.
Latest implementation is described in https://github.com/arapat/sparrow-writeup
Documentation: https://arapat.github.io/sparrow/sparrow/
- Install Rust following the instruction on the Rust offical webiste.
- Compile Sparrow:
cargo build --release
The Sparrow binary file would be generated at target/release/sparrow
.
Sparrow is written as a Rust library. It also supports running as a binary. The sparrow binary reads configuration from a specified configuration file. Many examples of the configuration files can be found in the examples/ directory.
To run the Sparrow binary, please provide the path to the configuration file.
For training,
./sparrow train <path to the config file>
For testing (or prediction),
./sparrow test <path to the config file>
To use Sparrow as a Rust library, please refer to its document (generated by rustdoc).
Read configuration.md for how to set config.yaml
file.
Sample config.yaml
files can be found in the sparrow-expriments repository.