-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
44 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,23 +1,56 @@ | ||
# clojask | ||
# Clojask | ||
Clojure data frame with parallel computing on larger-than-memory datasets | ||
|
||
#### Run the main function in `core`: | ||
### Features | ||
|
||
``` | ||
lein run | ||
``` | ||
- **Unlimited size** | ||
|
||
Theoretically speaking, it supports dataset larger than memory, even to infinity! | ||
|
||
- **Fast** | ||
|
||
Faster than Dask in most operations, and the larger the dataframe is, the bigger the advantage | ||
|
||
- **All native types** | ||
|
||
All the datatypes used to store data is native Clojure (or Java) types! | ||
|
||
- **From file to file** | ||
|
||
Integrate IO inside the dataframe. No need to write your own read-in and output functions! | ||
|
||
- **Parallel** | ||
|
||
#### Run the tests in `test`: | ||
Most operations could be executed into multiple threads or even machines. See the principle in [Onyx](http://www.onyxplatform.org/). | ||
|
||
- **Lazy operations** | ||
|
||
Most operations will not be executed immediately. Dataframe will intelligently pipeline the operations altogether in computation. | ||
|
||
### Installation | ||
|
||
Available on [Clojars](https://clojars.org/com.github.clojure-finance/clojask). | ||
|
||
Insert this line into your `project.clj` if using Leiningen. | ||
|
||
``` | ||
lein test | ||
[com.github.clojure-finance/clojask "1.0.0"] | ||
``` | ||
|
||
Insert this line into your `deps.edn` if using CLI. | ||
|
||
To run a particular test defined in the namespace: | ||
``` | ||
lein test :only core-test/df-api-test | ||
com.github.clojure-finance/clojask {:mvn/version "1.0.0"} | ||
``` | ||
|
||
#### Requirements for the input file: | ||
- the first row should contain the column names | ||
### Documentation | ||
|
||
The detailed doc for every API can be found [here](https://clojure-finance.github.io/clojask-website/posts-output/API/). | ||
|
||
### Examples | ||
|
||
A separate repository for some typical usage of Clojask can be found [here](https://github.com/clojure-finance/clojask-examples). | ||
|
||
### Problem Feedback | ||
|
||
If your question is not answered in existing [issues](https://github.com/clojure-finance/clojask/issues). Feel free to create a new one. |