Releases · KxSystems/ml

10 Aug 11:27

cmccarthy1

1.0.0-rc.4

3405543

Modification to tab2df to handle single character columns Pre-release

Pre-release

This release covers two changes to the interface

Fix a minor bug in .ml.tab2df relating to the incorrect conversion of 'c' columns

// Define a table which will highlight the incorrect behaviour
q)tab:([]s:`a`b`c;j:1 2 3;c:"ABC")
// Old behaviour (duplicating 'ABC')
q)print .ml.tab2df tab
   s  j    c
0  a  1  ABC
1  b  2  ABC
2  c  3  ABC
// New behaviour
q)print .ml.tab2df tab  
   s  j    c
0  a  1  A
1  b  2  B
2  c  3  C

Minor change to test scripts for continuous integration purposes due to update in python side default behaviour

Assets 4

08 Jul 12:28

cmccarthy1

1.0.0-rc.3

d69657c

Change to Kolmogorov Smirnov behaviour to account for scipy version Pre-release

Pre-release

Update to kolmogorov-smirnov release 1.0.0-rc.2 changed the behaviour of the feature significance tests to account for update in scipy but breaks on older versions of scipy. This has been fixed with a version type check.

Assets 4

06 Jun 18:55

cmccarthy1

1.0.0-rc.1

7cf1ae7

Release candidate update, df2tab conversion handling of NaT Pre-release

Pre-release

Previous version of df2tab did not account for null temporal types as introduced in pandas v1.0.0, updated functionality addresses this
Support for Pandas migrated to >1.0.0 as this is the production version of Pandas and thus presently the stable version of the API

Assets 4

12 May 10:40

awilson-kx

1.0.0-rc

7c0c8f0

Initial release candidate for version 1.0.0 Pre-release

Pre-release

Added clustering
Kx Clustering brings unsupervised machine learning techniques directly to kdb+ data, enabling users to discover patterns and infer hidden relationships within their datasets.
Features include

K-means clustering
Hierarchical clustering
DBSCAN clustering
Affinity Propagation clustering
CURE clustering
KD Tree implementation (for optimized nearest neighbor calcs)
Range of distance metrics and linkage algorithms
Clustering scoring metrics

Assets 4

06 Jan 13:17

awilson-kx

v0.3.4

fab4c32

v0.3.4

Updated mproc to manage multiple loads (e.g. fresh and xval)
Minor changes match scipy/numpy
Date and timezone management in pandas functions
Fixed tests

Assets 2

19 Sep 09:14

awilson-kx

v0.3.3

9f653e8

v0.3.3

MODIFICATIONS:

Example notebooks (and associated data/images) moved to mlnotebooks repo

Assets 2

17 Jul 15:57

cmccarthy1

v0.3.2

b63cf7f

v0.3.2

MODIFICATIONS:

Update to requirements for pandas, needed based on modifications to .ml.df2tab and .ml.tab2df in order to handle date and time types in conversions.
-> Pandas>=0.21

Assets 2

05 Jul 16:26

awilson-kx

v0.3.1

5cc5af2

v0.3.1

NEW
Multiprocessing library (mproc) for transparently distributing jobs
Serialization/deserialization (pickle) library for Python objects
Cross validation functions

.ml.xv.kfshuff (K-Fold cross-validation with randomized indices)
.ml.xv.kfsplit (K-Fold cross-validation with sequential indices)
.ml.xv.kfstrat (K-Fold cross-validation with stratified indices)
.ml.xv.mcsplit (Monte-Carlo cross-validation with random split indices)
.ml.xv.pcsplit (Percentage split cross-validation)
.ml.xv.tschain (Chain-forward cross-validation)
.ml.xv.tsrolls (Roll-forward cross-validation)

Grid search functions

.ml.gs.kfshuff (K-Fold cross-validation with randomized indices)
.ml.gs.kfsplit (K-Fold cross-validation with sequential indices)
.ml.gs.kfstrat (K-Fold cross-validation with stratified indices)
.ml.gs.mcsplit (Monte-Carlo cross-validation with random split indices)
.ml.gs.pcsplit (Percentage split cross-validation)
.ml.gs.tschain (Chain-forward cross-validation)
.ml.gs.tsrolls (Roll-forward cross-validation)
Cross validation and gridsearch automatically support multiprocessing jobs

UPDATES
FRESH automatically supports multiprocessing jobs
Pandas conversion functions (.ml.df2tab and .ml.tab2df) support temporal conversions

Assets 2

12 Apr 16:03

awilson-kx

v0.2.1

d765f9a

v0.2.1

NEW

Ten new statistical metrics (fbscore, r2score, matthews correlation coeff etc.).
Two categorical encoding schemes (lexicographical and frequency).
Time/Date encoding.
Multiple hyper-parameter inputs now supported in FRESH.
Two new significant features selection options (k-best & percentile).
MODIFICATIONS
Input structure modification to .ml.fresh.createfeatures full explanation at
(code.kx.com/ml/toolkit/fresh).
Input structure modification to .ml.fresh.significantfeatures to account for
additional significant feature selection methods.
Removal of .ml.util namespace, compression to .ml. This tidys implementations and
removes ambiguity arising relating to if functions were true utils.
NOTE: functions below here may have previously been in .ml.util namespace.
Underlying file structure change to tidies code locations within toolkit
statistical functions -> util/metrics.q,
true utils -> util/util.q,
preprocessing functions -> util/preproc.q.
.ml.onehot no longer supports lists, input expected as tables. Encoding can be set to
operate on a column by column basis.
.ml.comb returns combinations in ascending order, previous implementation
had non-obvious return pattern.
.ml.filltab has modified expected dictionary input, previous behaviour was
`linear`mean`median!`x`x1`x2, this has been changed to a more 'q like'
mapping of columns to desired behaviours `x`x1`x2!`linear`mean`median.
.ml.filltab no longer default forward+backward fills on entry of ()!(), entry of
empty dictionary now returns original table. Defaulted forward+backward fill is
achieved through entry of :: in place of dict.
.ml.dropconstant now supports removal of constant keys of a dictionary
FIXES
.ml.infreplace only worked correctly under the condition that both positive and
negative infinities existed within the vector. Function now operates if positive,
negative or no infinities are present in the vector.
REMOVED
.ml.util.traintestsplitseed, behaviour can be set via q)\S x prior
to application of .ml.traintestsplit.

Assets 2

17 Dec 12:49

cmccarthy1

v0.1.2

48ab13a

v0.1.2 Pre-release

Pre-release

Fix to Significant features function and addition of Appveyor test for windows install.

Changes to the feature significance function. In the previous release this had been performing incorrectly based on how .ml.fresh.benjhochfind and .ml.fresh.featuresignificance were interacting
Tests of feature benjamini-hochberg procedure have been made more rigorous to ensure function is performing correctly
Appveyor tests are now explicitly called on upload of new changes.

6034eba: Fix to feature significance function
48ab13achange path function to allow it to load in library into windows

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: KxSystems/ml

Modification to tab2df to handle single character columns

Change to Kolmogorov Smirnov behaviour to account for scipy version

Release candidate update, df2tab conversion handling of NaT

Initial release candidate for version 1.0.0

v0.3.4

v0.3.3

v0.3.2

v0.3.1

v0.2.1

v0.1.2