You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. Thanks for this package, but I am running into a lot of troubles with it.
First of all in mi.py you use entropy implementation by Gael Varoquaux which gives negative MI's. I replaced that with sklearn's MI, and got rid of that problem, but still the features end up being chosen don't make sense.
I used iris dataset from sklearn. I replicate a feature, but as you can see the method here ends up picking up the same feature twice which shouldn't be the case. Here is the MWE:
import pandas as pd
import mifs
import pandas as pd
from sklearn import datasets
iris = datasets.load_iris()
iris_df = pd.DataFrame(iris.data, columns = iris.feature_names)
import numpy as np
X = iris_df[['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)']].values
print (X[:5, :])
X = np.hstack((X[:,2].reshape((-1, 1)), X))
print (X[:5, :])
y = iris_df['petal width (cm)'].values.reshape((1, -1)).squeeze()
# define MI_FS feature selection method
feat_selector = mifs.MutualInformationFeatureSelector(categorical=False, n_features=2)
# find all relevant features
feat_selector.fit(X, y)
# check selected features
print (feat_selector._support_mask)
# check ranking of features
print (feat_selector.ranking_)
# call transform() on X to filter it down to selected features
X_filtered = feat_selector.transform(X)
you can comment or uncomment the appending line.
Also there was no attribute called support for feat_selector and I had to replace that with _support_mask in your example. The code I changed was only the function _get_first_mi
and it is changed to:
def _get_first_mi(i, k, MI_FS):
n, p = MI_FS.X.shape
if MI_FS.categorical:
x = MI_FS.X[:, i].reshape((n, 1))
MI = _mi_dc(x, MI_FS.y, k)
else:
vars = (MI_FS.X[:, i].reshape((n, 1)), MI_FS.y)
MI = _mi_cc(vars, k)
from sklearn.feature_selection import mutual_info_regression
MI_2 = mutual_info_regression(vars[0], vars[1],n_neighbors=k)
MI = MI_2[0]
# MI must be non-negative
if MI > 0:
return MI
else:
return np.nan
The text was updated successfully, but these errors were encountered:
Hello. Thanks for this package, but I am running into a lot of troubles with it.
First of all in mi.py you use entropy implementation by Gael Varoquaux which gives negative MI's. I replaced that with sklearn's MI, and got rid of that problem, but still the features end up being chosen don't make sense.
I used iris dataset from sklearn. I replicate a feature, but as you can see the method here ends up picking up the same feature twice which shouldn't be the case. Here is the MWE:
you can comment or uncomment the appending line.
Also there was no attribute called support for feat_selector and I had to replace that with _support_mask in your example. The code I changed was only the function _get_first_mi
and it is changed to:
The text was updated successfully, but these errors were encountered: