Step Forward, Step Backward and Exhaustive Feature Selection | Wrapper Method | KGP Talkie
Wrapping method
Uses of Wrapping method
- Use combinations of variables to determine predictive power.
- To find the best combination of variables.
- Computationally expensive than filter method.
- To perform better than filter method.
- Not recommended on high number of features.

Forward Step Selection
In this wrapping method, it selects one best feature every time and finally it combines all the best features for the best accuracy.

Backward Step Selection
It is reverse process of Forward Step Selection method, intially it takes all the features and remove one by one every time. Finally it left with required number of features for the best accuracy.

Exhaustive Feature Selection
- It is also called as subset selection method.
- It fits the model with each possible combinations of N features.
( y = B0, y = B0 + B1.X1, y = C0 + C1.X2 ) - It requires massive computational power.
- It uses test error to evaluate model performance.
Drawback
It is a slower method compared to step forward and back ward methods.
Use of mlxtend in Wrapper Method
!pip install mlxtend
Requirement already satisfied: mlxtend in c:\users\srish\appdata\roaming\python\python38\site-packages (0.17.3) Requirement already satisfied: scikit-learn>=0.20.3 in e:\callme_conda\lib\site-packages (from mlxtend) (0.23.1) Requirement already satisfied: pandas>=0.24.2 in e:\callme_conda\lib\site-packages (from mlxtend) (1.0.5) Requirement already satisfied: joblib>=0.13.2 in e:\callme_conda\lib\site-packages (from mlxtend) (0.16.0) Requirement already satisfied: matplotlib>=3.0.0 in e:\callme_conda\lib\site-packages (from mlxtend) (3.2.2) Requirement already satisfied: setuptools in e:\callme_conda\lib\site-packages (from mlxtend) (49.2.0.post20200714) Requirement already satisfied: numpy>=1.16.2 in e:\callme_conda\lib\site-packages (from mlxtend) (1.18.5) Requirement already satisfied: scipy>=1.2.1 in e:\callme_conda\lib\site-packages (from mlxtend) (1.5.0) Requirement already satisfied: threadpoolctl>=2.0.0 in e:\callme_conda\lib\site-packages (from scikit-learn>=0.20.3->mlxtend) (2.1.0) Requirement already satisfied: python-dateutil>=2.6.1 in e:\callme_conda\lib\site-packages (from pandas>=0.24.2->mlxtend) (2.8.1) Requirement already satisfied: pytz>=2017.2 in e:\callme_conda\lib\site-packages (from pandas>=0.24.2->mlxtend) (2020.1) Requirement already satisfied: cycler>=0.10 in e:\callme_conda\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (0.10.0) Requirement already satisfied: kiwisolver>=1.0.1 in e:\callme_conda\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (1.2.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in e:\callme_conda\lib\site-packages (from matplotlib>=3.0.0->mlxtend) (2.4.7) Requirement already satisfied: six>=1.5 in e:\callme_conda\lib\site-packages (from python-dateutil>=2.6.1->pandas>=0.24.2->mlxtend) (1.15.0)
More Information Available at http://rasbt.github.io/mlxtend/
How it works
Sequential feature selection algorithms are a family of greedy search algorithms that are used to reduce an initial d-dimensional feature space to a k-dimensional feature subspace where k < d.
In a nutshell, SFAs remove or add one feature at the time based on the classifier performance until a feature subset of the desired size k is reached. There are 4 different flavors of SFAs available via the SequentialFeatureSelector:
- Sequential Forward Selection (SFS)
- Sequential Backward Selection (SBS)
- Sequential Forward Floating Selection (SFFS)
- Sequential Backward Floating Selection (SBFS)
Step Forward Selection (SFS)
Importing required libraries
import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline
from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier from sklearn.metrics import roc_auc_score from mlxtend.feature_selection import SequentialFeatureSelector as SFS
from sklearn.datasets import load_wine from sklearn.preprocessing import StandardScaler
We are going to wine dataset. We can load this dataset from sklearn.
data = load_wine()
Let’s get the keys of this dataset.
data.keys()
dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names'])
Let’s get the description of the wine dataset.
print(data.DESCR)
.. _wine_dataset:
Wine recognition dataset
------------------------
**Data Set Characteristics:**
:Number of Instances: 178 (50 in each of three classes)
:Number of Attributes: 13 numeric, predictive attributes and the class
:Attribute Information:
- Alcohol
- Malic acid
- Ash
- Alcalinity of ash
- Magnesium
- Total phenols
- Flavanoids
- Nonflavanoid phenols
- Proanthocyanins
- Color intensity
- Hue
- OD280/OD315 of diluted wines
- Proline
- class:
- class_0
- class_1
- class_2
:Summary Statistics:
============================= ==== ===== ======= =====
Min Max Mean SD
============================= ==== ===== ======= =====
Alcohol: 11.0 14.8 13.0 0.8
Malic Acid: 0.74 5.80 2.34 1.12
Ash: 1.36 3.23 2.36 0.27
Alcalinity of Ash: 10.6 30.0 19.5 3.3
Magnesium: 70.0 162.0 99.7 14.3
Total Phenols: 0.98 3.88 2.29 0.63
Flavanoids: 0.34 5.08 2.03 1.00
Nonflavanoid Phenols: 0.13 0.66 0.36 0.12
Proanthocyanins: 0.41 3.58 1.59 0.57
Colour Intensity: 1.3 13.0 5.1 2.3
Hue: 0.48 1.71 0.96 0.23
OD280/OD315 of diluted wines: 1.27 4.00 2.61 0.71
Proline: 278 1680 746 315
============================= ==== ===== ======= =====
:Missing Attribute Values: None
:Class Distribution: class_0 (59), class_1 (71), class_2 (48)
:Creator: R.A. Fisher
:Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
:Date: July, 1988
This is a copy of UCI ML Wine recognition datasets.
https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data
The data is the results of a chemical analysis of wines grown in the same region in Italy by three different cultivators. There are thirteen differentmeasurements taken for different constituents found in the three types of wine.
Let’s go ahead and get the data in x and y vectors.
X = pd.DataFrame(data.data) y = data.target
X.columns = data.feature_names X.head()
| alcohol | malic_acid | ash | alcalinity_of_ash | magnesium | total_phenols | flavanoids | nonflavanoid_phenols | proanthocyanins | color_intensity | hue | od280/od315_of_diluted_wines | proline | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 14.23 | 1.71 | 2.43 | 15.6 | 127.0 | 2.80 | 3.06 | 0.28 | 2.29 | 5.64 | 1.04 | 3.92 | 1065.0 |
| 1 | 13.20 | 1.78 | 2.14 | 11.2 | 100.0 | 2.65 | 2.76 | 0.26 | 1.28 | 4.38 | 1.05 | 3.40 | 1050.0 |
| 2 | 13.16 | 2.36 | 2.67 | 18.6 | 101.0 | 2.80 | 3.24 | 0.30 | 2.81 | 5.68 | 1.03 | 3.17 | 1185.0 |
| 3 | 14.37 | 1.95 | 2.50 | 16.8 | 113.0 | 3.85 | 3.49 | 0.24 | 2.18 | 7.80 | 0.86 | 3.45 | 1480.0 |
| 4 | 13.24 | 2.59 | 2.87 | 21.0 | 118.0 | 2.80 | 2.69 | 0.39 | 1.82 | 4.32 | 1.04 | 2.93 | 735.0 |
Now we will chech whether null values present in the dataset by using isnull.sum().
X.isnull().sum()
alcohol 0 malic_acid 0 ash 0 alcalinity_of_ash 0 magnesium 0 total_phenols 0 flavanoids 0 nonflavanoid_phenols 0 proanthocyanins 0 color_intensity 0 hue 0 od280/od315_of_diluted_wines 0 proline 0 dtype: int64
Let’s go ahead do the train,test and split for this dataset. Have a look at the following code.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0) X_train.shape, X_test.shape
((142, 13), (36, 13))
Let’s go ahead and start working for the Step Forward Feature Selection(SFS).
Step Forward Feature Selection (SFS)
Here, we are using SequentialFeatureSelector() and passing Random Forest Classifier in this we are passing number of estimators, random_state and number of jobs.
k number of features are the required number of features.
In this case, since it is forward step method, forward is equal to True.
For verbose it is for log here we are using 2.Cross validation set,here we are choosing as 4.Number of jobs means how many cores we will use, here -1 means use all the available core in this system.
sfs = SFS(RandomForestClassifier(n_estimators=100, random_state=0, n_jobs = -1),
k_features = 7,
forward= True,
floating = False,
verbose= 2,
scoring= 'accuracy',
cv = 4,
n_jobs= -1
).fit(X_train, y_train)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 5 out of 13 | elapsed: 4.2s remaining: 6.8s [Parallel(n_jobs=-1)]: Done 13 out of 13 | elapsed: 5.8s finished [2020-08-06 12:56:29] Features: 1/7 -- score: 0.7674603174603174[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 12 | elapsed: 1.5s remaining: 3.1s [Parallel(n_jobs=-1)]: Done 12 out of 12 | elapsed: 3.2s finished [2020-08-06 12:56:33] Features: 2/7 -- score: 0.9718253968253968[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 2 out of 11 | elapsed: 2.3s remaining: 10.6s [Parallel(n_jobs=-1)]: Done 8 out of 11 | elapsed: 2.8s remaining: 1.0s [Parallel(n_jobs=-1)]: Done 11 out of 11 | elapsed: 4.7s finished [2020-08-06 12:56:37] Features: 3/7 -- score: 0.9859126984126985[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 7 out of 10 | elapsed: 2.2s remaining: 0.9s [Parallel(n_jobs=-1)]: Done 10 out of 10 | elapsed: 4.5s finished [2020-08-06 12:56:42] Features: 4/7 -- score: 0.9789682539682539[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 9 | elapsed: 2.3s remaining: 2.8s [Parallel(n_jobs=-1)]: Done 9 out of 9 | elapsed: 4.1s remaining: 0.0s [Parallel(n_jobs=-1)]: Done 9 out of 9 | elapsed: 4.1s finished [2020-08-06 12:56:46] Features: 5/7 -- score: 0.9720238095238095[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 3 out of 8 | elapsed: 2.1s remaining: 3.6s [Parallel(n_jobs=-1)]: Done 8 out of 8 | elapsed: 2.9s remaining: 0.0s [Parallel(n_jobs=-1)]: Done 8 out of 8 | elapsed: 2.9s finished [2020-08-06 12:56:49] Features: 6/7 -- score: 0.9789682539682539[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 7 | elapsed: 2.3s remaining: 1.7s [Parallel(n_jobs=-1)]: Done 7 out of 7 | elapsed: 2.7s finished [2020-08-06 12:56:52] Features: 7/7 -- score: 0.9791666666666666
sfs.k_feature_names_
('alcohol',
'ash',
'magnesium',
'flavanoids',
'proanthocyanins',
'color_intensity',
'proline')
sfs.k_feature_idx_
(0, 2, 4, 6, 8, 9, 12)
sfs.k_score_
0.9791666666666666
pd.DataFrame.from_dict(sfs.get_metric_dict()).T
| feature_idx | cv_scores | avg_score | feature_names | ci_bound | std_dev | std_err | |
|---|---|---|---|---|---|---|---|
| 1 | (6,) | [0.7222222222222222, 0.8333333333333334, 0.742… | 0.76746 | (flavanoids,) | 0.0670901 | 0.0418533 | 0.024164 |
| 2 | (6, 9) | [0.9444444444444444, 1.0, 0.9714285714285714, … | 0.971825 | (flavanoids, color_intensity) | 0.031492 | 0.0196459 | 0.0113425 |
| 3 | (4, 6, 9) | [0.9722222222222222, 1.0, 0.9714285714285714, … | 0.985913 | (magnesium, flavanoids, color_intensity) | 0.0225862 | 0.0140901 | 0.00813492 |
| 4 | (4, 6, 9, 12) | [0.9722222222222222, 0.9722222222222222, 0.971… | 0.978968 | (magnesium, flavanoids, color_intensity, proline) | 0.0194714 | 0.012147 | 0.00701308 |
| 5 | (2, 4, 6, 9, 12) | [0.9444444444444444, 0.9722222222222222, 0.971… | 0.972024 | (ash, magnesium, flavanoids, color_intensity, … | 0.0314903 | 0.0196449 | 0.011342 |
| 6 | (2, 4, 6, 8, 9, 12) | [0.9722222222222222, 0.9722222222222222, 0.971… | 0.978968 | (ash, magnesium, flavanoids, proanthocyanins, … | 0.0194714 | 0.012147 | 0.00701308 |
| 7 | (0, 2, 4, 6, 8, 9, 12) | [0.9444444444444444, 0.9722222222222222, 1.0, … | 0.979167 | (alcohol, ash, magnesium, flavanoids, proantho… | 0.0369201 | 0.0230321 | 0.0132976 |
sfs = SFS(RandomForestClassifier(n_estimators=100, random_state=0, n_jobs = -1),
k_features = (1, 8),
forward= True,
floating = False,
verbose= 2,
scoring= 'accuracy',
cv = 4,
n_jobs= -1
).fit(X_train, y_train)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 5 out of 13 | elapsed: 2.4s remaining: 4.0s [Parallel(n_jobs=-1)]: Done 13 out of 13 | elapsed: 4.8s finished [2020-08-06 12:57:09] Features: 1/8 -- score: 0.7674603174603174[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 12 | elapsed: 2.3s remaining: 4.7s [Parallel(n_jobs=-1)]: Done 12 out of 12 | elapsed: 4.4s finished [2020-08-06 12:57:13] Features: 2/8 -- score: 0.9718253968253968[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 2 out of 11 | elapsed: 1.9s remaining: 8.7s [Parallel(n_jobs=-1)]: Done 8 out of 11 | elapsed: 2.3s remaining: 0.8s [Parallel(n_jobs=-1)]: Done 11 out of 11 | elapsed: 4.3s finished [2020-08-06 12:57:17] Features: 3/8 -- score: 0.9859126984126985[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 7 out of 10 | elapsed: 2.6s remaining: 1.0s [Parallel(n_jobs=-1)]: Done 10 out of 10 | elapsed: 4.4s finished [2020-08-06 12:57:22] Features: 4/8 -- score: 0.9789682539682539[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 9 | elapsed: 2.5s remaining: 3.1s [Parallel(n_jobs=-1)]: Done 9 out of 9 | elapsed: 4.3s remaining: 0.0s [Parallel(n_jobs=-1)]: Done 9 out of 9 | elapsed: 4.3s finished [2020-08-06 12:57:26] Features: 5/8 -- score: 0.9720238095238095[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 3 out of 8 | elapsed: 2.1s remaining: 3.5s [Parallel(n_jobs=-1)]: Done 8 out of 8 | elapsed: 2.5s remaining: 0.0s [Parallel(n_jobs=-1)]: Done 8 out of 8 | elapsed: 2.5s finished [2020-08-06 12:57:29] Features: 6/8 -- score: 0.9789682539682539[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 7 | elapsed: 1.7s remaining: 1.3s [Parallel(n_jobs=-1)]: Done 7 out of 7 | elapsed: 1.8s finished [2020-08-06 12:57:31] Features: 7/8 -- score: 0.9791666666666666[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 3 out of 6 | elapsed: 1.9s remaining: 1.9s [Parallel(n_jobs=-1)]: Done 6 out of 6 | elapsed: 1.9s finished [2020-08-06 12:57:33] Features: 8/8 -- score: 0.9791666666666666
Let’s go ahead and see the accuracy with this 7 features.
sfs.k_score_
0.9859126984126985
Now, we can see here selected feature from this algorithm.
sfs.k_feature_names_
('magnesium', 'flavanoids', 'color_intensity')
Step Backward Selection (SBS)
Let’s go ahead work with the Step Backward Selection. Have a look at the following script.
The only thing change here compared to Step Forward Selection, keep forward as False.
sfs = SFS(RandomForestClassifier(n_estimators=100, random_state=0, n_jobs = -1),
k_features = (1, 8),
forward= False,
floating = False,
verbose= 2,
scoring= 'accuracy',
cv = 4,
n_jobs= -1
).fit(X_train, y_train)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 5 out of 13 | elapsed: 2.2s remaining: 3.6s [Parallel(n_jobs=-1)]: Done 13 out of 13 | elapsed: 4.6s finished [2020-08-06 12:57:54] Features: 12/1 -- score: 0.9861111111111112[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 12 | elapsed: 2.2s remaining: 4.5s [Parallel(n_jobs=-1)]: Done 12 out of 12 | elapsed: 4.5s finished [2020-08-06 12:57:58] Features: 11/1 -- score: 0.9861111111111112[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 2 out of 11 | elapsed: 2.2s remaining: 10.3s [Parallel(n_jobs=-1)]: Done 8 out of 11 | elapsed: 2.7s remaining: 0.9s [Parallel(n_jobs=-1)]: Done 11 out of 11 | elapsed: 4.1s finished [2020-08-06 12:58:03] Features: 10/1 -- score: 0.9791666666666666[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 7 out of 10 | elapsed: 3.1s remaining: 1.3s [Parallel(n_jobs=-1)]: Done 10 out of 10 | elapsed: 4.9s finished [2020-08-06 12:58:08] Features: 9/1 -- score: 0.9861111111111112[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 9 | elapsed: 2.1s remaining: 2.7s [Parallel(n_jobs=-1)]: Done 9 out of 9 | elapsed: 4.1s remaining: 0.0s [Parallel(n_jobs=-1)]: Done 9 out of 9 | elapsed: 4.1s finished [2020-08-06 12:58:12] Features: 8/1 -- score: 0.9859126984126985[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 3 out of 8 | elapsed: 2.2s remaining: 3.8s [Parallel(n_jobs=-1)]: Done 8 out of 8 | elapsed: 2.7s remaining: 0.0s [Parallel(n_jobs=-1)]: Done 8 out of 8 | elapsed: 2.7s finished [2020-08-06 12:58:15] Features: 7/1 -- score: 0.978968253968254[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 7 | elapsed: 2.1s remaining: 1.6s [Parallel(n_jobs=-1)]: Done 7 out of 7 | elapsed: 2.4s finished [2020-08-06 12:58:17] Features: 6/1 -- score: 0.9859126984126985[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 3 out of 6 | elapsed: 1.7s remaining: 1.7s [Parallel(n_jobs=-1)]: Done 6 out of 6 | elapsed: 1.8s finished [2020-08-06 12:58:19] Features: 5/1 -- score: 0.9789682539682539[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 2 out of 5 | elapsed: 2.1s remaining: 3.2s [Parallel(n_jobs=-1)]: Done 5 out of 5 | elapsed: 2.1s remaining: 0.0s [Parallel(n_jobs=-1)]: Done 5 out of 5 | elapsed: 2.1s finished [2020-08-06 12:58:21] Features: 4/1 -- score: 0.9718253968253968[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 4 out of 4 | elapsed: 2.3s remaining: 0.0s [Parallel(n_jobs=-1)]: Done 4 out of 4 | elapsed: 2.3s finished [2020-08-06 12:58:24] Features: 3/1 -- score: 0.9718253968253968[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 3 out of 3 | elapsed: 2.3s finished [2020-08-06 12:58:26] Features: 2/1 -- score: 0.9718253968253968[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 2 out of 2 | elapsed: 2.2s finished [2020-08-06 12:58:28] Features: 1/1 -- score: 0.7674603174603174
sbs = sfs sbs.k_score_
0.9859126984126985
Let’s get the selected features.
sbs.k_feature_names_
('alcohol',
'malic_acid',
'ash',
'alcalinity_of_ash',
'magnesium',
'flavanoids',
'nonflavanoid_phenols',
'color_intensity')
Exhaustive Feature Selection (EFS)
Let’s go ahead and learn about the Exhaustive Feature Selection(EFS).
from mlxtend.feature_selection import ExhaustiveFeatureSelector as EFS
It will start with the subset of minimum features to maximum subset of features.
efs = EFS(RandomForestClassifier(n_estimators=100, random_state=0, n_jobs=-1),
min_features= 4,
max_features= 5,
scoring='accuracy',
cv = None,
n_jobs=-1
).fit(X_train, y_train)
Features: 2002/2002
So, while training with exauhstive feature selection with minimum subset of 4 and 5 it has trained for 2002 subsets.
C(13, 4) + C(13, 5) = 715 + 1287
715 + 1287
2002
Let’s find out best accuracy for EFS algorithm with the following code.
efs.best_score_
1.0
Now get the selected features for the best score.
efs.best_feature_names_
('alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash')
Let’s get indices of selected features.
efs.best_idx_
(0, 1, 2, 3)
from mlxtend.plotting import plot_sequential_feature_selection as plot_sfs
Now, try to plot the graph of the performance with changing number of features.
plot_sfs(efs.get_metric_dict(), kind='std_dev')
plt.title('Performance of the EFS algorithm with changing number of features')
plt.show()

1 Comment