Feature selection Archives

Feature Selection Machine Learning Numpy Pandas Python

Feature Engineering Tutorial Series 6: Variable magnitude

Does the magnitude of the variable matter? In Linear Regression models, the scale of variables used to estimate the output matters. Linear models are of the type y = w x + b, where the regression coefficient w represents the expected change in y for a one unit change in x Read more…

By georgiannacambel, 4 years4 October 2020 ago

Feature Selection Machine Learning Matplotlib Numpy Pandas Python

Feature Engineering Tutorial Series 5: Outliers

An outlier is a data point which is significantly different from the remaining data. “An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism.” [D. Hawkins. Identification of Outliers, Chapman and Hall , 1980.] Should Read more…

By georgiannacambel, 4 years3 October 2020 ago

Feature Selection Machine Learning Matplotlib Numpy Pandas Python

Feature Engineering Tutorial Series 4: Linear Model Assumptions

Linear models make the following assumptions over the independent variables X, used to predict Y: There is a linear relationship between X and the outcome Y The independent variables X are normally distributed There is no or little co-linearity among the independent variables Homoscedasticity (homogeneity of variance) Examples of linear Read more…

By georgiannacambel, 4 years2 October 2020 ago

Feature Selection Machine Learning Matplotlib Numpy Pandas Python

Feature Engineering Series Tutorial 3: Rare Labels

Labels that occur rarely Categorical variables are those whose values are selected from a group of categories, also called labels. Different labels appear in the dataset with different frequencies. Some categories appear more frequently in the dataset, whereas some other categories appear only in a few number of observations. For Read more…

By georgiannacambel, 4 years1 October 2020 ago

Feature Selection Machine Learning Numpy Pandas Python

Feature Engineering Series Tutorial 2: Cardinality in Machine Learning

Cardinality refers to the number of possible values that a feature can assume. For example, the variable “US State” is one that has 50 possible values. The binary features, of course, could only assume one of two values (0 or 1). The values of a categorical variable are selected from Read more…

By georgiannacambel, 4 years29 September 2020 ago

Deep Learning Feature Selection Keras Machine Learning Matplotlib Numpy Pandas Python Tensorflow 2

Bank Customer Satisfaction Prediction Using CNN and Feature Selection

Feature Selection and CNN In this project we are going to build a neural network to predict if a particular bank customer is satisfies or not. To do this we are going to use Convolutional Neural Networks. The dataset which we are going to use contains 370 features. We are going Read more…

By georgiannacambel, 4 years29 August 2020 ago

Feature Selection Machine Learning Matplotlib Numpy Pandas Python Seaborn and Plotly

Feature Selection Based on Univariate ROC_AUC for Classification and MSE for Regression | Machine Learning | KGP talkie

Feature Selection Based on Univariate ROC_AUC for Classification and MSE for Regression Watch Full Playlist: https://www.youtube.com/playlist?list=PLc2rvfiptPSQYzmDIFuq2PqN2n28ZjxDH What is ROC_AUC The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a Read more…

By KGP Talkie, 4 years11 August 2020 ago

Feature Selection Machine Learning Matplotlib Numpy Pandas Python Seaborn and Plotly

Feature Selection using Fisher Score and Chi2 (χ2) Test | Titanic Dataset | Machine Learning | KGP Talkie

Feature Selection using Fisher Score and Chi2 (χ2) Test Watch Full Playlist: https://www.youtube.com/playlist?list=PLc2rvfiptPSQYzmDIFuq2PqN2n28ZjxDH What is Fisher Score and Chi2 ( χ2) Test Fisher score is one of the most widely used supervised feature selection methods. However, it selects each feature independently according to their scores under the Fisher criterion, which Read more…

By KGP Talkie, 4 years11 August 2020 ago

Feature Selection Machine Learning Matplotlib Numpy Pandas PCA Python Seaborn and Plotly

Feature Selection Based on Univariate (ANOVA) Test for Classification | Machine Learning | KGP Talkie

Feature Selection Based on Univariate (ANOVA) Test for Classification Watch Full Playlist: https://www.youtube.com/playlist?list=PLc2rvfiptPSQYzmDIFuq2PqN2n28ZjxDH What is Univariate (ANOVA) Test The elimination process aims to reduce the size of the input feature set and at the same time to retain the class discriminatory information for classification problems. An F-test is any statistical Read more…

By KGP Talkie, 4 years11 August 2020 ago

Feature Selection Machine Learning Matplotlib Numpy Pandas Python Seaborn and Plotly

Use of Linear and Logistic Regression Coefficients with Lasso (L1) and Ridge (L2) Regularization for Feature Selection in Machine Learning

Watch Full Playlist: https://www.youtube.com/playlist?list=PLc2rvfiptPSQYzmDIFuq2PqN2n28ZjxDH Linear Regression Let’s first understand what exactly linear regression is, it is a straight forward approach to predict the response y on the basis of different prediction variables such x and ε. . There is a linear relation between x and y. 𝑦𝑖 = 𝛽0 + Read more…

By KGP Talkie, 4 years10 August 2020 ago

Feature Selection Machine Learning Matplotlib Numpy Pandas Python Seaborn and Plotly

Recursive Feature Elimination (RFE) by Using Tree Based and Gradient Based Estimators | Machine Learning | KGP Talkie

Recursive Feature Elimination (RFE) Playlist: https://www.youtube.com/playlist?list=PLc2rvfiptPSQYzmDIFuq2PqN2n28ZjxDH As it’s name suggests, it eliminates the features recursively and build a model using remaining attributes then again calculates the model accuracy of the model..Moreover how it do it train the model on all the dataset and it tries to remove the least performing Read more…

By KGP Talkie, 4 years10 August 2020 ago

Feature Selection Machine Learning Matplotlib Numpy Python Seaborn and Plotly

Step Forward, Step Backward and Exhaustive Feature Selection | Wrapper Method | KGP Talkie

Wrapping method Uses of Wrapping method Use combinations of variables to determine predictive power. To find the best combination of variables. Computationally expensive than filter method. To perform better than filter method. Not recommended on high number of features. Forward Step Selection In this wrapping method, it selects one best Read more…

By KGP Talkie, 4 years9 August 2020 ago

Feature Selection Machine Learning Numpy Pandas Python

Lasso and Ridge Regularisation for Feature Selection in Classification | Embedded Method | KGP Talkie

What is Regularisation? Regularization adds a penalty on the different parameters of the model to reduce the freedom of the model. Hence, the model will be less likely to fit the noise of the training data and will improve the generalization abilities of the model. There are basically 3-types of Read more…

By KGP Talkie, 4 years8 August 2020 ago