Machine learning Hack
1. To calculate the null values in dependent variable in dataset data.
data.isnull().sum()
2. To compare the particular feature with dependent variable.
pd.crosstab(data.Gender,data.Loan_Status)
3.To check that particular categorical coloum has how many different values(data - dataframe, gender- coloum)
print(data.Gender.value_counts())
4. To fill the na in the categorical column with highest number of values in it
data['Gender'].fillna(data['Gender'].mode()[0],inplace = True)
5. To get the set of numerical feature coloum
numerical_feature_columns = list(df._get_numeric_data().columns)
numerical_feature_columns
6. To get the set of categorical coloum
categorical_feature_columns = list(set(df.columns) - set(df._get_numeric_data().columns))
categorical_feature_columns
7. To drop the particular coloum in dataframe
data = data.drop(['Loan_ID'],axis=1)
8. To convert features into dummy variable in oneshot
X_features = list( data.columns )
X_features
data = pd.get_dummies(data[X_features], drop_first = True )
9. To take only feature which provided by rfecv
x_train_rfecv = rfecv.transform(x_train)
x_train_rfecv.shape
10. To do min - max scaler
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0,1))
rescaled = scaler.fit_transform(x)
print(rescaled)
Comments
Post a Comment