Data Science

Posts

Showing posts from October, 2020

Machine learning Hack

October 14, 2020

1. To calculate the null values in dependent variable in dataset data. data.isnull().sum() 2. To compare the particular feature with dependent variable. pd.crosstab(data.Gender,data.Loan_Status) 3.To check that particular categorical coloum has how many different values(data - dataframe, gender- coloum) print(data.Gender.value_counts()) 4. To fill the na in the categorical column with highest number of values in it data['Gender'].fillna(data['Gender'].mode()[0],inplace = True) 5. To get the set of numerical feature coloum numerical_feature_columns = list(df._get_numeric_data().columns) numerical_feature_columns 6. To get the set of categorical coloum categorical_feature_columns = list(set(df.columns) - set(df._get_numeric_data().columns)) categorical_feature_columns 7. To drop the particular coloum in dataframe data = data.drop(['Loan_ID'],axis=1) 8. To convert features into dummy variable in oneshot X_features = list( data.columns ) X_features data = pd.get...

Competitive aspect of Machine learning

October 02, 2020

* Data ko split krna jruri hai, isse hamara model rough tough banenga * Data ko split krne ke do traike traditional way hai train_test_split, aur doosra tarika hai cross validation , cross validation is more sophisticated one * remember when we use k fold or cross validation , we don't partition it , we don't fit it , we don't predict it *There are some feature selection techniques are there first is Selectkbest, second is RFE (recursive feature elimination) * then we have something for dimensionalty reduction and that is pca(Principal component analysis) * # kya hota hai ki koi bhi chiz 3 dimension mai hai, # aur usme hame mushkil ho rahi hai ki ye a, b,c hai # to use hm two dimension mai daal denge , aur isse hme a,b,c clear ho jayenge # So by reducing the dimension , you are acheving the separation that is pca * See selectkbest is selecting individual Feature , RFE is giving multiple feature at a time and pca is kind of telling that with less feature you can a...