Below you will find pages that utilize the taxonomy term “Machine Learning”
Ds
Project 7: Bank Customer Churn Prediction
Performed Synthetic Minority Oversampling Technique (SMOTE) to overcome the problem of imbalance class in the customer dataset by synthesizing new examples from the existing examples. Built a XGBoost model and achieved over 93% AUC score in predicting churn of the customer. Found out the most important feature that impacted customer churn was the total transaction count for the past 12 months. AutoEDA of the Customer Dataset using Pandas Profiling Link to Google Colaboratory Notebook with Explanation
Ds
Project 4: Penguins Species Classification
Followed tutorial by Data Professor and modified the program to make the predictions using a file work normally Built a random forest model for predicting penguins species and saved into a pickle file. Read in the pickle file and predicted penguins species based on user input (slider or file uploading). Provided the example format of the file to be uploaded and showed the respective prediction results with their predicted probabilities based on user input or rows in file uploaded.
Ds
Project 3: IMDB Movie Reviews Sentiment Analysis
Performed word preprocessing such as special characters text and stopwords removal as well as stemming on the review texts. Conducted feature transformation to convert text data into numerical features using TF-IDF. Built a Multinomial Naive Bayes and Logistic Regression machine learning model to predict positive and negative sentiments and achieve F1 score of around 0.85. Plotted two word clouds to see the common words used in positive and negative reviews respectively.
Ds
Project 1: Life Expectancy Predictor
Created a web app that predicts life expectancy of people based on lifestyle and demographic factors using multiple linear regression. Performed feature selection and discarded the factors that showed low significant impact towards life expectancy prediction (p<0.05). Found that the number of years of schooling was the most correlated feature with life expectancy. Build the web app using R and its Shiny package. Try the app here