Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Tip Prediction NYC Taxi Data
Project type
Machine Learning in R
This project involved building a predictive model to estimate tip amounts for New York City taxi drivers. Using data from February 2017, the model was developed on Week 2 data and evaluated on Week 4 data, focusing on generalization to unseen data as measured by Mean Squared Prediction Error (MSPE). The process included data cleaning (removing irrelevant columns, handling missing/invalid entries, and outlier removal) and feature engineering, which created new variables like pickup hour, day of the week, trip duration, and categorized trip distance and fare amounts. Exploratory Data Analysis (EDA) was performed on a subsample to understand data distributions and relationships.