Explore real-world case studies that demonstrate how tailored data solutions delivered measurable results across industries.
Explore real-world case studies that demontrate how tailored data solutions delivered measurable results across industries.
01 | Taiwan Credit Card Default Analysis
This project analyzes credit default behavior using a dataset of 30,000 credit card clients from a Taiwanese financial institution. The data, sourced from the UCI Machine Learning Repository, includes demographic information, historical bill statements, and monthly payment records.
With a background in finance and risk modeling, I selected this dataset to explore the problem of predicting credit default—a key challenge in credit risk assessment. The aim of this analysis is to evaluate how effectively machine learning models can estimate the probability that a customer will default on their obligations.
Using Python, I trained and compared a variety of classification models, including logistic regression, decision trees, random forests, support vector machines, gradient boosting, and neural networks. A central focus of the project was on probability calibration—how closely a model’s predicted default probabilities align with actual default rates—rather than simply maximizing classification accuracy.
Key insights from the analysis include:
Models like logistic regression and gradient boosting offered strong performance in both discrimination (AUC) and calibration.
Calibration plots and Brier scores were used to assess how well predicted probabilities reflected real-world risk.
Inspired by academic research, the study incorporated sorting and smoothing techniques to better evaluate probability estimates.
This project highlights the importance of well-calibrated probability outputs in financial applications, where decision thresholds often depend on a lender’s risk tolerance or regulatory constraints.
Left: Monthly Occurences of “Vol de vehicule a moteur” (2018 Onwards).
Right: Correlation Heatmap
Top: Monthly Occurences of “Vol de vehicule a moteur” (2018 Onwards).
Bottom: Correlation Heatmap
02 | Montreal Vehicle-Related Crime Analysis
This project examines patterns and trends in vehicle-related criminal activity across Montreal using a dataset of over 300,000 police-reported incidents from 2015 to 2022. The goal was to uncover insights about the temporal distribution of crimes, the impact of public policy interventions (e.g., curfews), and the variation in incident types over time.
Through data wrangling and visualization in Python, the analysis revealed:
Clear temporal patterns in crime, with specific time periods (day, evening, night) correlating with different types of offenses.
A significant drop in vehicle thefts during COVID-19 lockdowns and curfew periods, indicating the influence of mobility restrictions on criminal behavior.
The most frequent types of vehicle-related crimes, and how these varied by date and time.
This work demonstrates how public datasets can be leveraged to support evidence-based urban safety planning and policy evaluation.
Left: Monthly Occurences of “Vol de vehicule a moteur” (2018 Onwards).
Right: Agglomeration Boundaries and Car Theft Incidents in Montreal.
Top: Monthly Occurences of “Vol de vehicule a moteur” (2018 Onwards).
Bottom: Agglomeration Boundaries and Car Theft Incidents in Montreal.
02 | Montreal Vehicle-Related Crime Analysis
This project examines patterns and trends in vehicle-related criminal activity across Montreal using a dataset of over 300,000 police-reported incidents from 2015 to 2022. The goal was to uncover insights about the temporal distribution of crimes, the impact of public policy interventions (e.g., curfews), and the variation in incident types over time.
Through data wrangling and visualization in Python, the analysis revealed:
Clear temporal patterns in crime, with specific time periods (day, evening, night) correlating with different types of offenses.
A significant drop in vehicle thefts during COVID-19 lockdowns and curfew periods, indicating the influence of mobility restrictions on criminal behavior.
The most frequent types of vehicle-related crimes, and how these varied by date and time.
This work demonstrates how public datasets can be leveraged to support evidence-based urban safety planning and policy evaluation.
Left: Monthly Occurences of “Vol de vehicule a moteur” (2018 Onwards).
Right: Agglomeration Boundaries and Car Theft Incidents in Montreal.
Top: Monthly Occurences of “Vol de vehicule a moteur” (2018 Onwards).
Bottom: Agglomeration Boundaries and Car Theft Incidents in Montreal.