Not All Skew Is Suspicious - How to Avoid Mistaking Signal for Outliers
Not All Skew Is Suspicious # Introduction # You’re exploring your data. You see a long tail or a cluster of extreme values. Your instinct? Flag them as outliers. Trim the noise. Clean the dataset. But here’s the thing: not all skewed data is dirty. Skew can carry meaningful structure about the real-world process you’re modeling.
Optimize Like a Pro With LSD
Introduction # Start-ups often need to move faster than traditional A/B testing best practices allow. Typically, A/B tests need a couple of weeks to gather enough data, sometimes more. When multiple improvements are ready to ship, waiting to test them one at a time can mean lost momentum or missed opportunities. Enter the Latin Square Design (LSD), a brilliant example of working smarter instead of harder. As a result of using LSD, your estimate of the treatment effect has significant sources of noise removed, which means:
To A/B or not to A/B
Introduction # This is the story of a project that began as a straightforward A/B test but quickly revealed more than expected—offering fresh insights and expanding the scope of analysis. It’s been a while since I worked as an independent data and analytics consultant. I went freelance after many years in data systems, BI, and MIS at a large multinational education company. During that time, I led projects using applied statistics, data mining, and algorithmic forecasting—and discovered a real passion for data science. But the chance to deepen those skills long-term wasn’t there, so I made the leap into freelancing, motivated by clarity about my goals and a desire for more hands-on, impactful work.
Tallinn Ride Hailing App
Introduction # I acquired a fun little data set for a well-known ride-hailing app recently. I performed a pretty detailed analysis at the request of my source, including some clustering of the ride start locations. The idea was to help drivers plan ahead to get into position before times of peak demand. There’s no NDA and this data is no longer very fresh, so I thought it would be nice to show the results in an interactive Tableau viz.
Analysing SaaS Trial to Subscriber Conversions - Going Beyond the Binary Outcomes with Survival Analytics - Part 1
Series Introduction # This is part one of a series on using Survival Analysis techniques for Product Management. Survival models excel at analyzing time-dependent events where timing matters as much as the outcome itself. Unlike time series analysis that tracks metrics evolving over time (and requires data collected at regular time intervals), survival analysis is event-based and asks a different question: when will something happen, and what influences that timing?