Hâtvalues

Not All Skew Is Suspicious - How to Avoid Mistaking Signal for Outliers

Not All Skew Is Suspicious # Introduction # You’re exploring your data. You see a long tail or a cluster of extreme values. Your instinct? Flag them as outliers. Trim the noise. Clean the dataset. But here’s the thing: not all skewed data is dirty. Skew can carry meaningful structure about the real-world process you’re modeling.

Optimize Like a Pro With LSD

Introduction # Start-ups often need to move faster than traditional A/B testing best practices allow. Typically, A/B tests need a couple of weeks to gather enough data, sometimes more. When multiple improvements are ready to ship, waiting to test them one at a time can mean lost momentum or missed opportunities. Enter the Latin Square Design (LSD), a brilliant example of working smarter instead of harder. As a result of using LSD, your estimate of the treatment effect has significant sources of noise removed, which means:

To A/B or not to A/B

Introduction # This is the story of a project that began as a straightforward A/B test but quickly revealed more than expected—offering fresh insights and expanding the scope of analysis. It’s been a while since I worked as an independent data and analytics consultant. I went freelance after many years in data systems, BI, and MIS at a large multinational education company. During that time, I led projects using applied statistics, data mining, and algorithmic forecasting—and discovered a real passion for data science. But the chance to deepen those skills long-term wasn’t there, so I made the leap into freelancing, motivated by clarity about my goals and a desire for more hands-on, impactful work.

Tallinn Ride Hailing App

Introduction # I acquired a fun little data set for a well-known ride-hailing app recently. I performed a pretty detailed analysis at the request of my source, including some clustering of the ride start locations. The idea was to help drivers plan ahead to get into position before times of peak demand. There’s no NDA and this data is no longer very fresh, so I thought it would be nice to show the results in an interactive Tableau viz.

Educational Attainment in England - A Deeper Dive

Introduction # On the 25th July 2023, the UK Office for National Statistics produced a wonderful piece of data journalism with open access to their dataset measuring educational attainment across England. The article’s title “Why do children and young people in smaller towns do better academically than those in larger towns?” hides a bold claim in the form of a question. I like to assume that their research question(s) did not start from such a knowledge claim but rather the title emerged from the themes they discovered during their investigation.