Blog

Your blog category

Mastering Hadoop, Part 1: Installation, Configuration, and Modern Big Data Strategies

Nowadays, a large amount of data is collected on the internet, which is why companies are faced with the challenge of being able to store, process, and analyze these volumes efficiently. Hadoop is an open-source framework from the Apache Software Foundation and has become one of the leading Big Data management technologies in recent years.

Mastering Hadoop, Part 1: Installation, Configuration, and Modern Big Data Strategies Read More »

Experiments Illustrated: Can $1 Change Behavior More Than $100?

I currently lead a small data team at a small tech company. With everything small, we have a lot of autonomy over what, when, and how we run experiments. In this series, I’m opening the vault from our years of experimenting, each story highlighting a key concept related to experimentation. Georandomization & How we optimized

Experiments Illustrated: Can $1 Change Behavior More Than $100? Read More »

Platform-Mesh, Hub and Spoke, and Centralised | 3 Types of data team

Introduction In the “ever rapidly changing landscape of Data and AI” (!), understanding data and AI architecture has never been more critical. However something many leaders overlook is the importance of data team structure. While many of you reading this probably identify as the data team, something most don’t realise is how limiting that mindset can be.

Platform-Mesh, Hub and Spoke, and Centralised | 3 Types of data team Read More »

Linear Regression in Time Series: Sources of Spurious Regression

1. Introduction It’s pretty clear that most of our work will be automated by AI in the future. This will be possible because many researchers and professionals are working hard to make their work available online. These contributions not only help us understand fundamental concepts but also refine AI models, ultimately freeing up time to

Linear Regression in Time Series: Sources of Spurious Regression Read More »

From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities

Introduction: Can AI really distinguish dog breeds like human experts? One day while taking a walk, I saw a fluffy white puppy and wondered, Is that a Bichon Frise or a Maltese? No matter how closely I looked, they seemed almost identical. Huskies and Alaskan Malamutes, Shiba Inus and Akitas, I always found myself second-guessing.

From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI’s Recognition Capabilities Read More »

Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend

Running cool experiments is easily one of my favorite parts of working in data science. Most experiments don’t deliver big wins, so the winners make for fun stories. We’ve had a few of these at IntelyCare, and I’m sharing each story in a way that highlights a concept related to experimentation. Georandomization & How we

Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend Read More »

Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board

Running experiments is a task that often falls to data scientists. If that’s you, congrats! It can be a rewarding and high-impact area of work, but also requires tools found outside the typical ML-heavy data science curriculum. Even with the best tools, only a small share of experiments deliver meaningful business value. I’ve been lucky

Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board Read More »