Data Unchained

Unleash the full potential of your data.

Latest Posts

We Lost the Thread on the Data Lake

In 2014, my last startup was acquired. We joined a fast growing organization with a top-notch data team. They had invested heavily in data infrastructure. Data was strategic. They had "the hub," a Hadoop cluster built on HDFS. I thought: here's a company doing things right.

From Data Hoarding to AI-Ready: Making Your Data Actually Useful

Live Webinar ᐧ January 8 ᐧ 11 AM PT / 2 PM ET You’re storing terabytes of data “just in case.” But when AI initiatives launch, that data is inaccessible, poorly formatted, or locked behind a six-month pipeline project. Sound familiar? You’re paying thousands in storage costs for data

How to Make AI Training Data Reproducible and Debuggable

The Challenge 70% of AI projects fail on data quality and integration, not models. But even teams with clean data struggle when they can’t reproduce their training runs. Your model’s accuracy dropped from 87% to 64%. But why? Was it bad training data? A schema change upstream? Preprocessing

You’re Not Bad at Data. Your Infrastructure Just Makes You Think You Are.

I wrote a post about thinking past medallion architectures. That one went a little deeper about the architectural characteristics that make thinking in “medallions” unnecessary. But the truth is, you don’t need to internalize all that. I’m guessing you sense that data just doesn’t work, even with

Your Teams Are Making Shadow Copies of Everything

Let’s talk about something nobody wants to admit. Your marketing team has their own copy of customer data. Sales has a different version. Product is maintaining yet another extract. Finance built their own dashboard using data they pulled last month. Each team has created their own shadow copy of

How to Feed Multiple AI Models from One Data Stream

The Challenge Your team is testing OpenAI embeddings, Anthropic’s Claude, and a custom fine-tuned model. Each needs customer data in a slightly different format. The traditional approach: build three separate pipelines, each with its own failure modes and maintenance overhead. Every AI workload expects data its own way. Your

Building Modern Data Systems with 1980s Thinking

Picture this: You’re in an executive meeting. The company just acquired another business, and the CEO wants to change how you calculate monthly active users to include the new customer base. Simple request, right? “That’ll be six months,” comes the response from the data team. Six months?! To

How to Get AI-Ready Data in Hours Instead of Months

The Challenge Your AI team has transformative ideas. Leadership approved the budget. Then reality: preparing data for AI means months of cleaning and formatting. Data scientists become data wranglers. Engineers build pipelines instead of AI features. By the time data is ready, your competitor already shipped. The problem isn’t

The Data Helplessness Epidemic

Why AI is exposing decades of accepted dysfunction You can’t move at AI velocity when your data team still says “that’ll take six months.” Here’s how an entire industry normalized broken patterns, and why AI is forcing us to finally confront them. I was talking to a

Popular Tags