Michael Kowalchik

Michael Kowalchik

Data Pipelines Are Prisons

A data pipeline is a set of choices. It's almost always a bet that you know how this data will be used, how it needs to be modeled, how it needs to be queried. I've watched a data engineers spend weeks building ETL pipelines. Then the

That's No Warehouse... It's a Space Station

“Fear leads to anger. Anger leads to hate. Hate leads to suffering. And centralized data architectures lead to a table called customer_revenue_final_v9_USE_THIS_ONE_SERIOUSLY.” — Yoda, leading a postmortem In your company there is likely a data warehouse or data lake. Or a plan for one.

Why Matterbeam

Data can be magical. I've built companies on data. I've seen truly counter-intuitive results that changed everything with people empowered by evidence instead of gut feel or force of personality. I've seen what's considered "possible" change when you have the

Data Pipelines Are Built for a World That No Longer Exists

“Reverse ETL.” An entire category of tooling to acknowledge that data flowing in one direction is right, and natural. That moving it the other way requires a special designation. These patterns are so deeply embedded in our data architectures that we can no longer see them. The pipeline is the

An Honest Architecture

For decades, the language around data has barely changed. Every few years a new architecture or philosophy rises. We hear about data lakes, warehouses, meshes, fabrics, and observability platforms. Each is a promise to finally tame the chaos of data management. Billions have been invested across multiple generations of tooling

We Lost the Thread on the Data Lake

In 2014, my last startup was acquired. We joined a fast growing organization with a top-notch data team. They had invested heavily in data infrastructure. Data was strategic. They had "the hub," a Hadoop cluster built on HDFS. I thought: here's a company doing things right.

You’re Not Bad at Data. Your Infrastructure Just Makes You Think You Are.

I wrote a post about thinking past medallion architectures. That one went a little deeper about the architectural characteristics that make thinking in “medallions” unnecessary. You don’t need to internalize all that. I’m guessing you sense that data just doesn’t work, even with the fancy medallion architecture.

Your Teams Are Making Shadow Copies of Everything

Let’s talk about something nobody wants to admit. Your marketing team has their own copy of customer data. Sales has a different version. Product is maintaining yet another extract. Finance built their own dashboard using data they pulled last month. Each team has created their own shadow copy of

Building Modern Data Systems with 1980s Thinking

Picture this: You’re in an executive meeting. The company just acquired another business, and the CEO wants to change how you calculate monthly active users to include the new customer base. Simple request, right? “That’ll be six months,” comes the response from the data team. Six months?! To