Stop chasing tools and focus on business value. You've heard it a thousand times. So has every data team. We nod, agree, and then... buy another tool. Why? Because the tool obsession is a symptom of something deeper. We don't need better discipline. We need a better way.
Data doesn't work in companies, I think everyone feels this on some level. One reason I've heard repeated is that it's a people problem, a lack of data culture and data literacy. Companies spend millions on training programs, hire Chief Data Officers, bring in
The medallion architecture emerged as the data industry's answer to data lake chaos. Organizations had dumped vast amounts of raw data into cheap storage, creating impenetrable data swamps. The medallion's promise was elegant: three progressive layers—Bronze for raw data, Silver for cleaned data, Gold for
I talk to data professionals and they're frequently frustrated. For example spending three months migrating everything to Parquet files in their data lake. Clean, columnar, compressed. Beautiful. But now their real-time service team needs that same data, and now it's painfully slow because, well, scanning columns
You know what you're supposed to do. We've heard the same refrains for a decade or more. Conference keynotes. Blog posts. LinkedIn thought leadership. Build a data culture. Invest in data literacy. Improve data quality at the source. Get executive buy-in. Implement strong governance. Focus on
There's a version of a stat that gets thrown around a lot. Data teams spend 80% of their time on data preparation or cleaning. Eighty percent. We've just... accepted this? Like it's some law of nature? As if the universe decreed that for every
You know that feeling when you've been doing something the same way for so long that you can't imagine any other approach? That's where Josh Pendergrass was when his company first started using Matterbeam. "At first I was like, that seems great. I
Data problems in companies are not due to people's lack of skills, but from the wrong fundamental approach that doesn't align with how businesses actually operate and evolve.
Let's trace the evolution of the "data lake" concept The term "data lake" was coined by James Dixon, then CTO of Pentaho, in 2010. Dixon used the metaphor of a lake to contrast with the more structured "data mart" (which he compared