A conversation with Galen Schreck, fractional CTO and enterprise architect, about the real challenges of working with Salesforce data and what needs to change.
There’s a conversation that happens in almost every company using Salesforce:
Sales team: “We need all our customer engagement data in Salesforce so we can see everything in one place.”
Analytics team: “We need to pull Salesforce data out so we can analyze it properly.”
Engineering team: “We need to connect Salesforce to 17 other systems and somehow keep it all in sync.”
Finance team: “Why is our Salesforce bill so high?”
If you’ve worked with Salesforce data at any scale, you know this isn’t just a coordination problem. It’s a fundamental tension between what Salesforce is great at and what modern data infrastructure needs to be.
We sat down with Galen Schreck, a fractional CTO and former VP of Enterprise Architecture at Forrester Research, who’s built analytics platforms, implemented streaming data systems, and wrestled extensively with Salesforce data integration.
His perspective? We’re doing it wrong. And it’s costing us way more than we think.
Salesforce sits at the center of most companies’ customer engagement strategy. The platform promises elegance: all your customer data, accessible to everyone who needs it, in one place.
The reality is messier.
Getting data INTO Salesforce requires reshaping it to work the Salesforce way, abandoning normalized database principles. Getting data OUT means transforming it again for analytics platforms. The fundamental problem? Salesforce wants data structured for sales teams using a UI. Analytics platforms want data structured for algorithms and aggregations.
Take drop-down lists. In a traditional database, you’d encode options as enumerated values: 1, 2, 3, 4, 5. In Salesforce? You just use the text labels directly. Things get even messier when someone asks to change the labels because of business process changes.
“Where you get into trouble is when you start bringing in normalized data from external systems,” Galen explained. “It makes reporting complicated in Salesforce. So you have to transform it and clean it up in a way that’s more simplistic.”
This creates a dilemma: Go all-in on Salesforce’s approach and you’ve locked yourself in. Try to maintain architectural purity and you’re fighting the platform constantly.
“Early on, I made the mistake of trying to mirror the way other systems worked in Salesforce, and it made things harder,” Galen admitted. “It’s easier if you do things the Salesforce way. But it feels very wrong from a computer science or architecture standpoint.”
When companies evaluate Salesforce, they look at license costs. But there’s a whole category of costs that only emerge once you’re deep in integration work.
Salesforce has been pushing customers toward change data capture (CDC) for synchronizing data. It sounds modern: streaming-first architecture, real-time updates.
Then you discover CDC burns through event quotas fast. Every record update counts against your quota. External systems making changes can trigger cascading flows and automated processes.
“You can end up with a storm of events as a result of updates from external systems that were beyond your control,” Galen said.
The solution? Buy more event entitlements. But predicting how many you need is nearly impossible until you’re already over quota.
Most modern serverless SQL environments don’t have native Salesforce CDC connectors. Your options:
“I feel a lot better about using a platform like Matterbeam, especially for something that needs to be super reliable rather than owning that connector code,” Galen explained.
But even with commercial connectors, CDC comes with a host of edge cases:
Salesforce doesn’t include formula fields in the CDC stream at all. You can get them through the REST API, but that means maintaining two separate data feeds for the same object.
“You’ve got essentially two feeds for fields on the same object that can get out of sync,” Galen noted.
Formula fields are calculated dynamically. They show the current value when someone views a record, which might not have synced to downstream systems. Worse, for non-formula fields, CDC can round decimals to fewer places than returned by the REST API. This leads to the nightmare scenario: two numbers that should match but don’t.
“Why is this two cents off? People tend to lose their minds.”
All this complexity leads to shadow data systems. Someone needs Salesforce data, the official pipeline is too slow or expensive, so they spin up their own connection through a SaaS tool or script.
You end up with multiple copies of the same data, each adding overhead to your Salesforce environment and consuming governor limits on CPU time, database operations, and callouts.
After years of wrestling with these challenges, Galen has a clear vision: a proper data plane.
Not a data lake. Not another point-to-point integration. A genuine data plane that sits between Salesforce and everything else.
“I like the idea of an intermediate platform, and I don’t mean a data lake, I mean a data layer, a data plane that would solve a multitude of issues,” Galen said.
Why? Because developers who aren’t Salesforce experts need to work with customer data naturally. Because analytics teams shouldn’t handle gap events and CDC quirks manually. Because these are infrastructure concerns that should be abstracted away.
“As a developer, I’m seeing the system internals that I shouldn’t have to know about. That was part of my issue with how CDC data comes across.”
A proper data plane should:
This isn’t really about Salesforce. It’s about what happens when we force tools to do things they weren’t designed for.
Salesforce excels at providing a UI for sales teams and managing customer relationships. But when we try to make it our data warehouse, analytics platform, and system of record for 17 different processes, we end up fighting it.
The data infrastructure we need for AI, real-time analytics, and multi-system orchestration is fundamentally different from what Salesforce was built for.
“Building a reliable data plane in itself would be a huge engineering undertaking, especially since it’s far beyond just building some connectors and copying things into a table,” says Galen.
That’s the real cost of Salesforce data, not the license fees, but the engineering time, the opportunity cost, the countless hours spent wrestling with a problem that shouldn’t be this hard.
What You Can Do About It
If you’re dealing with Salesforce data today:
The question is: Are we going to keep fighting this reality with increasingly complex workarounds or build the data plane layer we actually need?
Want to see how Matterbeam’s data plane approach handles Salesforce data differently? We’re working with teams who are tired of fighting their data infrastructure. Let’s talk.