The Hidden Cost of Salesforce Data: A Fractional CTO’s Reality Check

A conversation with Galen Schreck, fractional CTO and enterprise architect, about the real challenges of working with Salesforce data and what needs to change.

There’s a conversation that happens in almost every company using Salesforce:

Sales team: “We need all our customer engagement data in Salesforce so we can see everything in one place.”

Analytics team: “We need to pull Salesforce data out so we can analyze it properly.”

Engineering team: “We need to connect Salesforce to 17 other systems and somehow keep it all in sync.”

Finance team: “Why is our Salesforce bill so high?”

If you’ve worked with Salesforce data at any scale, you know this isn’t just a coordination problem. It’s a fundamental tension between what Salesforce is great at and what modern data infrastructure needs to be.

We sat down with Galen Schreck, a fractional CTO and former VP of Enterprise Architecture at Forrester Research, who’s built analytics platforms, implemented streaming data systems, and wrestled extensively with Salesforce data integration.

His perspective? We’re doing it wrong. And it’s costing us way more than we think.

The Salesforce Data Paradox

Salesforce sits at the center of most companies’ customer engagement strategy. The platform promises elegance: all your customer data, accessible to everyone who needs it, in one place.

The reality is messier.

Getting data INTO Salesforce requires reshaping it to work the Salesforce way, abandoning normalized database principles. Getting data OUT means transforming it again for analytics platforms. The fundamental problem? Salesforce wants data structured for sales teams using a UI. Analytics platforms want data structured for algorithms and aggregations.

Take drop-down lists. In a traditional database, you’d encode options as enumerated values: 1, 2, 3, 4, 5. In Salesforce? You just use the text labels directly. Things get even messier when someone asks to change the labels because of business process changes.

“Where you get into trouble is when you start bringing in normalized data from external systems,” Galen explained. “It makes reporting complicated in Salesforce. So you have to transform it and clean it up in a way that’s more simplistic.”

This creates a dilemma: Go all-in on Salesforce’s approach and you’ve locked yourself in. Try to maintain architectural purity and you’re fighting the platform constantly.

“Early on, I made the mistake of trying to mirror the way other systems worked in Salesforce, and it made things harder,” Galen admitted. “It’s easier if you do things the Salesforce way. But it feels very wrong from a computer science or architecture standpoint.”

The Hidden Costs Nobody Talks About

When companies evaluate Salesforce, they look at license costs. But there’s a whole category of costs that only emerge once you’re deep in integration work.

The Event Quota Surprise

Salesforce has been pushing customers toward change data capture (CDC) for synchronizing data. It sounds modern: streaming-first architecture, real-time updates.

Then you discover CDC burns through event quotas fast. Every record update counts against your quota. External systems making changes can trigger cascading flows and automated processes.

“You can end up with a storm of events as a result of updates from external systems that were beyond your control,” Galen said.

The solution? Buy more event entitlements. But predicting how many you need is nearly impossible until you’re already over quota.

The Connector Gap

Most modern serverless SQL environments don’t have native Salesforce CDC connectors. Your options:

Build it yourself, but now you’re maintaining custom connector code
Use Kafka with a commercial connector, another vendor, another license
Use a specialized data platform like Matterbeam that handles it natively

“I feel a lot better about using a platform like Matterbeam, especially for something that needs to be super reliable rather than owning that connector code,” Galen explained.

But even with commercial connectors, CDC comes with a host of edge cases:

Not all objects are supported by CDC: Some require separate REST API calls
Gap events: Salesforce occasionally says “changes happened, but we’re not sending them”
Special deletion handling: Deletions don’t follow the standard update pattern
Batched updates: One CDC event can contain changes to multiple records
Empty or mysterious events: Sometimes these are due to permission issues

The Formula Field Problem

Salesforce doesn’t include formula fields in the CDC stream at all. You can get them through the REST API, but that means maintaining two separate data feeds for the same object.

“You’ve got essentially two feeds for fields on the same object that can get out of sync,” Galen noted.

Formula fields are calculated dynamically. They show the current value when someone views a record, which might not have synced to downstream systems. Worse, for non-formula fields, CDC can round decimals to fewer places than returned by the REST API. This leads to the nightmare scenario: two numbers that should match but don’t.

“Why is this two cents off? People tend to lose their minds.”

The Shadow Data Problem

All this complexity leads to shadow data systems. Someone needs Salesforce data, the official pipeline is too slow or expensive, so they spin up their own connection through a SaaS tool or script.

You end up with multiple copies of the same data, each adding overhead to your Salesforce environment and consuming governor limits on CPU time, database operations, and callouts.

What Actually Needs to Change

After years of wrestling with these challenges, Galen has a clear vision: a proper data plane.

Not a data lake. Not another point-to-point integration. A genuine data plane that sits between Salesforce and everything else.

“I like the idea of an intermediate platform, and I don’t mean a data lake, I mean a data layer, a data plane that would solve a multitude of issues,” Galen said.

Why? Because developers who aren’t Salesforce experts need to work with customer data naturally. Because analytics teams shouldn’t handle gap events and CDC quirks manually. Because these are infrastructure concerns that should be abstracted away.

“As a developer, I’m seeing the system internals that I shouldn’t have to know about. That was part of my issue with how CDC data comes across.”

A proper data plane should:

Abstract away system internals: No more gap events or formula field synchronization
Enable data augmentation: Enrich records and transform data in-flight
Provide a neutral interface: Work with data without being locked into Salesforce patterns
Handle refresh and replay: Manage sandbox updates and data masking automatically
Support multiple consumers: Without multiplying the load on Salesforce

The Bigger Picture

This isn’t really about Salesforce. It’s about what happens when we force tools to do things they weren’t designed for.

Salesforce excels at providing a UI for sales teams and managing customer relationships. But when we try to make it our data warehouse, analytics platform, and system of record for 17 different processes, we end up fighting it.

The data infrastructure we need for AI, real-time analytics, and multi-system orchestration is fundamentally different from what Salesforce was built for.

“Building a reliable data plane in itself would be a huge engineering undertaking, especially since it’s far beyond just building some connectors and copying things into a table,” says Galen.

That’s the real cost of Salesforce data, not the license fees, but the engineering time, the opportunity cost, the countless hours spent wrestling with a problem that shouldn’t be this hard.

What You Can Do About It

If you’re dealing with Salesforce data today:

Acknowledge the real costs: Track engineering time, connector costs, and quota overages
Stop building point-to-point connections: Each one multiplies complexity
Think about data planes, not pipelines: You need transformation and multi-consumer patterns
Challenge the “do it the Salesforce way” assumption: Recognize when you’ve outgrown it

The question is: Are we going to keep fighting this reality with increasingly complex workarounds or build the data plane layer we actually need?

Want to see how Matterbeam’s data plane approach handles Salesforce data differently? We’re working with teams who are tired of fighting their data infrastructure. Let’s talk.