Ahorra un 25 % (o incluso más) en tus costes de Kafka | Acepta el reto del ahorro con Kafka de Confluent
As regulatory frameworks such as the General Data Protection Regulation (GDPR), Digital Operational Resilience Act (DORA), and Network and Information Security Directive 2 (NIS2) converge with the US Clarifying Lawful Overseas Use of Data Act (CLOUD Act), contractual assurances are no longer a sufficient defense. For senior leadership, digital sovereignty has evolved from a compliance checkbox into a core architectural requirement. In a real-time enterprise, the streaming layer is the point of control with the highest leverage; a lack of sovereign architecture here instantly propagates noncompliance across the entire ecosystem.
Our recently published white paper, “Streaming Sovereignty,” walks through that spectrum in detail. The following three ideas from the paper should reshape how you run your next architecture decisions.
1. "We will not" and "We cannot" are different.
In cloud procurement today, the most underrated distinction is between a policy assurance ("We will not access your data") and an architectural guarantee ("We cannot access your data").
Under regional and global regulations, those two statements carry materially different weight. A policy can be overridden by a court order, a subpoena, an insider mistake, or a sub-processor change you didn't approve. An architecture—where the provider holds no identity and access management (IAM) permissions into your data plane, no network path to your storage, and no cryptographic key to your data—has nothing for a legal instrument to compel. The data simply isn't there to produce.
This is what's quietly changing how regulated enterprises evaluate vendors. The interesting question is no longer "What does the vendor commit to?" It's "What does the vendor's architecture make impossible?" In the streaming world, the cleanest expression of "We cannot" is the Bring Your Own Cloud (BYOC) pattern: Stateless agents run inside your virtual private cloud, messages land directly in your own object storage, and the vendor holds no permissions into either. Combine that with customer-managed encryption keys (Bring Your Own Key, or BYOK) in your own key management system (KMS)—AWS KMS, Azure Key Vault, or Google Cloud KMS—and you control a revocation switch that no contract clause can match. Pull the key, and the data goes dark. That's the architectural standard.
This may not be the right answer for every workload, but it's the reference point against which every other answer should be measured.
2. The schema is the new sovereignty boundary.
In the database era, the unit of sovereignty control was the network perimeter. You drew a boundary around the system, controlled who crossed it, and audited the crossings. That worked because data was largely at rest—one system, one boundary, one set of locks.
In the streaming era, the unit of sovereignty control is the schema. The data contract is the only thing in a streaming architecture that every producer and every consumer must conform to. That makes it the highest-leverage place in your stack to encode sovereignty intent—and the modern schema registry turns it into an enforcement point rather than a documentation artifact.
Once you treat the schema as the sovereignty boundary, four things you used to handle in separate tools start living in the same place:
i. The data contract itself—field names, types, required versus optional, evolution rules. Producers that violate the contract are rejected at produce time. The contract is enforced as code, not as a Confluence page.
ii. Sensitivity and jurisdiction tags. Annotate fields and topics with classifications—personally identifiable information (PII), protected health information (PHI), financial, EU-only, regulated market. Those tags are then queryable across your entire data estate, which makes lineage real: You can answer "Where does customer tax data flow?" with a query, not a spreadsheet.
iii. Encryption directives. Apply client-side field level encryption (CSFLE) to sensitive fields or client-side payload encryption (CSPE) to the entire payload. The encryption rule lives with the field definition, the master key stays in your KMS, and the streaming platform never sees plaintext.
iv. Quality and policy rules. Constraints that go beyond schema validation (value ranges, allowlists, cross-field invariants) attach to the same contract and get enforced at produce time, before nonconformant data reaches a single consumer.
The shift this enables is significant. In the perimeter model, every new producer or consumer is a new sovereignty surface to defend. In the schema-as-boundary model, every new producer or consumer inherits the contract automatically. A team building a new service in São Paulo doesn't need to be briefed on which fields are EU-restricted; the schema enforces it for them. Artificial intelligence (AI) sovereignty becomes tractable for the same reason: When the streams feeding your models carry their classifications with them, EU AI Act lineage questions become queries instead of investigations.
This isn't a replacement for residency, BYOK, or private networking, which still matter at the cluster and storage layers. It's an additional layer of control that exists because streaming exists. That’s the reason DORA, NIS2, and GDPR enforcement is increasingly written around continuously enforced controls rather than periodic ones. The schema, when properly designed, is the most continuous control you have.
3. Portability is a property of the protocol, not the contract.
DORA introduces something genuinely useful: the requirement that EU financial institutions maintain—and regularly test—credible exit plans for their information and communications technology providers. It's a healthy discipline, and it rewards organizations that build on open foundations.
When the underlying protocol is open, an exit plan is an executable workflow—no format conversion, no application rewrites, no protocol translation. Apache Kafka® and Apache Flink®, governed by the Apache Software Foundation, are protocols, not products. Moving from a Kafka-based managed service to self-managed Kafka or another Kafka-compatible service is a documented path you can demonstrate to a supervisor in a controlled test. No proprietary wire format. No proprietary on-disk format. No bespoke client library to rewrite.
The broader lesson generalizes well beyond streaming: Sovereign portability is a property of the protocol, not the contract. Here are two questions you should be able to answer "yes" to before signing any infrastructure deal in a regulated environment:
i. Is the wire format open and independently implemented?
ii. Is the on-disk format open and readable without a vendor's tooling?
If both answers are “yes,” your DORA exit plan can be tested. If either answer is “no,” your exit plan is a legal artifact, not a technical one—and supervisors are starting to draw that distinction.
A blog post that promised sovereignty without trade-offs would be a sales pitch, not a useful read, so let's be honest about the cost. Architectural sovereignty buys you control—and control is operational work you used to outsource.
BYOC means you operate the data plane. Stateless agents are simple, but the storage account, the IAM policies, and the network are yours to monitor.
Self-managed (Confluent Platform/air-gap) means you take on upgrades, patching, and capacity planning. You trade managed-service velocity for full ownership.
CSFLE/CSPE mean you take on key life cycle and schema discipline. The encryption is only as good as the KMS hygiene behind it.
Cluster Linking across deployment tiers means an additional governance surface to keep coherent. It’s usually worth it but not free.
Honest framing: For most workloads in regulated enterprises, the right answer is a managed service with strong residency, BYOK, private networking, and client-side encryption for the fields that need it. It's the lowest-overhead path to a defensible posture. Zero-access BYOC and self-managed/air-gap exist for the slice of workloads—payment processing, clinical data, classified pipelines—where even policy-level vendor access is unacceptable. Treat the spectrum as a portfolio decision, not a single architecture choice.
Pull this into your next vendor evaluation, audit committee prep, or DORA risk register:
Architecture
In the event of a court order, what specific data could the vendor produce? (Document the answer per data class.)
Where do the encryption keys live, and who can revoke them? (BYOK in customer-managed KMS is the cleanest answer.)
What is the vendor's IAM and network path into the system that stores my data? (Map it. If the path is empty, document why.)
Stream-level governance
Is the streaming layer in scope of our DORA/NIS2 third-party risk register, with the same rigor as our core database?
At what point in the pipeline is noncompliant data detected—at produce time, at consume time, or in audit? (Earlier is better.)
Portability
Is the wire protocol open and independently implemented?
When did we last execute an exit plan in a test environment? (Documented isn't tested.)
A "no" or "we don't know" answer to any of these isn't disqualifying; it's a signal about which assurances in your vendor relationship are contractual and which are structural. That distinction is what auditors are increasingly asking you to make explicit.
We've spent the last decade building real-time infrastructure for regulated enterprises across financial services, healthcare, public sector, and critical infrastructure. The “Streaming Sovereignty” paper captures what we've learned in one place and goes deeper than this post on the parts that matter most:
A four-architecture sovereignty spectrum—Confluent Cloud, WarpStream BYOC, Confluent Private Cloud, and Confluent Platform—and a decision guide for which workload belongs where. There are multiple right answers; the paper tells you how to choose.
A board-level risk register mapping GDPR, DORA, NIS2, and CLOUD Act exposure to specific architectural mitigations.
An industry guide for financial services, healthcare, government, critical infrastructure, and telecommunications—with concrete deployment patterns we've validated with customers in each.
The audit artifacts that supervisors actually request: Service Organization Control (SOC) 1/2/3, International Organization for Standardization (ISO) 27001/27017/27018/27701, Payment Card Industry Data Security Standard (PCI DSS) Level 1, Health Insurance Portability and Accountability Act (HIPAA), Health Information Trust Alliance (HITRUST), Trusted Information Security Assessment eXchange (TISAX), the DORA mapping, the Data Transfer Impact Assessment, the Transparency Report, and the Law Enforcement Guidelines.
Equip your team with the insights they need to succeed. Get the full white paper here.
Apache®, Apache Kafka®, Apache Flink®, Flink®, and the respective logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by the Apache Software Foundation is implied by using these marks. All other trademarks are the property of their respective owners.
Confluent Tableflow simplifies the process of feeding data lakes and lakehouses by turning Kafka topics directly into analytics-ready Iceberg or Delta Lake tables, eliminating complex traditional ETL stacks leading to 30%–50% lower total ingestion costs.
This blog post introduces KCP integration with Gateway, which automates Kafka client cutover by routing traffic through an auth-translating proxy and orchestrating group-based, offset-safe migrations to Confluent Cloud with just a few CLI commands.