Data architecture: stop worrying about names, start worrying about functions

Photo by Viktor Talashuk on Unsplash In french we say “Dans les vieux pots on fait les meilleures soupes”

Why medallion architecture is a cloud-native rebrand of ODS and Data Warehousing — and what that means for your cloud bill.

There is nothing new under the sun of data engineering. We have simply changed the labels — and convinced ourselves to pay cloud compute costs to rediscover what Bill Inmon figured out in 1992.

If you have spent any time in the “Modern Data Stack” lately, you have encountered the Medallion Architecture. Databricks, Snowflake, Microsoft , dbt labs — they all promote Bronze, Silver, and Gold as the definitive framework for the cloud era. But strip away the branding, and you will find the same logical blueprints we have been using for three decades. As a senior architect, my job is not to care about the color of the layer. My job is to care about its function — because in an era where compute costs compound daily, confusing the two is the fastest way to build an expensive data swamp.

The universal mapping: from legacy to medallion

Regardless of the vendor or the tool — dbt, Airbyte, Fivetran, or a custom ingestion framework — the physics of data is immutable. Every resilient pipeline follows the same three-step logical progression.

1. The ingestion layer: Bronze is the new landing zone

In the medallion world, bronze is where data lands in its rawest form. In classical architecture, this was the staging area — but with one critical distinction that the cloud era has made explicit, and that most architects gloss over: the classical staging area was temporary. bronze is not.

In the old-school ETL world, storage was expensive. A staging area on Oracle or SQL Server was designed as a transit zone: truncate, load, transform, repeat. By the end of the night job, the staging tables were empty. You kept only what you needed for the current run.

The cloud changes this contract entirely. On S3, GCS, or ADLS, storing terabytes of raw data costs a few dollars a month. So the Bronze layer operates on a fundamentally different principle: we never overwrite, we only append. Raw data lands exactly as the source system produced it — no type casting, no renaming, no cleaning — and it stays there indefinitely.

This is not a cosmetic difference. It means that if a transformation fails downstream six months from now, or if a business requirement changes and you need to reprocess historical data, you can replay your entire pipeline from the original source without touching the operational database again. Bronze is your architectural insurance policy. It also decouples your ingestion schedule from your transformation schedule, which becomes critical at scale.

The rule is simple: Bronze is a write-once, read-many archive. Nothing is ever deleted.

2. The integration layer: Silver is the modern ODS

The Silver layer is where the real engineering happens. This is the operational data store — the governed, integrated middle tier that sits between raw ingestion and business consumption. If you are working with dbt, the majority of your models will live here.

This is where you handle deduplication, column renaming for consistency, and version-controlled history — whether through SCD Type 2 patterns, dbt snapshots, or Delta Lake’s native time-travel capabilities.

A point worth clarifying, because it is frequently misunderstood: Silver does not enforce ACID transactional guarantees in the traditional database sense. What it enforces is data quality constraints, idempotency — your pipeline can re-run safely without producing duplicates or side effects — and referential integrity between entities. If you are running on Delta Lake or Apache Iceberg, you gain ACID guarantees at the storage layer, but that is a property of the engine, not of the architectural tier.

A second common confusion: CDC (Change Data Capture) — implemented via tools like Debezium or Fivetran log-based replication — is an ingestion mechanism, not a Silver-layer property. You can build a fully historized Silver layer with or without CDC. What matters is that your integration layer captures change over time, regardless of the mechanism used to detect those changes upstream.

At the end of the Silver layer, your data is technically clean, standardized, and historically tracked. But it is not yet business-oriented. It is a governed mirror of the source — the technical truth, not the business answer. That distinction matters enormously when debugging a data discrepancy at 11pm before a board meeting.

3. The presentation layer: Gold is the data warehouse

The Gold layer is the destination. This is where data marts and the traditional data warehouse live — the layer that your analysts, BI tools, and machine learning models actually consume.

Here, data is no longer a technical record. It is a business answer. We apply complex business logic, join disparate domains — merging CRM customers with ERP orders, for instance — and pre-aggregate for query performance.

This is the return of Kimball’s dimensional modeling. We shape data into star schemas and snowflake schemas so that end users — via Power BI, Tableau, Looker, or ML pipelines — can consume it without understanding the underlying complexity. Every design decision here should be driven by how the data will be read, not by how it was written.

ETL vs. ELT: a change in logistics, not logic

The real revolution of the last decade is not the architecture — it is the direction of transformation.

In the ETL era, storage was expensive and compute was scarce. We transformed data before loading it into the warehouse, because we could not afford to store intermediary states. Today’s ELT flips this contract: we load raw data first, then transform it inside the platform using elastic cloud compute.

But elastic compute is not free. Just because you can transform everything in a single pass does not mean you should collapse the logical layers. A Bronze-to-Gold migration in a single 800-line SQL script saves you setup time on day one. By month six, it has cost you far more in debugging hours, lineage failures, and reprocessing costs when a single upstream schema change cascades through your entire pipeline with no isolation layer to absorb it.

The architectural cost of skipping Silver is not theoretical. A single schema-breaking change from a source system — a renamed column, a new NULL constraint — will propagate silently through a monolithic pipeline until it surfaces as a wrong number in a board-level dashboard. The remediation cost — engineering time, data restatement, stakeholder trust rebuilt one meeting at a time — reliably exceeds whatever setup time the shortcut was meant to save.

Architect the function, not the name

The modern data stack is a genuine evolution in speed, scalability, and developer experience. The tools are better. The compute is cheaper. The observability is richer. None of that changes the underlying physics of data movement.

Do not be distracted by the rebranding of layers. Whether you call it Silver or ODS, the requirement is identical: you need a governed, integrated, technically sound middle layer before you can claim to deliver business value. A system that lacks it is not a modern data stack. It is a data swamp with better branding.

The names will change again in five years. The logic will not.

I’m Camille Lebrun, a Data Consultant. I write about SQL optimization, data modeling, and analytics engineering.

Found this useful? Clap on Medium to help other analytics engineers find it. What’s the most misleading aggregation you’ve seen shipped to a stakeholder? Drop it in the comments.