One Data Model, Many Representations
One Data Model, Many Representations
Summary: Modern web systems often create unnecessary separation between websites and APIs. In reality, both are simply different representations of the same underlying data. When designed correctly, a website can function as an API, and an API can render a complete website — without semantic loss, duplication, or inconsistency. This article explores that principle and the machine‑readable HTML technologies that make it viable.
The Fundamental Principle
At the heart of a well‑designed web system is a simple idea:
There should be one canonical data model.
HTML, JSON, XML, and other formats are not competing truths — they are **projections** of that model, tailored for different consumers.
Problems arise when:
- HTML says one thing
- APIs say another
- Metadata exists in isolation
- Meaning is duplicated instead of shared
When this happens, systems drift, documentation lies, and maintenance cost grows quietly but relentlessly.
The Website as an API
A properly structured HTML document is already machine‑readable.
Browsers, crawlers, screen readers, and assistive technologies all parse:
- Document structure
- Element relationships
- Headings and landmarks
- Links and identifiers
When semantic meaning is embedded directly into HTML — using Microformats or RDFa — the document becomes a self‑describing data source.
In this model:
- The content is the data
- The markup expresses meaning
- Machines consume the same source as humans
This approach is resilient by design:
- It works without JavaScript
- It survives partial rendering
- It degrades gracefully
- It remains readable decades later
The document does not pretend to be an API — it simply is one.
The API as a Website
The inverse approach is equally valid.
When an API exposes:
- Stable identifiers
- Explicit relationships
- Meaningful field names
- A coherent domain model
…then rendering HTML from it becomes a presentation concern, not a data problem.
The same endpoint can legitimately serve:
- JSON to machines
- HTML to humans
- XML to legacy systems
- Other formats as required
Nothing new is invented — only rendered.
This is not duplication. It is **representation**.
Machine‑Readable HTML Technologies in Context
Different technologies support this model in different ways.
Microformats
Microformats embed meaning using existing HTML elements and class names.
Their strengths are simplicity and longevity:
- No parallel data structures
- No special parsers required
- No loss of human readability
If the machine disappears, the document remains correct.
This makes Microformats ideal for:
- Human‑centric documents
- Long‑lived content
- Systems that value resilience
RDFa
RDFa extends this idea by allowing richer expression of relationships.
Crucially, it still:
- Annotates existing content
- Avoids data duplication
- Keeps the document authoritative
Edits to content are edits to data — a powerful alignment that reduces drift over time.
JSON‑LD
JSON‑LD serves a different purpose.
It exists primarily for automated consumers that:
- Do not want to parse HTML
- Prefer fast, predictable extraction
- Operate at web scale
JSON‑LD works best when treated as:
- An optimisation layer
- A reflection of existing truth
- A convenience for external systems
Problems arise only when JSON‑LD becomes the *primary* source of meaning rather than a projection of it.
Microdata
Microdata introduced attribute‑based semantics alongside HTML5.
In practice, it:
- Adds verbosity without clarity
- Introduces new concepts without solving new problems
- Competes with simpler, more mature approaches
It is supported, but rarely preferred in real‑world systems.
Avoiding Parallel Realities
The most common architectural failure is semantic duplication.
Examples include:
- Content updated but metadata forgotten
- API fields drifting from UI labels
- SEO data diverging from visible truth
- Accessibility annotations bolted on late
The cure is not tooling — it is alignment.
When:
- HTML and API share identifiers
- Meaning is expressed once
- Representations are derived, not rewritten
…the system becomes calm and legible.
Progressive Enhancement as Architecture
This approach naturally supports progressive enhancement.
A document‑first system:
- Works without scripts
- Improves with them
- Never depends on them
An API‑first system:
- Scales cleanly
- Supports automation
- Remains renderer‑agnostic
Both are valid — and both can coexist — as long as they project the same underlying model.
Design Guidance
A pragmatic strategy looks like this:
- **Human‑first content** → Microformats or RDFa
- **Crawler‑first metadata** → JSON‑LD
- **Single source of truth** → Canonical identifiers and models
- **Long‑term systems** → Embedded meaning over external declarations
There is no universal winner — only informed trade‑offs.
Final Thought
The web does not suffer from a lack of standards. It suffers from a lack of honesty.
Systems last when:
- Meaning is not duplicated
- Data is not reinvented
- Documents say exactly what they mean
When the website and the API tell the same story, the web works as it always should have — as a shared space for humans and machines alike.