Glean just published "How Do You Build a Context Graph?" — a detailed look at their four-layer architecture for powering enterprise AI agents. Deep connectors, a unified knowledge graph, personal activity graphs, and an aggregated view of how work flows across the organization. It's serious infrastructure.
I've been writing about context graphs since Foundation Capital kicked off this conversation in December — on the operational context layer, temporal facts and the event clock, and what the ontology debate gets wrong. So I read the Glean piece with genuine interest. Their approach is thoughtful, and their framing of the problem is right: enterprise AI agents can't automate work without understanding how work actually happens.
But here's what I keep coming back to: most teams aren't Glean's customer.
Most teams aren't enterprises with $50K+ annual budgets for wall-to-wall search and a sales cycle before they can get started. They're 20-person CS teams, 5-person sales orgs, founding teams doing their own account management, VCs tracking portfolio companies. Their bottleneck isn't "search across the enterprise." It's simpler and more specific:
"What do we know about this account — and what changed since we last checked?"
That question demands a different architecture than enterprise search. And the answer, I think, is what we've been building at Dossium.
Two Ways to Build a Context Graph
Glean's approach is horizontal. Index everything in the enterprise — every document, ticket, email, chat message, calendar event — and build a unified knowledge graph across all of it. Capture how work flows through the organization. Learn patterns from how people use tools and move through workflows. The context graph is enterprise-wide, and the output is search plus agent automation.
This makes sense for large organizations where the question is: "How does work happen here?" Workflow patterns, organizational behavior, who does what and in what order. That's genuinely valuable infrastructure.
But there's a different question that a different audience asks every day: "What's happening with Acme?"
Not "search for documents mentioning Acme." Not "show me how workflows involving Acme typically play out." Just: what's true about this account right now? Who are the key people? What commitments are open? What changed since the last QBR? What should I know before this call?
That question doesn't need enterprise-wide indexing. It needs relationship-scoped intelligence — everything your team knows about an account, unified, identity-resolved, and temporally aware.
The architectures diverge because the organizing principles diverge:
Neither is wrong. They're solving different problems for different teams.
The Three Layers
If your organizing principle is accounts — not the enterprise org chart, not process flows — then the data model looks different too.
We've spent three years building context infrastructure at Graphlit. The lesson that emerged, and that we've written about in this series, is that relationship intelligence requires three distinct layers of data. Not four (Glean's model). Not two (a knowledge graph plus search). Three.
Content: The Evidence
Emails, meeting transcripts, Slack threads, documents, call recordings. The raw material — immutable, timestamped, sourced. The system extracts knowledge from content, but the content itself is never modified. It's the canonical record of what was captured — the evidence trail you can always go back to.
Most systems stop here. They index content, embed it, and call it RAG. The problem: content is noisy. An email thread about Acme also mentions six other topics. A meeting transcript is 45 minutes long and the relevant three sentences are buried in minute 32. Search over content returns results. It doesn't return understanding.
Content is the foundation. But it's not the answer.
Entities: The Actors
People, organizations, relationships, roles. This is where identity resolution happens — and where most approaches fall apart.
Sarah Chen appears in your data as sarah.chen@acme.com in email, @sarah in Slack, "Sarah Chen" in the CRM, "Sarah from Acme" in a meeting transcript, and "S. Chen" on a calendar invite. To a system that indexes content, these are five unrelated strings. To a system that resolves identities, they're one person — with a title, an employer, a history of interactions, and a role in the account relationship.
Glean acknowledges this — their unified knowledge graph resolves entities using "activity signals to resolve ambiguities like 'ACME Inc' vs 'ACME'." We agree this matters. Where we differ is scope. Glean resolves entities across the entire enterprise. We resolve entities within the context of accounts — the people, organizations, and relationships that matter to your business. The CRM provides the backbone: accounts are the namespace, contacts are the starting points, and everything else — emails, meetings, Slack mentions — resolves to that structure.
Identity resolution isn't a feature. It's the foundation that makes everything else meaningful. Without it, you're searching strings, not understanding relationships.
Facts: The Truth Layer
This is the layer that most systems lack entirely — and where the real differentiation lives.
Content tells you what was said. Entities tell you who was involved. Facts tell you what's true — and what kind of true.
Every fact carries a category. Not just "something happened" but what kind of thing happened:
- Commitment: "We'll have the API migration done by Q2" — a promise someone made
- Decision: "Acme chose the enterprise tier" — a choice that was finalized
- Escalation: "Sarah escalated the latency issue to her VP" — something moved up the chain
- Change: "Sarah Chen promoted to SVP Engineering" — a state transition
- Goal: "Acme wants to consolidate to a single vendor by year-end" — a stated objective
Each fact is also temporal — it carries a timestamp for when it became true and, if applicable, when it stopped being true:
- "Sarah Chen is SVP Engineering" (since December 2025)
- "Sarah Chen was VP Engineering" (March 2024 – December 2025)
- "Acme renewed for 2 years" (January 2026, sourced from contract)
Facts aren't keywords. They're categorized, timestamped, sourced, and linked to the entities they reference. They can be canonical, superseded, corroborated, or synthesized from multiple sources. When three different sources assert something similar at different times, the system resolves them into a coherent timeline.
This is what I called the "event clock" in a previous post. We've built infrastructure for what's true now — CRM fields, database records, dashboards. Almost nothing for how truth evolved over time. Facts with temporal validity are the missing layer.
Glean's model captures how work flows — who did what, in what order, across which tools. That's valuable for understanding organizational patterns. Facts capture something different: what your team believes to be true about an account, and how that belief evolved. One is about workflows. The other is about what's known.
From Layers to Intelligence
Three layers of data are necessary but not sufficient. The hard part is synthesis — turning content, entities, and facts into intelligence you can act on.
This is where most "AI-powered" tools fall short. They retrieve relevant documents, stuff them into a prompt, and hope the language model figures it out. The output is a summary of search results dressed up as insight. It's not grounded in resolved entities. It's not aware of temporal validity. It doesn't distinguish between what's current and what's superseded.
Here's what we've learned building Dossium: the synthesis has to operate across all three layers simultaneously. When you ask for a briefing — before a call, after a meeting, about an account — the system doesn't just search. It resolves every person mentioned against canonical entities in the knowledge graph. It gathers temporal facts — commitments, decisions, escalations — not just documents that mention the account. It retrieves the evidence trail, ranked by relevance and recency. And before it writes a word, it evaluates what it knows and what's missing — running targeted follow-up queries to fill gaps in coverage.
The output is grounded. Every claim traces to source material. Every citation is structural, not decorative. When the system doesn't know something, it says so. An honest gap is more useful than a confident hallucination.
The result isn't a search summary. It's a dossier — structured intelligence about an account, with temporal awareness and identity resolution baked in. I'll walk through this process in detail in the next post.
The API Layer
Guillermo Rauch recently wrote:
"Focus on problems where a simple API can hide enormous amounts of real-world and business complexity... Fast forward to the age of AI, agentic engineering, and the SaaS public market bloodbath. Software is now free to build. But when you sit down to vibe code your next app, that app will sit on the shoulders of giants. Your agent will read Markdown. Then, it will run CLIs and call MCP tools. Your software will make API calls to other services. And software will become more and more invisible. Computers talking to other computers to get you an answer, even before you ask. If you're starting now (or starting over), focus on the API. Do it for the agents."
This captures something important about how Dossium is built.
Dossium is a product — the interface humans use to prep for calls, browse account intelligence, and chat with their organizational context. But underneath, it's built on Graphlit's API: the infrastructure layer that handles multimodal ingestion, identity resolution, entity extraction, temporal fact modeling, and knowledge graph construction. Three years of context infrastructure, available as an API.
This matters because the intelligence isn't trapped in the product.
The same context that powers Dossium's briefings is available through MCP — the Model Context Protocol that lets any AI agent query your account context directly. Connect Dossium to Claude, Cursor, your internal agents, whatever comes next. They get the same identity-resolved, temporally-aware, fact-grounded context that the product surfaces.
Agents are compute. Context is data. The context layer is the durable asset. The interfaces change.
Enterprise search platforms tend to be monolithic — you use their search, their agent, their UI. The context lives inside the product. We've built it the other way: the API is the product. Dossium is one interface. MCP is another. Your agents are a third. Same substrate, different access patterns.
What's Next: Personas
Today, a briefing synthesizes what the system knows about an account. The output is the same regardless of who's reading it — the AE gets the same briefing as the CSM, the same as the support lead.
But different roles need different lenses on the same account.
A CS manager preparing for a QBR needs relationship health, feature adoption, and open commitments. A sales exec needs deal progression, competitive signals, and stakeholder dynamics. A support lead needs incident history, escalation patterns, and SLA status. A VC partner needs portfolio performance, burn rate trends, and founder communication patterns.
Same account. Same underlying context. Different intelligence.
We call this layer personas — and it's what we're building next. Not prescribed role templates, but learned patterns for how different people use account context. The system observes which facts matter to which roles, which sections get read, which follow-up questions get asked — and adapts the intelligence to the reader.
Personas don't change the data model. Content, entities, and facts remain the substrate. What changes is the synthesis — which facts surface first, which relationships matter most, what the briefing emphasizes. The same three layers, rendered through a different lens.
Where We Stand
The context graph conversation has been productive. Foundation Capital named the opportunity. Animesh Koratana deepened the technical framing. The ontology debate clarified what's solved and what isn't. Glean just showed how they're building it for the enterprise.
We're building it for the team with a call in an hour.
The architecture is three layers — content, entities, facts — built on Graphlit's API, scoped to the accounts that matter to your business. The output is synthesized intelligence — briefings and dossiers — grounded in evidence, with every fact traceable to its source.
Not enterprise search. Relationship intelligence.
If your work is relationships, and your bottleneck is context, that's the problem we're solving.
This is the fifth in a series on context graphs. Previous posts: "The Context Layer AI Agents Actually Need", "Building the Event Clock", "Context Graphs: What the Ontology Debate Gets Wrong", and "Introducing Dossium".
