Understanding Search: From Keywords to Knowledge Graphs
Search systems were once the precise, deterministic gateways to structured human knowledge. Today, they often behave more like probabilistic guessing engines—sometimes brilliant, sometimes incoherent, and frequently confident but wrong.
This shift wasn’t an accident; it was an architectural choice. We traded exactness for semantic flexibility, and in doing so, we broke the structural scaffolding that makes information reliable.
This article explains why search broke—from the perspective of information retrieval (IR) engineering—and why Knowledge Graphs are emerging not just as an enhancement, but as the necessary foundation for the next generation of AI systems.
Table of Contents
Open Table of Contents
The Age of the Catalog: Early Search Foundations
In the 1990s, the web was effectively a chaotic, ever-expanding public library. To manage it, early search engines like AltaVista, Yahoo, and Excite relied on inverted indexes.
An inverted index is a straightforward data structure: a map of Token → List[Documents]. If you searched for “Apple,” the system looked up the token and returned every document containing it. This was literal matching. It had no concept of semantics, intent, or disambiguation. It couldn’t distinguish between Apple the fruit, Apple the computer company, or Apple Records.
The PageRank Revolution
The first major paradigm shift came with PageRank.
Larry Page and Sergey Brin realized that the web wasn’t just a collection of documents; it was a graph. They viewed hyperlinks as directed edges and treated every link as a vote of authority. Mathematically, PageRank treats the web as a massive Markov chain. The rank of a page is effectively the stationary distribution of a random walk across the graph.
This moved search from simple term frequency (how often a word appears) to link-based authority estimation (how important the page is). This structural approach dominated IR systems for two decades because it successfully proxied “quality” through “connectivity.”
But users evolved faster than the infrastructure.
The Era of Vectors (and Their Limitations)
By the 2010s, user behavior shifted. Queries evolved from keyword fragments like weather paris to complex, intent-driven questions:
“Is it safe to travel to Paris this weekend given the protests?”
Keyword search and link analysis failed here. A keyword search for “safe” and “protests” might return articles from 2019 because the tokens matched, even if the context was obsolete.
Enter Semantic Search
To solve this, the industry pivoted to embedding-based semantic search.
We began using models (like BERT and later SentenceTransformers) to compress sentences and documents into high-dimensional vectors. Retrieval became a geometry problem: we measured the cosine distance between the query vector and document vectors in a metric space. We scaled this using Approximate Nearest Neighbor (ANN) algorithms like HNSW, FAISS, and ScaNN.
This felt genuinely transformative—it allowed systems to retrieve semantically related passages even if they didn’t share a single keyword.
The Critical Weakness
However, embeddings introduced a subtle but critical structural weakness:
Embeddings encode proximity, not relationships.
Vectors operate in a metric space, not a relational graph. They are excellent at determining that “climate change” and “agriculture” are related concepts (they are close in vector space). But they are blind to the logic of that relationship. They cannot explicitly encode:
- Causality (Does X cause Y?)
- Hierarchy (Is X a type of Y?)
- Temporal dependency (Did X happen before Y?)
We built systems that could find things that sounded similar, but we lost the ability to reason about how they were connected.
The Great Shredder: RAG and Context Fragmentation
When Large Language Models (LLMs) arrived, we attempted to bridge this gap with Retrieval-Augmented Generation (RAG). The idea was simple: retrieve relevant data and feed it to the LLM as context.
However, the engineering implementation of RAG introduced a fundamental flaw: Context Fragmentation.
The Chunking Problem
To build vector indexes, we typically split long documents into fixed-size fragments (e.g., 300–1200 tokens). We call this “chunking,” but from an information theory perspective, it is a lossy compression strategy.
graph LR
Doc[Structured Document] -->|Chunking| C1[Chunk 1]
Doc -->|Chunking| C2[Chunk 2]
Doc -->|Chunking| C3[Chunk 3]
C1 -.->|Loss of Context| C2
C2 -.->|Loss of Logic| C3
style Doc fill:#f9f,stroke:#333,stroke-width:2px
style C1 fill:#eee,stroke:#333
style C2 fill:#eee,stroke:#333
style C3 fill:#eee,stroke:#333
Chunking severs:
- Cross-references (Section A referring to Section F)
- Definitions (A term defined in the intro is used undefined in the conclusion)
- Causal chains (The cause is in Chunk 1, the effect is in Chunk 3)
A RAG pipeline retrieves a handful of these disconnected chunks and passes them to an LLM, expecting the model to reconstruct the global context. The result is often hallucinated reasoning or incoherent synthesis.
The failure wasn’t in the LLM; it was in the knowledge representation pipeline. We destroyed the map and expected the model to navigate the territory.
The Return to Structure: Knowledge Graphs
Where embeddings excel at capturing similarity, Knowledge Graphs excel at preserving structure.
A Knowledge Graph models reality strictly as:
- Nodes: Entities, events, and concepts (e.g., “Argentina”, “Drought”)
- Edges: Explicit relationships (e.g., “CAUSES”, “LOCATED_IN”)
- Properties: Metadata attached to nodes/edges (e.g., timestamps, confidence scores)
- Constraints: Domain rules that govern logic
This structure supports multi-hop retrieval—the ability to traverse a chain of facts to answer a complex question.
Example: Relational Reasoning
Consider the query:
“How do weather patterns in South America affect US markets?”
A vector search might fail because “weather in Argentina” and “US inflation” rarely appear in the same paragraph (chunk). A graph system, however, can traverse the explicit path:
Drought in Argentina
→ CAUSES_DECREASE_IN → Soybean Supply
→ IMPACTS → Global Export Prices
→ AFFECTS → US Inflation
This traversal isn’t a guess; it’s a logical deduction derived from the data topology.
Graph algorithms—such as PageRank (for importance), Betweenness Centrality (for influence), and Shortest Path (for connection)—allow us to reason over the data, rather than just retrieving it.
The Next Chapter: GraphRAG
The emerging architecture for next-generation search is GraphRAG. It is not a replacement for vector search, but a unification of methods.
It combines:
- Vectors for broad, fuzzy semantic recall (“Find concepts related to agriculture”)
- Graphs for precise, relational traversal (“Follow the supply chain dependencies”)
- LLMs for synthesis and natural language explanation
This hybrid approach solves the “chunking artifact” problem. By linking chunks back to their parent entities in a graph, we preserve the global context even when retrieval is local.
The Evolution of Search
Search is evolving through distinct engineering eras:
| Era | Technique | Core Limitation |
|---|---|---|
| 1990s | Inverted Indexes | Literal matching; no semantics |
| 2000s | Link Analysis | Authority-focused; blindly trusts links |
| 2010s | Embeddings | Encodes proximity, but ignores structure |
| 2020s | Naive RAG | Context fragmentation via chunking |
| 2025+ | GraphRAG | Restores structure and reasoning |
Search broke when we optimized purely for matching (finding words) instead of mapping (understanding worlds). It will be fixed by rebuilding the structural foundation underlying our data.
References
-
Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab.
-
Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS. (The foundational RAG paper)
-
Edge, D., et al. (2024). From Local to Global: A Graph RAG Approach to Query-Focused Summarization. Microsoft Research. (The paper defining modern GraphRAG)
-
Vaswani, A., et al. (2017). Attention Is All You Need. NeurIPS. (For context on embedding limitations)
-
Liu, N., et al. (2023). Lost in the Middle: How Language Models Use Long Contexts. (Explaining why larger context windows don’t fix structure)