2025-11-25

The allure and failure of knowledge graphs

tweet by hamal husain "> sees new RAG is dead blog post and opens it (shame on me) > see the word knowledge graph > promptly close the post"

Knowledge graphs are one of the sexiest sounding methods in theory. Scientist and engineer types are attracted to codifying knowledge in an abstract, syntactically perfect way. The history is full of big projects trying to bridge the gap between the perfectly syntactic knowledge graph world and the mushy semantic world. They all failed so far.

The newest revival of knowledge graphism is combining the knowledge graphs with LLMs. Go through the comments, people are again attracted by the idea of abstracted, perfected knowledge. I believe this concept finds resonance because interacting with LLMs is inherently mushy.

However, modern LLMs should be seen as more efficient knowledge graphs (efficient in the economic sense of the word¹). LLMs with agentic capabilities are even more efficient! LLM results are not as satisfying as syntactically and deterministically querying a knowledge graph, however.

The biggest knowledge graph user, Google, is pivoting to enhancing their results with AI Overviews. Funnily enough, Google itself is sneak dissing the usefulness of knowledge graphs in the linked PR statement:

We’ve meticulously honed our core information quality systems to help you find the best of what’s on the web. And we’ve built a knowledge base of billions of facts about people, places and things — all so you can get information you can trust in the blink of an eye.

Now, with generative AI, Search can do more than you ever imagined. So you can ask whatever’s on your mind or whatever you need to get done — from researching to planning to brainstorming — and Google will take care of the legwork.

People love to post examples of bad AI Overviews, but whenever I look over the shoulder of my friends, most of their Google queries are answered by the AI Overviews section.

I believe knowledge graphs work best when you can codify your knowledge as honest-to-god facts. Facts that have the least possible amount of interpretation potential, the least amount of semantics.

LLMs just capture semantics better, or worded differently, capture the context of the query better. As dissatisfying as the realization is, the real world is mushy and big, and thus can currently only be captured by mushy and big models.

Related ideas:

Write me: [email protected] ☺️

Footnotes:

Claude Opus 4.5′s take:

Software efficiency from a buyer’s perspective

Efficiency in software services boils down to value per unit of cost. Here’s how to evaluate it:

1. Total cost of ownership (TCO)
Not just the subscription price, but also: implementation time, training, integration costs, and ongoing maintenance. A cheaper tool that needs constant workarounds is less efficient.

2. Time to value
How quickly does the service deliver results? A service that saves 10 hours/week but takes 6 months to deploy may be less efficient than one saving 5 hours/week but operational in 2 days.

3. Resource utilization
Does Service A require 3 employees to manage while Service B runs autonomously? The labor overhead directly impacts efficiency.

4. Output per dollar
Compare: transactions processed, users served, or tasks completed relative to cost. Service A at $500/month handling 10,000 requests is more efficient than Service B at $300/month handling 5,000.

5. Opportunity cost
Choosing a less capable service means lost productivity, slower scaling, or missed features that competitors leverage.

The core equation:
Efficiency = (Business Value Delivered) / (Total Resources Invested)

A buyer should pick the service where marginal cost yields the highest marginal return, factoring in both direct expenses and indirect costs like time, complexity, and scalability limits.
↩