What I learned building graph systems in Elixir -- Vegard Krogh

I’ve spent the past year building graph infrastructure in Elixir. Not a graph database — a set of tools for representing relationships between things I care about: notes, bookmarks, photos, conversations. The kind of personal knowledge graph that sounds simple until you try to query it.

Here’s what I learned.

ETS is your friend (until it isn’t)

The first instinct was GenServer state. Keep the graph in a process, send it messages, get responses. This works beautifully for small graphs. A few hundred nodes, maybe a thousand edges. The API is clean, the semantics are clear, and you get serialization for free.

defmodule Graph.Store do
  use GenServer

  def traverse(pid, start_node, depth) do
    GenServer.call(pid, {:traverse, start_node, depth})
  end

  def handle_call({:traverse, start, depth}, _from, graph) do
    result = do_traverse(graph, start, depth, MapSet.new())
    {:reply, result, graph}
  end
end

The problem arrives around 10,000 nodes. GenServer calls serialize access. One slow traversal blocks everything. The solution is ETS — Erlang Term Storage. Concurrent reads, no process bottleneck.

But ETS has its own traps. The data is flat key-value pairs. Representing a graph in key-value storage requires careful schema design. I ended up with three tables: nodes, edges (indexed by source), and reverse edges (indexed by target). Every write is three operations that need to stay consistent.

The traversal problem

Graph traversal in a functional language feels natural. Pattern matching on node types, recursive descent through edges, accumulating results. But the naive approach has a fatal flaw: it’s recursive.

defp do_traverse(_graph, _node, 0, visited), do: visited

defp do_traverse(graph, node, depth, visited) do
  if MapSet.member?(visited, node) do
    visited
  else
    visited = MapSet.put(visited, node)
    neighbors = get_neighbors(graph, node)

    Enum.reduce(neighbors, visited, fn neighbor, acc ->
      do_traverse(graph, neighbor, depth - 1, acc)
    end)
  end
end

This blows the stack on deep graphs. Elixir doesn’t have tail-call optimization for this pattern because the Enum.reduce wrapping prevents it. The fix is to use an explicit stack (a list acting as a queue) and iterate.

What I’d do differently

Start with PostgreSQL. Not because ETS is bad, but because the query patterns for a knowledge graph are fundamentally relational. “Find all notes connected to photos taken in Oslo in March” is a SQL query. In ETS, it’s a custom traversal function you have to write and maintain.

Elixir is excellent for the pipeline that feeds the graph — processing, enriching, connecting. But for storage and querying, use a database. I learned this the slow way.

The OTP lesson

The real value of building this in Elixir wasn’t the graph itself. It was learning where OTP patterns apply and where they mislead. A GenServer is not a database. A Supervisor tree is not a transaction manager. These abstractions are powerful when used for what they’re designed for: managing processes, handling failures, coordinating work.

The graph system now uses PostgreSQL for storage, ETS for caching hot paths, and GenServer for coordination. Each tool doing what it’s good at. It took a year of wrong turns to arrive at the obvious architecture.

That’s usually how it works.