Post • #elixir
Someone filed a bug. The kind where the error message is just the wrong thing happening and nobody knows why.
I went looking for the code path that was blowing up.
The entry point was four lines long. It called a private function with eight clause heads. That function called another private function with five more.
I traced the happy path through four different functions across three different modules. None of them longer than four lines.
There were other problems too:
- Some clauses matched on error states like
{:error, :database_timeout}that could never occur. - Some had guard clauses like
when is_binary(id)in a system where IDs are always integers.
By the time I found the actual logic I had mentally eliminated more dead code than live code. The bug was simple. Finding it was not.
People say LLMs write great Elixir.
I disagree. LLMs write organized Elixir. A lot of this is LLM-shaped… but people write this code too.
Some people believe doing as much pattern matching as possible is good taste; alas…
José Valim himself lists “complex extractions in clauses” and “unrelated multi-clause functions” as anti-patterns in the official Elixir docs. This isn’t a hot take. The language designer says it’s bad.
Matching, As Intended
Pattern matching is good. I’m not here to tell you to stop using it.
Multi-clause functions are the right tool when the clauses are the specification.
GenServer Callbacks
A GenServer’s callbacks are a good example. Each clause is a message this process accepts.
def handle_call(:get_thing, _from, state), do: {:reply, state.thing, state}
def handle_call({:update, attrs}, _from, state) do
thing = do_update(state.thing, attrs)
{:reply, thing, %{state | thing: thing}}
end
Two clauses. Two messages. The function tells you exactly what this process does.
They’re easy enough to skim, they’re easy enough to test, and they’re easy enough to reason about. The clauses are the specification.
Reducer Patterns
Reducers are another good use case.
Each clause handles one case you actually care about, inline, no jumping around.
Enum.reduce(items, %{complete: 0, pending: 0}, fn
%{status: :complete}, acc -> %{acc | complete: acc.complete + 1}
%{status: :pending}, acc -> %{acc | pending: acc.pending + 1}
_item, acc -> acc
end)
Three clauses. Two statuses that matter. One catch-all. Inline and readable.
As an aside; unless there’s truly a “reusable, first-class” piece of logic, I prefer keeping these reducers inline.
The navigation tax of jumping to a private function is not worth it for a three-line reducer only being used once or twice.
Polymorphic Dispatch
Polymorphic dispatch where each variant is a real concept in your domain is fine too. That’s the tool doing its job.
defmodule Shape do
def area(%Circle{radius: r}) do
:math.pi() * r * r
end
def area(%Rectangle{width: w, height: h}) do
w * h
end
def area(%Triangle{base: b, height: h}) do
0.5 * b * h
end
end
The common thread here is that each clause means something.
It’s not enumerating return values from a function above it. It’s defining behavior for a specific named case.
Protocols are cool too. I don’t actually have to
defimpltoo frequently… but they’re cool?
Generally Speaking
Pattern matching is good when:
- The clauses are the specification. They tell you what the function does, and the implementation might differ for each clause.
- The clauses are the domain. Each clause is a real concept in your system.
- The clauses are inline. You can read them without jumping around.
- The clauses are relevant. They only match on states that can actually occur.
The problem is that people flinch at standard control flow. I’ve had these thoughts in the past:
- An
iffeels un-idiomatic. - A
condsmells like an anti-pattern. - The Erlang way of doing things is to pattern match.
Pattern matching feels correct. So they reach for it instead. A private function with six clause heads where one
casewould have been clearer.
Private Function, Bro
Here’s the part where everyone goes wrong :)
Someone writes a case block: It’s six branches. Each branch handles a different outcome. The branches relate to each other. The logic is coherent.
Then someone reviews it, or credo complains about cyclomatic complexity or some crap like “This function is getting long. Extract the case into a private function.”
So they do…
Now the caller calls handle_result(result) and handle_result has six clause heads matching on different tuple shapes. The logic is in a different part of the module. Maybe a different module entirely.
Extraction didn’t simplify the code. It moved the complexity somewhere else and added a seam.
Navigation Tax
Every extraction forces the reader to jump.
You see handle_result(result) and you have to find the definition. Then you read six clause heads. Then you trace which ones are reachable from this particular call site.
Then you jump back.
An eye-tracking study found that even for simple tasks, participants spent more time navigating between the call site and the definition than they would have spent just reading inline.
Leaky Abstraction
Sandi Metz (of 99 Bottles of OOP fame) said it best: “duplication is far cheaper than the wrong abstraction.”
When you extract a case into a private function, you’re creating an abstraction.
The function has a name. It has a contract. Callers depend on it.
If a new requirement comes along and the abstraction doesn’t quite fit, you start adding parameters and conditionals. Then more requirements arrive. Now the abstraction is a mess and you can’t undo it without touching every caller.
If you find yourself adding parameters and conditionals to make an abstraction fit new cases, in my humble opinion: inline it back and start over.
The wrong abstraction is worse than no abstraction at all.
Dispatch or Destructure
The official docs make a specific distinction. Pattern matching in function heads should be for dispatch: which clause runs. Not for convenient destructuring.
If you’re matching %{status: :active, name: name, age: age} in a function head when only status determines which clause fires, you’re mixing dispatch with extraction. The reader can’t tell which parts of the pattern actually matter for control flow.
# bad: matches fields that don't affect dispatch
defp handle(%{status: :active, name: name, age: age}), do: ...
# better: only match what matters for dispatch
defp handle(%{status: :active} = thing) do
%{name: name, age: age} = thing
...
end
The first version looks cleaner. But the second version tells the truth. Only :status matters for picking this clause. The rest is just data access. Put it in the body.
Rambling about Heuristics
A lot of this comes from following rules that were never meant to be universal.
“This function is too long.” Says who?
A function should be as long as it needs to be to express one coherent operation. If that’s forty lines of a case statement where every branch relates to every other branch, splitting it up destroys the coherence.
You haven’t simplified anything.
“This is used in two places, so it should be extracted.” Clean Code says extract on second use. But Clean Code also gave us a generation of over-abstracted codebases where every two-line helper has three parameters and a conditional.
Duplication really is cheaper than the wrong abstraction.
You’re skimming function heads to understand state transitions and see:
defp handle_result({:ok, %{status: :active, name: name, age: age}})
The function signature is lying to you about what’s important.
Obviously, thinking “ah, the name and age matter for dispatch.” but they don’t. They’re just there because someone optimized for “lines of code” and decided a variable allocation or two was worse than pattern-matching, muddying the waters.
When you extract:
- The function signature becomes the only thing the reader sees at the call site.
- If that signature is full of “secondary” concerns (fields matched for convenience, types guarded against that never occur, error states that can’t happen) the reader builds the wrong mental model.
Please, only extract your “primary” concerns. Actual “concepts”.
Sometimes, your “primary” concern might be a meta-concern, like “I wanna do a database query!”…
(As they’re about to yap about next…)
Primary Concerns
Let me tell you about a codebase.
Setting the Scene
We had a Utils module that was a graveyard of private functions. Stuff like:
- A
deleted/2function that’d would handle checks for nil, timestamp, or boolean deleted flags. - A
between/4would support various date range filters across different fields. - Whatever else you could imagine. It was a dumping ground for “common” query helpers.
Each context module had its own set of query helpers too.
This looked nice a nice API from the outside. Functions named with_ and where_ would do things to a query. Perfectly pipe-able:
deleted_things
|> Invoice.with_line_items
|> LineItems.with_dispense_fees
|> LineItems.where_status(:paid)
|> Accounting.where_realized(...)
Under the hood, none of them worked the same way.
Some delegated to more private functions. Some didn’t. Some did joins. Some did subqueries. Some had guards on types you didn’t know about. Some were O(n) and some were O(n²) and there was no way to tell.
There were different with_client(...) functions in different modules that did completely different things under the same name. Some joined. Some subqueried. All of them were called with_client/3.
The consistent API surface wasn’t real.
It just happened that people liked the function name and it did vaguely the same thing.
Fixing It
We realized at some point, “huh, this is a bit of a mess” and wanted to “clean it up”; whatever that meant. (Who knew?)
First we thought standardize the naming and what each thing was allowed to do. “Everything called with_ means a join. Everything called where_ means a filter.” but that didn’t really help.
Then we tried colocating the helpers with the schema modules they operated on.
Invoice query helpers lived in Invoice. That was better, no more guessing between Accounting and Billing anymore. But that just added a lot of noise to the schema, and it was still hard to force consistency.
Eventually, though, we realized that this “query” thing was actually a primary concept.
It was a specific “thing” we were doing, and that we should start modelling it as such.
OTP provides something called a behaviour for this. We define an interface, provide some sane defaults we extracted from the existing code, and encapsulated it.
Suddenly, all our modules started looking like:
defmodule MyApp.Billing.Invoice do
use MyApp.Queryable
schema "invoices" do
field :status, :string
field :amount, :integer
...
end
# This is optional, by the way. We provide a default super/2 implementation that
# handles arbitrary field queries, in clauses, like, associations, etc.
#
# Schema's only have to extend the API for their own very specific needs, and that's
# visible at a glance.
@impl MyApp.Queryable
def query(base_query, filters \\ []) do
Enum.reduce(filters, base_query, fn
{:invoiced, status}, query when is_boolean(status) ->
from invoice in query, where: invoice.status == ^status
{:invoiced, _status}, query ->
query
filter, query ->
super(query, filter)
...
end)
end
end
iex> Repo.all(Invoice.query(invoiced: true, amount: {:gt, 1000}, items: [product_id: 123]))
...
The big difference here is “co-location”.
All schemas shared a same “language” for query composition. If schemas had specific query needs, they can extend the API to do whatever needed to be done, but that’d be co-located with the schema itself.
The returned queries can be run with any Ecto.Repo callback, and can be used for things like reporting, ETLs, as well as normal business logic.
Extraction feels like organization until suddenly you have three modules and fourteen private functions and nobody can trace what a single pipeline actually does.
This is a specific story about a specific codebase. But the arc is universal.
A Better Heuristic
The Queryable story isn’t really about queries. It’s about what happens when you stop extracting things just because they’re “getting long” and start asking what belongs where.
The heuristics I talked about earlier:
- “This function is too long, …”
- “This is used twice, extract it …”
- etc.
Those are reflexes. They’re not wrong in every case, but they’re not right in every case either. The Queryable story worked because we stopped treating the heuristic as the goal and started asking what the code was actually trying to do.
It’s not about “shorter” or “prettier”. It’s about “easier to trace”. It’s about “co-location”. It’s about “primary concerns”.
It still used pattern matching extensively. For dispatch. For specification. For domain.
If it were up to be, these are the heuristics I’d refer to instead.
Inline first
If a case block handles the outcomes of a specific operation, keep it with that operation.
Don’t extract it into a private function. The relationship between the branches is the important part, and proximity preserves that relationship.
def record_payment(changeset) do
result = do_record_payment(changeset)
case result do
{:ok, %{status: :active} = thing} ->
sync_thing(thing)
{:ok, thing}
{:ok, %{status: :pending} = thing} ->
notify_pending(thing)
{:ok, thing}
{:error, reason} ->
{:error, reason}
end
end
The dispatch is right there. You can see every outcome without leaving the function.
Is it longer than extracting to handle_result/1? Yes. Is it easier to debug? Also yes.
A slightly longer function you can read top to bottom beats three tiny functions scattered across the module. The navigation tax is real and you pay it every time someone has to debug.
Impossible clauses
Stop matching on states that can’t happen.
If upstream logic already checked that a record exists, don’t match on {:error, :not_found}.
If the system doesn’t use string IDs, don’t guard on when is_binary(id).
Two outcomes if that’s all it takes. That’s it. Bubble up the nil case using with, then handle only possible invariants after that point. Handling the nil case is not your job.
def update_thing(id, attrs) do
with %Thing{} = thing <- Repo.get(Thing, id) do
thing
|> Thing.changeset(attrs)
|> Repo.update()
|> case do
{:ok, thing} -> {:ok, thing}
{:error, changeset} -> {:error, changeset}
end
end
end
LLMs are egregiously bad at this.
They’re too defensive. They match every possible outcome even if the right thing to do in that case is let it crash.
“Elixir is the best language for LLMs to generate.” Sure bro. Sureeee…
Public over Private
There are still good reasons to extract.
- A GenServer callback module by convention.
- A shared validation used in three places (though try to keep it public, yeah?)
- A complex computation with a clear name that makes the caller more readable and can be tested. A real “concern”.
The bar should be high. Higher than you initially think.
Public functions have to justify their contract. Private functions should carry the same burden. If you can’t make the case for def, don’t write defp either.
The Problem with Pattern Matching
Pattern matching is one of the best things about Elixir. Use it for dispatch. Use it for specifications. Use it where the clauses mean something.
Put it down where they don’t.
▼
▶
Directory
1
- 01.
The Problem with Pattern Matching
I don't like pattern matching.