Target audience: Practitioners interested in programming language design
and familiar with representations of errors in at least a few different languages
such as error codes, checked/unchecked exceptions, tagged unions,
polymorphic variants etc.
Estimated reading time: 60 to 90 mins.
In research papers on programming languages,
there is often a focus on sophisticated
type system features to rule out certain classes of errors,
whereas error handling itself receives
relatively little attention, despite its importance.This doesn’t seem too dissimilar to problems in the database community/literature, where less “cool” topics like better representation for strings receive relatively little attention compared to their importance. See also: What are important data systems problems, ignored by research? and DuckDB creator Hannes Mühleisen’s CIDR 2023 keynote on Developing Systems in Academia for related discussion.
For example, in
Simple testing can prevent most critical failures [PDF]Readers in a rush can skim sections 1 and 2 and read the ‘Findings’ in bold in section 4. , Yuan et al.
found that in the context of distributed data-intensive systems:
“Almost all catastrophic failures (92%) are the result of
incorrect handling of non-fatal errors explicitly signaled in software”
(Finding 10, p256)
Out of these “35% [..] are caused by trivial mistakes in error
handling logic — ones that simply violate best programming practices;
and that can be detected without system specific knowledge.”
(Finding 11, p256)
In contrast, there is some excellent long-form writing on error models out
there by practitioners.
Joe Duffy’s blog post on the error model in Midori:
This describes a variant of C# used to write a microkernel, drivers,
and a large amount of user code. Duffy describes a two-pronged error
model – “abandonment” (fail-fast error handling)
and a variation on checked exceptions (utilizing class hierarchies).
The design doc for Swift’s error model: This discusses the pros and cons of error models in several
different languages.
The TigerBeetle docs on “Tiger Style”:
This describes how the TigerBeetle database tries to ensure various
forms of safety. The discussion on assertions in the Safety section
is particularly worth reading.
Rust RFC 236 – Error conventions:
This describes conventions for how Rust libraries
should use different error handling mechanisms
in different situations.
Joe Armstrong’s talk The Do’s and Don’ts of Error Handling:
Armstrong covers the key requirements for handling and recovering
from errors in distributed systems, based on his
PhD thesis from 2003 (PDF).By this point in time, Armstrong was about 52 years old, and had 10+ years of experience working on Erlang at Ericsson.
Out of the above, Armstrong’s thesis is probably the most holistic,
but it’s grounding in Erlang means that it also does not take into
account one of the most widespread forms of static analysis we have today
– type systems. Here’s an excerpt from page 209:
We can impose a type system on our programming language,
or we can impose a contract checking mechanism between any
two components in a message passing system. Of these two methods I
prefer the use of a contract checker
In this post, I want to approach error models in a way that is both
holistic and geared towards ecosystems of systems PLs.
By holistic, I mean that I will approach error models from several different perspectives:
From a “product” perspective: What are the core assumptions,
what are all the requirements and how are they informed by practical use cases.
From a type systems perspective: How can the type system assist
the programmer to accomplish what they want.
From a language design perspective: How API design guidelines,
conventions and tooling can be combined with type system features
to enable people to write more robust applications.
By geared towards ecosystems of systems PLs,The term “systems programming language” inevitably seems to trigger people into rehashing the same “arguments” in comment sections – many people seem to like using it as a term for excluding other languages because they use reference counting or tracing GC as the default memory management strategy. I’m picking a definition here for the sake of this post. If you dislike this definition, you can either mentally replace all usages of “systems PL” with “language with XYZ memory management strategy” or you can stop reading the post. I mean that the error
model must take into account the following needs:
Graceful degradation in the presence of errors,
because the underlying platform and/or acceptance criteria
may offer limited recovery capabilities.
Optimizability – either by the compiler, the programmer or both
– as performance bottlenecks arise within the system
rather than from external sources.
This may involve trading off flexibility in error handling
for performance.
Facilitation of interoperability between and co-evolution of
libraries and applications.
Some examples of such software include databases,
high-performance web servers, and interpreters.
I will be ignoring use cases
such as small-scale scripting for one-off tasks,
interactive data exploration etc.
I will also be ignoring ABI considerations, because
those can largely be resolved
by varying levels of indirection
in the run-time representation (e.g. see Swift).
The structure of this thesis blog post is as follows:
Section 1 discusses definitions for important terms such as ‘error’,
‘error model’ etc.
Section 2 goes over the primary theses about errors and how people reason
about them, along with examples.
Section 3 covers the key requirements for an error model,
and how these are justified based on the theses in Section 2.
Section 4 describes a hypothetical programming language Everr
and its error model.The reason for introducing a fake language
for the sake of discussion, instead of say proposing extensions
to an existing language, is that it offers a clean slate for combining
ideas across different existing languages without having to
worry about things like compatibility with existing
language features and/or idioms.
Section 5 describes the error models of various existing
and upcoming programming languages.
Section 6 compares Everr with other languages based on the
requirements laid out in Section 3.
Section 7 concludes with some questions for you, an unorthodox choice
to introduce two new terms, a potpourri of metaphors, and some fluffy exhortations.
For the die-hard fans readers, there is also an Appendix with (EDIT: Mar 9, 2025) 7 8 sections.
To this end, this post is fairly large, clocking in at 15K+ words
at the time of writing. You have been warned.
Still here? Let’s get started.
Section 1: Key definitions
For the sake of this post, I’m going to use the following definitions:
Error: I’ll use this term in two ways:
A program state that is undesirable or non-ideal in some way.
A value reifying an undesirable program state into something
that can be operated upon by the language’s constructs.
For example, errors may be represented as error codes (in C),
exceptions (in Java), error values (in Go), a Result::Err case
(in Rust) and so on. This value may or may not have some
metadata attached, such as a stack trace or some indicator
of progress made before hitting the error.
Error propagation: The act of passing an error received from
a function that was called to one’s own caller. During error propagation
one might want to attach more metadata, release some resources etc.
Error handling: The act of inspecting an error value
and/or its metadata, and making some decision based on that.
For example, one might decide to log the error, propagate it,
discard it, or convert it to a different form.
Error model: The overall system for representing, creating,
propagating, and handling errors in a programming language,
including best practices and common idioms.
In common parlance, this is just called “error handling”,
but it’s always felt a bit weird to me that seemingly very
different actions like declaring, propagating and inspecting
errors would get lumped together under “handling”. So I’m
borrowing this term from Duffy’s blog post instead.
Exhaustive: An error (type) is said to be exhaustive when all
the possible data for that error is known up-front.
For example, conversion from an Int64 value to a UInt32
value will only fail in a couple of ways: either the
value is negative, or it exceeds 232 – 1.
Thus, such a conversion operation supports recording
an exhaustive error in case of failure.
In contrast, errors types from external systems,
such as third-party APIs, are typically non-exhaustive.
Exhaustiveness has two axes: fields and cases.I’m deliberately avoiding a discussion on records vs variants (or structs vs enums/tagged unions, or classes vs case classes) here; more on that in later sections.
Hopefully, none of the above are too controversial.
Let’s keep moving.
Section 2: Key theses
The rest of this post is going to be based on a few key theses:
Whether a program state is an error or not is contextual.
Whether an error can be recovered from or not is sometimes contextual.
Errors need to be able to carry metadata.
Robust error handling sometimes requires detailed understanding of possible error cases.
Errors and metadata in unstructured form are difficult to handle reliably.
Programs typically need to handle both exhaustive and non-exhaustive errors.
Programmers mostly operate with incomplete knowledge about errors.
Let’s dig in to each of these one-by-one.
Thesis 1: Whether a program state is an error or not is contextual
Let’s consider the classic example of opening a file.
You have some code trying to open a file and that file is not found.
Is that an error? Well, it depends!
For example, say you are writing a CLI application
that is doing a recursive lookup for a configuration file in ancestor
directories from the working directory.
Instead of first checking if the file exists
(e.g. using stat(2) on Linux)
and then opening the file,
you write your code to open the file directly.You do this because the existence check + open strategy can still fail at the open stage with a ‘File not found’ error if some other process deleted the file in the middle of your operations.
However, in this case, the file opening operation not succeeding
with a ‘File not found’ is not undesirable or incorrect in some way,
i.e., it is not an error, but it is part of normal operation.
Now consider the situation where the path to the
configuration file is obtained via a command-line argument.
In this case, if you get a ‘File not found’ error when opening the file,
it’s likely that the user made a mistake when providing
the command-line argument, so it would make sense to surface
the ‘File not found’ to the user.
Thesis 2: Whether an error can be recovered from or not is sometimes contextual
Some common examples of programming bugs that are often considered
as non-recoverable in the context of application languages
are out-of-bounds array accesses, unwrapping optionals
(or nil/null dereference) and out-of-memory – these are all
listed in the Swift docs and Joe Duffy’s blog post as examples.
Rust RFC 236 categorizes errors into three distinct types:
catastrophic errors, contract violations and obstructions.
Out of these, out-of-memory is considered as a catastrophic error,
whereas index out of bounds is considered a contract violation.Yes, I understand that this is a conventions RFC and that individual libraries may deviate from the conventions if they have a different set of needs. However, conventions strongly affect the design of language features and standard library APIs, so I think it’s worth discussing this here as well.
For both of these, the RFC states that
“The basic principle of the conventions is that:
Catastrophic errors and programming errors (bugs) can and should only
be recovered at a coarse grain, i.e. a task boundary.”
I think it’s important to recognize that even for these errors,
the recoverability is contextual.
For example, in video game code, if there is an off-by-one error
in some rare cases in collision detection or lighting,
it’s possible that the game still works mostly fine,
and that’s good enough.
For out-of-memory in a web server, you may want to limit
the amount of memory a single task can consume,
to prevent the whole server from being terminated
when the server process runs out of memory.
Even if you don’t have per-request limits,
it’s possible that you breach the process-wide memory limit
outside of the context of a particular
task handling some request or background work.
If the server has the opportunity to recover from this,
it might be OK to shed load by
terminating the tasks for a few requests
(e.g. if they don’t use global state,
or only use it in a limited way that allows cleanup)
instead of terminating the entire process.
To be clear, I’m not saying that such examples represent
the majority of cases.
However, my point is that even for cases which seem
relatively cut-and-dry, there are situations where
the classification of an error into recoverable/non-recoverable
is not clear cut.
Thesis 3: Errors need to be able to carry metadata
Once code grows beyond a certain scale, understanding errors requires
collecting metadata about where/what/when/how/why. For example,
in a web server, you might care about tracking the following:
A stack trace for the place where the error was originally created.
Severity of the error.
For data validation errors such as in validating JSON,
some ‘path’ within the larger structure
(at the time of the error),
as well as the value of the unexpected portion.
Retryability of operations, such as DB transactions or
RPC calls in a framework like gRPC.
Additionally, one needs to be able to have some logic
that can make use of this metadata (e.g. as methods on an error type,
if the language supports methods in some form),
such as something for generating key-value pairs for observability,
computing equality/hashes for checking sameness etc.
Corollary: Error codes as the primary language-supported way
of error handling are inadequate for many use cases
(see also: Zig issue #2647 – Allow returning a value with an error).
Thesis 4: Robust error handling sometimes requires detailed understanding of possible error cases
In Designing Data-Intensive Applications, Martin Kleppman gives an excellent example of database transactions and retryability:
popular object-relational-mapping (ORM) frameworks [..] don’t retry aborted transactions [..]. This is a shame, because the whole point of aborts is to enable safe retries.
Although retrying an aborted transaction is a simple and effective error handling mechanism, it isn’t perfect:
If the transaction actually succeeded, but the network failed while the server tried to acknowledge the successful commit to the client [..], then retrying the transaction [is unsound without extra app-level de-duplication]
If the error is due to overload, retrying the transaction will make the problem worse [..]
It is only worth retrying after transient errors ([..] deadlock, isolation violation, temporary network interruptions, and failover); after a permanent error (e.g. constraint violation) a retry would be pointless
For a library or framework which provides APIs for interacting with a SQL database,
it is necessary to be able to distinguish the various cases of network errors and database errors
if it wants to support automatic retries for aborted transactions.
Of course, not all error handling needs this level of rigor in analyzing the various cases.
Depending on the context,
it might be OK to just log an error and keep going
if one is confident that
it won’t negatively impact the overall system.
Corollary: An API specification language should probably
discourage hiding error information from return values
(see also: GraphQL’s default machinery for error handling).
Thesis 5: Errors and metadata in unstructured form are difficult to handle reliably
I imagine this point is probably the least controversial.
Stuffing error case information and metadata into strings
makes an API harder to use for a client which cares about
error handling.
If you expose an API where the only way a conscentious user
can extract more data for an error is by parsing a string,
they’re going to write that error parser,
and all the options when you want to
change that error message are going to suck.
(see also: Hyrum’s Law).
Corollary: A standard library should probably not encourage
people to easily attach arbitrary data to errors in a way that
cannot be recovered later (see also: Go’s fmt.Errorf
function which implicitly encourages users to create unstructured errors**.
Thesis 6: Programs typically need to manage both exhaustive and non-exhaustive errors
When a part of a program does not interact with any external systems,
but behaves like a pure function with a
well-understood set of possible behaviors,
using exhaustive error types allows one to model this certainty.
On the other hand, when interacting with external systems which
may change in the future, such as third-party APIs across a network,
OS APIs, databases etc., the program needs a way to model
and handle errors where all the cases and fields are not known up front.
In such situations, ideally, the addition of new error cases and/or fields
should not break existing third-party clients at a source level,
and lead to some reasonable dynamic behavior.
Thesis 7: Programmers mostly operate with incomplete knowledge about errors
Outside of safety critical contexts,
I think it’s fair to say that for most production systems,
most people working on them (myself included) have a fairly limited
picture of all the different things that could go wrong.
I suspect this is probably true even if one limits the scope
to just the situations where the underlying APIs are written
by oneself and just involve pure (but complex) computation,
as well as situations where all the inputs and system parameters
are within “acceptable” ranges.
To be clear, I mean this as an observation and not as a value judgement
– I think there are various contributing reasons for this;
competing priorities, high system complexity, poor docs,
minimal language/tooling support and perhaps even optimism.For more discussion on optimism specifically, see Appendix A7.
Discussing this more would probably take a full blog post or more by itself,
so I’m going to stop there.
Section 3: Key criteria for an error model
Based on the above theses and my personal experience, I believe
that an error model should be judged based on how well it
satisfies the following key criteria.If these sound a bit too abstract, I’ll be discussing them in more detail shortly.
Error declarations must support:
Exhaustiveness annotation: The ability to mark a declaration
as being exhaustive (or not). This must support both axes: fields and cases.
Case extension: The ability to extend errors with new cases
(at the declaration site itself) in a backward-compatible way,
if the error was declared to be non-exhaustive in terms of cases.
Field extension: Analogous to the above but for fields.
Case refinement: The ability to refine previously defined cases
into more fine-grained cases over time
(at the declaration site itself) in a backward-compatible way.
Error propagation must support:
Explicit marking: The abilityThe choice of the word “ability” is intentional. Using an explicit marking discipline may or may not be the default, and may or may not be conventional, but following it must be possible. to force code to be written
in a way such that:
Possible errors from primitive operations are indicated with explicit marks.
Propagating errors from an invoked operation to one’s own caller requires
an explicit mark.
The absence of explicit marks must cause a localized static error.
Structured metadata attachment: The ability to attach structured metadata
to an error. Attaching metadata must preserve the fact that the error
is still an error.
Error combination: The ability to combine errors into a larger error.
Erasure: The ability to abstract fine-grained errors into more
coarse-grained errors.
Error handling must support:
Exhaustiveness checking:This needs to correctly account for access control rules. For example, in Rust, the #[non_exhaustive] attribute has no effect within the same crate. For errors declared to be exhaustive,
the ability to match exhaustively against all cases.
Non-exhaustive matches for exhaustive errors must be diagnosed statically.
Additionally, for non-exhaustive errors, attempting an exhaustive match
against the known cases must be diagnosed statically.
Structured metadata extraction: The ability to extract structured metadata
attached to an error (the dual of Structured metadata attachment).
Error projection: The ability to inspect individual sub-errors out of a combined error (the dual of Error combination).
Unerasure: The ability to unerase fine-grained information out of coarse-grained errors (the dual of Erasure).
Criteria for error conventions:
Error category definitions: Different categories of errors
must be documented, so that the ecosystem can rely on centralized definitions.
These must be accompanied by examples of
when an error should be put in a certain category,
and when it may be considered as part of another category.
Guidelines on error definitions: These should cover what ought to be
documented, which annotations should be considered and/or avoided,
as well as considerations for downstream users of libraries.
Guidelines on error propagation: These should cover when it is appropriate
to return errors that are erased vs unerased.
Guidelines on error handling: These should cover when it is and
is not appropriate to handle errors within and across library boundaries.
Guidelines should generally be accompanied by rationale, as well as curated lists
of potential reasons to deviate from the guidelines.
Criteria for tooling:
Lint support: Certain classes of errors are likely best avoided through
lints/style checkers, rather than type system features.
(Important) A lint that prevents error values from being discarded using
standard shorthands (e.g. _ = ), without an explicit annotation,
such as a comment or a call to an earmarked function
(to allow for ‘Find references’) etc.
(Nice-to-have) If exhaustiveness of cases/fields is the default at the
language level, then a lint to require manual exhaustiveness annotations
on every type.
(Nice-to-have) A lint for enforcing that fields and cases of error types
must be documented.
Editing support:
(Important) A structured flow for adding new error cases and fields.
The editing environment should identify locations which potentially
need changes – these will generally be locations which materialize
or handle errors of the same type.
For each of these criteria, the following sub-sections
describe why they are useful.
Criteria for error declarations
Exhaustiveness annotation is necessary because of Thesis 6 –
typical programs need to work with both exhaustive and non-exhaustive errors.
This should probably take the form of an annotation or similar, rather than two
entirely different type system features (e.g. subclasses for the non-exhaustive case
vs standard sum types for the exhaustive case), because:
It is possible for a non-exhaustive error to become exhaustive (if the underlying API stops evolving)
The small “conceptual diff” would probably be best reflected as a small syntax diff in code, rather than an entirely different way of organizing the program.
Once one accepts that non-exhaustive errors are permitted,
for such errors to be usable across project boundaries,
it naturally follows that the language must support
adding new cases and fields to a non-exhaustive error type
without breaking source-level backward compatibility.
Lastly, when an API developer gains more knowledgeRecall Thesis 7 – Programmers mostly operate with incomplete knowledge about errors. about a previously
broad error case, they need a way of communicating that to new clients without breaking
existing clients. This requires some way of expressing “this one case is now actually N other cases, where N >= 2”
which is exactly how I described case refinement.
Criteria for error propagation
The ability to force code to be written using explicit marks for error
handling is valuable because it enables modular reasoning about control
flow and error handling, one function at a time.
Even if a programmer is able to magically keep track of which functions
can silently return which errors in their head,
they may stop working on the codebase,
and the next programmer working on the code
will need assistance in gradually assembling
a new theory of the codebase
they’ve inherited.
The ability to attach structured metadata is valuable because not all
errors can be described by simple primitive values,
and often, it is necessary to have contextual information
in order to debug why an error occurred
(e.g. where in the JSON file is there a missing , again, goddammit).
The ability to combine errors is valuable in the same way
that collections like arrays, sets and maps are valuable.
The ability to erase errors is valuable since not all code cares about
the details of an error. For example, code inside a supervisor task/process
might potentially only care about how to log an error.
Criteria for error handling
Exhaustiveness checking is valuable because it provides clarity
that all cases have been handled. However, in some cases,
it is impossible to have that clarity, since:
One might not have sufficient time to understand and analyze
all the cases.
If the error comes from a dependency:
One might be unable to change the code in the dependency
One might be unable to replace the dependency
One might not want to (or potentially, cannot)
lock to a certain version of the dependency.
This means that non-exhaustive errors must also get proper
treatment during exhaustiveness checking.
Features like structured metadata extraction,
error projection, and recovery are needed because
their corresponding duals only make sense when used
in concert with them.
Now that we’ve covered a fair bit of ground related to error models themselves,
let’s switch gears and talk about what a language design
grounded in these observations could look like.
Section 4: An immodest proposal
I’m going to describe an error model in terms of a hypothetical
systems language Everr (“Evolving errors”)
and its ecosystem.
I’ll demonstrate Everr using examples.Normally, I’d hope that this is understood, but since this is the internet, I’m going to spell this out explicitly; the concrete syntax being used here is not the point. This is a language I literally just made up to illustrate some concepts.
After that, we’ll see how well Everr’s error model compares
to existing mainstream languages with respect to the criteria
outlined in the previous section.
Here’s a rough summary of Everr’s core language constructs:
Everr supports semantic attributes on declarations, similar to C++, Rust, Swift etc.
Everr supports structs (a.k.a. records / product types) and enums (a.k.a. variants / sum types).
These can be exhaustive or non-exhaustive.
Enums are second-class, like Cap’n Proto’s (tagged) union types.
However, they support dedicated sugar making them feel essentially first-class.
Everr supports style pattern matching using a minor variation of
Cheng and Parreaux’s “Ultimate Conditional Syntax”.
Everr has simple namespaces for grouping things,
similar to namespaces in C++ and modules in Rust.
Types themselves are not namespaces.This is largely for simplifying the description here. If types were allowed to function as namespaces, (or alternately, if namespaces were allowed to have type parameters), that would necessitate different syntax for referring to “outer” generic parameters vs introducing fresh generic parameters, and we’d end up going into the weeds.
Everr supports union types,When I say “union types”, I mean it in the type-theoretic sense, not in the sense of untagged unions as in C, C++ etc. but only with a mandatory upper bound.
For brevity, I’ll refer to these as “bounded union types”.
Bounded union types come in two flavors – exhaustive and non-exhaustive.
Everr supports a Rust-like trait mechanism as well as a trait deriving mechanism.The exact mechanism powering trait derivations – whether they be hard-coded in the compiler, be implemented as macros, compiler plugins etc. – is not terribly important for this post, so I will ignore it.
Everr supports a delegation mechanism for field accesses
and method calls from one type to another.
First, let’s do a tour of these core language features.
After that, I’ll describe Everr’s error model.
A tour of Everr
Everr has an Optional type for representing optional values:
@exhaustive
enum Optional[A] {
| None {}
| Some { value: A }
}
This is syntax sugar for the following code:
namespace Optional {
// ↓ None will never have new fields
@exhaustive(fields)
struct None {}
// ↓ Some will never have new fields
@exhaustive(fields)
struct Some[A] { value: A }
}
// ↓ Optional will never have new fields
@exhaustive(fields)
struct Optional[A] {
// ↓ Actual enum syntax, without the sugar
case: @exhaustive(cases) enum {
| type Optional.None
// ↑ refers to struct None
| type Optional.Some[A]
// ↑ refers to struct Some[A]
}
}
So Some and None represent first-class types,For comparisons with Scala’s case classes and Rust’s proposed enum variant types, see Appendix A1.
not just cases of Optional.This kind of design does not preclude niche optimizations, see Appendix A2.
Optional values support pattern matching:
fn demo0(x: Optional[Str]) -> Str {
if x.case is {
Optional.None {} -> return “Got None”
// _ represents a value being discarded
Optional.Some { value: _ } -> return “Got Some”
} // Compiler checks exhaustiveness for:
// 1. Cases of Optional
// 2. Fields of None
// 3. Fields of Some
}
Even though pattern matching looks like it happens against types directly,
internally it works like tagged union types in other languages,
so exhaustiveness checking is supported,
unlike pattern matching with open class hierarchies.
To reduce verbosity, Everr has some syntax sugar for pattern matching.
Everr supports implicit projection of the field
named case to provide a more familiar syntax.This is the only special case (ahem) where Everr inserts a field projection operation implicitly.
To allow the user to omit the type name in common cases,
Everr supports looking up namespaces
with the same name as the type, so one can use “leading dot syntax”,
similar to Swift etc.
Rewriting the code using the above sugar:
fn demo1(x: Optional[Str]) -> Str {
if x is {
.None {} -> return “Got None”
.Some { value: _ } -> return “Got Some”
}
}
Neat! 😃 Things get more interesting
when Everr’s non-exhaustive enums come into the mix.
@non-exhaustive
enum Dessert {
| Cake { icing : Str }
| IceCream {}
}
The above type declaration desugars to:
namespace Dessert {
@non-exhaustive(fields)
struct Cake { icing : Str }
@non-exhaustive(fields)
struct IceCream {}
}
@non-exhaustive(fields)
struct Dessert {
case: @non-exhaustive(cases) enum {
| type Dessert.IceCream
| type Dessert.Cake
}
}
Here’s how one might write some string conversion functions
for these types in a different project.Everr’s rules around mandatory handling of future cases and fields are similar to the behavior of Rust’s #[non_exhaustive] – they only apply across access control boundaries.
fn cake_to_str(cake: Dessert.Cake) -> Str {
if cake is Dessert.Cake { icing: i, !__ } {
i.append(” cake”);
return i
}
}
There are two sigils here: ! and __.
__ means “ignore any label-value pairs, if present”.
Since Cake has a @non-exhaustive(fields) annotation,
the __ is mandatory, similar to mandatory catch-all
clauses for cases of an enum.
! means “if the next identifier matches one or more known label-value pairs,
or one or more cases, please issue a warning”.This is a generalized version of @unknown in Swift (docs for @unknown).
Since Cake has a @non-exhaustive(fields) annotation, if a new field is
added to it in the future, then a warning will be issued inside
cake_to_str.
This design allows the author of the type to add new
fields without breaking source-level backward compataibility,
while still allowing the user of the type to opt-in to a “notification”
when the type definition changes.
Using the above function, one can write a function to convert
a Dessert to a string.
fn dessert_to_str(d: Dessert) -> Optional[Str] {
if d is Dessert { case: dcase, !__ }
and dcase is { // (1)
.Cake c ->
return .Some { value: cake_to_str(c) }
.IceCream { __ } ->
return .Some{value: “ice cream”} // (2)
!_ -> return .None{} // (3)
}
}
I know there’s a lot going on in the above code example,
so let’s take it step-by-step.
if d is Dessert { case: dcase, !__ }
and dcase is { // (1)
There are a few things going on here:
dcase is a new binding for the case field of d.
It is immediately matched on again after and,
using the aforementioned Ultimate Conditional Syntax.
The __ matches the rest of the fields not covered by the pattern,
and their values are ignored.
Omitting __ would trigger a compiler error,
since Dessert is a non-exhaustive struct.
Alternately, if one didn’t care about a new field being
added to Dessert, then one could directly use if d.case is.
Due to the !, the compiler will issue a warning
if the __ matches one or more fields,
similar to the logic in cake_to_str.
Let’s look at the branch (2):
.IceCream { __ } ->
return .Some{value: “ice cream”} // (2)
Here again, __ means that future fields are being ignored.
The absence of ! means that this code will not issue any
warnings if the IceCream type has new fields in the future.
Lastly, since d.case has a non-exhaustive enum type,
a catch-all pattern is mandatory:
!_ -> return .None{} // (3)
_ means “ignore a single value” similar to other languages.
Due to the !, this line will trigger a compiler warning
if one or more new cases are added to Dessert.
Whew, that was a lot! Here’s an XKCD for a short break.
Let’s continue. Suppose we have two people Alice and Bob
and we want to write two functions to describe which desserts
they like, in the same project as Dessert.If it were in a different project, the initialization syntax would not be available for non-exhaustive types.
Here are their preferences:
Alice likes cake with ganache on top, and also ice cream, but Alice is not open
to trying new desserts.
Bob likes cake with buttercream on top not ice cream. However,
Bob is open to trying new desserts if there are more options in the future.
These preferences can be modeled using Everr’s bounded union types.
First, let’s model Alice’s preference.
pub fn alice_likes() -> Array[Dessert:union[.Cake | .IceCream]] {
return Array.new(
.Cake{icing: “ganache”},
.IceCream{},
)
}
Here, Dessert:union[.Cake | .IceCream] represents an exhaustive (bounded) union type. It implies that values of types Cake and IceCream are allowed, and in the future, even if Dessert gains new cases, those will not be returned. Similar to exhaustive enums, if a caller tries to pattern match on the union value, they can match exhaustively, without needing a catch-all pattern.
Now let’s model Bob’s preference.
pub fn bob_likes() -> Array[Dessert:union+[.Cake]] {
return Array.new(.Cake{icing: “buttercream”})
}
Here Dessert:union+[.Cake] represents a non-exhaustive (bounded) union type. It implies that only values of type Cake are allowed, but in the future, other cases of the top type – in this case, Dessert – may also appear. Similar to non-exhaustive enums, if a caller tries to pattern match on the union value, they must account for new cases being added (including IceCream if Bob changes his mind).
The exhaustiveness of a union type only applies to its cases. When pattern matching, one still needs to be explicit about handling unknown fields (using __ or !__) since both Cake and IceCream have @non-exhaustive(fields).
Since enum cases have first-class struct types, different enums can share cases.
enum BakeryItem {
| Bread {}
| type Dessert.Cake // OK
}
This allows reuse of case types without needing to copy the case definition.
Everr supports traits and trait derivations.
Everr has a built-in UpCast trait meant to represent O(1) conversions
from enum cases to a type containing the enum.
trait UpCast[From, To] {
fn up_cast(_: From) -> To
}
This trait is automatically implemented for enum cases.
So you’d have:
impl UpCast[Dessert.Cake, Dessert] { … }
impl UpCast[Dessert.IceCream, Dessert] { … }
impl UpCast[BakeryItem.Bread, BakeryItem] { … }
impl UpCast[Dessert.Cake, BakeryItem] { … }
Similar to interfaces in other languages, these can also be implemented by hand.
Lastly, Everr supports a delegation mechanism across types.
For example, if you have code like:
struct BakeryProduct {
@delegate
base: BakeryItem
ingredients: Array[Ingredient]
price: Money
}
At usages of the . operator for field access and method calls
on a BakeryProduct value, the compiler first checks if
BakeryProduct has the field or method, and if not,
checks if BakeryItem has the field or method.
At most one field can have a @delegate annotation to maintain
predictability in the face of field re-ordering without needing
ad-hoc tie-breaking rules.
Because cases are represented through a special .case field,
pattern matching directly on a BakeryProduct value will work,
because BakeryItem has a .case field which allows pattern-matching.
Okay, that concludes the tour of Everr’s core language features!
Now let’s talk about Everr’s error model.
Everr’s error model – Overview
Everr’s error model is based on the core observation that the
handling and recovery from different errors is generally
context-dependent, so it provides flexibility and broad mechanisms
for different error handling strategies in different contexts.
There are two modes for error propagation and handling.
Fail-slow error handling: This is done using the standard Result type,
which is an exhaustive enum equivalent to that in Rust and Swift.
Fail-fast error handling: This comes in two flavors:
Recoverable: This is done using panic and catch panic
primitives, similar to panic/recover in Go and
panic!/catch_unwind in Rust. Panicking and panic catching themselves
use memory pre-allocated at program startup to avoid out-of-memory
in the middle of a panic,
while accepting the risk of potentially needing to discard
some relevant data.
Non-recoverable, i.e. program termination. This is done using an
exit primitive.
OS signals are handled using callbacks and potentially
manifested using one of the above depending on the exact signal
and configuration.
Notably, there is no support for asynchronous termination of non-cooperative tasks
via asynchronous exceptions, such as that in Haskell, OCaml or Erlang.
For primitive operations, the defaults are as follows:
Numeric operations on bounded integer types panic on failure,
prioritizing safety over performance.
Assertion failures trigger a panic.
Out-of-memory for the heap aborts the program.
Stack overflow aborts the program.
However, these can be customized; more details on each of them soon.
One important aspect on how Everr code negotiates what is and is not allowed,
is through the designation of certain core language features as capabilities.
Examples of capabilities include heap allocation, panicking,
program termination and foreign function interface (FFI) usage.
Capabilities manifest at the boundaries of packages,
which are Everr’s units of distribution.
Each package has a manifest file, which supports specifying:
The capabilities used by the given package, in addition to the defaults.
Each capability has three levels (other than Unavailable):
Implicit: All code is granted access to the capability, without needing
any annotation.
Explicit: Code which uses the capability must have an explicit annotation.
Binding: Code which uses the capability must have an explicit annotation,
and this annotation is considered part of the API contract;
weakening the annotation is considered a breaking change.
One can optionally configure the standard linter to enforce that
functions which do not use the capability also have an explicit annotation
indicating that the function is guaranteed to not use the capability
in the future or that it reserves the right to use the capability later on.
The capabilities permitted for different dependencies.
If that sounds a bit too abstract, don’t worry,
I’ll explain the capability system later with examples.
First, let’s discuss fail-slow and fail-fast error handling.
Fail-slow error handling
For very basic operations where only a single failure is possible,
(e.g. a map lookup), the failure is exposed using the standard Optional type.
In most other cases, either the Result or Hurdle types are used.Result and Hurdle are recognized specially by the standard Everr
linter, which issues warnings if _ is used to ignore the Fail case the problem field.
// Result represents computations where errors block progress.
//
// Most commonly used for short-circuiting logic.
@exhaustive
enum Result[A, E] {
| Pass { value: A }
| Fail { error: E }
}
// Hurdle represents computations where problems do not
// block progress, but they still need to be bubbled up,
// such as in the form of warnings.
@exhaustive
struct Hurdle[A, E] {
data: A
problem: E
}
For the E type parameter, Everr programmers are recommended
to use domain-specific struct and enum types which define errors.
Since both enums and structs support exhaustiveness annotations,
it is easy to mark error types for future evolution.
Since enums are desugared to structs:
It is possible to add a field common to all cases of an enum,
without breaking source-level backward compatibility.
It is possible to refine a coarse case into multiple sub-cases
by adding a case: enum { … } field to the corresponding case.
For example, say one has a float parsing function which returned
an error with the following type:
@exhaustive(cases)
enum FloatParseError {
| UnexpectedChar { byte_offset: UInt64 }
| NotRepresentable {}
}
Since this is marked specifically as being @exhaustive(cases),
it means that adding new fields is allowed without it being
considered a breaking change.
One could add a field to the NotRepresentable case, and refine it,
without breaking backward compatibility:
@exhaustive(cases)
enum FloatParseError {
| UnexpectedChar { byte_offset: UInt64 }
| NotRepresentable {
// Represents the lower bound on the number of bits
// that would be needed for a floating point type
// to be able to represent the given value.
min_bits_needed: UInt64
case: @exhaustive enum {
| TooSmall {}
| TooLarge {}
}
}
}
Once the type is refined, both of these forms of pattern-matching work.
if err is {
.UnexpectedChar { .. } -> ..
.NotRepresentable { __ } -> ..
}
// or more fine-grained
if err is {
.UnexpectedChar { .. } -> ..
.NotRepresentable nr and nr is {
.TooSmall {} -> ..
.TooLarge {} -> ..
}
}
Case refinement in Everr provides optionality without a-priori
factoring out enum definitions into separate enum and case struct
definitions, unlike typical languages in the ML family.
The cost of additional verbosity is paid when refinements are introduced.
Errors can be propagated using a try operator,
which can be used at different granularities.
– it can be used for one-or-more statements,
or for a specific (tree of) expressions,
including individual call expressions.
enum ABCError {
| type SubError1
| type SubError2
| type SubError3
| type SubError4
}
fn abc() -> Result {
try {
sub_op1(…)
sub_op2(…)
}
let a = try sub_op3(…).some_method(…)
let b = sub_op4(a).@try
return ok(b.other_method())
}
try allows for a single level of automatic conversion of errors,
via the previously mentioned UpCast trait.
This is similar to ? and Try in Rust.
Everr programmers are encouraged to define and use structured error types,
and attach metadata to errors by defining new struct fields.
Structs can be easily serialized
using the same mechanism as for trait derivation,
with minimal boilerplate.
This integrates well with observability libraries such as those
providing APIs for structured logging and tracing.
The Everr language server has a code action for defining error types for a given
function based on the error types of the functions that are called.
It also can intelligently suggest modifications to error cases
as the function body evolves over time, taking contextual rules such
as access control into consideration.
The Everr standard library exposes a standard type for call traces.A call trace covers different kinds of traces such as a stack trace, Zig-style error return trace, as well as async stack traces. While these are all recorded differently, they fall under the same concept (sequence of source code locations which describe “how did we get here”).
Even though the use of structured errors is encouraged,
the Everr standard library provides APIs for working
with unstructured errors.
An AnyErrorContext type which exposes convenient APIs for:
Attaching key-value pairs.
Capturing and recording call traces.
An AnyError type which pairs an error value along with an AnyErrorContext and zero-or-more child AnyError values (similar to error in Go). This exposes convenient APIs for:
Initialization from a specific struct/enum error type.
Merging several sub-errors into a larger error.
Tree traversal and inspection.
Fail-fast error handling
Recoverable fail-fast error handling is done via panics,
which are similar to unchecked exceptions in languages like Java and C++.
Function declarations and types can optionally be annotated with a
@panics attribute which covers whether the function might panic,
it might panic when compiled in development mode
(but never in release mode), or never at all.
Panicking is a capability.
The default capability level for panicking is Implicit, so most
packages do not use @panics annotations.
For packages compiled with Explicit or Binding level for panic marking,
the Everr compiler checks if a function’s @panics annotation
(or lack thereof) matches with the annotations on other
functions that are called by it.
The standard linter recognizes functions with @panics annotations
and recommends adding a section to the function’s API docs
describing when the function might panic.
For the Binding level in particular, the standard linter
has an optional lint requires explicit @panics(maybe)
or @panics(never) annotations on all functions,
to avoid accidental API additions without @panics
annotations (which would prevent the implementation
from using assertions).
The Everr core libraries, including the standard library, use Explicit
panic marking.
A small portion of the Everr ecosystem, chiefly some minimal alternatives
to the standard library and related packages meant to be used in the
context of embedded systems, default to using the Binding level for panic marking.
Similar to panicking, program termination using the exit primitive
is treated as a capability; it has a matching @exits attribute.
However, unlike panicking, the default level for the termination capability
is Explicit, as it is fairly rare to require it in library code.
Primitives – Bounded integer operations
In Everr, numeric operations on bounded integer types panic on overflow.
One can opt-in to alternate behavior on overflow (commonly wrapping)
at the granularity of a package, one or more statements,
or an expression. Similar to try, this operates at a syntactic level.
let sum_plus_one = @math.wrap { 1 + vec.sum() }
// Addition semantics in the sum() call itself are unaffected.
When overflow behavior is overriden at the package level,
the Everr language server supports showing inlay hints
for the overflow behavior at function granularity.
Primitives – Assertions
Everr’s assertions are customizable.I’m deliberately not describing the exact APIs for assertions here as there is some room for building interesting APIs, such as sometimes assertions.
By default, an assertion failure triggers a panic.
However, a binary package can customize the semantics of assertions,This customizability introduces some more complexity when considering the interaction with default capability levels and capability checking. I’ve not fully thought through the ramifications, but my gut feeling is that this is a solvable problem.
by specifying an “assertion overlay” in its package manifest.
This acts as an override for the default assertion related APIs,
so every package in the dependency graph ends
up using one’s custom implementation of assertions.
The two most commonly used assertion overlays are:
Profiling overlay: This helps collect metrics related to individual
assertions in production.
Toggling overlay: This helps disable assertions in dependencies,
either statically or dynamically.
Out-of-memory handling
By default, out-of-memory for the heap results in program termination.
Heap usage is a capability, with the default level of Implicit.
Primitives which utilize heap allocation have fallible alternatives,
making it possibleThe design of ergonomic and performant APIs for using custom allocators while maintaining memory safety (or, at least, trying to reduce the number of footguns) is an open problem with several competing approaches. For more details, see Section 6 and Appendix A4. to build APIs on top without having to rely on
heap allocation always being successful.
Stack overflow handling
By default, stack overflow results in program termination.
Recursion and usage of indirect calls are both considered capabilities,
with the default level of Implicit.
Code which cannot afford to have a stack overflow,
such as code following
NASA’s rules for safety-critical code,
can easily disable both of these features,
allowing the potential call graph of the program to be statically computed.
This allows computing the total stack usage of the program at the time
of compilation.
Stack usage is not guaranteed to be stable across minor or patch versions
of the Everr compiler, but since qualifying the compiler for safety-critical code
tends to be a time-consuming affair that is less frequent than compiler
releases, this is considered an acceptable trade-off.
For an alternative design which allows for more expressivity at the cost
of more implementation complexity, see Appendix A5.
Section 5: Error models in the wild
The Swift Error Handling Rationale and Proposal
doc I linked earlier covers the error model followed in
C, C++, Objective-C, Java, C#, Go, Rust and Haskell,
so I will not elaborate on those much more here.
Some older statically typed languages missing from that list include
D, OCaml and Ada. Some newer languages understandably missing include
Pony, Nim, Zig, Odin, Roc, Jai and Hare.
Scala does not market itself as a systems programming language,
but its type system offers interesting ways of expressing errors,
so it is included below. Other languages such as Dart, Kotlin,
Gleam and Unison are excluded because
I did not see any unusual or interesting ideas related to error
handling in their docs, and they seem to be more targeted towards
applications with less stringent performance requirements.
(EDIT: Mar 09, 2025) Common Lisp is not statically typed, but supports
an interesting system for resumable exceptions so I’ve included it as well.Thanks to feedback from Manuel Simoni (@manuel) and Tony Finch (@fanf) on on Lobste.rs and Gavin Howard (@GavinHoward) on Hacker News.
I’m going to try to summarize the error models very quickly here,
based on the language’s own docs. My summaries should not be considered
authoritiative, especially for pre-1.0 languages where conventions
are probably still in flux, and docs are perhaps more likely to be
out-of-date.
D
As of Feb 2025, the official D docs
recommend using unchecked exceptions as the primary way of communicating errors.
However, at DConf 2020,
Walter Bright, the creator of D, stated that he thinks that
exceptions are obsolete.
Other noteworthy aspects:
Functions can be marked nothrow.
The type Exception
is the base class for all “errors that are safe to catch and handle”.
Throwing an Exception is not allowed in nothrow functions.
The type Error
is the base class of “all unrecoverable runtime errors”.
Throwing an Error is allowed in nothrow functions.
The default assert function/built-in throws an Error.
OCaml
Error propagation in OCaml comes in several flavors:
Unchecked exceptions – These are widely used by the standard library.
Notably, assertion failures get converted to exceptions.
option and result types – Increasingly more library APIs have alternate
variants which return option or result in the failure case
rather than an exception.
Async exceptions are used for out-of-memory, stack overflow and
for OS signals like SIGINT.
Additionally, OCaml supports syntax sugar
for short-circuiting evaluation/chaining of operations using option and result.
The official OCaml docs are probably
best described as being somewhat neutral in terms of prescribing a particular
error propagation strategy, but they do state:
It tends to be considered good practice nowadays when a function can fail
in cases that are not bugs (i.e., not assert false, but network failures,
keys not present, etc.) to return type such as ’a option or (’a, ’b)
result (see next section) rather than throwing an exception.
The book Real World OCaml
recommends using the Base library by Jane Street
in addition to the standard library. This library includes additional
helper types such as Error.t – a lazy string with various helper functions
for ease of construction – and helper functions such as Or_Error.try_with
to convert an exception throwing computation to a result.
Real World OCaml concludes:
If you’re writing a rough-and-ready program where getting it done quickly is key and failure is not that expensive, then using exceptions extensively may be the way to go. If, on the other hand, you’re writing production software whose failure is costly, then you should probably lean in the direction of using error-aware return types.
[..] If an error occurs sufficiently rarely, then throwing an exception is often the right behavior.
Also, for errors that are omnipresent, error-aware return types may be overkill. A good example is out-of-memory errors, which can occur anywhere, and so you’d need to use error-aware return types everywhere to capture those.
Ada
Ada supports unchecked exceptions using values of a specific
type exception that can optionally carry a string payload.
If I understand correctly, exceptions have a notion of identity that is separate from the string payload.
The pre-defined exceptions include:
Constraint_Error of out-of-bounds accesses, overflow, null dereference etc.
Storage_Error for allocation failure and stack exhaustion
In one Stack Overflow discussion, I found two different practitioners stating:
“The reason for [not having a way to chain exceptions] is that the language has been designed for exceptions to be used scarcely”
“Ada exceptions are expected to be used for truly exceptional problems.”
However, the Ada style guide
on Wikibooks has a guideline “Use Exceptions to Help Define an Abstraction”.
One of the code examples in this section is a stack where the Pop operation
raises an exception when the stack is empty.
I/O operations in the Ada standard library pervasively use exceptions.
I was not able to verify if there are other widely used standard library alternatives.
GNAT (Ada compiler toolchain in GCC) supports several settings for
restricting exception usage such as
No_Exception_Handlers,
No_Exception_Propagation,
and No_Exceptions.
For example, No_Exception_Propagation requires functions to handle exceptions in callees.
Scala
Scala 3 supports using unchecked exceptions
as well as a tagged union type Try
which enables easily catching exceptions and converting them to a sum type representation.
Some popular libraries such as Li Haoyi’s libraries for IO use exceptions
for errors such as a file not being found.
I’m guessing the more FP-oriented parts of the ecosystem probably use
case classes and/or Result to a greater extent, but I was not able to validate
this within 5~10 minutes of searching.
Nim
Based on a quick skim of the Nim docs, I surmised the following:
Nim supports exception handling,
and this is used by standard library APIs such as for opening files.
Nim supports raises annotations on functions which indicate the types of
exceptions a function may throw (e.g. {.raises: [ParseError]}).
Nim supports a way to mandate explicit raises annotations across a module,
by putting {.push raises: [].} at the start of a module.
This setting ignores exceptions which inherit from system.Defect,
which is Nim’s way of signaling errors such as division by zero
and assertion failure.
Nim also has optional and result types.
The Nim tutorial section on exceptions states that
A convention is that exceptions should be raised in exceptional cases,
they should not be used as an alternative method of control flow.
The rest of the tutorial is descriptive –
it covers how to use different language features related to exceptions.
This unofficial Nim style guide
has more detailed recommendations on modeling and handling errors.
I was unable to find similar prescriptive language in
the official Nim docs.
The below languages are pre-1.0 as of March 2025.The exception is Odin which currently uses date-based versioning. I’m assuming that as being equivalent to pre-1.0 in terms of stability guarantees given that Odin is less than 10 years old.
Pony
The Pony tutorial has a page on the error expression,
which allows a function to abort execution until the enclosing try block.
All partial functions must be annotated with ?.
As far as I can tell, there is no way to attach any data
when using the error statement.
If that is correct,
it means that the error primitive is similar
to an optional type which must be explicitly unwrapped.
Looking at unofficial sources,
based on this blog post
and StackOverflow discussion on distinguishing between different types of errors,
it seems like the most common way is to do
error handling in Pony is using union types (e.g. Int64 | OverflowError).
As of Feb 2025, I could not find any official docs
further explaining how errors ought to be modeled and
how error handling should to be done, outside of the
performance cheat sheet
which recommends avoiding error
and union types in performance-sensitive code,
both for different reasons.
Based on the docs, it looks like union types are compiled
using a type ID pointer as the tag
plus implicit boxing for the value.This makes sense, as it allows implementing the natural subtyping relationship T Optional[A]:union[.None] {
// implementation elided
}
fn _some_to_union[A](v: Optional.Some[A]) -> Optional[A]:union[.Some[A]] {
// implementation elided
}
Since the generation of these functions is entirely under the compiler’s
control, it is possible for the compiler to directly generate specialized
versions of these functions for different types with different niches
during monomorphization, instead of attempting to generate one version.
This means that when a compiler sees an implicit conversionThis is syntax-directed implicit conversion rather than being a subtyping rule, because it requires a change to the run-time representation. If it were a subtyping rule, and Everr were to support covariant and contravariant type parameters, that would require potentially-arbitrarily-expensive bridging conversions, like Swift. from
None to Optional[A]:union[.None], it can specialize the conversion
based on what A is instantiated to.
Appendix A3: Hare type system discussion
The Hare docs state that tagged union types are “commutative, associative, and reductive”
and that “the order of types, inclusion of the same type more than once,
or use of nested tagged unions has no effect on the final type.”
So Hare’s tagged unions implement union types in the type-theoretic sense,
not sum types, unlike most other languages with tagged unions.
Hare does not support parametric polymorphism,
so the choice to implement union type semantics via tagged unions:
Offers a high degree of flexibility, like Scala, Pony and TypeScript,
by not requiring upper bounds (unlike Everr)
and allowing the union of arbitrary types (unlike Everr).
Does not require a uniform representation,
like Everr but unlike Scala, Pony and TypeScript.
Does not increase type system complexity significantly, because
there is no need for subtyping rules involving type inference
and polymorphism.
Appendix A4: Safe and modular memory management
The research work on Verona has surfaced one potential direction
Reference Capabilities for Flexible Memory Management:
Verona is a concurrent object-oriented programming language
that organises all the objects in a program
into a forest of isolated regions.
Memory is managed locally for each region,
so programmers can control a program’s memory use by
adjusting objects’ partition into regions,
and by setting each region’s memory management strategy.
The overall type system is more flexible than Rust’s in some ways,
but if you look closely, the paper points out that
field accesses may throw an exception if the region
corresponding to the field is already open. If I understand
correctly, this is similar to Rust’s Cell type which
does dynamic borrow-checking.
Entering a region borrows and/or buries the variable
or field referencing the bridge object.
In the case of a stack variable,
the variable is buried to prevent the region from being multiply opened
In the case of a field, we instead resort to a dynamic check of the region’s state.
I have not been able to absorb the paper deeply,
but such a design seems potentially concerning.
Technically, this could be replaced with a more local form of
returning an error, but it’s unclear how widespread the need
for this would be in typical programs.
Appendix A5: Modular static analysis for stack usage
I believe it should be possible to support modular static analysis
for controlling stack usage without requiring eliminating indirect calls,
which can be useful with basic operations like map, filter etc.
The problem with an indirect call is that the stack usage for it will
be unknown. So the most direct “fix” is to equip calls with stack usage information.
Specifically, function types could be equipped with two kinds of stack usage budgets:
Self budget: The maximum memory the function is allowed to use for temporaries.
Calls budget: The maximum stack usage for calls inside the function.
When a function body is lowered to an IR
which makes temporaries explicit,
after some set of relatively stable baseline optimizations
(e.g. sparse conditional constant propagation and copy propagation, but no inlining),
a compiler can check the following:
Does the self budget exceed the sum of stack usage for all temporaries assuming
no lifetime contraction/further optimization.
Does the calls budget exceed the total stack usage budget for each called function
(these calls may be indirect)
Finally, after inlining and other optimizations but before generating assembly,
one could perform a validation check only for the total budget
(but not for the self and calls budgets, because of inlining).
In such a system, if you annotate main with a stack budget,
then you’d essentially trigger errors for each function call inside main
and so on until you’ve added stack budgets
for every function in the call graph.
Yes, this would necessitate writing your own
stack usage aware minimal standard library.
I believe such a system should be “workable” in practice in the limited
sense that compiler optimizations typically do not increase stack usage
of call trees, and the number of temporaries generally goes down
as the optimization crank is turned more. So I suspect that the final
validation check should fail not too often.
Depending on the desired properties about where errors should be handled
(e.g. is it OK to emit errors after monomorphization?),
and which language features need to be supported in concert with stack budgets
(e.g. is it OK to only allow setting budgets on monomorphic functions?), one could:
Potentially have the budget checks run on a polymorphic IR instead of
the post-monomorphization IR
Allow the budget to not just be an integer, but a more general expression,
allowing references to stack budgets of parameters, some basic numeric operations
like + etc.
I suspect that it’s not really possible to have a much simpler solution than
what I’ve described above unless one is willing to give up on (1) modularity
or (2) move the check to be dynamic.
Of course, one might ask: is this much complexity worth it “just” for statically
preventing stack overflows in a modular fashion?
For that, the answer is I don’t know.
If we believe existing languages,
the answer seems to largely be No.
Appendix A6: Defining new types using diffs
Say Everr supported type diffs. This could help reduce the boilerplate
involved in extending a type defined upstream from:
@exhaustive
struct DetailedStoringError {
@delegate
base: ImageProcessingError.StoringError,
dbName: Str,
dbURL: URL,
}
@exhaustive
enum DetailedImageProcessingError {
| type ImageProcessingError.DownloadingError
| type ImageProcessingError.ProcessingError
| type DetailedStoringError
}
to something like:
@exhaustive
enum DetailedImageProcessingError diff ImageProcessingError {
| …
* | base: StoringError -> DetailedStoringError {
…,
+ dbName: Str,
+ dbURL: URL,
}
}
Specifics of concrete syntax aside, it would technically be possible
to parse the syntax below and desugar it to
the one above. This may be particularly useful if there are lots of cases
in the upstream type being extended.
The Everr language server could show the desugared version as a large multi-line
“inlay hint” inside the editor.However, this runs into the further issue of how to represent the upstream type if the package containing the definition of the upstream type is not pinned to a specific version.
However, because this implicitly adds a @delegate attribute to preserve
the ability to do field projections and method calls,
chaining such type definitions makes it easy
to have multiple levels of delegation,
which can be confusing to debug (similar to deep inheritance hierarchies).
The interaction with the implicit project for .case also needs
to be thought through, and may be confusing.
On the other hand, forbidding multiple levels of definitions of
type diffs may be too restrictive if the target use case also needs
to cover code which needs to support 3+ versions of a data type over time.
This is only considering additive diffs. Subtractive diffs are likely
simpler – because @delegate cannot be involved without breaking abstraction
– but also less useful, since it’s more common to want
to create extended versions of third-party types in practice.
Appendix A7: Optimism
Recently at work, I discovered that my assumption
that our DBs was configured so that queries which
are running fast would continue running fast
was proven wrong as the DBs went through a Postgres major version upgrade,
and the loss of statistics
– despite having autovacuum turned on
– contributed to an incident.
The incident was resolved late on a Thursday night;
I was due to go on vacation the subsequent week.
When on vacation, when I had some time to kill,
I spent digging around mailing list threads,
Postgres source code, blog posts and asking Claude,
attempting to answer the question
“What are all the different possible situations
under which Postgres can have statistics that are
woefully out-of-date despite autovacuum being turned on,
and how can those be detected?”
When I came back, I noticed that nobody had really asked the same question,
at least in public. At first, I was puzzled,
“Surely, people are not just hoping that
this doesn’t happen again, right?”
Then I realized I was operating with a different mental model
altogether. My trust in Postgres’s ability to maintain
statistics had gone from confident to low (and the scarce
documentation did nothing to allay that fear),
whereas my colleagues were believing that
this was likely a one-off issue specific to major version upgrades.
Both of these beliefs make some sense in different ways.
The more optimistic point-of-view assumes
that given that we did not have any such statistics related
issues earlier when running Postgres 12,
and that Postgres upgrades generally bring improvements,
it was likely that autovacuum in Postgres 16 is
at least as good and less buggy than Postgres 12.
The more pessimistic point-of-view assumes given the presence
of one undesirable behavior in the autovacuum daemon that
hadn’t yet been fixed despite Postgres being one of the most
mature and widely used DB systems in existence,
it was possible that more undesirable
behaviors in the same area were still lurking,
just waiting to be hit.
When I was thinking about this, I was reminded of
The Mythical Man Month, where in the titular chapter,
Brooks devotes the very first section to discussing
programmers’ optimism.
All programmers are optimists. [..]
In a single task, the assumption all will go well has
a probabilistic efect on the schedule. It might indeed go
as planned, for there is a probability distribution for
the delay that will be encountered, and “no delay” has a
fine probability. A large programming effort, however,
consists of many tasks, some chained end-to-end.
The probability that each will go well becomes vanishingly small.
Here, Brooks is discussing optimism it in the context of planning.
Overall, the section is largely anecdotal and speculative about causes.
This point about optimism in the context of planning has been
repeated by Kent Beck and Jeff Atwood.
However, the planning fallacy is common across professions.
I’m curious: is there research showing programmers tend to be more optimistic than other professions? (🧠: “Am I the weird one?**)
And does this optimism apply when reasoning about error cases, and how programmers adjust their trust levels in particular bits of code as they discover bugs? I searched for a bit but didn’t get much, but if you’re reading this and know some research in this area, please let me know. 😃
Appendix A8: Resumable exceptions, continuations and algebraic effects
(EDIT: Mar 9, 2025): This section was added based on discussion in Lobste.rs and Hacker News threads.
A condition system like that of Common Lisp may be thought of as being similar
to passing a structure mapping prompts to handlers implicitly
to various functions, and at the point of a resumable exception,
the appropriate handler must be identified based on the provided
prompt and which existing handlers are available.
Since Common Lisp is dynamically typed, the choice for which handler
is invoked must happen at run-time.
My understanding (potentially wrong) is that this is not too dissimilar
from algebraic effects, except that in the case of algebraic effects,
one can optimize this further, such as not needing to search through
the evaluation context (e.g. as in Generalized Evidence Passing
for Effect Handlers).
Based on the Rust PR for the removal of the condition system in Rust,
it seems like the condition system originally in Rust
did not track “which handlers are needed in this context”.
Technically, this particular problem might have been solvable using
the recent research on algebraic effects.
In the absence of passing type information to the point of resumption,
some form of dynamic type checking is necessary.
This presents two broad options in terms of language design:
Allow data types containing borrowed references to be passed
to a condition handler (i.e. non-‘static values in Rust).
This may require reifying lifetime/ownership/region information
at runtime for run-time type-checking, depending on the exact
rules of run-time type-checking.
Require that resumption functions only be able to deal with
owned values or references guaranteed to live arbitrarily long.
Depending on threading/task structure, further restrictions may be
needed. For example, can the resumption function capture “owned” values?
If so, what if it destroys them after/during the first invocation?
Can the resumption function be invoked concurrently?
Rust has several traits for expressing various kinds of functions
– Fn, FnMut, FnOnce. Function invocation under concurrency
further involves the Send and Sync traits.
In essence, if one desires type safety and memory safety,
if we consider the structure containing the “handlers”
as described at the start of this section,
the types for the functions involved
would need to be checked either dynamically
(à la Common Lisp or pre-1.0 Rust)
or statically (à la algebraic effects) at the point of resumption.I believe this point applies even if the exact implementation is not based on literally passing an extra hidden parameter.
However, given the complexities of function types in both Rust and Swift
(as two examples of languages which ensure data race freedom while
targeting low-level use cases),
it seems that implementing either of these options is unlikely
to be palatable to programmers.
As far as I know, there are no widely deployed programming languages
integrating algebraic effects with lifetimes/ownership/borrowing.
The closest work I know related to this is that on
Granule,
but I do not know enough about it to comment on its limitations.
As of Mar 2025, the project homepage mentions that the type-checker
uses the Z3 SMT solver, which is not a common practice outside
of theorem provers.
GIPHY App Key not set. Please check settings