Origin's Blog
Source of Error and How Modern Error Handling Deals With It
The greatest and most hated thing whenever writing any application is
always, when something unexpected happens. Let it be uninitialized
memory, some undefined reference in current context (for JavaScript
the difference between a normal function and an arrow function),
whenever something throws somewhere without any knowledge of where and
why or if it is due to invalid state. It is especially a nightmare to
debug when using a dynamic language, wherever I like to fallback to use print
statements, as they are more clear and easier to use than any debugger
I have come across. To catch the potential source of error
is not always possible, because they might be raised from other
vendor packages, which are not always editable, other than if you have
statically included them into your app. But I get carried away, let’s make a small
detour of how we got here:
The dinosaur way of exception handling
For the first sentence I will quote on a post from Wikipedia (Link):
The first hardware exception handling was found in the UNIVAC I from 1951. Arithmetic overflow executed two instructions at address 0 which could transfer control or fix up the result.
Whilst not truly scientific, this was the set-in-stone meaning of how we would think of what an exception is and how we deal with them. Either could we continue normal execution by using a default value
for or we started an exception routine that uses another method of obtaining a result or notify the user about the error. But back these days everything was done directly on
the machine with interrupts. Some years later and the application of computers mainly for scientific and business users, IBM came up with a way to combine both groups to slowly replace
FORTRAN
and COBOL
in hopes of having a one-size-fits-all programming language (It didn’t work). Then 1964 came and a formal definition of PL/I
(Programming Language I) appeared,
that was the first language which by design included a way to handle exceptions during run time. Now it wasn’t great, it was even worse than what we have today, because its exception
routine included a keyword ON
, with which you could modify variables asynchronously during exception handling, but it was unseen by the caller, resulting in unpredictable run time behavior.
This made it difficult to predict a variables value, because it may be modified during exception. Lisp 1.5
has the concept of the ERROR
pseudo function with its matching
ERRORSET
to catch the exception, that will either contain the routines evaluated value or start the diagnostic routine if argument m
is set to true. Overall the ERRORSET
value
will be NIL
in case of error.
Later on in 1972, MacLisp
(The name is derived from Project MAC “Project on Mathematics and Computation”, it has nothing to do with the company Apple) improved on this concept by introducting our beloved
keywords THROW
and CATCH
to the language. It behaved exactly how one would expect, in case of exception, the error will be thrown, which cascades until a catch is found. Its general cleanup
routine then is called finally, which at that time was more of a concept, rather than a keyword. But NIL
(New Implementation of LISP) made it integrated by introducing UNWIND-PROTECT
, which
was integrated into Common Lisp.
Let’s move forward, take a look at post-dinosaur
After the initial release of C
in 1972 by Dennis Ritchie, it did not include a direct way of error handling. The mentallity at that time was, that a programmer is expected to prevent
errors from occurring in the first place by testing the return values from functions. Everything that can go wrong, should be clearly indicated in its documentation, i.e. that a NULL
or some other value is returned in case of error. Additionally an external variable called errno
can be set to indicate the error code and to be able to convert it into its const char*
representation by calling strerror(errno)
. This is the most janky solution that someone has ever come up with! Introducing a global variable to indicate the error code of the last
operation! But in case of C
it makes sense. No exception routine is needed, while everyone has access to the global error variable to read / write into. As long as no bad actor is
randomly overwriting the variable, this way of error handling is quite cheap. Furthermore, at least holding true for POSIX
compatible systems (mainly UNIX), signals may be used to
indicate errors on operating system level. Even though our program will most likely not recover from a signal raise, we can at least try to properly cleanup to prevent memory leaks.
Of course more modern languages, such as C++, Java, C#, PHP, Prolog, Python
etc. have built-in syntactic support for exceptions and exception handling. They all provide a mechanism for
throwing and catching exception objects / values inside your program, during run-time. We also can concatenate multiple catch statements that are catching specific types of errors that
are handled differently, i.e. we can differentiate between an EmptyLineException
and InvalidArgumentException
. This type of error handling is also called terminating
because unhandled,
it terminates your program.
Another concept of error handling is resumption
which resumes execution at exactly the same point the error occurred. So it retries the exact same subroutine calls a bunch of times until
it returns an expected value. While sounding great in theory, in practice it results in many slow down, at every point that something unexpected happened (Read
Butler Lampson - A Description of the Cedar Language, it includes a RESUME
keyword for signal handling).
Now take a look at modern languages
Even though we like to keep with whatever it is that we feel
comfortable with, we have to admit that in the future there will
always be answers to the problems we have today. Something that in
context of error handling set its step into the main-stream world is
Go
. Designed to circumvent pain points of already established tools,
with simple design, slick syntax and fast build times (Read Go at
Google: Language Design in the Service of Software
Engineering). Its syntax
enabled developers to think differently when it came to errors in
the language.
First and foremost, there is no concept of exception facility in
Go
, meaning no error handling control structure associated with
it. Although it includes a way to crash, with a function panic
and a
matching recover
for recovering from a panic, these functions are
rarely used and are not recommended in most cases. Instead Go
provides an error
interface with which a (most likely)
human-readable error can be returned for comprehension. These errors
are mostly returned as a second value from a function, which must be
used (i.e. checked) explicitly. If we want to ignore the error, we
would have to rename it to an underscore. Otherwise the application
will not compile. This is an instance of where we are forced by the
language design choice, to handle each and every error we get. This
way, we always know where an error has occurred, and what our recover
strategy is. This becomes pretty useful when we try to abstract things
into functions, we can compress the amount of checks we have to do to
check for unexpected behavior. Additionally, for converting, let’s
say, interface{}
to a struct
or primitive, we always get a boolean
value that indicates, if the conversion succeeded or not. This way we
can either default to a value or return a custom made error message
for that case.
For all of that, many people think of this way as still too verbose,
why do we have to always check for err != nil
for any returned
value? That may seem tedious, even though Go
provides a short-hand
syntax for exactly that specific case. But still, can we do better or
at least, save on operator usage?
The pinnacle of error handling (Mostly)
After shortly looking over how Go
made its way to solve the problem
of solving exceptions and undefined behavior, we now take a look at
the peak of the peak. The hottest, greatest and latest language that
all developers seem to talk about is Rust
. The case for Rust
in
terms of error handling is quite interesting, as it probably took its
inspiration from functional languages, but don’t quote me on that one!
For anything and everything that potentially can go wrong or that will
go wrong, we have wrapped types into a Result<T, E>
enum, containing
its destination type T
and error type E
. We cannot use the
Result
directly, we have to specify an action of how we want to
retrieve the data. That means if any error occurs, we can choose
exactly what to do. We can either crash the program with the usage of
.unwrap()
, return a default value of type T
with .unwrap_or(e: T)
,
return a default value of type T
, inferred by its trait Default
implementation .unwrap_or_default()
or we can even compute something of
value T
by a parsed callback function F
with .unwrap_or_else(op: F) where F: FnOnce(E) -> T
.
All of these will check if our value is not of type Err
and return its
underlying type T
. But Rust
is more flexible than that. We can
actually propagate the error to stop execution at any given point and
return its Err
Result. This way, a calling function must handle the
error, until we find someone, that is willing to deal with it.
I could go on and talk more about how Rust
tackles errors and how
its Result enum is flexible in usage and what it can do. This monadic
approach to error handling is pretty unique and powerful, it is something
that I have only ever dealt with in Rust
, although they also exist in Haskell
as well as OCaml
(Read Enum std::result::Result
Documentation Page).
Summary
We have looked slightly into the history of how exception handling has evolved over time and from what concepts we have moved on, to what we have today. I hope you could learn a thing or two from this post and that it may makes you think more about exceptions in your program, they are a feature and not a bug.
Sources
- Wikipedia Link to Exception Handling (Programming)
- Wikipedia Link to Exception Handling
- Wikipedia Link to PL/I
- Wikibooks Link to C Programming/Error Handling
- Uni Stuttgart Link to Lisp 1.5 Manual