Origin's Blog

Source of Error and How Modern Error Handling Deals With It

The greatest and most hated thing whenever writing any application is always, when something unexpected happens. Let it be uninitialized memory, some undefined reference in current context (for JavaScript the difference between a normal function and an arrow function), whenever something throws somewhere without any knowledge of where and why or if it is due to invalid state. It is especially a nightmare to debug when using a dynamic language, wherever I like to fallback to use print statements, as they are more clear and easier to use than any debugger I have come across. To catch the potential source of error is not always possible, because they might be raised from other vendor packages, which are not always editable, other than if you have statically included them into your app. But I get carried away, let’s make a small detour of how we got here:

The dinosaur way of exception handling

For the first sentence I will quote on a post from Wikipedia (Link):

The first hardware exception handling was found in the UNIVAC I from 1951. Arithmetic overflow executed two instructions at address 0 which could transfer control or fix up the result.

Whilst not truly scientific, this was the set-in-stone meaning of how we would think of what an exception is and how we deal with them. Either could we continue normal execution by using a default value for or we started an exception routine that uses another method of obtaining a result or notify the user about the error. But back these days everything was done directly on the machine with interrupts. Some years later and the application of computers mainly for scientific and business users, IBM came up with a way to combine both groups to slowly replace FORTRAN and COBOL in hopes of having a one-size-fits-all programming language (It didn’t work). Then 1964 came and a formal definition of PL/I (Programming Language I) appeared, that was the first language which by design included a way to handle exceptions during run time. Now it wasn’t great, it was even worse than what we have today, because its exception routine included a keyword ON, with which you could modify variables asynchronously during exception handling, but it was unseen by the caller, resulting in unpredictable run time behavior. This made it difficult to predict a variables value, because it may be modified during exception. Lisp 1.5 has the concept of the ERROR pseudo function with its matching ERRORSET to catch the exception, that will either contain the routines evaluated value or start the diagnostic routine if argument m is set to true. Overall the ERRORSET value will be NIL in case of error.

Later on in 1972, MacLisp (The name is derived from Project MAC “Project on Mathematics and Computation”, it has nothing to do with the company Apple) improved on this concept by introducting our beloved keywords THROW and CATCH to the language. It behaved exactly how one would expect, in case of exception, the error will be thrown, which cascades until a catch is found. Its general cleanup routine then is called finally, which at that time was more of a concept, rather than a keyword. But NIL (New Implementation of LISP) made it integrated by introducing UNWIND-PROTECT, which was integrated into Common Lisp.

Let’s move forward, take a look at post-dinosaur

After the initial release of C in 1972 by Dennis Ritchie, it did not include a direct way of error handling. The mentallity at that time was, that a programmer is expected to prevent errors from occurring in the first place by testing the return values from functions. Everything that can go wrong, should be clearly indicated in its documentation, i.e. that a NULL or some other value is returned in case of error. Additionally an external variable called errno can be set to indicate the error code and to be able to convert it into its const char* representation by calling strerror(errno). This is the most janky solution that someone has ever come up with! Introducing a global variable to indicate the error code of the last operation! But in case of C it makes sense. No exception routine is needed, while everyone has access to the global error variable to read / write into. As long as no bad actor is randomly overwriting the variable, this way of error handling is quite cheap. Furthermore, at least holding true for POSIX compatible systems (mainly UNIX), signals may be used to indicate errors on operating system level. Even though our program will most likely not recover from a signal raise, we can at least try to properly cleanup to prevent memory leaks.

Of course more modern languages, such as C++, Java, C#, PHP, Prolog, Python etc. have built-in syntactic support for exceptions and exception handling. They all provide a mechanism for throwing and catching exception objects / values inside your program, during run-time. We also can concatenate multiple catch statements that are catching specific types of errors that are handled differently, i.e. we can differentiate between an EmptyLineException and InvalidArgumentException. This type of error handling is also called terminating because unhandled, it terminates your program.

Another concept of error handling is resumption which resumes execution at exactly the same point the error occurred. So it retries the exact same subroutine calls a bunch of times until it returns an expected value. While sounding great in theory, in practice it results in many slow down, at every point that something unexpected happened (Read Butler Lampson - A Description of the Cedar Language, it includes a RESUME keyword for signal handling).

Now take a look at modern languages

Even though we like to keep with whatever it is that we feel comfortable with, we have to admit that in the future there will always be answers to the problems we have today. Something that in context of error handling set its step into the main-stream world is Go. Designed to circumvent pain points of already established tools, with simple design, slick syntax and fast build times (Read Go at Google: Language Design in the Service of Software Engineering). Its syntax enabled developers to think differently when it came to errors in the language.

First and foremost, there is no concept of exception facility in Go, meaning no error handling control structure associated with it. Although it includes a way to crash, with a function panic and a matching recover for recovering from a panic, these functions are rarely used and are not recommended in most cases. Instead Go provides an error interface with which a (most likely) human-readable error can be returned for comprehension. These errors are mostly returned as a second value from a function, which must be used (i.e. checked) explicitly. If we want to ignore the error, we would have to rename it to an underscore. Otherwise the application will not compile. This is an instance of where we are forced by the language design choice, to handle each and every error we get. This way, we always know where an error has occurred, and what our recover strategy is. This becomes pretty useful when we try to abstract things into functions, we can compress the amount of checks we have to do to check for unexpected behavior. Additionally, for converting, let’s say, interface{} to a struct or primitive, we always get a boolean value that indicates, if the conversion succeeded or not. This way we can either default to a value or return a custom made error message for that case.

For all of that, many people think of this way as still too verbose, why do we have to always check for err != nil for any returned value? That may seem tedious, even though Go provides a short-hand syntax for exactly that specific case. But still, can we do better or at least, save on operator usage?

The pinnacle of error handling (Mostly)

After shortly looking over how Go made its way to solve the problem of solving exceptions and undefined behavior, we now take a look at the peak of the peak. The hottest, greatest and latest language that all developers seem to talk about is Rust. The case for Rust in terms of error handling is quite interesting, as it probably took its inspiration from functional languages, but don’t quote me on that one! For anything and everything that potentially can go wrong or that will go wrong, we have wrapped types into a Result<T, E> enum, containing its destination type T and error type E. We cannot use the Result directly, we have to specify an action of how we want to retrieve the data. That means if any error occurs, we can choose exactly what to do. We can either crash the program with the usage of .unwrap(), return a default value of type T with .unwrap_or(e: T), return a default value of type T, inferred by its trait Default implementation .unwrap_or_default() or we can even compute something of value T by a parsed callback function F with .unwrap_or_else(op: F) where F: FnOnce(E) -> T. All of these will check if our value is not of type Err and return its underlying type T. But Rust is more flexible than that. We can actually propagate the error to stop execution at any given point and return its Err Result. This way, a calling function must handle the error, until we find someone, that is willing to deal with it.

I could go on and talk more about how Rust tackles errors and how its Result enum is flexible in usage and what it can do. This monadic approach to error handling is pretty unique and powerful, it is something that I have only ever dealt with in Rust, although they also exist in Haskell as well as OCaml (Read Enum std::result::Result Documentation Page).

Summary

We have looked slightly into the history of how exception handling has evolved over time and from what concepts we have moved on, to what we have today. I hope you could learn a thing or two from this post and that it may makes you think more about exceptions in your program, they are a feature and not a bug.

Sources