Errors: Exceptions: General Principles

From Matt Morris Wiki
Jump to navigation Jump to search

Programming - Errors


Here are some general guidelines that you should follow when using exceptions.

At The Point Of The Error

Don't Throw An Exception If You Don't Need To

If an operation is expected to fail sometimes, provide a return value API so people can test for failure and take appropriate action. Don't throw an exception for something that happens in the normal course of execution and can possibly be handled straight away.

As an example, imagine a class called Cache, with a method "getEntry" that will throw an exception of type CacheKeyException if the relevant entry is not in the cache:

class Cache
{
   const CacheEntry& getEntry(const CacheKey& key) const;
} 

A programmer needs to write code that adds an entry to the cache if it isn't already present. One way to enable this would be to add a method "hasEntry" to the cache, allowing users to test for a key:

class Cache
{
   const CacheEntry& getEntry(const CacheKey& key) const;
   bool hasEntry(const CacheKey& key) const;
} 

The programmer, however, realises that they don't have to write this extra method - they can just test for an exception in response to "getEntry()":

Cache c;
Key testKey;

// some code here ...

bool addEntry = false;
try
{
   c.getEntry(testKey);
}
catch( CacheKeyException& ex )
{
   addEntry = true;
}
if( addEntry )
{
   // code to add entry to cache goes here
}

What's wrong with this code?

Some would say that the main problem is the needless generation of an exception - this being inefficient. While possibly true, that's not why the code is bad.

The code is bad because catching the exception is an imprecise way of stating what's going on. Does that exception mean what you think it means? Maybe not. Think of a master cache that fetches and stores across a collection of network servers, the names of which are stored in a server cache. If a server is removed from the server cache unexpectedly, an attempt to fetch a key from the master cache may well trigger a CacheKeyException in the server cache. The code above will treat this as a simple missing key, hiding the underlying network problem.

If you instead test the cache directly, the code is clear and unambiguous:

Cache c;
Key testKey;

// some code here ...

bool addEntry = ! c.hasEntry(testKey);
if( addEntry )
{
   // code to add entry to cache goes here
}

An exception is for conveying error information up a call stack. It should not be used as a substitute for missing interface methods. If you can't test for some condition without forcing and catching an exception, add a new explicit method that tests for the condition instead.

Describe The Problem As Fully As You Can

If you are generating an exception, describe what went wrong as fully as you can. To take an example, assume we are trying to get a cached data curve called "MYCURVE", and it is not available for the date 1 December 2008. A very general (and nearly useless) error might be:

"Error getting resource" 

This does not even tell the eventual handler which resource was being looked for. This is better, although still very uninformative:

"Error getting cached data curve"

Adding what was being looked for:

"Error getting cached data curve 'MYCURVE'" 

And finally, with all relevant information:

"Error getting cached data curve 'MYCURVE' for date 1Dec2008" 

You should also add the location of the error: the file, line and function in which the problem arose.

If you're returning a code instead, don't pick a generic "failure" code by default. Spend a little time looking for the one that best reflects the true problem.

Think Before You Create New Exception Classes

Many exception class hierarchies are too large. In fact, even large subsystems will often need (at most) a few distinct exception classes. When deciding whether to create a new exception class, bear in mind that every new class comes at a cost - programmers dealing with the system must bear this exception class in mind, and try to deal with it properly. This becomes very difficult if there are more than a handful of classes to consider. So any new exception class needs to have some corresponding benefit that outweighs this cost of extra complexity for the programmer.

A good example of the kind of benefit that makes a new class worth creating is adding relevant information. In the linked-to example, a new type of exception is created to contain the ID of a low-level field, on which validation has failed. This 'FieldException' is then caught by a higher-level class, which looks the field ID up in its own naming scheme. The higher-level class can then add the higher-level symbolic name of the field to the original exception message. Here, the new 'FieldException' class has served to convey information about the error (the field ID) to an area with more context (the enclosing class), which can use that information to provide a better report on the problem.

As The Exception Propagates

Add Relevant Information As Exceptions Propagate

As an exception propagates back up the call stack, more becomes known about the external meaning of the failed operation. If the error reaches a level where extra information can be provided to help identify the root cause, this information should be added.

Sometimes the information available at the source of the error, while not enough to provide a useful hint to handlers by itself, can be combined with information available higher up the call stack. For instance, you may have a complex input structure consisting of a hundred fields. Assume a validation fails on one of the fields. The error at this point might read:

'' is not a valid date 

As it stands, this is not very useful. The input structure might have dozens of blank fields, and this message gives no clue as to which of them might have caused the error. To help identify the field, its address/id could be written into the error information. A higher level routine with access to the entire complex input structure can look up the field's address/id in that structure to get a more informative symbolic name, and augment the error to the following:

Input field 'RegistrationDate' in panel 'LicenceDetails'
'' is not a valid date 

Don't Add Irrelevant Information As Exceptions Propagate

You should only add information to propagating exceptions if it will be of use to the eventual handler. Sometimes, people are tempted to add something "just in case it's useful" in practically every function the exception propagates through. This is counter-productive, since the actual problem becomes hidden behind pages of irrelevancy.

Instead, start with only the information available at the error point itself, and then observe what extra information is, in fact, required to identify problems - and then add precisely that extra information. In particular, when supporting a system, if an error cannot be resolved as quickly or cleanly as possible, always identify and add the information that could have improved its resolution. Dedicated application of this way of working will quickly result in a sparse, but informative, set of error handlers.

Don't Remove Information As Exceptions Propagate

We already know how important it is to make the information in an exception as full and as relevant as possible. One implication of this is that you should assume that any information already there is relevant, and retain it when propagating. Wrap old exceptions inside new exceptions.

You may have to lose some information when propagating across typesystem or language boundaries - see later on for guidelines on what to do in that situation.

Don't Handle Exceptions That You Can't Deal With Properly

An exception may come from code of which you know nothing, and be intended to be dealt with by handlers of which you also know nothing. A major function of exceptions is to pass information about problems without code in-between having to deal with the information itself. So your default assumption should be that errors are to be handled non-locally, and your default action should be to let them propagate outwards.

Adding context and re-throwing is fine, as long as you have useful context to add. But trying to handle a problem that you can't do anything about leads to code like this C++ example:

bool wasAnError = false;
try
{
   some_operation_that_might_throw();
}
catch (...)
{
   // We are throwing away all information about any problems
   // that have arisen at this point. Let's hope it wasn't
   // anything that we weren't expecting!
   wasAnError = true;
}

You should never write a "catch-all" clause in the body of an application. The only valid use for such a thing is when code is about to exit the language you are working in altogether - either because a thread is about to be exited, or because control is about to pass to a different language.

In The Surrounding Code

Provide Robust Resource Handling

You need to ensure that resources will not leak in the presence of exceptions. This applies even if you are using a garbage-collected language: such a language might be taking care of memory for you, but "resources" are not just memory - other examples include synchronization objects (mutexes, semaphores, spin locks), database resources (transactions, cursors) and file handles. Complex programs typically have many kinds of resource to handle, for which no language will provide direct support.

Different languages provide different ways of ensuring resources get released when exceptions are thrown. There is a major difference in this respect between C++, and garbage-collected languages such as Java, C# and Python.

Since C++ is not garbage-collected, objects have their destructor called as soon as they go out of scope. So the easiest (and the best) way to ensure resources are freed in C++ is for objects to free their resources in their destructor functions :

void databaseOperation(Database& d)
{
   // The destructor of DatabaseQuery releases resources
   DatabaseQuery q = d.getQueryObject();

   // do something here...
}

However, this doesn't work for garbage-collected languages, since they don't reclaim objects at any definite point. Instead, they typically provide a "finally" keyword that indicates cleanup code to be run under all circumstances; even if an exception is thrown.

void databaseOperation(Database& d)
{
   try
   {
      // The destructor of DatabaseQuery releases resources
      DatabaseQuery q = d.getQueryObject();
   
      // do something here...
   }
   finally
   {
      // release resource explicitly here
      q.releaseResources();
   }
}

Which method is "better"? Both have their advantages. The C++ way completely removes cleanup code from the flow of control, arguably making code easier to read. The "finally" way makes the disposal of resources more explicit, and so arguably easier to understand. But ultimately, saying that one method is "better" is not very meaningful, since the different methods arise largely from whether the languages are garbage-collected or not! The most important thing is to be aware of the right way to do things in the language you are using.

To summarise:

  • In C++, release resources in destructors
  • In Java, use the finally keyword
  • In Python, use the finally keyword
  • In C#, use the finally or using keywords

Preserve Class Invariants

Class invariants should not be violated in the presence of exceptions. Don't leave an object with an invalid state if an exception is thrown - make sure that things are at least self-consistent. In the terminology of the C++ standard library , this is called the "basic" or "weak" guarantee.

An example of code failing to meet this guideline would be a container class that became unusable, or even crashed, if an attempt to add objects to it ever failed.

Provide Transactions Where Needed

Sometimes, if an operation fails (eg one of a series of linked database updates), then a whole series of changes must be undone. So, where applicable (eg for databases), use a commit-or-rollback model in the presence of errors. In the terminology of the C++ standard library , this is called the "strong" guarantee.

One way to do this is to work on a copy of the relevant data, and to swap it for the "real data" only once all the modifications have been carried out successfully.

Don't Provide Transactions Where They Aren't Needed

"Transactional semantics", the idea that an operation either fully succeeds, or fully rolls back, is vital where the code will have a permanent side-effect, such as updating a database. But a lot of business application code does reporting, or otherwise processes pre-existing information, rather than doing anything that has a permanent side-effect. For such code, transactions are not required. Instead, in the event of a failure, the resources used in the operation can be released, the relevant error logged and/or shown to the user, and the next job started.

It is not a good idea to provide transactions where they are not needed. Transactions require extra work, because the information necessary to reverse out partially-completed operations must be preserved until the whole transaction has been committed. This extra work is wasted if transactions are not actually required. So if a body of operations does not create persistent side-effects during its execution, don't provide transational semantics.

And Finally

Follow The Debates On Current Best Practice

Don't be swayed by what the person sitting next to you is doing for exception handling in a given language. How best to handle errors, particularly exceptions, still has a lot of debate, and "good practice" changes as people's understanding improves. For instance, people only started getting a reasonable understanding of the interaction between C++ templates and exceptions from 1998 or so. So find out what people are saying now, not what they said 10 years ago.