12.2 Errors and crashes
Once our web applications go live, it’s likely that there will be some unforeseen errors. A few example of common errors that may occur in the course of your application’s daily operations, are listed below:
Database Errors: errors related to accessing the database server or stored data. The following are some database errors which you may encounter:
Connection Errors: indicates that a connection to the network database server could not be established, a supplied user name or password is incorrect, or that the database does not exist.
- Query Errors: the illegal or incorrect use of an SQL query can raise an error such as this. These types of errors can be avoided through rigorous testing.
- Data Errors: database constraint violation such as attempting to insert a field with a duplicate primary key. These types of errors can also be avoided through rigorous testing before deploying your application into a production environment.
Application Runtime Errors: These types of errors vary greatly, covering almost all error codes which may appear during runtime. Possible application errors are as follows:
File system and permission errors: when the application attempts to read a file which does not exist or does not have permission to read, or when it attempts to write to a file which it is not allowed to write to, errors of this category will occur. A file system error will also occur if an application reads a file with an unexpected format, for instance a configuration file that should be in the INI format but is instead structured as JSON.
Third-party application errors: These errors occur in applications which interface with other third-party applications or services. For instance, if an application publishes tweets after making calls to Twitter’s API, it’s obvious that Twitter’s services must be up and running in order for our application to complete its task. We must also ensure that we supply these third-party interfaces with the appropriate parameters in our calls, or else they will also return errors.
HTTP errors: These errors vary greatly, and are based on user requests. The most common is the 404 Not Found error, which arises when users attempt to access non-existent resources in your application. Another common HTTP error is the 401 Unauthorized error (authentication is required to access the requested resource), 403 Forbidden error (users are altogether refused access to this resource) and 503 Service Unavailable errors (indicative of an internal program error).
- Operating system errors: These sorts of errors occur at the operating system layer and can happen when operating system resources are over-allocated, leading to crashes and system instability. Another common occurrence at this level is when the operating system disk gets filled to capacity, making it impossible to write to. This naturally produces many errors.
- Network errors: network errors typically come in two flavors: one is when users issue requests to the application and the network disconnects, thus disrupting its processing and response phase. These errors do not cause the application to crash, but can affect user access to the website; the other is when applications attempt to read data from disconnected networks, causing read failures. Judicious testing is particularly important when making network calls to avoid such problems, which can cause your application to crash.
Error handling goals
Before implementing error handling, we must be clear about what goals we are trying to achieve. In general, error handling systems should accomplish the following:
- User error notifications: when system or user errors occur, causing current user requests to fail to complete, affected users should be notified of the problem. For example, for errors cause by user requests, we show a unified error page (404.html). When a system error occurs, we use a custom error page to provide feedback for users as to what happened - for instance, that the system is temporarily unavailable (error.html).
- Log errors: when system errors occur (in general, when functions return non-nil error variables), a logging system such as the one described earlier should be used to record the event into a log file. If it is a fatal error, the system administrator should also be notified via e-mail. In general however, most 404 errors do not warrant the sending of email notifications; recording the event into a log for later scrutiny is often adequate.
- Roll back the current request operation: If a user request causes a server error, then we need to be able to roll back the current operation. Let’s look at an example: a system saves a user-submitted form to its database, then submits this data to a third-party server. However, the third-party server disconnects and we are unable to establish a connection with it, which results in an error. In this case, the previously stored form data should be deleted from the database (void should be informed), and the application should inform the user of the system error.
- Ensure that the application can recover from errors: we know that it’s difficult for any program to guarantee 100% uptime, so we need to make provision for scenarios where our programs fail. For instance if our program crashes, we first need to log the error, notify the relevant parties involved, then immediately get the program up and running again. This way, our application can continue to provide services while a system administrator investigates and fixes the cause of the problem.
How to handle errors
In chapter 11, we addressed the process of error handling and design using some examples. Let’s go into these examples in a bit more detail, and see some other error handling scenarios:
- Notify the user of errors:
When an error occurs, we can present the user accessing the page with two kinds of errors pages: 404.html and error.html. Here is an example of what the source code of an error page might look like:
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Page Not Found
</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<body>
<div class="container">
<div class="row">
<div class="span10">
<div class="hero-unit">
<h1> 404! </h1>
<p>{{.ErrorInfo}}</p>
</div>
</div>
<!--/span-->
</div>
</div>
</body>
</html>
Another example:
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>system error page
</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<body>
<div class="container">
<div class="row">
<div class="span10">
<div class="hero-unit">
<h1> system is temporarily unavailable ! </h1>
<p>{{.ErrorInfo}}</p>
</div>
</div>
<!--/span-->
</div>
</div>
</body>
</html>
404 error-handling logic, in the occurrence of a system error:
func (p *MyMux) ServeHTTP(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/" {
sayhelloName(w, r)
return
}
NotFound404(w, r)
return
}
func NotFound404(w http.ResponseWriter, r *http.Request) {
log.Error(" page not found") //error logging
t, _ = t.ParseFiles("tmpl/404.html", nil) //parse the template file
ErrorInfo := " File not found " //Get the current user information
t.Execute(w, ErrorInfo) //execute the template merger operation
}
func SystemError(w http.ResponseWriter, r *http.Request) {
log.Critical(" System Error") //system error triggered Critical, then logging will not only send a message
t, _ = t.ParseFiles("tmpl/error.html", nil) //parse the template file
ErrorInfo := " system is temporarily unavailable " //Get the current user information
t.Execute(w, ErrorInfo) //execute the template merger operation
}
How to handle exceptions
We know that many other languages have try... catch
keywords used to capture the unusual circumstances, but in fact, many errors can be expected to occur without the need for exception handling, and can be instead treated as an errors. It’s for this reason that Go functions return errors by design. For example, if a file is not found or if os.Open
returns an error, these functions will not panic; as another example, if a network connection gets disconnected during a data write operation, the net.Conn
family of Write
functions will return errors instead of panicking. These error states are to be expected in most applications and Go particularly makes it explicit when operations might fail by returning error variables. Looking at the example above, we can clearly see the errors that can be expected to occur.
There are, however, cases where panic
should be used. For instance in operations where failure is almost impossible, or in certain situations where there is no way to return an error and the operation cannot continue, panic
should be used. Take for example a program that tries to obtain the value of an array at x[j], but the index j is out of bounds. This part of the code will cause the program to panic, as will other critical, unexpected errors of this nature. By default, panicking will kill off the offending process (goroutine), allowing the code which dispatched the goroutine an opportunity to recover from the error. This way, the function in which the error occurred as well as all subsequent code after it will not continue to execute. Go’s panic
was deliberately designed with this behavior in mind, which is different than typical error handling; panic
is really just exception handling. In the example below, we expect that User[UID]
will return a username from the User
array, but the UID that we use is out of bounds and throws an exception. If we do not have a recovery mechanism to deal with this immediately, the process will be killed, and the panic will propagate up the stack until our program finally crashes. In order for our application to be robust and resilient to these kinds of runtime errors, we need to implement recovery mechanisms in certain places.
func GetUser(uid int) (username string) {
defer func() {
if x := recover(); x != nil {
username = ""
}
}()
username = User[uid]
return
}
The above describes the differences between errors and exceptions. So, when it comes down to developing our Go applications, when do we use one or the other? The rules are simple: if you define a function that you anticipate might fail, then return an error variable. When calling another package’s function, if it is implemented well, there should be no need to worry that it will panic unless a true exception has occurred (whether recovery logic has been implemented or not). Panic and recover should only be used internally inside packages to deal with special cases where the state of the program cannot be guaranteed, or when a programmer’s error has occurred. Externally facing APIs should explicitly return error values.
Summary
This section summarizes how web applications should handle various errors such as network, database and operating system errors, among others. We’ve outline several techniques to effectively deal with runtime errors such as: displaying user-friendly error notifications, rolling back actions, logging, and alerting system administrators. Finally, we explained how to correctly handle errors and exceptions. The concept of an error is often confused with that of an exception, however in Go, there is a clear distinction between the two. For this reason, we’ve discussed the principles of processing both errors and exceptions in web applications.
Links
- Previous section: Logs
- Next section: Deployment