-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create error-handling.md #14
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
# Error Handling | ||
|
||
## What is error handling? | ||
|
||
Applications have a expected flow, and due to external circumstances sometimes will deviate from this flow. This deviation is considered an error and has to be handled accordingly. We dive into this handling later in this spec. | ||
|
||
## What types of errors are there? | ||
|
||
We can encounter errors for a variety of reasons, for example: | ||
|
||
1. Unexpected input - From a user typing invalid information into an input to an unexpected file being uploaded there are countless ways a system can be utilized besides the intended use. | ||
2. I/O or access failures - failure to send/receive data to/from network, device, etc. (Errors: out of disk space, memory, internet connection issues). Your system probably expects to access some sort of external system, and any unexpected responses should be handled. | ||
3. Inconsitent state - You system may also make assumptions about the "state" of things. Maybe the internal machine state such as a directory structure has been changed. Maybe a user's session is missing some expected data. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/inconsitent/inconsistent/ |
||
|
||
These errors might be classifed under the following categories: | ||
|
||
- Expected Errors - These are errors you explicity check for and throw. For example maybe you want to error and stop a flow if a user enters an invalid email. You'd explicitly validate the email, and if it was invalid you'd consider it an error. | ||
- Unexpected Errors - These may come from anywhere, but won't happen if systems are working correctly. Examples of these could be the inconsitent state or I/O failures mentioned previously. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/inconsitent/inconsistent/ |
||
|
||
## How do you handle/respond to errors? | ||
### 1. Before Errors Happen | ||
Errors are inevitable. However, we strive to preempt errors with the following practices: | ||
|
||
- [testing](https://github.com/Clever/dev-handbook/blob/testing/testing.md) - You should ideally test all parts of your system, so by definition you will test your error handling code. This includes writing tests for external I/O failures. This allows you to know how your system handles different error types. | ||
- [code review](https://github.com/Clever/dev-handbook/blob/master/git-workflow.md) - During code review is another time to reflect on how an application will handle and react to different errors and conditions. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in the spirit of preemptive error handling, might want to preemptively link to the file that will resolve #9 |
||
|
||
### 2. General Approaches to Error Handling | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the items in this section might benefit from examples |
||
Error handling is different on an language by language and application by application basis. In general there are several things to keep in mind when designing error handling. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 'an' should be 'a', but this sentence might read better as "Error handling strategies differ from language to language and application to application" |
||
|
||
- Error Bubbling | ||
Often systems will depend on other services or modules, and in turn be depended on. Systems should expose a interface and make it clear when errors occur. This allows systems with enough context to handle the error to handle it correctly. | ||
- DRY error handling | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. assumes the reader knows what DRY means. is that an assumption we want to make? |
||
On that note you should strive to error errors in as much of a central location as possible. This not only reduces code duplication, but it also makes it very clear in any context how to handle or throw errors, since they all bubble up to one place. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what does it mean to "error errors"? |
||
- When to fail hard vs soft | ||
Some errors can be swallowed and the execution of the program can continue. For example when loading data into a database the program can probably report invalid records and continue running. | ||
|
||
However some errors justify the halting of the program flow. For example if you loose a connection to the DB while loading data into it then that most likely justifies a hard failure. | ||
|
||
### 3. Reporting Errors | ||
Your program creates errors elegantly - so what? You'd better have a way to view those errors and respond to them. Outlets for reviewing errors include: | ||
|
||
1. Logs - TBD | ||
2. Notifications - Often you want to be notified when unpexted errors occur. We have email and Hipchat hooks that allow us to get notifications on system failures. Clever uses [Sentry](getsentry.com) to submit system error details. This allows you to keep a tab on which errors are occuring and which need to be responded to. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/unpexted/unexpected/ does your editor support spell check? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this section benefit from a discussion of how to balance signal to noise when using aggressive error notifications? |
||
3. Rate monitoring - Some errors are permissible at certain rate. For example perhaps reading from external web services will sometimes be faulty. In other cases, an error may have been happening but not require repair (thus, the only signal is when a previously stable system transitions to a failing state, whereas constant old errors are noise). By monitoring the rate of occurences of these errors over time you can detect sudden changes which indicate a bigger problem. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/occurences/occurrences/ |
||
4. Public Awareness - Sometimes, errors can immediate affect Clever's users, so we have a [Status Page](https://status.clever.com) to share downtime and stability stats that pulls from internal error metrics. | ||
|
||
## Language Specifics | ||
|
||
Below we've collected a few patterns for error handling Clever uses across our codebases. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what about error codes (numeric and textual)? that seems like a common pattern we use |
||
|
||
Across different programming languages, there are a variety of approaches to handle errors. I focus on the paradigms we've adopted at Clever based on our infrastructure (AWS, Heroku) and coding languages (CoffeeScript, Python, Go). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this line leaks first person singular |
||
|
||
1. CoffeeScript/Javascript | ||
- `Error` You can subclass the builtin Error class and throw different types of errors depending on the type of failure. You can then detect where a failure came from in a central error handler. Careful doing this though, there are [gotchas](https://github.com/Clever/clever-api/pull/155). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is a private link, so we should probably replace this with a link to a public version of this comment. maybe a blog post? |
||
- Async Methods | ||
We follow node callback style in most places, including the frontend or other places where this style has traditionally not been as prevalent. | ||
|
||
Async functions are called with arguemtns as normal, with the last argument being a callback. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/arguemtns/arguments/ (this misspelling has a Levenshtein distance of 3, which is actually somewhat impressive) |
||
|
||
The callback has the signature `(err, data)` where data is equivlent to the return value of the function, and err is either an error or null. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/equivlent/equivalent/ |
||
- Sync Methods | ||
Non async functions should throw errors that can be caught by the caller. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. errors should be instances of Error should functions include their name in the error message? or should that be covered by the stack trace? |
||
2. Python | ||
TBD | ||
- use `with` to explicitly declare resources being accessed (Redis Reservations. Selenium Driver instnaces) | ||
- `try/catch/finally`. use `finally` to cleanup on failed execution | ||
3. Go | ||
TBD? | ||
- `panic` to output the stacktrace and stop | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
- http://blog.golang.org/error-handling-and-go | ||
- Use error types | ||
- log.fatal to output the error and stop | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. log.fatal is basically panic, so I'd remove this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/a expected/an expected/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't really a spec, maybe "guide"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduction makes it seem like errors are out of our control. I might consider removing this section and leading off with the definition of "error" i.e. the "what types of errors are there" section.