StrongLoop | Managing Node.js Callback Hell with Promises, Generators and Other Approaches
We know it most endearingly as “callback hell” or the “pyramid of doom”:
Callback hell is subjective, as heavily nested code can be perfectly fine sometimes. Asynchronous code is hellish when it becomes overly complex to manage the flow. A good question to see how much “hell” you are in is: how much refactoring pain would I endure if doAsync2 happened before doAsync1? The goal isn’t about removing levels of indentation but rather writing modular (and testable!) code that is easy to reason about and resilient.
In this article, we will write a module using a number of tools and libraries to show how control flow can work. We’ll even look at an up and coming solution made possible by the next version of Node.
The problem
Let’s say we want to write a module that finds the largest file within a directory.
Let’s break down the steps to accomplish this:
- Read the files in the provided directory
- Get the stats on each file in the directory
- Determine which is largest (pick one if multiple have the same size)
- Callback with the name of the largest file
If an error occurs at any point, callback with that error instead. We also should never call the callback more than once.
A nested approach
The first approach is nested, not horribly, but the logic reads inward.
- Read all the files inside the directory
- Gets the stats on each file. This is done in parallel so we are using a counter to track when all the I/O has finished. We are also using a errored boolean to prevent the provided callback (cb) from being called more than once if an error occurs.
- Collect the stats for each file. Notice we are setting up a parallel array here (files to stats).
- Check to see if all parallel operations have completed
- Only grab regular files (not links or directories, etc)
- Reduce the list to the largest file
- Pull the filename associated with the stat and callback
This may be a perfectly fine approach to solving this problem. However, its tricky to manage the parallel operation and ensure we only callback once. We’ll look at managing that a little later, but lets first look at breaking this into smaller modular chunks first.
A modular approach
Our nested approach can be broken out into three modular units:
- Grabbing the files from a directory
- Grabbing the stats for those files
- Processing the stats and files to determine the largest
Since the first task is essentially just fs.readdir(), we won’t write a function for that. However, let’s write a function that, given a set of paths, will return all the stats for those paths while maintaining the ordering:
Now, we need a processing function that compares the stats and files and returns the largest filename:
Let’s tie the whole thing together:
- Generate a list of paths from the files and directory
A modular approach makes reusing and testing methods easier. The main export is easier to reason about as well. However, we are still manually managing the parallel stat task. Let’s switch over to some control flow modules and see what we can do.
An async approach
The async module is widely popular and stays close to the Node core way of doing things. Let’s take a look at how we could write this using async:
- async.waterfall provides a series flow of execution where data from one operation can be passed to the next function in the series using the next callback.
- async.map lets us run fs.stat over a set of paths in parallel and calls back with an array (with order maintained) of the results.
- The cb function will be called either after the last step has completed or if any error has occurred along the way. It will only be called once.
The async module guarantees only one callback will be fired. It also propagates errors and manages parallelism for us.
A promises approach
Promises provide error handling and functional programming perks. How would we approach this problem using promises? For that, let’s utilize the Q module (although other promise libraries could be employed):
- Since Node core functionality isn’t promise-aware, we make it so.
- Q.all will run all the stat calls in parallel and the result array order is maintained.
- Since we want to pass files and stats to the next then function, it’s the last thing returned.
Unlike the previous examples, any exceptions thrown inside the promise chain (i.e. then) are caught and handled. The client API changes as well to be promise centric:
Although designed this way above, you don’t have to expose a promise interface. Many promise libraries have a way to expose a nodeback style as well. With Q, we could do this using the nodeify function.
The scope of promises is not developed here. I would recommend reading more about them here.
A generators approach
As promised in at the beginning of the article, there is a new kid on the block that is available to play with in Node >=0.11.2: generators!
Generators are lightweight co-routines for JavaScript. They allow a function to be suspended and resumed via the yield keyword. Generator functions have a special syntax: function* (). With this superpower, we can also suspend and resume asynchronous operations using constructs such as promises or “thunks” leading to “synchronous-looking” asynchronous code.
A “thunk” is a function that returns a callback as opposed to calling it. The callback has the same signature as your typical nodeback function (i.e. error is the first argument). Read more here.
Let’s look at one example that enables generators for asynchronous control flow: the co module from TJ Holowaychuk. Here’s how to write our largest file program:
- Since Node core functionality isn’t “thunk”-aware, we make it so.
- co takes a generator function which can be suspended at anytime using the yield keyword
- The generator function will suspend until readdir returns. The resulting value is assigned to the files variable.
- co can also handle arrays a set of parallel operations to perform. A result array with order maintained is assigned to stats.
- The final result is returned.
We can consume this generator function with the same callback API we specified at the beginning of this article. Co has some nice error handling as any errors (including exceptions raised) will be passed to the callback function. Generators also enable the use of try/catch blocks around yield statements which co takes advantage of:
Co has a lot of neat support for arrays, objects, nested generators, promises and more.
There are other generator modules rising up as well. The Q module has a neat Q.async method that behaves similarly to co using generators.
Wrapping up
In this article, we investigated a variety of different approaches to mitigating “callback hell”, that is, getting control over the flow of your application. I am personally most intrigued by the generator idea. I am curious how that will play out with new frameworks like koa.
Although we didn’t employ it while looking at the 3rd party modules, a modular approach can be applied to any flow libraries (async, promises, generators). Can you think of ways to make the examples more modular? Have a library or technique that has worked well for you? Share it in the comments!
Want to check out and play with all the code samples used in this article as well as another generator example? There is a GitHub repo set up for that!
Get trained in Node.js and API development
Looking for Node.js and API development training? StrongLoop instructors are coming to a city near you:
- Nov 6-7: Denver, CO at Galvanize Campus
- Nov 13-14: Herndon, VA at Vizuri
- Dec 3-4: Framingham, MA at Applause
- Dec 11-12: Minneapolis, MN at BestBuy
Check out our complete Node.js and API development training, conference and Meetup calendar to see where you can come meet with StrongLoop engineers.
What’s next?
- What’s in the upcoming Node v0.12 release? read Ben Noordhuis’ blog to learn more.
- Watch Bert Belder’s comprehensive video presentation on all the new upcoming features in v0.12
- Ready to develop APIs in Node.js and get them connected to your data? We’ve made it easy to get started either locally or on your favorite cloud, with a simple npm install.