Updated: Apr 12
The main reason I like this approach is because it’s more declarative and explicit: once you get used to the syntax, and become acquainted with the various iteration methods, just seeing which method is being used provides a clear indication of what type of operation is being performed. For example, when I see filter I immediately know that the purpose is to retain only the items that match a specific criteria.
Likewise, when I see map I know that the purpose is to apply a transformation to the values of items. The endresult it that the code becomes both more succinct and readable, especially when used with the new ECMAScript arrow syntax, as shown above.
Last but not least, these methods can be chained together, so that complex operations on a collection can be constructed from a sequence of simple ones, each one expressed as an operation on a single item:
The built-in iteration methods are only available on arrays. This means that if you want to apply them to any other type of collection, you must first transform it into an array.
They are eager rather than lazy. In the example above, the entire array is filtered into a temporary array, and after that the entire resulting array is transformed into yet another temporary array, even though only the first three items are required. This can result in excessive CPU and memory use.
Lodash To The Rescue?
Fortunately, there are third-party libraries out there that can alleviate these limitations. A great example is the highly popular lodash library, which provides iteration methods that can operate on more generalized collections, and not just on arrays. In addition, lodash has a chaining feature, which is lazy rather than eager, meaning it evaluates and allocates space only for items that it actually needs.
In the following example, since slice only takes the first three items provided by map, the map method will only actually apply the transformation to up to three items, regardless of the total size and content of the original array:
Also, no arrays are generated by any of the iteration methods in the chain. Instead, the terminating call to value instructs lodash to actually trigger the lazy computation and provide the resulting array.
But, unfortunately, lodash has its own limitations and downsides. In particular, lodash only supports collections which are either arrays or JSON property bags. If you implement your own custom collection, for example, it won’t work with lodash. Even the built-in Map and Set collections aren’t supported by lodash. Instead, they must be converted into arrays before lodash methods can be applied to them.
Another limitation is that tree-shaking isn’t really compatible with lodash chains. This is because lodash uses the dot operator to construct the chains, and so each link in the chain emits an object which references all the chainable iteration methods. These references prevent bundlers, like WebPack, from being able to identify which methods are actually being used, and excluding the rest. (There are methods to overcome this, but they are convoluted and cumbersome to use, which is why most developers don’t use them.)
Are we stuck then, with a limited solution? Fortunately not!
Better Iteration With Iterators
All map does is take an input argument, which is either an iterator or a collection that implements iterators, and a second argument, which is the operator to apply to each item. Since it’s a generator, when map is invoked it doesn’t actually compute anything, or return an intermediate collection. Instead it creates an output iterator, which directly provides the result of each transformation only when required, as specified by the argument to yield.
Similarly, here is the implementation of filter:
Now we can finally use these functions together to reproduce the lodash example from before:
Here filter is applied directly to the numbers array, and provides an iterator, which is passed to map. Similarly, map provides an iterator which is passed to slice. And slice also provides an iterator, which is passed as an argument to the built-in Array.from function. It turns out that Array.from can use the iterator to construct an array. As a result, no intermediate collections are generated, and only the values needed are actually computed. (You can also use the spread operator instead of Array.from, but I wanted to construct this expression from a sequence of function calls.)
Yet there are still a couple of problems with this expression. First and foremost, we haven’t yet defined the slice function. Can you come up with an implementation for it yourself? Try giving yourself a couple of minutes to do so. Are you done? Hopefully the implementation you came up with is similar to the following:
Please be aware that this is a simplistic implementation, intended to show how such a function can be defined. Note the use of break to exit the loop immediately when index is greater or equal to finish. This ensures that values that aren’t needed aren’t pulled from the input iterator. As a result, the preceding functions in the sequence, filter and map in this example, won’t compute unneeded values. This is how we get lazy evaluation in this implementation.
And because each iteration function is wholly independent of the others, there won’t be any problem tree-shaking this code, unlike lodash chaining.
Laying The Pipeline
Another big problem with this implementation is that it’s difficult to read and follow, unlike the lodash example. This is because when reading from left to right, the functions are provided in reverse to the order in which they are used. Moreover, all the parentheses and arguments placement make this code confusing and verbose.
Fortunately, a much better syntax for this type of expressions is being introduced by ECMAScript, known as the pipeline operator. Although the syntax for this proposed operator hasn’t fully settled yet, and it’s still at an early stage of the acceptance process, you can already try it out using Babel. With the pipeline operator, the result of each function is passed as an argument to the next function in the expression. Here is the same example above written using the pipeline operator:
The # symbol is a placeholder, used to indicate where the result of the previous expression in the pipeline sequence should be used when the next function takes more than one argument. In this case, # is either the original array (passed to filter), or the iterator returned from the previous step.
While this syntax certainly requires some getting used to, it definitely makes the code much easier to read and follow, and the operations are provided in the order in which they are used. And so we have the solution we wanted all along.
Note that Object.entries will actually generate an intermediate array. It’s possible to create an entries function which provides an iterator instead - I leave this as an exercise, or you can cheat and look at the CodePen that I have prepared with this, and additional functions.
Needle In A Haystack
One obvious way to implement find functionality using the functions that we already have is using filter, for example:
Normally, using filter to implement find is a bad idea because it tests every item in the array, even if a matching item has already been found. However, because this library uses lazy evaluation to only compute needed values, the use of slice prevents this from happening. In this code segment, slice indicates that only the value of the first matching item is required, and so, once an item is found, no additional items are tested by filter.
But this code is overly verbose, and creates an extra one-item array that contains the value of the matching item. Let’s improve on this by implementing a simple head function:
This function returns the value of the first item in a collection, or undefined if the collection is empty. Now we can achieve the same result as the previous example using the much simpler:
In this case, it’s the head function that prevents filter from processing any item after the first matching item is found. And, since it directly returns the value of the first item, no intermediate array is required.
We can also package this code into a find function, and use it as follows:
This is a great example of how easily extensible such a library is.
In the interim, you can use this CodePen to try out this functionality already. It contains a more complete implementation of the library I discussed in this post, with additional functions such as reduce and flat, and is still very small, at around 100 lines of unminified code! Feel free to play with this library, and use it as you see fit. And if it inspires you to create a library of your own then more power to you.
This post was written by Dan Shappir
You can follow him on Twitter
For more engineering updates and insights: