The Neophyte's Guide to Scala Part 4: Pattern Matching Anonymous Functions

In the previous part of this series, I gave an overview of the various ways in which patterns can be used in Scala, concluding with a brief mention of anonymous functions as another place in which patterns can be put to use. In this post, we are going to take a detailed look at the possibilities opened up by being able to define anonymous functions in this way.

If you have participated in the Scala course at Coursera, or have coded in Scala for a while, you will likely have written anonymous functions on a regular basis. For example, given a list of song titles which you want to transform to lower case for your search index, you might make want to define an anonymous function that you pass to the map method, like this:

  1. val songTitles = List("The White Hare", "Childe the Hunter", "Take no Rogues")
  2. songTitles.map(t => t.toLowerCase)

Or, if you like it even shorter, of course, you will probably write the normalize function like this, making use of Scala’s placeholder syntax:

  1. songTitles.map(_.toLowerCase)

So far so good. However, let’s see how this syntax performs for a slightly different example: We have a sequence of pairs, each representing a word and its frequency in some text. Our goal is to filter out those pairs whose frequency is below or above a certain threshold, and then only return the remaining words, without their respective frequencies. We need to write a function wordsWithoutOutliers(wordFrequencies: Seq[(String, Int)]): Seq[String].

Our initial solution makes use of the filter and map methods, passing anonymous functions to them using our familiar syntax:

  1. val wordFrequencies = ("habitual", 6) :: ("and", 56) :: ("consuetudinary", 2) ::
  2. ("additionally", 27) :: ("homely", 5) :: ("society", 13) :: Nil
  3. def wordsWithoutOutliers(wordFrequencies: Seq[(String, Int)]): Seq[String] =
  4. wordFrequencies.filter(wf => wf._2 > 3 && wf._2 < 25).map(_._1)
  5. wordsWithoutOutliers(wordFrequencies) // List("habitual", "homely", "society")

This solution has several problems. The first one is only an aesthetic one – accessing the fields of the tuple looks pretty ugly to me. If only we could destructure the pair, we could make this code a little more pleasant and probably also more readable.

Thankfully, Scala provides an alternative way of writing anonymous functions: A pattern matching anonymous function is an anonymous function that is defined as a block consisting of a sequence of cases, surrounded as usual by curly braces, but without a match keyword before the block. Let’s rewrite our function, making use of this notation:

  1. def wordsWithoutOutliers(wordFrequencies: Seq[(String, Int)]): Seq[String] =
  2. wordFrequencies.filter { case (_, f) => f > 3 && f < 25 } map { case (w, _) => w }

In this example, we have only used a single case in each of our anonymous functions, because we know that this case always matches – we are simply decomposing a data structure whose type we already know at compile time, so nothing can go wrong here. This is a very common way of using pattern matching anonymous functions.

If you try to assign these anonymous functions to values, you will see that they have the expected type:

  1. val predicate: ((String, Int)) => Boolean = { case (_, f) => f > 3 && f < 25 }
  2. val transformFn: ((String, Int)) => String = { case (w, _) => w }

Please note that you have to specify the type of the value here, the Scala compiler cannot infer it for pattern matching anonymous functions.

Nothing prevents you from defining a more complex sequence of cases, of course. However, if you define an anonymous function this way and want to pass it to some other function, such as the ones in our example, you have to make sure that for all possible inputs, one of your cases matches so that your anonymous function always returns a value. Otherwise, you will risk a MatchError at runtime.

Partial functions

Sometimes, however, a function that is only defined for specific input values is exactly what you want. In fact, such a function can help us get rid of another problem that we haven’t solved yet with our current implementation of the wordsWithoutOutliers function: We first filter the given sequence and then map the remaining elements. If we can boil this down to a solution that only has to iterate over the given sequence once, this would not only need fewer CPU cycles but would also make our code shorter and, ultimately, more readable.

If you browse through Scala’s collections API, you will notice a method called collect, which, for a Seq[A], has the following signature:

  1. def collect[B](pf: PartialFunction[A, B])

This method returns a new sequence by applying the given partial function to all of its elements – the partial function both filters and maps the sequence.

So what is a partial function? In short, it’s a unary function that is known to be defined only for certain input values and that allows clients to check whether it is defined for a specific input value.

To this end, the PartialFunction trait provides an isDefinedAt method. As a matter of fact, the PartialFunction[-A, +B] type extends the type (A) => B (which can also be written as Function1[A, B]), and a pattern matching anonymous function is always of type PartialFunction.

Due to this inheritance hierarchy, passing a pattern matching anonymous function to a method that expects a Function1, like map or filter, is perfectly fine, as long as that function is defined for all input values, i.e. there is always a matching case.

The collect method, however, specifically expects a PartialFunction[A, B] that may not be defined for all input values and knows exactly how to deal with that case. For each element in the sequence, it first checks if the partial function is defined for it by calling isDefinedAt on the partial function. If this returns false, the element is ignored. Otherwise, the result of applying the partial function to the element is added to the result sequence.

Let’s first define a partial function that we want to use for refactoring our wordsWithoutOutliers function to make use of collect:

  1. val pf: PartialFunction[(String, Int), String] = {
  2. case (word, freq) if freq > 3 && freq < 25 => word
  3. }

We added a guard clause to our case, so that this function will not be defined for word/frequency pairs whose frequency is not within the required range.

Instead of using the syntax for pattern matching anonymous functions, we could have defined this partial function by explicitly extending the PartialFunction trait:

  1. val pf = new PartialFunction[(String, Int), String] {
  2. def apply(wordFrequency: (String, Int)) = wordFrequency match {
  3. case (word, freq) if freq > 3 && freq < 25 => word
  4. }
  5. def isDefinedAt(wordFrequency: (String, Int)) = wordFrequency match {
  6. case (word, freq) if freq > 3 && freq < 25 => true
  7. case _ => false
  8. }
  9. }

Usually, however, you will want to use the much more concise anonymous function syntax.

Now, if we passed our partial function to the map method, this would compile just fine, but result in a MatchError at runtime, because our partial function is not defined for all possible input values, thanks to the added guard clause:

  1. wordFrequencies.map(pf) // will throw a MatchError

However, we can pass this partial function to the collect method, and it will filter and map the sequence as expected:

  1. wordFrequencies.collect(pf) // List("habitual", "homely", "society")

The result of this is the same as that of our current implementation of wordsWithoutOutliers when passing our dummy wordFrequencies sequence to it. So let’s rewrite that function:

  1. def wordsWithoutOutliers(wordFrequencies: Seq[(String, Int)]): Seq[String] =
  2. wordFrequencies.collect { case (word, freq) if freq > 3 && freq < 25 => word }

Partial functions have some other very useful properties. For example, they provide the means to be chained, allowing for a neat functional alternative to the chain of responsibility pattern known from object-oriented programming. This, however, will have to be the subject of a future post in this series, when I am going to address the issue of functional composability.

Partial functions are also a crucial element of many Scala libraries and APIs. For example, the way an Akka actor processes messages sent to it is defined in terms of a partial function. Hence, it’s quite important to know and understand this concept.

Summary

In this part, we examined an alternative way of defining anonymous functions, namely as a sequence of cases, which opens up some nice destructuring possibilities in a rather concise way. Moreover, we delved into the topic of partial functions, demonstrating their usefulness by means of a simple use case.

In the next article, I am going to dig deeper into the ever-present Option type, explaining the reasoning behind its existence and how best to make use of it.

Please let me know if you have any questions or feedback. Is there any particular topic you would like to see covered in an article?

Posted by Daniel Westheide