The Neophyte's Guide to Scala Part 4: Pattern Matching Anonymous Functions
In the previous part of this series, I gave an overview of the various ways in which patterns can be used in Scala, concluding with a brief mention of anonymous functions as another place in which patterns can be put to use. In this post, we are going to take a detailed look at the possibilities opened up by being able to define anonymous functions in this way.
If you have participated in the Scala course at Coursera, or have coded in Scala for a while, you will likely have written anonymous functions on a regular basis. For example, given a list of song titles which you want to transform to lower case for your search index, you might make want to define an anonymous function that you pass to the map
method, like this:
val songTitles = List("The White Hare", "Childe the Hunter", "Take no Rogues")
songTitles.map(t => t.toLowerCase)
Or, if you like it even shorter, of course, you will probably write the normalize function like this, making use of Scala’s placeholder syntax:
songTitles.map(_.toLowerCase)
So far so good. However, let’s see how this syntax performs for a slightly different example: We have a sequence of pairs, each representing a word and its frequency in some text. Our goal is to filter out those pairs whose frequency is below or above a certain threshold, and then only return the remaining words, without their respective frequencies. We need to write a function wordsWithoutOutliers(wordFrequencies: Seq[(String, Int)]): Seq[String]
.
Our initial solution makes use of the filter
and map
methods, passing anonymous functions to them using our familiar syntax:
val wordFrequencies = ("habitual", 6) :: ("and", 56) :: ("consuetudinary", 2) ::
("additionally", 27) :: ("homely", 5) :: ("society", 13) :: Nil
def wordsWithoutOutliers(wordFrequencies: Seq[(String, Int)]): Seq[String] =
wordFrequencies.filter(wf => wf._2 > 3 && wf._2 < 25).map(_._1)
wordsWithoutOutliers(wordFrequencies) // List("habitual", "homely", "society")
This solution has several problems. The first one is only an aesthetic one – accessing the fields of the tuple looks pretty ugly to me. If only we could destructure the pair, we could make this code a little more pleasant and probably also more readable.
Thankfully, Scala provides an alternative way of writing anonymous functions: A pattern matching anonymous function is an anonymous function that is defined as a block consisting of a sequence of cases, surrounded as usual by curly braces, but without a match
keyword before the block. Let’s rewrite our function, making use of this notation:
def wordsWithoutOutliers(wordFrequencies: Seq[(String, Int)]): Seq[String] =
wordFrequencies.filter { case (_, f) => f > 3 && f < 25 } map { case (w, _) => w }
In this example, we have only used a single case in each of our anonymous functions, because we know that this case always matches – we are simply decomposing a data structure whose type we already know at compile time, so nothing can go wrong here. This is a very common way of using pattern matching anonymous functions.
If you try to assign these anonymous functions to values, you will see that they have the expected type:
val predicate: ((String, Int)) => Boolean = { case (_, f) => f > 3 && f < 25 }
val transformFn: ((String, Int)) => String = { case (w, _) => w }
Please note that you have to specify the type of the value here, the Scala compiler cannot infer it for pattern matching anonymous functions.
Nothing prevents you from defining a more complex sequence of cases, of course. However, if you define an anonymous function this way and want to pass it to some other function, such as the ones in our example, you have to make sure that for all possible inputs, one of your cases matches so that your anonymous function always returns a value. Otherwise, you will risk a MatchError
at runtime.
Partial functions
Sometimes, however, a function that is only defined for specific input values is exactly what you want. In fact, such a function can help us get rid of another problem that we haven’t solved yet with our current implementation of the wordsWithoutOutliers
function: We first filter the given sequence and then map the remaining elements. If we can boil this down to a solution that only has to iterate over the given sequence once, this would not only need fewer CPU cycles but would also make our code shorter and, ultimately, more readable.
If you browse through Scala’s collections API, you will notice a method called collect
, which, for a Seq[A]
, has the following signature:
def collect[B](pf: PartialFunction[A, B])
This method returns a new sequence by applying the given partial function to all of its elements – the partial function both filters and maps the sequence.
So what is a partial function? In short, it’s a unary function that is known to be defined only for certain input values and that allows clients to check whether it is defined for a specific input value.
To this end, the PartialFunction
trait provides an isDefinedAt
method. As a matter of fact, the PartialFunction[-A, +B]
type extends the type (A) => B
(which can also be written as Function1[A, B]
), and a pattern matching anonymous function is always of type PartialFunction
.
Due to this inheritance hierarchy, passing a pattern matching anonymous function to a method that expects a Function1
, like map
or filter
, is perfectly fine, as long as that function is defined for all input values, i.e. there is always a matching case.
The collect
method, however, specifically expects a PartialFunction[A, B]
that may not be defined for all input values and knows exactly how to deal with that case. For each element in the sequence, it first checks if the partial function is defined for it by calling isDefinedAt
on the partial function. If this returns false
, the element is ignored. Otherwise, the result of applying the partial function to the element is added to the result sequence.
Let’s first define a partial function that we want to use for refactoring our wordsWithoutOutliers
function to make use of collect
:
val pf: PartialFunction[(String, Int), String] = {
case (word, freq) if freq > 3 && freq < 25 => word
}
We added a guard clause to our case, so that this function will not be defined for word/frequency pairs whose frequency is not within the required range.
Instead of using the syntax for pattern matching anonymous functions, we could have defined this partial function by explicitly extending the PartialFunction
trait:
val pf = new PartialFunction[(String, Int), String] {
def apply(wordFrequency: (String, Int)) = wordFrequency match {
case (word, freq) if freq > 3 && freq < 25 => word
}
def isDefinedAt(wordFrequency: (String, Int)) = wordFrequency match {
case (word, freq) if freq > 3 && freq < 25 => true
case _ => false
}
}
Usually, however, you will want to use the much more concise anonymous function syntax.
Now, if we passed our partial function to the map
method, this would compile just fine, but result in a MatchError
at runtime, because our partial function is not defined for all possible input values, thanks to the added guard clause:
wordFrequencies.map(pf) // will throw a MatchError
However, we can pass this partial function to the collect
method, and it will filter and map the sequence as expected:
wordFrequencies.collect(pf) // List("habitual", "homely", "society")
The result of this is the same as that of our current implementation of wordsWithoutOutliers
when passing our dummy wordFrequencies
sequence to it. So let’s rewrite that function:
def wordsWithoutOutliers(wordFrequencies: Seq[(String, Int)]): Seq[String] =
wordFrequencies.collect { case (word, freq) if freq > 3 && freq < 25 => word }
Partial functions have some other very useful properties. For example, they provide the means to be chained, allowing for a neat functional alternative to the chain of responsibility pattern known from object-oriented programming. This, however, will have to be the subject of a future post in this series, when I am going to address the issue of functional composability.
Partial functions are also a crucial element of many Scala libraries and APIs. For example, the way an Akka actor processes messages sent to it is defined in terms of a partial function. Hence, it’s quite important to know and understand this concept.
Summary
In this part, we examined an alternative way of defining anonymous functions, namely as a sequence of cases, which opens up some nice destructuring possibilities in a rather concise way. Moreover, we delved into the topic of partial functions, demonstrating their usefulness by means of a simple use case.
In the next article, I am going to dig deeper into the ever-present Option
type, explaining the reasoning behind its existence and how best to make use of it.
Please let me know if you have any questions or feedback. Is there any particular topic you would like to see covered in an article?
Posted by Daniel Westheide