The Neophyte's Guide to Scala Part 2: Extracting Sequences
In the first part of this series, we learned how to implement our own extractors and how these extractors can be used for pattern matching. However, we only discussed extractors that allow you to destructure a given object into a fixed number of parameters. Yet, for certain kinds of data structures, Scala allows you to do pattern matching expecting an arbitrary number of extracted parameters.
For example, you can use a pattern that only matches a list of exactly two elements, or a list of exactly three elements:
val xs = 3 :: 6 :: 12 :: Nil
xs match {
case List(a, b) => a * b
case List(a, b, c) => a + b + c
case _ => 0
}
What’s more, if you want to match lists the exact length of which you don’t care about, you can use a wildcard operator, _*
:
val xs = 3 :: 6 :: 12 :: 24 :: Nil
xs match {
case List(a, b, _*) => a * b
case _ => 0
}
Here, the first pattern matches, binding the first two elements to the variables a
and b
, while simply ignoring the rest of the list, regardless how many remaining elements there are.
Clearly, extractors for these kinds of patterns cannot be implemented with the means I introduced in the first article. We need a way to specify that an extractor takes an object of a certain type and destructures it into a sequence of extracted values, where the length of that sequence is unknown at compile time.
Enter unapplySeq
, an extractor method that allows for doing exactly that. Let’s take a look at one of its possible method signatures:
def unapplySeq(object: S): Option[Seq[T]]
It expects an object of type S
and returns either None
, if the object does not match at all, or a sequence of extracted values of type T
, wrapped in a Some
.
Example: Extracting given names
Let’s make use of this kind of extractor method in an admittedly contrived example. Let’s say that in some piece of our application, we are receiving a person’s given name as a String
. This string can contain the person’s second or third name, if that person has more than one given name. Hence, possible values could be "Daniel"
, or "Catherina Johanna"
, or "Matthew John Michael"
. We want to be able to match against these names, extracting and binding the individual given names.
Here is a very simple extractor implementation by means of the unapplySeq
method that will allow us to do that:
object GivenNames {
def unapplySeq(name: String): Option[Seq[String]] = {
val names = name.trim.split(" ")
if (names.forall(_.isEmpty)) None else Some(names)
}
}
Given a String
containing one or more given names, it will extract those as a sequence. If the input name does not contain at least one given name, this extractor will return None
, and thus, a pattern in which this extractor is used will not match such a string.
We can now put our new extractor to test:
def greetWithFirstName(name: String) = name match {
case GivenNames(firstName, _*) => "Good morning, " + firstName + "!"
case _ => "Welcome! Please make sure to fill in your name!"
}
This nifty little method returns a greeting for a given name, ignoring everything but the first name. greetWithFirstName("Daniel")
will return "Good morning, Daniel!"
, while greetWithFirstName("Catherina Johanna")
will return "Good morning, Catherina!"
Combining fixed and variable parameter extraction
Sometimes, you have certain fixed values to be extracted that you know about at compile time, plus an additional optional sequence of values.
Let’s assume that in our example, the input name contains the person’s complete name, not only the given name. Possible values might be "John Doe"
or "Catherina Johanna Peterson"
. We want to be able to match against such strings using a pattern that always binds the person’s last name to the first variable in the pattern and the first name to the second variable, followed by an arbitrary number of additional given names.
This can be achieved by means of a slight modification of our unapplySeq
method, using a different method signature:
def unapplySeq(object: S): Option[(T1, .., Tn-1, Seq[T])]
As you can see, unapplySeq
can also return an Option
of a TupleN
, where the last element of the tuple must be the sequence containing the variable parts of the extracted values. This method signature should be somewhat familiar, as it is similar to one of the possible signatures of the unapply
method that I introduced last week.
Here is an extractor making use of this:
object Names {
def unapplySeq(name: String): Option[(String, String, Seq[String])] = {
val names = name.trim.split(" ")
if (names.size < 2) None
else Some((names.last, names.head, names.drop(1).dropRight(1)))
}
}
Have a close look at the return type and the construction of the Some
. Our method returns an Option
of Tuple3
. That tuple is created with Scala’s syntax for tuple literals by just putting the three elements – the last name, the first name, and the sequence of additional given names – in a pair of parentheses.
If this extractor is used in a pattern, the pattern will only match if at least a first and last name is contained in the given input string. The sequence of additional given names is created by dropping the first and the last element from the sequence of names.
We can use this extractor to implement an alternative greeting method:
def greet(fullName: String) = fullName match {
case Names(lastName, firstName, _*) => "Good morning, " + firstName + " " + lastName + "!"
case _ => "Welcome! Please make sure to fill in your name!"
}
Feel free to play around with this in the REPL or a worksheet.
Summary
In this article, we learned how to implement and use extractors that return variable-length sequences of extracted values. Extractors are a pretty powerful mechanism. They can often be re-used in flexible ways and provide a powerful way to extend the kinds of patterns you can match against.
We will revisit extractors in a case study towards the end of this series. In the next part, however, I will give an overview of the different ways in which patterns can be applied in Scala code – there is more to it than just the pattern matching you have seen in the examples so far.
Update, 24.01.2013: I updated the code example implementing the GivenNames extractor. Thanks to Christophe Bliard for pointing out a mistake in there.
Posted by Daniel Westheide