title | date | author |
---|---|---|
On functional-style comparators in Java 8 | 2015-04-25 | Marco Torchiano |
Marco Torchiano - Version 1.1 - 25 April 2015
Given a fairly typical Student
class:
public class Student {
//...
public int getId() { /*...*/ }
public int getSurname() { /*...*/ }
public int getName() { /*...*/ }
}
A common task is to order a collection of such object, e.g. by their id.
For this purpose we can used the Collections.sort()
method that accepts a List
and sorts it. There are two version of such a method:
one using the natural ordering of the elements, i.e. the one defined
by implementig the Comparable
interface and providing the compareTo()
method.
the other requiring and additional argument that implements
the Comparator
interface providing the code to compare two elements.
Let us suppose we have a list of students declared as:
List<Student> students;
Sorting a students' list using a comparator can be done in several different ways.
A new class must be created to host the code required compare two Student objects:
public class StudentIdComparator implements Comparator<Student>{
@Override
public int compare(Student a, Student b){
return a.getId() - b.getId();
}
}
Such class will be used in the client code to sort the elements of the list as follows:
Collections.sort( students, new StudentIdComparator() );
Observation the sort()
method arguments are profoundly different and
play two very different roles:
students
) is a regular object, i.e. it encapsulates
some data (the list of students) and provides methods to access such data.
Its role is to provide the data the method will operate on.The usage of functional objects is very common, especially when adopting a functional programming style, and it is encoded in the Strategy pattern.
This solution presents a few drawbacks:
To address the limitations of the above solution, in Java it is common to use an anonymous inner class. In this case, no additional standalon class is required and the client code can be written as:
Such class will be used in the client code to sort the elements of the list as follows:
Collections.sort( students, new Comparator<Student>(){
@Override
public int compare(Student a, Student b){
return a.getId() - b.getId();
}
});
This solution solves the limitations though is requires still a significant amount of boilerplate code that is required by the language syntax though it does not provide any significan information.
A much more compact form can he achieved by means of lambda expressions, that were introduced in Java 8. Lambda expression allow us to define and instantiate an anonynous inner class without the boilerplate code and leveraging the compiler type inference capabilities.
A perfectly equivalent code to the previous can be written as:
Collections.sort( students, (a,b) -> a.getId() - b.getId() );
Two mechanisms are in place here:
type inference:
it starts from the signature of the sort
method, that looks like:
static <T> void sort(List<T>, Comparator<T>)
from the first argument ( List<Student>
) it infers the type T
therefore the type of the second argument will be Comparator<Student>
this means that the method to be implemented inside the anonymous inner class will be:
int compare(Student o1, Student o2)
therefore the type of the two arguments of the lambda expression
will be both of type Student
and the return type will be int
code generation:
based on the type inference information
the compiler is able to generate an anonymous inner class similar to the
one in the previous solution
This solution is much more compact and readable than the previous one.
The drawback is that it is still non immediately undestandable: a potential maintainer of the code would need to figure out that the difference of the two ids means sorting in ascending id order.
A further step consists in using a comparator factory method, which can build a comparator based on some key feature of the objects. The key can be obtained by means of a key extractor object.
The code using the factory method would then be:
Collections.sort(students, comparing( s -> s.getId()) );
The static
comparing()
method inside Comparator
can be accessed
without the class qualification provided the following import statement
is present in the class
import static java.util.Comparator.*
The comparing()
method is implemented rougly as:
static <T,U extends Comparable<U>>
Comparator<T> comparing(Function<T,U> keyExtractor){
return (o1,o2) -> keyExtractor.apply(o1)
.compareTo( keyExtractor.apply(o2) );
}
The comparator uses the apply()
method defined inside the
Function
interface, whose simplified code is:
public interface Function<T,U> {
U apply(T t);
}
So when we instantiate it with a lambda (s) -> s.getId()
, the compiler generates an anonymous inner class that looks like:
new Function<Student,Integer> {
Integer apply(Student s) { return new Integer(s.getId()); }
}
The first type parameter (i.e. T
replaced by Student
) is inferred from the context (the type of the List elements)
The second type parameter (i.e. U
replaced by Integer
) is inferred from the return type of the lambda expression (i.e. the return type of getId()
) that is int
, though, being a primitive type, int
cannot be used as a type parameter so the compiler replaces it with the corresponding wrapper class (Integer
) and provides conversion with auto-boxing (that is the new Integer( .. )
in the code above).
This solution is easier to understand, a potential maintainer can read it as "sort the students by comparing their key feature obtained through the getId() method".
Though is has two minor limitations:
s -> s.
that
convey no useful informationIn order to remove the useless boilerplate code, instead of the lambda expression used in the last solution, we can improve the code by using a method refence, another new feature introduced in Java 8.
The code using the factory method and the method reference would be:
Collections.sort(students, comparing( Student::getId ));
The method reference used the operator ::
which allows combining
a contained and a method name
Container :: method
Where the container can be a type name or an object reference.
For instance System.out::println
is translated by the compiler into
the lambda o -> System.out.println(o)
.
While the method reference Student::getId
is translated into
o -> o.getId()
.
This solution removes the boilerplate code still presente in the previous solution and it is easy to understand, a potential maintainer can read it as "sort the students by comparing their key feature obtained through the getId() method of class Student".
A further improvement, in terms of performance, can be achieved
using a factory method that avoids the auto-boxing of the
int
returned by the getId
method.
The code can be written as:
Collections.sort(students, comparingInt( Student::getId ));
The comparingInt()
method is very similar to the general version but
it accepts an argument of type ToIntFunction<T>
.
That interface is declared as:
public interface ToIntFunction<T> {
int applyAsInt(T t);
}
So when we instantiate it with the method refence Student::getId
, the compiler generates an anonymous inner class that looks like:
new Function<Student> {
int apply(Student s) { return s.getId(); }
}
By comparing this generated implementation with the one obtained using the comparing()
method we can observe the absence of the auto-boxing.
On functional-style comparators in Java 8 by Marco Torchiano is licensed under a Creative Commons Attribution 4.0 International License.