titledateauthor
On functional-style comparators in Java 82015-04-25Marco Torchiano

On functional-style comparators in Java 8

Marco Torchiano - Version 1.1 - 25 April 2015

Given a fairly typical Student class:

public class Student {
    //...
    public int getId() { /*...*/ }
    public int getSurname() { /*...*/ }
    public int getName() { /*...*/ }
}

A common task is to order a collection of such object, e.g. by their id. For this purpose we can used the Collections.sort() method that accepts a List and sorts it. There are two version of such a method:

Let us suppose we have a list of students declared as:

List<Student> students;

Sorting a students' list using a comparator can be done in several different ways.

Regular comparator class

A new class must be created to host the code required compare two Student objects:

public class StudentIdComparator implements Comparator<Student>{
   @Override
    public int compare(Student a, Student b){
        return a.getId() - b.getId();
    }
}

Such class will be used in the client code to sort the elements of the list as follows:

Collections.sort( students, new StudentIdComparator() );

Observation the sort() method arguments are profoundly different and play two very different roles:

The usage of functional objects is very common, especially when adopting a functional programming style, and it is encoded in the Strategy pattern.

This solution presents a few drawbacks:

Anonymous inner class

To address the limitations of the above solution, in Java it is common to use an anonymous inner class. In this case, no additional standalon class is required and the client code can be written as:

Such class will be used in the client code to sort the elements of the list as follows:

Collections.sort( students, new Comparator<Student>(){
                    @Override
                    public int compare(Student a, Student b){
                        return a.getId() - b.getId();
                    }
                });

This solution solves the limitations though is requires still a significant amount of boilerplate code that is required by the language syntax though it does not provide any significan information.

Lambda expression

A much more compact form can he achieved by means of lambda expressions, that were introduced in Java 8. Lambda expression allow us to define and instantiate an anonynous inner class without the boilerplate code and leveraging the compiler type inference capabilities.

A perfectly equivalent code to the previous can be written as:

Collections.sort( students, (a,b) -> a.getId() - b.getId() );

Two mechanisms are in place here:

This solution is much more compact and readable than the previous one.

The drawback is that it is still non immediately undestandable: a potential maintainer of the code would need to figure out that the difference of the two ids means sorting in ascending id order.

Comparator factory method

A further step consists in using a comparator factory method, which can build a comparator based on some key feature of the objects. The key can be obtained by means of a key extractor object.

The code using the factory method would then be:

Collections.sort(students, comparing( s -> s.getId()) );

The static comparing() method inside Comparator can be accessed without the class qualification provided the following import statement is present in the class

import static java.util.Comparator.*

The comparing() method is implemented rougly as:

static <T,U extends Comparable<U>> 
  Comparator<T> comparing(Function<T,U> keyExtractor){
     return (o1,o2) -> keyExtractor.apply(o1)
                       .compareTo( keyExtractor.apply(o2) );
}

The comparator uses the apply() method defined inside the Function interface, whose simplified code is:

public interface Function<T,U> {
  U apply(T t);
}

So when we instantiate it with a lambda (s) -> s.getId(), the compiler generates an anonymous inner class that looks like:

new Function<Student,Integer> {
  Integer apply(Student s) { return new Integer(s.getId()); }
}

The first type parameter (i.e. T replaced by Student) is inferred from the context (the type of the List elements) The second type parameter (i.e. U replaced by Integer) is inferred from the return type of the lambda expression (i.e. the return type of getId() ) that is int, though, being a primitive type, int cannot be used as a type parameter so the compiler replaces it with the corresponding wrapper class (Integer) and provides conversion with auto-boxing (that is the new Integer( .. ) in the code above).

This solution is easier to understand, a potential maintainer can read it as "sort the students by comparing their key feature obtained through the getId() method".

Though is has two minor limitations:

Factory with a method reference

In order to remove the useless boilerplate code, instead of the lambda expression used in the last solution, we can improve the code by using a method refence, another new feature introduced in Java 8.

The code using the factory method and the method reference would be:

Collections.sort(students, comparing( Student::getId ));

The method reference used the operator :: which allows combining a contained and a method name

Container :: method

Where the container can be a type name or an object reference.

For instance System.out::println is translated by the compiler into the lambda o -> System.out.println(o). While the method reference Student::getId is translated into o -> o.getId().

This solution removes the boilerplate code still presente in the previous solution and it is easy to understand, a potential maintainer can read it as "sort the students by comparing their key feature obtained through the getId() method of class Student".

Primitive type factory with a method reference

A further improvement, in terms of performance, can be achieved using a factory method that avoids the auto-boxing of the int returned by the getId method.

The code can be written as:

Collections.sort(students, comparingInt( Student::getId ));

The comparingInt() method is very similar to the general version but it accepts an argument of type ToIntFunction<T>. That interface is declared as:

public interface ToIntFunction<T> {
  int applyAsInt(T t);
}

So when we instantiate it with the method refence Student::getId, the compiler generates an anonymous inner class that looks like:

new Function<Student> {
  int apply(Student s) { return s.getId(); }
}

By comparing this generated implementation with the one obtained using the comparing() method we can observe the absence of the auto-boxing.


Tweet

Creative Commons License
On functional-style comparators in Java 8 by Marco Torchiano is licensed under a Creative Commons Attribution 4.0 International License.