Google Collections

Google Collections is a natural evolution from the standard collections API in Java, bringing a much broader range of functionality to the existing collections and also providing several new collections for its users. The library plays nice with the standard collections and uses existing functionality where possible.

Here is a list of useful facts before we throw ourselves into the code.

Immutable Collections

A big part of the collections library is focused on immutability, the reason for this is the increased demand for concurrent code in order to be able to utilize the ever increasing number of cores in modern computers.

Immutable data are thread-safe because it can never be changed and can therefore be shared between multiple threads without worrying. This is a huge improvement because of the complexity involved writing thread safe code that uses mutable data. furthermore immutable data can be handed to untrusted code without worrying about the data being changed by the untrusted code, this really helps sustaining the security boundaries of your system.

Another benefit of using immutable collections is that they are in most cases more memory efficient and perform better than the regular collections.

In a way it was also possible to create immutable collections with the standard Java API, as an example a set could be made immutable the following way:

With Google Collections it would look the following way:

There are several things that are better in the version using Google Collections. You only need to know about the ImmutableSet class, where as in the standard Java version, four classes are combined to achieve the same effect, that is just confusing. When using the new keyword to construct generic classes, the generic type can’t be inferred from the constructor arguments, this limitation is not present for methods. Google Collections uses static factory methods to create their collections meaning the generic type will be inferred in this case and therefore turning the code noise down. The third benefit is that the immutable collections are guaranteed to be immutable where as an unmodifiable collection is just an immutable view of the collection and can be changed by code holding a reference to the actual collection. The last benefit is that the information about immutability is captured in the type and clients of this type therefore will be able to count on that the collection is immutable, which means that defensive copying is unnecessary. Be aware that it is only the collection that is promised to be immutable, if the objects contained in the collection are not immutable they can still be changed. Notice that ImmutableSet implements the Set interface so it can be used seamlessly with existing code.

Those of you who are familiar with functional programming languages, may expect, that mutating methods like add returns a modified copy of the collection. But in order to implement the standard Java collections interfaces, this is not possible, so instead they just throw an exception. You can of cause still combine existing immutable collections into new ones. If you for example want to create the union of two sets, you can do it the following way:

Notice how we use the static union method on the Sets class to create a new SetView based on the two given sets. The SetView is an unmodifiable implementation of the standard Set interface but can change if the underlying collections changes. Google Collections provide similar convenience classes for working with all the popular collections interfaces from the standard Java API, improving your productivity a lot.

Numerous immutable collections exist in the library, for more information see the API documentation.

New collections types

Google Collections also provides implementations of a few collections that is not included in the standard Java. These collections can be very useful in special cases.

Multisets

Multiset is a very useful collection for collecting statistic data of some kind. The collection is basically just a set where a count of how many instances has been added of each element is maintained. So if we as an example want to count the occurrences of each number in a list of numbers it can be done the following way.

The resulting multiset after adding the list of numbers contains to following information:

  • 0 has 1 occurrences
  • 1 has 6 occurrences
  • 2 has 3 occurrences
  • 3 has 3 occurrences

Numerous implementations of the Multiset exists, so it is a good idea to match your requirement against the capabilities of the different implementations before choosing one.

Multimaps

Have you ever implemented a Map where each key could be associated to a list of values – that’s exactly what a Multimap is. This is useful in many cases, for example indexing. To show the functionality I’ll index fruits by colors. The Fruit class is just a class with a name and a color and the fruits collection contains banana, strawberry, cucumber, kiwifruit, tomato and lemon. First we create a new multimap for the color index, then we look through all the fruits and add them to the color index using the color of the fruit as index. After we have constructed the index, we get a collection of red fruits using the get method. And finally, we assert that a tomato is a red fruit.

BiMap

A BiMap is an implementation of a bidirectional map, that is a map where a one-to-one relation between each key and value exists. The BiMap is special because it is capable of producing the inverse mapping using the inverse method. Bidirectional maps are also useful for indexing in cases where an one-to-one relationship exists and you don’t want the relationship encoded in the objects. As an example we will look at mapping numbers to their names and back again.

We can easily create a mapping between the numbers and their names using the of method on the ImmutableBiMap implementation. As with the Multisets and Multimaps numerous implementations exists to fit your exact needs.

For a BiMap we can get the inverse BiMap the following way:

Ordering

If you ever have struggled with implementing Java’s Comparator interface the Ordering class from Google Collections is you friend. The Ordering class provides you with everything you need for handling ordering of collections. Furthermore the Ordering class implements the Comparator interface to handle backwards compatibility. If you, for example, want to sort the fruits earlier mentioned first by color then by name it can be done the following way.

Prints out:

First I create the ordering of fruits by color. The Color enumeration implements Comparable, so we just want to promote the ordering of fruits to use the natural order of the Color class. To do this we create a natural order on the result of the function that retrieves the color from a fruit. To verify that the ordering works, I create a new ImmutableSortedSet with the ordering we just created and print the result. Notice that it is necessary to used the builder pattern in order to create the ImmutableSortedSet with a specified order.

The ordering on fruit names is similar to the way we order by colors except we use a function that retrieves the name of the fruit.

When we have the two orderings we can make a compound ordering by using the compound method on the ordering that should be applied first. A compound ordering applies each ordering in turn to achieve a hierarchical ordering. In this case the color-ordering is first applied, if the objects are equal, according to that ordering, the naming-ordering is applied.

Another use case of the Ordering class is sorting Iterable classes where elements are not guaranteed to be unique. Normally you would have to first convert the iterable to a list and then use the sort method on the Collections class to sort the elements. Consider the case of sorting an iterable of numbers in reverse order, this can be done the following way in standard Java:

The same result can be achieved with Google Collections the following way:

This doesn’t seem like a big difference. But in the standard Java example involves three different classes and the generics is specified two times.

Iterables and Iterators

Iterables and iterators are a powerful abstraction that lets you handle streams of data in an uniform way. But the standard Java library doesn’t provide you with many tools to handle these types. Google Collections includes two classes with several convenience methods to work on these types. The two classes contains the same methods so I’ll only focus on iterables.

Imagine that you are given an iterable of numbers should find all the even numbers, square them and return a new iterable of the result. Of course this should be done without reading all the numbers in the given iterable. In standard Java you would need to implement two iterators and two iterables in order to achieve this. In Google Collection it can be done the following way:

First we create a predicate function that only accepts even numbers and filters the given numbers by this predicate function. Notice this is done lazily so the original iterable is not touched, which means at no time is a collection containing all the results constructed and stored in memory. All the methods on Iterables and Iterators classes is as lazy as possible.

Then we create a square function and apply it to the iterable of even numbers giving us a new iterable with the squared even numbers.

It is important to understand that the predicate and square functions will first be applied when the iterable is used to iterate through the squared even numbers.

Conclusion

In my opinion Google Collection will definitely improve your code, making it more readable and save you from creating your own error-prone implementation of the functionality already provided and tested by Google. I think that it is especially good that the library extends the standard library in a non intrusive way, instead of creating a parallel collection library. This also means that you can choose to use only the convenience classes and not spreading Google specific types all over your system if that worries you.

21 Comments

  1. This is an excellent post.

    Thanks,
    –Gautam

  2. I will not use Java standard collections API.
    From this moment only Google Collections.
    Thank you for nice introduction article.

  3. Very nice introduction. It seems to be inspired by Scala collections a lot. It’s a pity that functions definitions need so much boilerplate in Java. Thanks a lot.

  4. Thanks for the post. I was planning to explore it since the first time I saw it presented by Bob Lee and Josh Bloch. Good work !

    Nice introduction.

  5. sachin Misurkar

    Thanks for sharing such new and useful info.

  6. phil

    Excellent post!
    I have a question. Do you know a way to sort a multiset based on the number of occurrence of each elements?
    For instance if your multimap is
    apple -> 1
    tomato ->5
    you would like to see tomato first.
    I can’t find an easy way to do that operation that seems to me very usefull.
    In the meantime I used a comparator as follow

    Comparator<Multiset.Entry> occurence_comparator = new Comparator<Multiset.Entry>() {
    public int compare(Multiset.Entry e1, Multiset.Entry e2) {
    return e2.getCount() – e1.getCount() ;
    }
    };

    Is there a more generic and elegant way to do it?
    Thanks!
    -Phil.

  7. phil

    As you guessed, I meant “multiset” and not multimap in my example.
    Thanks.

  8. Thanks to all of you, I’m really happy you liked the article.

    To you Phil, I can’t come up with a better way than to sort by occurrences as you mentioned:

    Function<Multiset.Entry<Fruit>, Integer> getCountFunction =
        new Function<Multiset.Entry<Fruit>, Integer>() {
          public Integer apply(Multiset.Entry<Fruit> from) {
            return from.getCount();
          }
        };

    Ordering<Multiset.Entry<Fruit>> popular =
        Ordering.natural().reverse().onResultOf(getCountFunction);

    List<Multiset.Entry<Fruit>> sorted =
        popular.sortedCopy(fruits.entrySet());

    for (Multiset.Entry<Fruit> entry : sorted) {
      System.out.println(String.format(“%s %s”, entry.getCount(),
        entry.getElement()));
    }

    4: Yellow Banana
    3: Red Strawberry
    2: Green Kiwifruit
    2: Red Tomato
    1: Yellow Lemon

  9. Hiram

    Thanks you very much for your post.
    It is a very clear introduction to the API.
    It helped me to understand the basic concepts.

  10. I like this concept. I visited your blog for the first time and just been your fan. Keep posting as I am gonna come to read it everyday

  11. Thanks you very much for your post.
    It is a very clear introduction to the API.
    It helped me to understand the basic concepts.

  12. I am happy that I learn something new.
    Awesome way of expressing your ideas. You are my favourite.

  13. Rama

    Thank you very much for a nice article..

  14. Buvanes

    Thank you very much…
    Very very useful..

Trackbacks for this post

  1. A Generic Method For Sorting (Google Collections) Multiset Per Entry Count « Java. Internet. Algorithms. Ideas.
  2. Ralph’s Blog » Blog Archive » Google Guava Teil 3: Objects/StringUtils/Primitives
  3. Google Guava Teil 3: Objects/StringUtils/Primitives « Ralph's Blog

Leave a Reply