Computing Map on Google Collections


Google always makes interesting projects. My toy nowadays is Google Collections. I don’t think I need to reintroduce it as it has been nicely covered on several blog posts:

Of course, two videos from GTUG are also nice.

Now I want to discuss one functionality from Google Collections which is not really covered by those previous articles. This functionality is called Computing Map. No.. no.. you won’t find a class with such name in their Javadoc.

It is basically a map, where the keys are parameters for a calculation, and the values are results of the calculation. Probably you ever faced such scenario where you need to do a lot of computations using complex algorithm? Fortunately, says that many of those calculations are done using the same parameters. So, instead of doing the same operation over and over again, would it be better to just cache the result and using it later?

That is basically the idea and you can even implement it without using Google Collections. You can code something like this:

private final Map<Parameter, Result> cache = new HashMap<Parameter, Result>(100000);

public Result getResult(Parameter p) {
        if (!cache.containsKey(p)) {
            prepareCache(p); // the complex calculation

        return cache.get(p);

Easy, yes?

But wait… there is a problem with such code. First, how if two calculations are done at almost the same time? You won’t get the wrong result, but the code will still do the calculation twice due to thread problem. No easy solution for such case, double-checked locking is just failed.

And more problems may arise. With time it is possible that there are so many parameters used and your Map will grow without limit. This is standard problem for any cache implementation and using soft reference or any third-party cache implementation may solve the problem.

At the end, our solution is not just so simple anymore.

Here Google Collections may help us. The MapMaker is a very powerful factory-class that allow you to combine features of a Map you can think of. Need a Map with soft reference key and weak reference value? Need a map with strong key and soft reference value? MapMaker will allow you to do that… the easy way.

And it provides us with a Computing Map. A computing map is created with MapMaker by calling the method ‘makeComputingMap’ and defining a function that will transform Parameter to Result.

Our example before will be something like this:

private final Map<Parameter, Result> cache;

public Cache {
    cache = new MapMaker().makeComputingMap(new Function<Parameter, Result>() {

            public Result apply(Parameter from) {
                return prepareCache(from);

public Result getResult(Parameter p) {
        return cache.get(p);

That is basically all. The documentation of the method is like this:

Builds a map that supports atomic, on-demand computation of values.
Map#get either returns an already-computed value for the given key,
atomically computes it using the supplied function, or, if another thread
is currently computing the value for this key, simply waits for that thread
to finish and returns its computed value. Note that the function may be
executed concurrently by multiple threads, but only for distinct keys.

If an entry’s value has not finished computing yet, query methods
besides get return immediately as if an entry doesn’t exist. In
other words, an entry isn’t externally visible until the value’s
computation completes.

Map#get on the returned map will never return null. It
may throw:

  • NullPointerException if the key is null or the computing
    function returns null

  • ComputationException if an exception was thrown by the
    computing function. If that exception is already of type
    ComputationException, it is propagated directly; otherwise it is

Note: Callers of get must ensure that the key
argument is of type K. The get method accepts
Object, so the key type is not checked at compile time. Passing an object
of a type other than K can result in that object being unsafely
passed to the computing function as type K, and unsafely stored in
the map.

If put is called before a computation completes, other
threads waiting on the computation will wake up and return the stored
value. When the computation completes, its new result will overwrite the
value that was put in the map manually.

This method does not alter the state of this MapMaker instance,
so it can be invoked again to create multiple independent maps.

So you’ll get synchronization freely. And best of all, the synchronization doesn’t lock the whole Map, only threads that access the same key.

But there is still a problem with that code… It’s still using strong reference for both keys and values. That’s the default implementation if you don’t specify anything in the MapMaker. Your map will still grow limitless and you will eventually get an OutOfMemoryException.

Well, it’s easy… you can just add a call (softValues) to the creation.

private final Map<Parameter, Result> cache;

public Cache {
    cache = new MapMaker().softValues().makeComputingMap(new Function<Parameter, Result>() {

            public Result apply(Parameter from) {
                return prepareCache(from);

public Result getResult(Parameter p) {
        return cache.get(p);

Now you have a proper implementation of computing map. The values and keys will be hold as long as you have enough memory, but once it need more memory, GC will remove the entries from the Map. Your application will need to calculate the complex calculation again but I think it’s the best achievement of what we can get. Of course you can increase the JVM memory easily anytime you want.

Note that you don’t want to use softKeys. Look at the Javadoc of softKeys.

Note: the map will use identity ({@code ==}) comparison
to determine equality of soft keys, which may not behave as you expect.
For example, storing a key in the map and then attempting a lookup
using a different but {@link Object#equals(Object) equals}-equivalent
key will always fail.

Hmm… that means that your key will be considered equals if it is the same object. If you recreated the Parameter with the same value and even if you override the equals and hashCode correctly, you will not using the pre-computed value. On the other hand, using just softValues is enough, because once the values is GC-ed, the keys will be removed as well. See this bug entry for more information: or this dialog in the groups:

And if you still think that you have a use case for equality soft reference, I have a patch to the MapMaker you can use. It’s not nice and pretty hacky, but it works as far as I can say. I personally don’t use it anymore but maybe I will in the future (if I find a strong use case for that, which I doubt).

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.