Java Tips: Optimizing your Map loop

Quite often, a program needs to go through all elements of a Map. Unfortunately, like a Set, a Map doesn’t have index in the data structure so you can’t just get a key of certain index or a value of certain index.

The most common practice used to iterate all elements in a Map is to get the key set and then based on the key, we can retrieve the value. The template we use for such case is something like this:

for (String k : m.keySet()) {
    Integer v = m.get(k);
   // do something with the key and value
}

This works of course, but based on quick observation, this should be not that efficient because we have to retrieve value using key for every step. It would be much better to get the key and value as a pair in the beginning. This is unfortunately less obvious and less used. The template would be something like this (of course, we need to change the generic type as properly):

for (Entry<String, Integer> e : m.entrySet()) {
    Integer v = e.getValue();
    // do something with the key and value
}

Now let’s make some tests. I use this template for the test and changing the backing object, number of iteration, and the method to get the key and value from the map (of course it’s not that real, because we only use the value and ignore the key, in which case we can use values() for better performance).

public static void main(String[] args) {
	java.util.Map<String, Integer> m = new TreeMap<String, Integer>();
	for (int i = 0; i < 500000; i++) {
		m.put(i + "", i);
	}

	List<Integer> l = new ArrayList<Integer>();

	long st = System.currentTimeMillis();

	for (String string : m.keySet()) {
		l.add(m.get(string));
	}

	System.out.println(System.currentTimeMillis() - st);
}

From my tests, I observe that the performance of the first template depends on the backing object. If we are using HashMap, the performance is less or more the same as the second template. If we are using Hashtable, the performance of the first template is about 1.5 times worse as the second template and if we are using TreeMap, the performance of the first template is more than 5 times worse as the second template. This proved my original hypotheses and we as developers should try to change our instinct to use the second template every time we encounter such problem.

UPDATE: I redo the test using nanotime and it’s just confirming my original observation.

Leave a Reply

Your email address will not be published. Required fields are marked *