Category Archives: java

Maven Plugin: Java Code Formatter

Enforcing code format in an organization needs discipline. And the best way to do that is by doing some automation. A plugin for Maven will be a easy decision if your organization has used Maven to manage the build.

This plugin is one plugin that exactly does that. The backbone on the plugin comes from Eclipse Java Formatter, which I still endorse as the best in the industry.

Compared to Jalopy, which is now not free, I find Eclipse formatter has the flexibility we often needed. Although it is not bug-free, for most cases it is more than enough.

However, I concerned by the fact that official maven doesn’t have the latest jar of Eclipse JDT. It still only has version 3.3 of it. Hopefully they will have it soon, especially after Helios is released, which bring a lot of improvement to the Java formatter.

op4j: Bending the Java spoon

The tagline of op4j is very interesting: ‘Bending the Java spoon’, which implies that the library offer magic to Java programming. And indeed it does.

The basic idea of the library is to use Fluent Interface to a much greater use. To do this, the developer basically try to provide as much general functions as possible. It says that the current version of op4j has already more than 1000 functions.

If you read some examples from the website and from the blog, you can find several absolutely genuine idea how programming with Java can be enjoyable. One example:

Calendar date = Op.onListFor(1492, 10, 12).exec(FnCalendar.fieldIntegerListToCalendar()).get();

which if done without op4j will be something like:

Calendar date = Calendar.getInstance();
date.clear();
date.set(Calendar.DAY_OF_MONTH, 12);
date.set(Calendar.MONTH, Calendar.OCTOBER);
date.set(Calendar.YEAR, 1492);

Although on this particular case I can see some people will say that the first code is unclear because there the order of the integer can somehow confusing the reader, the fact that it saves a lot of program code is absolutely beautiful.

I love the fact that lately there are many Java libraries with a goal to make programming much enjoyable.

Little Java Generic Quiz

Let say I have following code:

interface TestGeneric {

   public Integer test(List<Integer> a);
   public Boolean test(List<Boolean> a);

}

Can you guess what are the answers for following questions?

  • TestGeneric will be reported as error by Java 6 (true/false)
  • TestGeneric will be reported as error by Java 7 (true/false)
  • TestGeneric will be reported as error by Eclipse 3.5 using Java 6 (true/false)
  • TestGeneric will be reported as error by Eclipse 3.5 using Java 7 (true/false)
  • TestGeneric will be reported as error by (upcoming) Eclipse 3.6 using Java 6 (true/false)
  • TestGeneric will be reported as error by (upcoming) Eclipse 3.6 using Java 7 (true/false)

What’s your guess?

The correct answer is (highlight to see the answer): false, true, false, (can Eclipse 3.5 run Java 7?)?, true, true.

The fact that statement 1 is false is related to this bug entry: http://bugs.sun.com/view_bug.do?bug_id=6182950

Eclipse (as 3.6M2) offered a more consistent behavior by considering all cases as compile error as discussed here: https://bugs.eclipse.org/bugs/show_bug.cgi?id=289247

And this op4j used exactly this feature and will not be compiled under Java 7 or using Eclipse 3.6.

Computing Map on Google Collections

2009-11-13_1134

Google always makes interesting projects. My toy nowadays is Google Collections. I don’t think I need to reintroduce it as it has been nicely covered on several blog posts:

Of course, two videos from GTUG are also nice.

Now I want to discuss one functionality from Google Collections which is not really covered by those previous articles. This functionality is called Computing Map. No.. no.. you won’t find a class with such name in their Javadoc.

It is basically a map, where the keys are parameters for a calculation, and the values are results of the calculation. Probably you ever faced such scenario where you need to do a lot of computations using complex algorithm? Fortunately, says that many of those calculations are done using the same parameters. So, instead of doing the same operation over and over again, would it be better to just cache the result and using it later?

That is basically the idea and you can even implement it without using Google Collections. You can code something like this:

private final Map<Parameter, Result> cache = new HashMap<Parameter, Result>(100000);

public Result getResult(Parameter p) {
        if (!cache.containsKey(p)) {
            prepareCache(p); // the complex calculation
        }

        return cache.get(p);
}

Easy, yes?

But wait… there is a problem with such code. First, how if two calculations are done at almost the same time? You won’t get the wrong result, but the code will still do the calculation twice due to thread problem. No easy solution for such case, double-checked locking is just failed.

And more problems may arise. With time it is possible that there are so many parameters used and your Map will grow without limit. This is standard problem for any cache implementation and using soft reference or any third-party cache implementation may solve the problem.

At the end, our solution is not just so simple anymore.

Here Google Collections may help us. The MapMaker is a very powerful factory-class that allow you to combine features of a Map you can think of. Need a Map with soft reference key and weak reference value? Need a map with strong key and soft reference value? MapMaker will allow you to do that… the easy way.

And it provides us with a Computing Map. A computing map is created with MapMaker by calling the method ‘makeComputingMap’ and defining a function that will transform Parameter to Result.

Our example before will be something like this:

private final Map<Parameter, Result> cache;

public Cache {
    cache = new MapMaker().makeComputingMap(new Function<Parameter, Result>() {

            @Override
            public Result apply(Parameter from) {
                return prepareCache(from);
            }
        });
}

public Result getResult(Parameter p) {
        return cache.get(p);
}

That is basically all. The documentation of the method is like this:

Builds a map that supports atomic, on-demand computation of values.
Map#get either returns an already-computed value for the given key,
atomically computes it using the supplied function, or, if another thread
is currently computing the value for this key, simply waits for that thread
to finish and returns its computed value. Note that the function may be
executed concurrently by multiple threads, but only for distinct keys.

If an entry’s value has not finished computing yet, query methods
besides get return immediately as if an entry doesn’t exist. In
other words, an entry isn’t externally visible until the value’s
computation completes.

Map#get on the returned map will never return null. It
may throw:

  • NullPointerException if the key is null or the computing
    function returns null

  • ComputationException if an exception was thrown by the
    computing function. If that exception is already of type
    ComputationException, it is propagated directly; otherwise it is
    wrapped.

Note: Callers of get must ensure that the key
argument is of type K. The get method accepts
Object, so the key type is not checked at compile time. Passing an object
of a type other than K can result in that object being unsafely
passed to the computing function as type K, and unsafely stored in
the map.

If put is called before a computation completes, other
threads waiting on the computation will wake up and return the stored
value. When the computation completes, its new result will overwrite the
value that was put in the map manually.

This method does not alter the state of this MapMaker instance,
so it can be invoked again to create multiple independent maps.

So you’ll get synchronization freely. And best of all, the synchronization doesn’t lock the whole Map, only threads that access the same key.

But there is still a problem with that code… It’s still using strong reference for both keys and values. That’s the default implementation if you don’t specify anything in the MapMaker. Your map will still grow limitless and you will eventually get an OutOfMemoryException.

Well, it’s easy… you can just add a call (softValues) to the creation.

private final Map<Parameter, Result> cache;

public Cache {
    cache = new MapMaker().softValues().makeComputingMap(new Function<Parameter, Result>() {

            @Override
            public Result apply(Parameter from) {
                return prepareCache(from);
            }
        });
}

public Result getResult(Parameter p) {
        return cache.get(p);
}

Now you have a proper implementation of computing map. The values and keys will be hold as long as you have enough memory, but once it need more memory, GC will remove the entries from the Map. Your application will need to calculate the complex calculation again but I think it’s the best achievement of what we can get. Of course you can increase the JVM memory easily anytime you want.

Note that you don’t want to use softKeys. Look at the Javadoc of softKeys.

Note: the map will use identity ({@code ==}) comparison
to determine equality of soft keys, which may not behave as you expect.
For example, storing a key in the map and then attempting a lookup
using a different but {@link Object#equals(Object) equals}-equivalent
key will always fail.

Hmm… that means that your key will be considered equals if it is the same object. If you recreated the Parameter with the same value and even if you override the equals and hashCode correctly, you will not using the pre-computed value. On the other hand, using just softValues is enough, because once the values is GC-ed, the keys will be removed as well. See this bug entry for more information: http://code.google.com/p/google-collections/issues/detail?id=250 or this dialog in the groups: http://groups.google.com/group/google-collections-users/browse_frm/thread/8e4bd19f5cfa9adb/24e9d9de34fadb6f?lnk=gst&q=soft+reference+identity#24e9d9de34fadb6f.

And if you still think that you have a use case for equality soft reference, I have a patch to the MapMaker you can use. It’s not nice and pretty hacky, but it works as far as I can say. I personally don’t use it anymore but maybe I will in the future (if I find a strong use case for that, which I doubt).

MapMaker.java

Lambda4JDT: lambda expression on your Java code

This is just another proof how many people are starting to want some functional syntaxes on Java. Some lambda expressions are possible to be implemented using current Java, but there are just too many noise in the result.

Lambda4JDT provides a new plugin for Eclipse that can collapse the noise in the implementation without changing the syntax of Java. I must say that the approach it takes is pretty unique and original.


Java Tips: Optimizing your Map loop

Quite often, a program needs to go through all elements of a Map. Unfortunately, like a Set, a Map doesn’t have index in the data structure so you can’t just get a key of certain index or a value of certain index.

The most common practice used to iterate all elements in a Map is to get the key set and then based on the key, we can retrieve the value. The template we use for such case is something like this:

for (String k : m.keySet()) {
    Integer v = m.get(k);
   // do something with the key and value
}

This works of course, but based on quick observation, this should be not that efficient because we have to retrieve value using key for every step. It would be much better to get the key and value as a pair in the beginning. This is unfortunately less obvious and less used. The template would be something like this (of course, we need to change the generic type as properly):

for (Entry<String, Integer> e : m.entrySet()) {
    Integer v = e.getValue();
    // do something with the key and value
}

Now let’s make some tests. I use this template for the test and changing the backing object, number of iteration, and the method to get the key and value from the map (of course it’s not that real, because we only use the value and ignore the key, in which case we can use values() for better performance).

public static void main(String[] args) {
	java.util.Map<String, Integer> m = new TreeMap<String, Integer>();
	for (int i = 0; i < 500000; i++) {
		m.put(i + "", i);
	}

	List<Integer> l = new ArrayList<Integer>();

	long st = System.currentTimeMillis();

	for (String string : m.keySet()) {
		l.add(m.get(string));
	}

	System.out.println(System.currentTimeMillis() - st);
}

From my tests, I observe that the performance of the first template depends on the backing object. If we are using HashMap, the performance is less or more the same as the second template. If we are using Hashtable, the performance of the first template is about 1.5 times worse as the second template and if we are using TreeMap, the performance of the first template is more than 5 times worse as the second template. This proved my original hypotheses and we as developers should try to change our instinct to use the second template every time we encounter such problem.

UPDATE: I redo the test using nanotime and it’s just confirming my original observation.

Java Tips: Thread Safety Documentation

Joshua Bloch in his book, “Effective Java” summarized the levels of thread safety:

  • immutable—Instances of this class appear constant. No external synchronization is necessary. Examples include String, Long, and BigInteger (Item 15).
  • unconditionally thread-safe—Instances of this class are mutable, but the class has sufficient internal synchronization that its instances can be used concurrently without the need for any external synchronization. Examples include Random and ConcurrentHashMap.
  • conditionally thread-safe—Like unconditionally thread-safe, except that some methods require external synchronization for safe concurrent use. Examples include the collections returned by the Collections.synchronized wrappers, whose iterators require external synchronization.
  • not thread-safe—Instances of this class are mutable. To use them concurrently, clients must surround each method invocation (or invocation sequence) with external synchronization of the clients’ choosing. Examples include the general-purpose collection implementations, such as ArrayList and HashMap.
  • thread-hostile—This class is not safe for concurrent use even if all method invocations are surrounded by external synchronization. Thread hostility usually results from modifying static data without synchronization. No one writes a thread-hostile class on purpose; such classes result from the failure to consider concurrency. Luckily, there are very few thread-hostile classes or methods in the Java libraries. The System.runFinalizersOnExit method is thread-hostile and has been deprecated.

Let we find a simplest example of classes for each of the mentioned level.

Immutable

public class ImmutableClass {

    private String a;

    public ImmutableClass(String a) {
        this.a = a;
    }

    public String getA() {
        return a;
    }

}

Unconditionally Thread Safe

public class UnconditionallyThreadSafeClass {

    private String a;

    private Object syncObject = new Object();

    public UnconditionallyThreadSafeClass(String a) {
        this.a = a;
    }

    public void setA(String a) {
        synchronized (syncObject) {
            this.a = a;
        }
    }

    public String getA() {
        synchronized (syncObject) {
            return a;
        }
    }

}

Conditionally Thread Safe

import java.util.ArrayList;

public class ConditionallyThreadSafeClass {

    private ArrayList<String> a;

    private Object syncObject = new Object();

    public ConditionallyThreadSafeClass(ArrayList<String> a) {
        this.a = a;
    }

    public void add(String s) {
        synchronized (syncObject) {
            a.add(s);
        }
    }

    public ArrayList<String> getA() {
        return a;
    }

}

Not Thread Safe

public class NotThreadSafeClass {

    private String a;

    public NotThreadSafeClass(String a) {
        this.a = a;
    }

    public String getA() {
        return a;
    }

    public void setA(String a) {
        this.a = a;
    }
}

Thread Hostile

public class ThreadHostileClass {

    private static String a;

    public ThreadHostileClass(String a) {
        ThreadHostileClass.a = a;
    }

    public String getA() {
        return a;
    }

    public void doSomethingToA(final String a) {
        new Thread() {
            public void run() {
                ThreadHostileClass.a = a;
            }
        }.run();
    }
}

Java Tips: Memory Optimization for String

String is a unique object in Java. The Java Specification explains several unique properties of String in Java. We might already know some of them. First, String is unique because it can be created without new keyword, like example below.

String s = "new String";

I have to mention that you can still create String object using new keyword, like this:

String s = new String("new String");

Does both statement “exactly equals”? Well, most of you also know that this is not true. The first example will try to reuse the same object whenever possible (and is correct because String is immutable) while the second will force the creation of new String object. Consider this example:

System.out.println("b" == "b");
System.out.println(new String("b") == new String("b"));

The result of first example is “true” while the second one will give “false”.

I almost certain that experienced programmer will never create String using new in normal use. But sometime, we are forced to use that. One case that I can think of is when you parse an XML file using SAX parser.

public class Reader extends DefaultHandler {

    private List<String> listString = new ArrayList<String>();

    public void characters(char[] ch, int start, int length) throws SAXException {

        String content = new String(ch, start, length);
        listString.add(content);

    }
}

This example works correctly but is not efficient. Once you have a document like this:

<test>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
</test>

Try to profile your application, force garbage collection and you will still have ten String objects left in the memory.

Fortunately, Java has provided a method to avoid such case. You can use String.intern() to force the application to use the same String object whenever possible. For above example, you can change the code to something like this:

public class Reader extends DefaultHandler {

    private List<String> listString = new ArrayList<String>();

    public void characters(char[] ch, int start, int length) throws SAXException {

        String content = new String(ch, start, length).intern();
        listString.add(content);

    }
}

Now, re-profile the application, force garbage collection, and you will only have one String left in the memory. You can save a lot of memory if you can make sure that there is only one instance of String with certain value in your JVM.

This method also has nice side effect. If you do a lot of String equality comparison in the application, a same String object run faster. To explain this, we can read the source code of String:

...
public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        int n = count;
        if (n == anotherString.count) {
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = offset;
            int j = anotherString.offset;
            while (n-- != 0) {
                if (v1[i++] != v2[j++])
                    return false;
            }
            return true;
        }
    }
    return false;
}
...

If the object is same, then the method will be immediately after this line if (this == anObject). This is very fast and will save a lot of process time if your application do this operations a lot of time.

Spring Tips: Initializing bean using EasyMock for unit test

Unit test is a very good practice for creating robust application. Using Spring, we can avoid using ApplicationContext and instead arrange all dependencies programmatically. However, once in a while, we still need to test using the actual ApplicationContext. This can be problematic if our test using mock from EasyMock.

So here is the example on how to create a mock using EasyMock programmatically:

AccountDAO accountDAOMock = EasyMock.createMock(AccountDAO.class);

Move it to Spring ApplicationContext like this:

<bean id="accountDAOMock" class="org.easymock.EasyMock"
		factory-method="createMock">
      <constructor-arg index="0"
            value="com.argus.camc.datamanagement.interfaces.AccountDAO" />
</bean>

Happy unit testing!