Java Tips: Memory Optimization for String

String is a unique object in Java. The Java Specification explains several unique properties of String in Java. We might already know some of them. First, String is unique because it can be created without new keyword, like example below.

String s = "new String";

I have to mention that you can still create String object using new keyword, like this:

String s = new String("new String");

Does both statement “exactly equals”? Well, most of you also know that this is not true. The first example will try to reuse the same object whenever possible (and is correct because String is immutable) while the second will force the creation of new String object. Consider this example:

System.out.println("b" == "b");
System.out.println(new String("b") == new String("b"));

The result of first example is “true” while the second one will give “false”.

I almost certain that experienced programmer will never create String using new in normal use. But sometime, we are forced to use that. One case that I can think of is when you parse an XML file using SAX parser.

public class Reader extends DefaultHandler {

    private List<String> listString = new ArrayList<String>();

    public void characters(char[] ch, int start, int length) throws SAXException {

        String content = new String(ch, start, length);
        listString.add(content);

    }
}

This example works correctly but is not efficient. Once you have a document like this:

<test>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
    <string>String</string>
</test>

Try to profile your application, force garbage collection and you will still have ten String objects left in the memory.

Fortunately, Java has provided a method to avoid such case. You can use String.intern() to force the application to use the same String object whenever possible. For above example, you can change the code to something like this:

public class Reader extends DefaultHandler {

    private List<String> listString = new ArrayList<String>();

    public void characters(char[] ch, int start, int length) throws SAXException {

        String content = new String(ch, start, length).intern();
        listString.add(content);

    }
}

Now, re-profile the application, force garbage collection, and you will only have one String left in the memory. You can save a lot of memory if you can make sure that there is only one instance of String with certain value in your JVM.

This method also has nice side effect. If you do a lot of String equality comparison in the application, a same String object run faster. To explain this, we can read the source code of String:

...
public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        int n = count;
        if (n == anotherString.count) {
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = offset;
            int j = anotherString.offset;
            while (n-- != 0) {
                if (v1[i++] != v2[j++])
                    return false;
            }
            return true;
        }
    }
    return false;
}
...

If the object is same, then the method will be immediately after this line if (this == anObject). This is very fast and will save a lot of process time if your application do this operations a lot of time.

One thought on “Java Tips: Memory Optimization for String”

  1. except that intern() doesn’t use the head but some other memory (perm space). So if you don’t have a lot a perm space you can have not enough memory.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>