Wednesday, October 10, 2007

Manipulating String intelligently

Performance and Optimizations are the key features desired in Java Programming. So here are some examples, how to make improvement in your String manipulation..



String is the most encountered Objects in Java Programming. Till Tiger, we have 3 Classes to manipulate Strings. These are String, StringBuilder, StringBuffer.

String is immutable whereas StringBuffer and StringBuilder can change their values.

StringBuilder was introduced in J2SE5 or Tiger.The only difference between StringBuffer and StringBuilder is that StringBuilder is unsynchronized whereas StringBuffer is synchronized. So when the application needs to be run only in a single thread then it is better to use StringBuilder. StringBuilder is more efficient than StringBuffer.

Here are some examples given by Mr. Heinz Kabutz about how to perform String Manipulation intelligently.

We start with a basic concatenation based on +=:

public static String concat1(String s1, String s2, String s3,
String s4, String s5, String s6) {
String result = "";
result += s1;
result += s2;
result += s3;
result += s4;
result += s5;
result += s6;
return result;
}

String is immutable, so the compiled code will create many intermediate String objects, which can strain the garbage collector. A common remedy is to introduce StringBuffer, causing it to look like this:

public static String concat2(String s1, String s2, String s3,
String s4, String s5, String s6) {
StringBuffer result = new StringBuffer();
result.append(s1);
result.append(s2);
result.append(s3);
result.append(s4);
result.append(s5);
result.append(s6);
return result.toString();
}

But the code is becoming less legible, which is undesirable.

Using JDK 6.0_02 and the server HotSpot compiler, I can execute concat1() a million times in 2013 milliseconds, but concat2() in 734 milliseconds. At this point, I might congratulate myself for making the code three times faster. However, the user won't notice it if 0.1 percent of the program becomes three times faster.

Here's a third approach that I used to make my code run faster, back in the days of JDK 1.3. Instead of creating an empty StringBuffer, I sized it to the number of required characters, like so:

public static String concat3(String s1, String s2, String s3,
String s4, String s5, String s6) {
return new StringBuffer(
s1.length() + s2.length() + s3.length() + s4.length() +
s5.length() + s6.length()).append(s1).append(s2).
append(s3).append(s4).append(s5).append(s6).toString();
}

I managed to call that a million times in 604 milliseconds. Even faster than concat2(). But is this the best way to add the strings? And what is the simplest way?

The approach in concat4() illustrates another way:

public static String concat4(String s1, String s2, String s3,
String s4, String s5, String s6) {
return s1 + s2 + s3 + s4 + s5 + s6;
}

You can hardly make it simpler than that. Interestingly, in Java SE 6, I can call the code a million times in 578 milliseconds, which is even better than the far more complicated concat3(). The method is cleaner, easier to understand, and quicker than our previous best effort.

Sun introduced the StringBuilder class in J2SE 5.0, which is almost the same as StringBuffer, except it's not thread-safe. Thread safety is usually not necessary with StringBuffer, since it is seldom shared between threads. When Strings are added using the + operator, the compiler in J2SE 5.0 and Java SE 6 will automatically use StringBuilder. If StringBuffer is hard-coded, this optimization will not occur.

When a time-critical method causes a significant bottleneck in your application, it's possible to speed up string concatenation by doing this:

public static String concat5(String s1, String s2, String s3,
String s4, String s5, String s6) {
return new StringBuilder(
s1.length() + s2.length() + s3.length() + s4.length() +
s5.length() + s6.length()).append(s1).append(s2).
append(s3).append(s4).append(s5).append(s6).toString();
}

However, doing this prevents future versions of the Java platform from automatically speeding up the system, and again, it makes the code more difficult to read.

Source: By Mr. Heinz Kabutz(Java Champions) (creator of the free Java Specialists' Newsletter)