Pages

Wednesday, March 5, 2014

Specifying the size for StringBuilder (jmh)

Several years ago I started doing some microbenchmarks. I don't think that I really progressed since than, but at least some errors became obvious. So I decided to take one of my old benchmarks and reimplement it using the JVM library. Here are the results that I got:

Benchmark                                    Mode   Samples         Mean   Mean error    Units
c.s.m.j.StringBuilderSize.expandingSize      avgt        15     7566.477      374.611    ns/op
c.s.m.j.StringBuilderSize.predefinedSize     avgt        15     5640.386      133.672    ns/op


And here is the code for it. (I have also posted some other jmh-based benchmarks in that repo).

Sunday, February 23, 2014

Freemarker loading taglibs from classpath

It is pretty common problem to try using various jsp taglibs from freemarker templates[1][2][3][4]. There is some support for this in freemarker, but this support is a bit ugly - it requires dancing with jar files like placing them in WEB-INF/lib folder and there is no easy way to use taglibs just from the classpath. At least - there was no such an easy way. After a day of debugging and investigation I figured the solution that works pretty good for me. The key is to override just the two methods in the ServletContext that is used by freemarker TagLibFactory. To do this I used standard dynamic proxy, but other solutions are possible two. After this taglibs can be referenced by their paths in the classpath - like:

<#assign security=JspTaglibs["/META-INF/security.tld"] />

for spring-security taglib instead of the usual

<#assign security=JspTaglibs["http://www.springframework.org/security/tags"] />

And the InvocationHandler that can be used to create the proxy for the ServletContext is here:

package com.sopovs.moradanen;

import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Method;

import javax.servlet.ServletContext;


public class ServletContextResourceHandler implements InvocationHandler {
    private final ServletContext target;

    private ServletContextResourceHandler(ServletContext target) {
        this.target = target;
    }

    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        if ("getResourceAsStream".equals(method.getName())) {
            Object result = method.invoke(target, args);
            if (result == null) {
                result = ServletContextResourceHandler.class.getResourceAsStream((String) args[0]);
            }
            return result;
        } else if ("getResource".equals(method.getName())) {
            Object result = method.invoke(target, args);
            if (result == null) {
                result = ServletContextResourceHandler.class.getResource((String) args[0]);
            }
            return result;
        }

        return method.invoke(target, args);
    }
}

And the complete solution can be found in my pet-project here.

Sunday, February 16, 2014

System.arraycopy versus simple copy in a loop

Programming in Java daily I rarely need to use arrays. But sometimes this happens and even more often I trap into reading libraries code that heavily use arrays. So proper working with them is the real question for me. And recently I stumbled upon a code that used plain loops for copying values from one array to the other. So I decided to check whether this is viable option when we have System.arraycopy method. I wrote a simple benchmark using excellent JMH library.

And so what I have got as a result:


Benchmark                        Mode   Samples         Mean   Mean error    Units
c.s.m.j.ArrayCopy.loopcopy       avgt        15        1.463        0.160    ms/op
c.s.m.j.ArrayCopy.systemcopy     avgt        15        1.457        0.112    ms/op
c.s.m.j.ArrayCopy.loopcopy     sample     10322        1.450        0.018    ms/op
c.s.m.j.ArrayCopy.systemcopy   sample     11071        1.351        0.013    ms/op
c.s.m.j.ArrayCopy.loopcopy         ss        15        1.653        0.925       ms
c.s.m.j.ArrayCopy.systemcopy       ss        15        1.657        0.788       ms


So there is no real difference in hand-written loop and using native method. But I really doubt that even with these results we should ever prefer hand-written loop instead of System.arraycopy native method. Here is a short list of ad-hock parameters that may come into play:

  • Different CPU. I used the most powerful current desktop CPU for running by benchmark. Your code may be run on very different CPU with very different result - maybe even on differnt architecture. Who knows - maybe in just several years the major part of java application servers will be run on ARM-servers.
  • Different hardware cache usage. It is no doubt that System.arraycopy is optimized for the proper usage of CPU caches and not interfering with the other code running on the same CPU concurrently. For the loop - you cannot be absolutely sure.
  • Different JVM implementation. With only three major JVM-implementations currently - OpenJDK (but again - I doubt that ARM-version can be called absolutely the same JVM in this specific context), IBM J9 and Dalvik (not a JVM actually, but your library that uses arrays can easily get to running on it) - you cannot be sure in the JIT already. But also there are other JVMs, like Azul Zing, Azul Zuzu, Excelsior JET and many others.
So, nonetheless that I was not able to observe any difference in running hand-written loop and native built-in arraycopy method - I prefer the latter.

Friday, December 27, 2013

Learning Vaadin 7 review

Some time ago I was proposed to review the second edition of the "Learning Vaadin 7" (On Amazon) book. Shame on me for holding it for so long. Previously I have used Vaadin 6 and was really impressed by its features and simplicity of doing real desktop-like web-applications. But at the same time every now and then I was frustrated by some of the methods and mostly by the methods returning plain Objects. Fortunately now these type-safety problems are solved with the Vaadin 7.
But what about the book? It seems as really good introduction to Vaadin for junior developers:

  • It has a really extensive and balanced introduction that defines Vaadin place in the Java and Web ecosystems. Actually, I think that it is worth reading on its own.
  • It proceeds with detailed instructions on the development environment setup with a dive into production setup.
  • It is really detailed with all the basic and not so basic concepts of building Vaadin applications clearly explained.
The drawbacks of the book, that I can name are:
  • Examples use ant+ivy as a build system that seems less widespread to me, but it may have its benefits since it leaves much less area for "build magic" and gives more control and understanding of the build. Also information about Maven is given in a separate chapter.
  • Book seems like a one time reading for me. It can not be used as reference - but again, it is clearly stated that this is not a goal. Also it will motivate readers to search for information as it should be done after the book is read.
As an overall conclusion, I would not buy such book for myself, but I will probably recommend it to newbie developers.

Monday, November 25, 2013

Java8 HashMap is not compatible with Java7... in a way...

One of the things that shine in Java (not the only one!) is its performance. And it constantly improving. One of the performance improving features of Java8 is an improvement to HashMap and many related standard hashing collections. Actually from the title of this JEP (JDK Enhancement Proposal) "Handle Frequent HashMap Collisions with Balanced Trees" it is clear that this enhancement uses Comparable and Comparator functionality. If class do not implement Comparable interface or not Comparable with each other classes are stored in the map - Comparator based on the hashcode is used. But if class implements Comparable - this implementation is used.

So based on all this here is the very simple question. What does this code output on the JVM version 7 and on the JVM version 8?

Thursday, November 21, 2013

Benchmarking Guava's ImmutableMap

So I decided to bench (with the help of the excellent JMH suite) Guava's ImmutableMap. The first my try revealed some not expected results.

Source of the benchmark: Results: What I was not expected is that Guava's ImmutableMap with single entry is much faster to create than standard Collections.singletonMap(). Another surprise was in no performance benefits from using ImmutableMap with more than one entry comparing to HashMap and HashMap inside standard Collections.unmodifiableMap().

But this was test for creation of the map only. So here is the second version with simple work of the "contains" method: Results: The result is almost the same.

What is different in Collection.singletonMap comparing to the Guava's ImmutableMap is cached keySet, entrySet and values fields. I decided to make a little test to know whether not-initialized (and thus pointing to null) and not-used fields impact performance. Results: As it is seen the difference is significant. So probably the difference in Collections.singletonMap() and Guava's ImmutableMap is caused by these three fields even if they are not used.

In order to compare performance using these caching fields (or at least one of them) I made the third version: Results: Guava's ImmutableMap is still a bit better than Collections.singletonMap() but the difference is much less than in the first version. I assume that if using all the fields cached in standard version - it will be better.

Another surprise it no performance benefits from using Guava's Maps.newHashMapWithExpectedSize(). 128 entries is definitely enough to produce at least one rehash of the HashMap but probably they do not damage performance in a significant way, or there was really no rehashes since the expected size of the HashMap was inferred form the subsequent code but the JVM.

And the last surprise was that HashMap inside the Collections.unmodifiableMap() may have a better performance not only comparing to the Guava's ImmutableMap, but also comparing with exactly the same HashMap without any covers.

Saturday, June 29, 2013

Synchronized vs threadlocal SimpleDateFormat based on JMH

Micro-benchmarks are really hard. If you have seen some in this blog and you think that at least some of them are correct - you are wrong! ;-) Actually I don't think that I would ever really need to do some micro-benchmarks. To the moment all performance optimizations that I have done were about SQL (mostly), or not loading all the data into memory, but loading and processing it in chunks instead, or introducing some caching, or introducing multi-threading, or just rewriting the algorithm from scratch (introduction of multi-threading is usually the complete rewrite too).

But to have some fun (and to have at least anything to write in blog about) I sometimes do such stupid benchmarks. Of course it is better for this fun activity to have at least some similarity to the real world. Recently (maybe not really recently) Oracle Java Performance Team has open-sourced their framework for micro-benchmarks. I decided to try it and rewrote my benchmark for patterns of multi-threaded usage of the standard SimpleDateFormat (why actually use it since we have joda-time for Java <= 7 and new shiny standard datetime api for Java >= 8?). So here is the result of this benchmark on the Core i7-3632QM (laptop) processor.
The code for this can be found here.
I have tried the same benchmark on the Core i7-3770K (desktop) processor also. And here is the graph:
What we can see is the expected result, except that for the synchronized access to the single SimpleDateFormat the results are not very smooth and stable. So here is the same graph with the synchronized access to the SimpleDateFormat measured in 30 iterations (5s each) instead of 10.

This measurement has not cleared anything actually... What I think is that JIT-compilation of of the version with single synchronized SimpleDateFormat is not stable, so the result of the measurement is stable across different iterations of the same run, but is not stable across different runs and maybe contrintuitive for different number of threads in the same run (different methods => different acts of compilation => different code).