Pages

Friday, December 31, 2010

KeySet vs EntrySet iteration

There is bug pattern in Findbugs tool that states rather obvious thing that needing both the key and value iterating over the map we should use entry set but not the key set. So in similar to the previous post way I want to measure the effect of this inefficiency.

But looking only at state of the art implementations of the equals() and hashCode() methods which are under the hood of working with HashMap (and therefore almost with every Map in practice) is just not very interesting. From the other hand if the hashCode() is well distinctive for various objects but just too long to compute the effect of its extra usage is obvious. So I'll try to bench the using in this case of the most ugly possible but still valid hashCode() method. I.e. the constant number (let it be prime though ещ not to cause the heart attack of someone accidentally looking at this code).

public class EntriesVsKeysIteratingBench {
  public static void main(String[] args) {
    Map<String, String> map = new HashMap<String, String>();
    for (int i = 0; i < 10000; i++) {
      map.put("srt" + i + "key", "str" + i + "value");
    }
    long nanoTime = System.nanoTime();
    for (String str : map.keySet()) {
      String value = map.get(str);
    }
    System.out.println((System.nanoTime() - nanoTime)
        + " with keyset (bare strings)");
    nanoTime = System.nanoTime();
    for (Entry<String, String> entry : map.entrySet()) {
      String key = entry.getKey();
      String value = entry.getValue();
    }
    System.out.println((System.nanoTime() - nanoTime)
        + " with entryset (bare strings)");

    Map<StringContainer, String> map2 = new HashMap<StringContainer, String>();
    for (int i = 0; i < 10000; i++) {
      map2.put(new StringContainer("srt" + i + "key"), "str" + i
          + "value");
    }
    nanoTime = System.nanoTime();
    for (StringContainer str : map2.keySet()) {
      String value = map2.get(str);
    }
    System.out.println((System.nanoTime() - nanoTime)
        + " with keyset (ugly string container)");
    nanoTime = System.nanoTime();
    for (Entry<StringContainer, String> entry : map2.entrySet()) {
      StringContainer key = entry.getKey();
      String value = entry.getValue();
    }
    System.out.println((System.nanoTime() - nanoTime)
        + " with entryset (ugly string container)");

  }

  private static class StringContainer {
    private String value;

    public StringContainer(String value) {
      this.value = value;
    }

    @Override
    public int hashCode() {
      // do not ever do this
      return 31;
    }

    @Override
    public boolean equals(Object obj) {
      // generated by Eclipse
      if (this == obj) {
        return true;
      }
      if (obj == null) {
        return false;
      }
      if (getClass() != obj.getClass()) {
        return false;
      }
      StringContainer other = (StringContainer) obj;
      if (value == null) {
        if (other.value != null) {
          return false;
        }
      } else if (!value.equals(other.value)) {
        return false;
      }
      return true;
    }
  }
}


* This source code was highlighted with Source Code Highlighter.

and the output of this code is:

40545520 with keyset (bare strings)
14589848 with entryset (bare strings)
653681133 with keyset (ugly string container)
1732427 with entryset (ugly string container)

So with less than three-times inefficiency on the really good hashCode() the inefficiency of most ugly possible hashCode() in this inefficient use case reaches ten thousand percent and more.

But there is also some unexpected result. Usage of entryset on the map of ugly hashCoded objects is more efficient then using it on good objects. However this effect is explained by warming up the VM. Here are the results for interchanged benchs:

632880923 with keyset (ugly string container)
6900615 with entryset (ugly string container)
4327640 with keyset (bare strings)
3176151 with entryset (bare strings)

P.S. And the last one - changing the order of iterations over key set and entry set will not change the results ;-)

Happy New Year, by the way.

No comments:

Post a Comment