Pages

Friday, March 25, 2011

Load factor for HashSet with add operation

Update: This post was written when I was young and stupid. It is a try of investigation of the impact of the load factor on the speed of creation of HashSet without taking into account memory usage or any other characteristics of the resulting set. If you really need some theory behind all this - refer to Wikipedia or any book on the algorithms. And if you need this for some practise I'd suggest to just use excellent Guava library and methods for creating Sets and Maps in particular.

Load factor, initial size... only two parameters that controls the HashSet, but it is not very intuitive how to use them. So here is the small benchmark on adding the elements to HashSet. (The same may be applied to the HashMap I think, since the data structure that is used internally in the HashSet is: private transient HashMap map;).


private static final int MAX_ITERATIONS = 100000;

  public static void main(String[] args) {
    //Warming the JVM. Just a guess - please tell me if it is silly.
    for (int i = 0; i < MAX_ITERATIONS; i++) {
      String.valueOf(i);
    }
    doWithLoadFactorOne();
    doWithExactSize();
    doWithBiggerSize();

    long loadFactorOneTime = 0, exactSizeTime = 0, biggerSizeTime = 0;
    for (int i = 0; i < 1000; i++) {
      long nanoTime;

      nanoTime = System.nanoTime();
      doWithExactSize();
      exactSizeTime += System.nanoTime() - nanoTime;

      nanoTime = System.nanoTime();
      doWithBiggerSize();
      biggerSizeTime += System.nanoTime() - nanoTime;

      nanoTime = System.nanoTime();
      doWithLoadFactorOne();
      loadFactorOneTime += System.nanoTime() - nanoTime;
    }

    System.out.println("With load factor 1 : " + loadFactorOneTime);
    System.out.println("With exact size  : " + exactSizeTime);
    System.out.println("With bigger size  : " + biggerSizeTime);

  }

  private static void doWithLoadFactorOne() {
    Set<String> set = new HashSet<String>(MAX_ITERATIONS, 1f);
    doWithSet(set);
  }

  private static void doWithExactSize() {
    Set<String> set = new HashSet<String>(MAX_ITERATIONS);
    doWithSet(set);
  }

  private static void doWithBiggerSize() {
    Set<String> set = new HashSet<String>(Math.round(MAX_ITERATIONS * 1.4f));
    doWithSet(set);
  }

  private static void doWithSet(Set<String> set) {
    for (int i = 0; i < MAX_ITERATIONS; i++) {
      set.add("String number ".concat(String.valueOf(i)));
    }
  }


* This source code was highlighted with Source Code Highlighter.

Output:

With load factor 1 : 17751349088
With exact size    : 24279598679
With bigger size   : 16574028885

So I don't actually see how and when the load factor of the HashSet should be every used...

No comments:

Post a Comment