Pages

Thursday, July 21, 2011

Multithreaded factorial with ThreadPoolExecutor

In the previous post I have written a multithreaded factorial. It was not really good, because last one of the worker threads worked significantly longer than the first one. Fortunately we are leaving in 2011 and Java 1.5 was released so long time ago. The solution is simple - do not create 4 threads of the 4 virtual cores processors. Create really many threads and let the standard concurrent library execute them all. So ok, here is the code for this:

    private static BigInteger factMtExecutor(int input, int numThreads)
            throws InterruptedException, ExecutionException {
        FactCallable[] workers = new FactCallable[100];
        for (int i = 1; i <= 100; i++) {
            int start = i == 1 ? 1 : (input / 100 * (i - 1)) + 1;
            int end = i == 100 ? input : input / 100 * i;
            workers[i - 1] = new FactCallable(start, end);
        }

        ThreadPoolExecutor executor = new ThreadPoolExecutor(numThreads,
                numThreads, 0, TimeUnit.SECONDS,
                new LinkedBlockingQueue<Runnable>());

        List<Future<BigInteger>> futures = executor.invokeAll(Arrays
                .asList(workers));
        BigInteger result = BigInteger.valueOf(1L);
        for (Future<BigInteger> future : futures) {
            result = result.multiply(future.get());
        }

        return result;
    }

    private static class FactCallable implements Callable<BigInteger> {
        private final int from;
        private final int to;

        public FactCallable(int from, int to) {
            this.from = from;
            this.to = to;
        }

        @Override
        public BigInteger call() throws Exception {
            BigInteger result = BigInteger.valueOf(from);
            for (int i = from + 1; i <= to; i++) {
                result = result.multiply(BigInteger.valueOf(i));
            }
            return result;
        }
    }
And the result comparing to the previous version:
35 seconds for simple start/join
25 seconds for ThreadPoolExecutor

And the second time, swapping the order of execution:
29 seconds for ThreadPoolExecutor
38 seconds for simple start/join
Full code for comparison may be found on github.

Sunday, July 3, 2011

Multithreaded factorial

Update: the follow up post gives another version, that lacks the drawback of unequal load on processor cores.

There is a very good blog post about how to write the factorial method in Java (really it is not about righting factorial). However, even in this master peace;-) there is no multi-threaded factorial. So here it is.
There are some problems with it. The most significant is that multiplying all the BigIntegers from 1 to 50000 is much less heavy job for the computer, than multiplying all BigIntegers from 50001 to 100000. So the the amount of work that is done by each worker-thread is not equal. But investigation on how this may be resolved in a common way is much harder problem, than just writing what I am publishing here. Maybe I'll do it in the other post.

package com.sopovs.moradanen;

import java.math.BigDecimal;
import java.util.concurrent.TimeUnit;

import junit.framework.Assert;

import org.junit.Test;

public class FactorialTest {

    private static final int INPUT = 203457;

    @Test
    public void test() throws InterruptedException {
        BigDecimal fact1, fact2;
        {
            long start = System.nanoTime();
            fact1 = fact(INPUT);
            long end = System.nanoTime();
            System.out.println(TimeUnit.SECONDS.convert(end - start,
                    TimeUnit.NANOSECONDS) + " seconds for 1 thread");
        }
        {
            long start = System.nanoTime();
            fact2 = factMt(INPUT, 4);
            long end = System.nanoTime();
            System.out.println(TimeUnit.SECONDS.convert(end - start,
                    TimeUnit.NANOSECONDS) + " seconds for 4 threads");
        }
        Assert.assertEquals(fact1, fact2);

    }

    private static BigDecimal fact(int input) {
        BigDecimal result = BigDecimal.valueOf(1);
        for (int i = 1; i <= input; i++) {
            result = result.multiply(BigDecimal.valueOf(i));
        }
        return result;
    }

    private static BigDecimal factMt(int input, int numThreads)
            throws InterruptedException {
        BigDecimal result = BigDecimal.valueOf(1);
        Thread[] threads = new Thread[numThreads];
        FactComputer[] workers = new FactComputer[numThreads];
        for (int i = 1; i <= numThreads; i++) {
            int start = i == 1 ? 1 : (input / numThreads * (i - 1)) + 1;
            int end = i == numThreads ? input : input / numThreads * i;
            workers[i - 1] = new FactComputer(start, end);
            threads[i - 1] = new Thread(workers[i - 1]);
        }

        for (int i = 0; i < numThreads; i++) {
            threads[i].start();
        }

        for (int i = 0; i < numThreads; i++) {
            threads[i].join();
        }
        for (int i = 0; i < numThreads; i++) {
            result = result.multiply(workers[i].result);
        }
        return result;
    }

    private static class FactComputer implements Runnable {
        BigDecimal result;
        private final int from;
        private final int to;

        public FactComputer(int from, int to) {
            this.from = from;
            this.to = to;
        }

        public void run() {
            result = BigDecimal.valueOf(from);
            for (int i = from + 1; i <= to; i++) {
                result = result.multiply(BigDecimal.valueOf(i));
            }

        }
    }
}
Here is the result in System.out:
56 seconds for 1 thread
26 seconds for 4 threads
I have written it as the unit test to be sure that at least this simplistic distribution of work between threads has no bugs (I tested it with different values of INPUT). I think it is obvious from this code that I run it on the 4-core processor. Actually on the 2-core but with hyper-threading enabled.