Tech Notes of Yi Wang: Compare GNU GCJ with Sun's JVM

On this Wikipedia page, there is a link to Alex Ramos's experiment, which compares the performance of native binary generated by GNU's GCJ from Java program and bytecode binary generated by Sun's JDK and runs on JIT JVM. As Alex did the comparison on AMD CPU, I did more additional ones. Here are the results.

System	Java version	Sum Mflops	Sqrt Mflops	Exp Mflops
2x AMD 64 5000+, Ubuntu	JIT 1.6.0_14	99	43	10
	GCJ 4.3.2	64	65	13

2x Intel Core2 2.4GHz, Ubuntu	JIT 1.6.0_0	87.4	36.9	16.6
	GCJ 4.2.4	150.6	39.3	30

Intel T2600 2.16GHz, Cygwin	JIT 1.6.0_17	45.4	34.8	10.4
	GCJ 3.4.4	84.1	23.7	12.1

The first comparison was done by Alex; I just copy-n-pasted his results. The second was done on my workstation. The third on my IBM T60p notebook computer. I also tried to do the comparison on my MacBook Pro, but MacPorts cannot build and install GCJ correctly.

Generally, GCJ beats JIT on numerical computing. However, I have to mention that it takes a lot more time to start the binary generated by GCJ. (I do not know why...)

Here attaches the Java source code (VectorMultiplication.java), which is almost identical to Alex's, but use much shorter vectors (1M v.s. 20M), so more computer can run it.


import java.util.Random;

public class VectorMultiplication {

  public static double vector_mul(double a[], double b[], int n, double c[]) {
    double s = 0;
    for (int i = 0; i < n; ++i)
      s += c[i] = a[i] * b[i];
    return s;
  }

  public static void vector_sqrt(double a[], double b[], int n) {
    for (int i = 0; i < n; ++i)
      b[i] = Math.sqrt(a[i]);
  }

  public static void vector_exp(double a[], double b[], int n) {
    for (int i = 0; i < n; ++i) 
      b[i] = Math.exp(a[i]);
  }

  public static void main(String[] args) {
    final int MEGA = 1000 * 1000;
    Random r = new Random(0);
    double a[], b[], c[];
    int n = 1 * MEGA;
    a = new double[n];
    b = new double[n];
    c = new double[n];

    for (int i = 0; i < n; ++i) {
      a[i] = r.nextDouble();
      b[i] = r.nextDouble();
      c[i] = r.nextDouble();
    }

    long start = System.currentTimeMillis();
    vector_mul(a, b, n, c);
    System.out.println("MULT MFLOPS: " +
                       n/((System.currentTimeMillis() - start)/1000.0)/MEGA);

    start = System.currentTimeMillis();
    vector_sqrt(c, a, n);
    System.out.println("SQRT MFLOPS: " +
                       n/((System.currentTimeMillis() - start)/1000.0)/MEGA);

    start = System.currentTimeMillis();
    vector_exp(c, a, n);
    System.out.println("EXP MFLOPS: " +
                       n/((System.currentTimeMillis() - start)/1000.0)/MEGA);
  }
}

On my Core2 workstation, the way I invoked GCJ is identical to that used in Alex's experiment:

gcj -O3 -fno-bounds-check -mfpmath=sse -ffast-math -march=native \
 --main=VectorMultiplication -o vec-mult VectorMultiplication.java

On my notebooks, I use

gcj -O3 -fno-bounds-check -ffast-math \
 --main=VectorMultiplication -o vec-mult VectorMultiplication.java

Tech Notes of Yi Wang

Jan 2, 2010

Compare GNU GCJ with Sun's JVM

No comments:

About Me

Blog Archive

Followers