Sunday, April 17, 2011

Why Jython behaves inconsistently when tested with PyStone?

I've been playing recently with Jython and decided to do some quick and dirty benchmarking with pystone. In order to have a reference, I first tested cPython 2.6, with an increasing numbers of loops (I thought this may be relevant as Jython should start to profit from the JIT only after some time).

(richard garibaldi):/usr/local/src/pybench% python ~/tmp/pystone.py 
Pystone(1.1) time for 50000 passes = 1.04
This machine benchmarks at 48076.9 pystones/second

(richard garibaldi):/usr/local/src/pybench% python ~/tmp/pystone.py 500000 
Pystone(1.1) time for 500000 passes = 10.33
This machine benchmarks at 48402.7 pystones/second

(richard garibaldi):/usr/local/src/pybench% python ~/tmp/pystone.py 1000000 
Pystone(1.1) time for 1000000 passes = 19.6
This machine benchmarks at 51020.4 pystones/second

As you can see, cPython behaves consistently: the time it takes to complete the test increases linearly to the number of loops. Knowing this, I started testing Jython.

(richard garibaldi):/usr/local/src/pybench% jython ~/tmp/pystone.py 
Pystone(1.1) time for 50000 passes = 2.29807
This machine benchmarks at 21757.4 pystones/second

(richard garibaldi):/usr/local/src/pybench% jython ~/tmp/pystone.py 500000 
Pystone(1.1) time for 500000 passes = 10.931
This machine benchmarks at 45741.4 pystones/second


(richard garibaldi):/usr/local/src/pybench% jython ~/tmp/pystone.py 1000000
Pystone(1.1) time for 1000000 passes = 107.183
This machine benchmarks at 9329.86 pystones/second

During the first run Jython runs rather lousily in comparison to its C brother. When increased the number of loops it started feeling better, coming close to cPython, like my initial hypothesis predicted. Note that the number of loops increased 10 times, but it took Jython only about 5 times longer to complete them. So, as you imagine, I was expecting that Jython would really rock in the final test. To my great disappointment, however, it did really bad: more than twice slower than in the initial run.

What are your hypotheses: why does Jython behave such an inconsistent manner? Could it be that GC is kicking in at some moment, and taking a lot of time? I've looked at PyStone's code and garbage collection doesn't seem to be turned off, but I would expect Java's GC to be at least as good as Python's... Do you think this slowing down is permanent, or it will go away at some point after increasing the number of loops? How shall Jython behave in a really long running processes?

EDIT: unfortunately, I get java.lang.OutOfMemoryError if I increase the number of loops to 2 million...

(Of course, Jython is still beta, so it should get better in the final release.)

I am using Jython 2.5b1 (trunk:5903:5905, Jan 9 2009, 16:01:29), Java(TM) SE Runtime Environment (build 1.6.0_07-b06-153) and Java HotSpot(TM) 64-Bit Server VM (build 1.6.0_07-b06-57, mixed mode) on MacOS X 10.5.

Thanks for your answers.

From stackoverflow
  • I'm pretty sure that the results can be improved by tweaking the JVM configuration (JRuby is using quite a few interesting flags for doing it) and I'm also pretty sure that the garbage collection can be tuned. If you are very interested in this benchmark here is a good resource for configuring your VM: Tuning Garbage Collection. I'd also take a look at JRuby configuration.

    ./alex

  • This might be a bug in jython 2.5b1. You should consider reporting it back to the jython team. I have just run the pystone benchmark on my MacBook with the current stable release of jython (2.2.1) and I get slow but consistent results:

    mo$ ~/Coding/Jython/jython2.2.1/jython pystone.py 50000
    Pystone(1.1) time for 50000 passes = 2.365
    This machine benchmarks at 21141.6 pystones/second
    
    mo$ ~/Coding/Jython/jython2.2.1/jython pystone.py 500000
    Pystone(1.1) time for 500000 passes = 22.246
    This machine benchmarks at 22476 pystones/second
    
    mo$ ~/Coding/Jython/jython2.2.1/jython pystone.py 1000000
    Pystone(1.1) time for 1000000 passes = 43.94
    This machine benchmarks at 22758.3 pystones/second
    
    mo$ java -version
    java version "1.5.0_16"
    Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_16-b06-275)
    Java HotSpot(TM) Client VM (build 1.5.0_16-132, mixed mode, sharing)
    

    The cPython results for me are more or less the same. I reran eacht test three times and got very similar results all the time.

    I also tried giving java a bigger initial and maximum Heap (-Xms256m -Xmx512m) without a noteworthy result

    However, setting the JVM to -server (slower startup, better long running performance, not so good for "interactive" work) turned the picture a bit:

    mo$ ~/Coding/Jython/jython2.2.1/jython pystone.py 50000
    Pystone(1.1) time for 50000 passes = 1.848
    This machine benchmarks at 27056.3 pystones/second
    
    mo$ ~/Coding/Jython/jython2.2.1/jython pystone.py 500000
    Pystone(1.1) time for 500000 passes = 9.998
    This machine benchmarks at 50010 pystones/second
    
    mo$ ~/Coding/Jython/jython2.2.1/jython pystone.py 1000000
    Pystone(1.1) time for 1000000 passes = 19.9
    This machine benchmarks at 50251.3 pystones/second
    

    I made one final run with (-server -Xms256m -Xmx512m):

    mo$ ~/Coding/Jython/jython2.2.1/jython pystone.py 5000000
    Pystone(1.1) time for 5000000 passes = 108.664
    This machine benchmarks at 46013.4 pystones/second
    

    My guess would be, that the slow first run is due to VM startup/JIT not yet having really kicked in. The results of the longer runs are more or less consitent and show the effects of hotspot/JIT

    Maybe you could rerun your last test with a bigger heap? To change the JVM switches, just edit the jython file in your Jython installation.

  • Hi, my bench on a XP_Win32_PC :

    C:\jython\jython2.5b1>bench "50000"
    
    C:\jython\jython2.5b1>jython Lib\test\pystone.py "50000"
    Pystone(1.1) time for 50000 passes = 1.73489
    This machine benchmarks at 28820.2 pystones/second
    
    C:\jython\jython2.5b1>bench "100000"
    
    C:\jython\jython2.5b1>jython Lib\test\pystone.py "100000"
    Pystone(1.1) time for 100000 passes = 3.36223
    This machine benchmarks at 29742.2 pystones/second
    
    C:\jython\jython2.5b1>bench "500000"
    
    C:\jython\jython2.5b1>jython Lib\test\pystone.py "500000"
    Pystone(1.1) time for 500000 passes = 15.8116
    This machine benchmarks at 31622.3 pystones/second
    
    C:\jython\jython2.5b1>bench "1000000"
    
    C:\jython\jython2.5b1>jython Lib\test\pystone.py "1000000"
    Pystone(1.1) time for 1000000 passes = 30.9763
    This machine benchmarks at 32282.8 pystones/second
    
    C:\jython\jython2.5b1>jython
    Jython 2.5b1 (trunk:5903:5905, Jan 9 2009, 16:01:29)
    [Java HotSpot(TM) Client VM (Sun Microsystems Inc.)] on java1.5.0_17
    

    It is not so fast, but ...

    no "special effects"

    Is it a java-vm 'problem' ?

    Add a comment if you want further infos to my benchmarking on this old Win32-PC

  • The same results from my laptop running Ubuntu Jaunty, with JRE 1.6.0_12-b04:

    nathell@breeze:/usr/lib/python2.5/test$ python pystone.py 500000
    Pystone(1.1) time for 500000 passes = 12.98
    This machine benchmarks at 38520.8 pystones/second
    
    nathell@breeze:/usr/lib/python2.5/test$ python pystone.py 1000000
    Pystone(1.1) time for 1000000 passes = 26.05
    This machine benchmarks at 38387.7 pystones/second
    
    nathell@breeze:/usr/lib/python2.5/test$ ~/jython/jython pystone.py
    Pystone(1.1) time for 50000 passes = 2.47788
    This machine benchmarks at 20178.6 pystones/second
    
    nathell@breeze:/usr/lib/python2.5/test$ ~/jython/jython pystone.py 500000
    Pystone(1.1) time for 500000 passes = 19.7294
    This machine benchmarks at 25342.9 pystones/second
    
    nathell@breeze:/usr/lib/python2.5/test$ ~/jython/jython pystone.py 1000000
    Pystone(1.1) time for 1000000 passes = 38.9272
    This machine benchmarks at 25689 pystones/second
    

    So perhaps this is related to the JRE rather than Jython version, after all. The problems the Armed Bear Common Lisp project has had with early versions of JRE 1.6 (see http://tiny.pl/b9r5) might also hint at this.

  • Benchmarking a runtime environment as complex as the JVM is hard. Even excluding the JIT and GC, you've got a big heap, memory layout and cache variation between runs.

    One thing that helps with Jython is simply running the benchmark more than once in a single VM session: once to warm up the JIT and one or more times you measure individually. I've done a lot of Jython benchmarking, and unfortunately it often takes 10-50 attempts to achieve a reasonable time

    You can use some JVM flags to observe GC and JIT behavior to get some idea how long the warmup period should be, though obviously you shouldn't benchmark with the debugging flags turned on. For example:

    % ./jython -J-XX:+PrintCompilation -J-verbose:gc
      1       java.lang.String::hashCode (60 bytes)
      2       java.lang.String::charAt (33 bytes)
      3       java.lang.String::lastIndexOf (156 bytes)
      4       java.lang.String::indexOf (151 bytes)
    [GC 1984K->286K(7616K), 0.0031513 secs]
    

    If you do all this, and use the HotSpot Server VM, you'll find Jython slightly faster than CPython on pystone, but this is in no way representative of Jython performance in general. The Jython developers are paying much more attention to correctness than performance for the 2.5 release; over the next year or so with a 2.6/2.7/3.0 release performance will be more emphasized. You can see a few of the pain points by looking at some microbenchmarks (originally derived from PyPy) I run.

0 comments:

Post a Comment