Notes from the Azul Systems’ (presented by Cliff Click) talk.

Java Performance Myths.

[In response to the comment about the original notes being a little too scant, I've decided to edit this entry. Sun will provide audio & slides, which I will link to once available.]

. I heard (or googled) that making fields or methods final (or private) will help performance (or it will allow more inlining).
…Wrong. With or without ‘final’ every inlinable methods is inlined by the runtime compilers.

. I heard (or googled) that try/catch block are free (or very expensive).
…The reality is — it depends, in general try to avoid try/catch blocks in tight loops. Don’t use exceptions to end loops, or for null checking on say list traversal. You’d be defeating JIT optimizations, and duplicating the automatic range check.

. I heard (or googled) that using RTTI is better than instance_of (and/or better than v-call (virtual call))
…RTTI (Runtime Type Information –google for samples) is an ugly hack from c++, don’t do it unless you need to squeeze the last 10% of perf. improvement. RTTI wins but it’s too ugly (in the OO sense). Use v-call if you can.
…v-call is more expensive in Hotspot than other VMs, Hotspot implements better subtype checking (efficent switch).
…The bottom line is runtime compilers are optimized for the common patterns of coding out there, so stick to clean design/coding, use OO principles.

v-call : v_call(); // dynamic dispatch here
instance_of:
if( this instanceof Child1 )
((Child1)this).non_v_call();
RTTI: switch( _rtti ) { case 1: // hand-inline Child1 specific ...

. Should I avoid synchronization at all cost?
…The average system today spends 55-110 ns doing uncontented lock/unlock operations, so it’s not free but not terribly expensive either.
…Hotspot’s synchronization operations on Xeons are apparently slow (~275 cycles), IBM/BEA/Azul are much better at it.
…For light contention situations, BEA outperforms the other VMs, Sun performs poorly.
…In general, synchronization is better than bugs! (especially now that’s we’re in the multi-core era). So beware of the costs (and profile if you can) and try to use the new concurrency API in 1.5 but don’t avoid them as threading bugs like race conditions are notoriously hard to fix. Think more about your algorithm.

. I heard (or googled) that I should use object pools say to help the garbage collector, or should I reuse objects or create new ones and assume that the GC is smart enough to efficently take care of the cleanup?
…It depends on cost of initialization of the object and the turnover rates. Don’t do it for small objects, but it may be a possible win for large ones or those with heavy initalization cost (JPanel?). As always, profile — use JConsole or VisualGC.
…Don’t pool objects like Hashtables.

. How much of a performance impact the 5.0 features have on Java code?
…The foreach construct and autoboxing are FREE! (no additional cost incurred). They’re syntactic sugar.
…Note that enums on Xeon with Sun’s Hotspot has issues (more on this in a later post, but it’s when iterating over enums).

.Other predictions/advice
…Pause times are going down, GCs are getting more efficient. Concurrent GC will be the default in most VMs in the coming years, so if possible use them now!
Escape analysis (optimization that can be performed to improve storage allocation and reclamation of objects) is being integrated into VMs, one less reason to pool objects.
Locking is getting cheaper but multi-cores will make it more expensive, a CAS instruction on x86 takes ~200+ cycles.

Technorati Tags: |