« May 2005 | Main | July 2005 »

June 28, 2005

JavaOne Notes: Java on Linux

I was hoping there would be a bit more about Linux tuning, problems, etc. in this presentation; though it had some useful tips that might come in handy given I've never actually deployed production scale Java app under Linux (soon!).

Lots of Threads
� Old systems have thread limit of 1024
� Don�t set LD_ASSUME_KERNEL
� Use small stack sizes (e.g. �Xss48k
� Use ulimit �s to change stack size for native threads
� Pay attention to /proc/sys/kernal/threads-max (automatically set by kernel based on memory available; different kernel versions compute differently)

cat /proc/self/maps
� determine max heap size on a given system

2.4 -> 2.6 kernel changes
� Overall transition is smooth
� Some caveats:
o HZ changed from 100 to 1000
� Thread.sleep(1) will wake up after 1ms vs 10ms!
� Max cause excessive context switches if in a loop
o Thread.yield() is more expensive
� �XX:+DontYieldALot


To run java inside gdb and valgrind

setenv LD_LIBRARY_PATH `java PrintLibraryPath`

public class PrintLibraryPath {
public static void main(String[] args) {
System.out.println(System.getProperty(�java.library.path�));
}
}

JDK Tools
� jstack: print stack trace
� jmap: print heap uisage, object histogram
� jjinfo: show command line flags & system properties
� jstat: performance stats about various JVM properties

JavaOne Notes: 1.5 Features Tips and Techniques

After a less than thrilling session on Jini attended an excellent quick overview presentation of Java 1.5 features by BEA engineers.

Foreach
� avoids needing to cache collection size pre-loop (scopes size var appropriately)
� no access to index or iterator (e.g. remove current element)

Annotations
� @SuppressWarnings�.suppresses compiler warning for a specific block (e.g. @SupperWarnings(�deprecation�) )
� 1.5 currently ignores SuppressWarnings (supported in Eclipse 3.1 and JDK 1.6)
� SuppressWarnings (all, deprecation, unchecked, fallthrough, etc)
� @Deprecated: can�t indicate why, not a javadoc�if you use also put in javadoc tag to give more info to callers
� @Override: indicates that a method should be overriding a superclass, if superclass method doesn�t exist (i.e. it changed) an error is emitted
� Annotations are usefull in frameworks
� Not a pre-processor�you can�t modify the code that is generated; by design from Sun
� Not a silver bullet; don�t use for things they aren�t intended for (e.g. @Property to create get/set methods�this becomes essentially a macro�makes hard to synchronized since you can�t edit the code)

Enumerations
� Type-safe (much better than public static final int!!! failures occur pre-runtime now)
� Should be able to drop in 1.5 enums wherever ints are used today
� EnumMap and EnumSet are optimized specifically for when you are using enums in maps and sets�
� Can statically import enums (useful in the case where you are using a private enums�you can import inner class statically to avoid having to prepend inner class name)
� Hidden static methods not in java.lang.Enum: values() returns all enum values (finally!); valueOf(String) to convert string rep
� Enums can have methods! Very useful�

Varargs
� Use sparingly!
� Cleans up code where you have multiple methods that take varing numbers of methods�
� NOTE: Backward compatibility issues for callers that import your binary jar; changing FORCES you to recompile all clients!!!
� There is a backward compatibility map from array[] argument methods to var arg methods

Covariant Returns
� Design pattern
� Hide implementation details from API users
� Avoids bad behaviors like downcasts to internal types
� Impl can return specific type (rather than interface) and clients can avoid casts

Generics
� Type inferencing on invocation
� Generics w/ no parameters are determined by compiler by looking at where invoked (if value is returned and assigned, type == assigned type)
� Type inferencing is not possible where no type info is available from context of being called
� Can�t determine inference where called as a return from a method w/o assignment under certain situations (e.g. return x ? genericObj.foo() : null )
� Can�t determine inference by being casted (e.g. genericObj.foo() )
� Type CAN be passed explicitly in this case: return genericObj.foo()

Generics: Beyond collections
� Class
public T newInstance()
� Comparable
public int compareTo(T o)
� Enum>
public Class getDeclaringClass() // return class that impls enum

Using lower bound wildcards
� Array typing is too lose; generic typing is too strict
� List : list whose element type is known precisely�and that is Number
� List : exact element type isn�t known, but you do know it is a *subtype* of number
� Why use : great for hiding implementation details
� private Set getCustomers(); // hides the fact that the class returns an internal CustomerImpl class
� Very important for interfaces to avoid exposing impl details; e.g. void removeNegatives(List list)

Upper bound wildcards
� List
� NOTE: type safety can make code obscenely bloated

Unbounded wildcard
� List : have no idea what the heck is passed in, other than it�s an object; equivalent to List

If read-only access is needed, use Collections.unmodifiiableCollection() for immutable collections (wrapper from java.util.collections)

Constructing Generics
� Example: Pair to enable returning multiple objects from a method
� Before 1.5 Generics you had to explicitly create wrapper classes (e.g. return new FileInfo(name, size))
� Under 1.5: Return new Pair(file, size)
� Can be used for additional compile time type safety

Generic Methods
� �Necessary� when parameterizing the return (e.g. List reverse(List list) )
� Type info for generics is lost at compile time (erasure)
� Type Tokens:

public class ArrayExample {
// T must have a no arg constructor!!! No way to enforce, failures
// will occur at runtime if someone
private Class clazz;


public ArrayExample(Class clazz) {
this.clazz = class;
}


public T[] getArray(int size) {
return (T[])Array.newInstance(clazz, size);
}
}

Migration 1.4 -> 1.5
� Safe to add generics gradually (binary compatible�but can�t compile on pre-1.5; even safe for subclasses to NOT be recoded to use generics)
� Raw collections should be avoided as of 1.5
� Temporary use of Collection�s checked wrappers can help (e.g. checkedCollection() )

JavaOne Notes: Grizzly HTTP Listener via NIO

Intro to NIO
� Selector � basic abstraction to enable multiplexed I/O
o Register one or more �selectable� channels
o Relationship between channel and selector represented by a selection key
o Selection key remembers the events you are interested in
o Selector�s select() method updates the keys which are �ready�
o Service each channel by iterating over they keys that are �ready�

Block/Non-Blocking
� Traditional java.net.Sockets are blocking
� Channels can be blocking or non-blocking
o Non-blocking channels
Never puts the invoking thread to sleep
� Operation completes right away and returns a result indicating what (if anything) was done
� Non-blocking makes it esy to manage many channels simultaneously
� But, you have to tell the channel to be blocking or non-blocking
� Additional book keeping required, you may get partial results back

Buffer and ByteBuffer
� Containers for handling data
� Worker very well together with Channels
o High performing if done right!

� To use Buffers you must understand
o Capacity � max num of elements
o Limit � count of �live� elements, don�t read/write beyond
o Position � index of next element to read./write
o Mark � a remembered position
o 0 <= mark <= position <= limit <= capacity

� ByteBuffers � two flavors
o HeapByteBuffer
� Underlying storage maintained in a Java language byte[]
o DirectByteBuffer
� Underlying storage maintained in native code, not in the Java language heap

� Tradeoffs with both of these approaches

� What�s true and what�s not true?
o Using NIO SocketChannels and ByteBuffers is easy == true
o Building a high performing and highly scalable app with NIO is easy == false
o Non-blocking is fort server-side apps only == false

Grizzly is part of App Server 9.0 (Glass fish); Open Sourced

Grizzly integrates w/ current Apache Tomcat HTTP Connector architecture (Coyote)

Problems w/ NIO Non-Blocking
� No guarantee that data is processed
count = socketChannel.read(byteBuffer)
� May not read the entire stream contents, requiring an extra read
o Occurs frequently when reading HTTP requests
� Same issue exists with writing data

A Task-based architecture has been used, each task representing an operation when manipulating the byte steam
� AcceptTask for managing OP_ACCEPT
� ReadTask for managing OP_READ
� ProcessorTask for parsing

Tasks can execute on their own thread, on a shared thread or using the Selector thread

� Better off reading and processing in same thread
� Single thread pool for read/process
� Thread doing accepts, pool doing reads

Non-Blocking NIO is very similar to libc select()/poll()

The Grizzle GW supports 2 strategies:
� Mode Blocking:
� Mode Non-Blocking: w/ algorithm to predict byte stream completion
� These are the modes that have produced best perfomance

Lessons Learned implementing Grizzly
� Buffer management is crucial
� DirectByteBuffer (DBB) is a big win
� DBB expensive to allocate
� Bext Practices
o Use DBB w/ SocketChannels to avoid unnecessary copying of date
o Avoid copy data out of DBB
o Create view buffers to minimize DBB allocation costs
� What�s a view buffer?
o A buffer that manages data in another buffer

� Why useful?
o Allocate one very large DBB and create smaller �views� or view buffers
o Reduce number of costly DBB allocactions
o Can re-use views

Gotchas
� Register �key� in same thread doing the select
o Unpredictable behavior in 1.4.x JVMs
� Enable/disable a key�s operation of interest in same thread as select
� Selector.select() and Selector.wakeup() are expensive
� Buffers � too many leads to GC issues
� Keep-Alive connections are well suited to using NIO (connection create/teardowns aren�t any more efficient under NIO)

JavaOne Notes: HotSpot Platform Performance

Presented by David Dagastine, Brian Dohert (Sun) including materials on benchmarks for the current JVMs out there. Not directly useful to day-to-day implementation but useful to know how they�ve really optimized using dynamic tuning so that hand tuning of the JVM is less necessary. They quote that on all the benchmarks (ok, benchmarks != real world) they can only eek out at most 5% improvement by hand tuning the JVM options!!!

Java performance analysis is harder, lot more variability under the covers compared to C/C++ (analyzing the JVM layer); must approach statistically; dynamic JIT compiler, garbage collection, thread scheduling, etc.

We do analysis by using Student�s T-Test
� Determine the probability that two data sets are from the same population (p-value)

Beware the Micro-Benchmark
� Make sure you understand what you are measuring (e.g. for testing writes: disk cache settings, processor speed, num processors, etc)
� Interesting, runtime compilation generally occurs in a separate thread (-Xbatch option force compilation to occur sequentially in thread requesting compilation)
� Always ensure you warm up your benchmark
� Use �XX:+PrintCompilation flag
o Verify that no compilation is occurring during measurement interval

Watch out for platform differences
� Heap size
� Garbage collection policy
� -client/-server ergonomics
o �client is default on 32-bit Windows
o �server is default on Solaris/Linux when there are 2 or more processors AND 2GB or more of RAM
� Compilation policy
o �Xbatch is default on 32-bit Windows
� If results look too good to be true, they probably are.

Different JVMs have different RAS performance (Reliability, availability, and serviceability)

J2SE 5.0 foci
� Language features
� Performance
o Out of the box server performance (-server ergonomics)
o Client startup and footprint
o We want our JVM system to tune things dynamically, you worry less about performance
� Quality
o 8000 bug fixes in Tiger!!!
o Far more number of test cases (compatibility, performance, etc)

Client performance
� Sun JVM (on Windows) has smallest footprint and fastest startup
o Note, comparing Sun client based JVM to IBM/BEA�s server JVMs which is a bit misleading, all out of the box, no options.
o JVM Footprint (small -> large): Sun J2SE 1.4.2; Sun J2SE 5.0; JSE 6; (huge step up) IBM 1.4.2; BEA 1.4.2; (really high) BEA 5.0
o JVM startup: (small -> large): Sun J2SE 5.0; Sun JSE 6; Sun J2SE 1.4.2; IBM 1.4.2; BEA 1.4.2; BEA 5.0

� BEA 5.0 was worst in all benchmarks.
� Sun JVM startup (particularly on J2SE 5.0 and JSE 7)
� BEA does compilation at startup vs HotSpot where compilation occurs when the code is first used

� Class Data Sharing not enabled for �-server�

Server performance
� Out-of-box; we are spending our time focusing on dynamic tuning so in many cases you don�t have to
� Under our latest HotSpot JVM the best hand tuning (vs default JVM startup options) you can get is 5% under all the benchmarks!!!
� Competitive performance options based on tuning found at: http://www.spec.org
� Under 64-bit JRockit 5.0 is only beat by IBM 1.4.2 and Sun JSE 6 (J2SE is about 20% slower)
� For J2SE 5.0 their extensive tuning options were: -Xbatdch �Xms1600m �Xmx1600m �XX+AggressiveHeap �Xss128k

HotSpot Server Tuning Recommendations
� Use large pages
o Before J2SE 5.0
� No good options on Linux and Windows
o J2SE 5.0 and beyond
� Use �XX:+UseLargePages on Linux and Windows
� Limit ParallelGC threads when running multiple JVMs on a large box
o Make sure total number of GC threads <= the number of cores/CPUs
� Don�t use AggressiveHeap with J2SE 5.0 or later
o Just increase heap size (if needed); default heap size is up to 1GB on server class machines
� Do use the �-server� flag

JavaOne Notes: Achieving Great File I/O Performance

Basic mom-and-apple-pie presentation on I/O, spent a long time warming up the basic concepts and the Java APIs. A good presenter from BEA with a few learnings shared, but nothing shocking. About an hour w/ the file I/O javadocs and a really good book that evaluates the actual performance internals and you likely would have captured the same.

What to think about when doing file I/O
� Data access pattern (rand, seq)
� Data size read/write at once
� How much of the file will be read/written
� How durable must the disk writes be (OS crash, power failure)

Writing Safely
� OS normally caches all disk writes (data may be lost in the event of OS crash)
� Most disks also cache all disk writes (data lost on power failure)
� APIs are available that let you �force� the I/O
� They make sure data is not cached by the O/S or disk
� These APIs do not always work!
o Depends on the O/S, filesystem, disk
o For IDE and Firewire it depends on hardware, device driver, and config settings for the driver
o Enterprise (SCSI, etc) hardware usually avoid these problems

General I/O concept: Bigger is better
� A few large I/Os are faster than many small I/Os
� There is overhead in traversing all the layers between app and disk
� Java -> JNI -> system call -> OS -> file system -> cache -> driver -> device

General I/O concept: Extending files is slow
� When you just write data to existing file, you just write to the disk blocks where your data goes
� When you write past the EOF, OS must keep track of new file size
� When you must extend, try doing it in larger chunks

FileOutputStream
� Easy to use � everyone knows paradigm
� Only supports sequential IO
� File IO may be easily buffered using BufferedOutputStream
� Limited force options!
� Suggestion: use BufferedOutputStream for buffering
o But don�t forget to close or flush when you�re done!
o Application exit doesn�t close/flush!!!

RandomAccessFile
� Totally random IO
� Supports all force options
� Suggestion: buffer your IO if you can to increase throughput, however you have to do this in your own code!

FileChannel
� You get a FileChannel from RandomAcceessFile
� Supports all force modes
� Implements the Channel interface(s)
o Takes NIO byte buffers as parameters
� Supports �transfer� � transfer a file to a socket; at the OS kernel level for Linux transfer data from a socket to a file, vice-a-versa (abstracted in Java)�very fast on those platforms!
� Suggestions:
o Performance issues for all small writes in some JVMs
o Solution: allocate a single direct ByteBuffer and copy to and from here
� ByteBuffer is a simple wrapper object around a byte array
� DirectByteBuffer doesn�t allow you access byte array directly...useful for JNI native code integration; downside is access time to DirectByteBuffer is slow�allocation and garbage collection is also much more expensive
� If you arfe using the regular ByteBuffer the performance on Sun JVMs doesn�t seem to be as good as RandomAccessFile (currently), something is happening in the JVM

MappedByteBuffer
� Created from a FileChannel
� Very convenient to map a file into memory; can be extremely fast IF you can use it
� Creates a big ByteBuffer containing your file
� You may explicitly force updates to disk
� Fatal flaw in Java no way to quickly unmap a file; mapping large files into a 32-bit address space JVM may start giving out of memory errors to other allocations in your app
� Handy class to use IF the amount of data is significantly small OR if you are on a 64-bit OS
� Suggestions:
o Use to randomly read and write small files if data integrity is unimportant
o Be very careful on 32-bit JVMs
o Don�t use too many�there�s no �close� call!!!!
o Believe that the file is unmapped during finalizer, but NO promises here!

Reliable Disk IO
� IO writes may be forced explicitly
� RandomAccessFile.flush()
o Forces contents AND meta-data
o Equivalent of Unix fsync()
� FileChannel.flush()
o Forces contents and OPTIONALLY meta-data
o Equivalent of fsync() or fdatasync()
� Meta-data is the last mod time, etc.; modifying means disk writes have to occur in 2+ places on disk
� Ignoring meta-data sync is slightly faster
� Or implicitly:
� RandomAccessFile �rws� mode
o Forces contents and meta-data
o Equivalent of open(O_SYNC)
� RandomAccessFile �rwd� mode
o Forces contents only
o Equivalent of open(O_DSYNC)
� If you can use �rwd� mode does the write in one system call vs multiple trips to write file and file meta-data.
� Implicit vs explicit puts control in your hands

Reliable Disk Writes
� Example: transaction log�reliable messaging log
� Possible solutions:
o Call �flush� after each write
o Open file with a �sync� flag like �rwd� or �rws�
o Use a memory mapped file and call �force�
� Results:
o Using the �rwd� mode will be fastest
o flush() was 2x map force for write sizes from 50-4096 bytes; rwd was 50%-300% faster (smaller writes much faster obviously when buffered internally) than flush()
o flush() and map means many JVM -> OS system calls

RandomReads
� Program: big file/database of several 100MB
� Randomly looking at bits and pieces
� Possible solutions:
o Use standard file IO APIs
o Use memory mapping
� Results
o Memory mapping (particularly for smaller reads) is tremendously faster
o If you can deal w/ the address space issues w/ memory mapping you can get phenomenal read speed

Direct I/O: Things you can�t do in Java
� Every O/S today has a feature called �direct I/O�
� Bypasses O/S buffer cache, generally reducing CPU usage for large I/O
� Designed for use by databases
� There is no Java API for this, so we must use native code�
� His example app he got 2x the I/O output rate (MB/sec), however stated his tests were a bit old

Q&A
� Guidelines for choosing buffer sizes on BufferedOutputStreams? No great suggestion other than make them big as possible; you�d actually have to do some performance tests
� How do these techniques apply to network I/O? I don�t know; certainly buffering data applies to network I/O
� Will this also apply to SAN and solid-state devices? To some degree, remember minimizing trips to kernel and file system still apply! I don�t have benchmarks.

June 27, 2005

43 Places, cool site for addicted travellers!

Those innovative guys over at the Robot Co-Op just launched something very similar to an idea I've been rolling around for a few years now; they call it 43places.com.

Nifty way to track, re-live, and share experiences on places you've visited as well as a launching pad for tracking and discovering new places you want to visit in this world.

See their blog announcement of 43 Places for a few more details.