Main

October 26, 2006

Carbonado

As Werner Vogels mentioned, congrats to Brian and his fellow engineers responsible for building and releasing Carbonado!

March 15, 2006

Amazon S3

Amazon.com just launched S3 (Simple Storage Service), a serious game changer. Should be extremely interesting to follow what innovators integrate S3 into over the coming months.

Definately see interesting backup solutions and potentially digital content/software distribution via their BitTorrent interface. Very nice touch! Naysays may balk at the cost per-GB but I think they are really missing the boat here. To have your data securely stored across multiple physical locations (to avoid complete loss in fire, etc) is quite costly as a home user, particularly when you only need to store a small amount (e.g. 10GB) of data. Think about it. If you bought the smallest drive to fit that data (30GB) it costs you ~$25 (per pricewatch w/ S&H). Now buy 2 for local RAID, then a 2 more for storing offsite. You are already up to $100. And that assumes the offsite disks are manually moved locally to transfer data, and then manually (e.g. walk with sneakers) to the offsite location. Want online access? Buck up for a CPU, case, motherboard, network connection, etc. Easily another $100+.

S3 in comparison you can back up 10GB costs only $1.55/month + $2 transfer cost. The peace of mind knowing all my digital photos are safely archived offsite (and I can access from anywhere) is worth MUCH more than $1.50/month!

Also interesting is the Amazon Web Services team published the principles of distributed system design used to meet their requirements for S3.

  • Decentralization: Use fully decentralized techniques to remove scaling bottlenecks and single points of failure.
  • Asynchrony: The system makes progress under all circumstances.
  • Autonomy: The system is designed such that individual components can make decisions based on local information.
  • Local responsibility: Each individual component is responsible for achieving its consistency; this is never the burden of its peers.
  • Controlled concurrency: Operations are designed such that no or limited concurrency control is required.
  • Failure tolerant: The system considers the failure of components to be a normal mode of operation, and continues operation with no or minimal interruption.
  • Controlled parallelism: Abstractions used in the system are of such granularity that parallelism can be used to improve performance and robustness of recovery or the introduction of new nodes.
  • Decomposed components: Do not try to provide a single service that does everything for everyone, but instead build small components that can be used as building blocks for other services.
  • Symmetry: Nodes in the system are identical in terms of functionality, and require no or minimal node-specific configuration to function.
  • Simplicity: The system should be made as simple as possible (– but no simpler).

June 28, 2005

JavaOne Notes: Java on Linux

I was hoping there would be a bit more about Linux tuning, problems, etc. in this presentation; though it had some useful tips that might come in handy given I've never actually deployed production scale Java app under Linux (soon!).

Lots of Threads
� Old systems have thread limit of 1024
� Don�t set LD_ASSUME_KERNEL
� Use small stack sizes (e.g. �Xss48k
� Use ulimit �s to change stack size for native threads
� Pay attention to /proc/sys/kernal/threads-max (automatically set by kernel based on memory available; different kernel versions compute differently)

cat /proc/self/maps
� determine max heap size on a given system

2.4 -> 2.6 kernel changes
� Overall transition is smooth
� Some caveats:
o HZ changed from 100 to 1000
� Thread.sleep(1) will wake up after 1ms vs 10ms!
� Max cause excessive context switches if in a loop
o Thread.yield() is more expensive
� �XX:+DontYieldALot


To run java inside gdb and valgrind

setenv LD_LIBRARY_PATH `java PrintLibraryPath`

public class PrintLibraryPath {
public static void main(String[] args) {
System.out.println(System.getProperty(�java.library.path�));
}
}

JDK Tools
� jstack: print stack trace
� jmap: print heap uisage, object histogram
� jjinfo: show command line flags & system properties
� jstat: performance stats about various JVM properties

JavaOne Notes: 1.5 Features Tips and Techniques

After a less than thrilling session on Jini attended an excellent quick overview presentation of Java 1.5 features by BEA engineers.

Foreach
� avoids needing to cache collection size pre-loop (scopes size var appropriately)
� no access to index or iterator (e.g. remove current element)

Annotations
� @SuppressWarnings�.suppresses compiler warning for a specific block (e.g. @SupperWarnings(�deprecation�) )
� 1.5 currently ignores SuppressWarnings (supported in Eclipse 3.1 and JDK 1.6)
� SuppressWarnings (all, deprecation, unchecked, fallthrough, etc)
� @Deprecated: can�t indicate why, not a javadoc�if you use also put in javadoc tag to give more info to callers
� @Override: indicates that a method should be overriding a superclass, if superclass method doesn�t exist (i.e. it changed) an error is emitted
� Annotations are usefull in frameworks
� Not a pre-processor�you can�t modify the code that is generated; by design from Sun
� Not a silver bullet; don�t use for things they aren�t intended for (e.g. @Property to create get/set methods�this becomes essentially a macro�makes hard to synchronized since you can�t edit the code)

Enumerations
� Type-safe (much better than public static final int!!! failures occur pre-runtime now)
� Should be able to drop in 1.5 enums wherever ints are used today
� EnumMap and EnumSet are optimized specifically for when you are using enums in maps and sets�
� Can statically import enums (useful in the case where you are using a private enums�you can import inner class statically to avoid having to prepend inner class name)
� Hidden static methods not in java.lang.Enum: values() returns all enum values (finally!); valueOf(String) to convert string rep
� Enums can have methods! Very useful�

Varargs
� Use sparingly!
� Cleans up code where you have multiple methods that take varing numbers of methods�
� NOTE: Backward compatibility issues for callers that import your binary jar; changing FORCES you to recompile all clients!!!
� There is a backward compatibility map from array[] argument methods to var arg methods

Covariant Returns
� Design pattern
� Hide implementation details from API users
� Avoids bad behaviors like downcasts to internal types
� Impl can return specific type (rather than interface) and clients can avoid casts

Generics
� Type inferencing on invocation
� Generics w/ no parameters are determined by compiler by looking at where invoked (if value is returned and assigned, type == assigned type)
� Type inferencing is not possible where no type info is available from context of being called
� Can�t determine inference where called as a return from a method w/o assignment under certain situations (e.g. return x ? genericObj.foo() : null )
� Can�t determine inference by being casted (e.g. genericObj.foo() )
� Type CAN be passed explicitly in this case: return genericObj.foo()

Generics: Beyond collections
� Class
public T newInstance()
� Comparable
public int compareTo(T o)
� Enum>
public Class getDeclaringClass() // return class that impls enum

Using lower bound wildcards
� Array typing is too lose; generic typing is too strict
� List : list whose element type is known precisely�and that is Number
� List : exact element type isn�t known, but you do know it is a *subtype* of number
� Why use : great for hiding implementation details
� private Set getCustomers(); // hides the fact that the class returns an internal CustomerImpl class
� Very important for interfaces to avoid exposing impl details; e.g. void removeNegatives(List list)

Upper bound wildcards
� List
� NOTE: type safety can make code obscenely bloated

Unbounded wildcard
� List : have no idea what the heck is passed in, other than it�s an object; equivalent to List

If read-only access is needed, use Collections.unmodifiiableCollection() for immutable collections (wrapper from java.util.collections)

Constructing Generics
� Example: Pair to enable returning multiple objects from a method
� Before 1.5 Generics you had to explicitly create wrapper classes (e.g. return new FileInfo(name, size))
� Under 1.5: Return new Pair(file, size)
� Can be used for additional compile time type safety

Generic Methods
� �Necessary� when parameterizing the return (e.g. List reverse(List list) )
� Type info for generics is lost at compile time (erasure)
� Type Tokens:

public class ArrayExample {
// T must have a no arg constructor!!! No way to enforce, failures
// will occur at runtime if someone
private Class clazz;


public ArrayExample(Class clazz) {
this.clazz = class;
}


public T[] getArray(int size) {
return (T[])Array.newInstance(clazz, size);
}
}

Migration 1.4 -> 1.5
� Safe to add generics gradually (binary compatible�but can�t compile on pre-1.5; even safe for subclasses to NOT be recoded to use generics)
� Raw collections should be avoided as of 1.5
� Temporary use of Collection�s checked wrappers can help (e.g. checkedCollection() )

JavaOne Notes: Grizzly HTTP Listener via NIO

Intro to NIO
� Selector � basic abstraction to enable multiplexed I/O
o Register one or more �selectable� channels
o Relationship between channel and selector represented by a selection key
o Selection key remembers the events you are interested in
o Selector�s select() method updates the keys which are �ready�
o Service each channel by iterating over they keys that are �ready�

Block/Non-Blocking
� Traditional java.net.Sockets are blocking
� Channels can be blocking or non-blocking
o Non-blocking channels
Never puts the invoking thread to sleep
� Operation completes right away and returns a result indicating what (if anything) was done
� Non-blocking makes it esy to manage many channels simultaneously
� But, you have to tell the channel to be blocking or non-blocking
� Additional book keeping required, you may get partial results back

Buffer and ByteBuffer
� Containers for handling data
� Worker very well together with Channels
o High performing if done right!

� To use Buffers you must understand
o Capacity � max num of elements
o Limit � count of �live� elements, don�t read/write beyond
o Position � index of next element to read./write
o Mark � a remembered position
o 0 <= mark <= position <= limit <= capacity

� ByteBuffers � two flavors
o HeapByteBuffer
� Underlying storage maintained in a Java language byte[]
o DirectByteBuffer
� Underlying storage maintained in native code, not in the Java language heap
�
� Tradeoffs with both of these approaches

� What�s true and what�s not true?
o Using NIO SocketChannels and ByteBuffers is easy == true
o Building a high performing and highly scalable app with NIO is easy == false
o Non-blocking is fort server-side apps only == false

Grizzly is part of App Server 9.0 (Glass fish); Open Sourced

Grizzly integrates w/ current Apache Tomcat HTTP Connector architecture (Coyote)

Problems w/ NIO Non-Blocking
� No guarantee that data is processed
count = socketChannel.read(byteBuffer)
� May not read the entire stream contents, requiring an extra read
o Occurs frequently when reading HTTP requests
� Same issue exists with writing data

A Task-based architecture has been used, each task representing an operation when manipulating the byte steam
� AcceptTask for managing OP_ACCEPT
� ReadTask for managing OP_READ
� ProcessorTask for parsing

Tasks can execute on their own thread, on a shared thread or using the Selector thread

� Better off reading and processing in same thread
� Single thread pool for read/process
� Thread doing accepts, pool doing reads

Non-Blocking NIO is very similar to libc select()/poll()

The Grizzle GW supports 2 strategies:
� Mode Blocking:
� Mode Non-Blocking: w/ algorithm to predict byte stream completion
� These are the modes that have produced best perfomance

Lessons Learned implementing Grizzly
� Buffer management is crucial
� DirectByteBuffer (DBB) is a big win
� DBB expensive to allocate
� Bext Practices
o Use DBB w/ SocketChannels to avoid unnecessary copying of date
o Avoid copy data out of DBB
o Create view buffers to minimize DBB allocation costs
� What�s a view buffer?
o A buffer that manages data in another buffer

� Why useful?
o Allocate one very large DBB and create smaller �views� or view buffers
o Reduce number of costly DBB allocactions
o Can re-use views

Gotchas
� Register �key� in same thread doing the select
o Unpredictable behavior in 1.4.x JVMs
� Enable/disable a key�s operation of interest in same thread as select
� Selector.select() and Selector.wakeup() are expensive
� Buffers � too many leads to GC issues
� Keep-Alive connections are well suited to using NIO (connection create/teardowns aren�t any more efficient under NIO)

JavaOne Notes: HotSpot Platform Performance

Presented by David Dagastine, Brian Dohert (Sun) including materials on benchmarks for the current JVMs out there. Not directly useful to day-to-day implementation but useful to know how they�ve really optimized using dynamic tuning so that hand tuning of the JVM is less necessary. They quote that on all the benchmarks (ok, benchmarks != real world) they can only eek out at most 5% improvement by hand tuning the JVM options!!!

Java performance analysis is harder, lot more variability under the covers compared to C/C++ (analyzing the JVM layer); must approach statistically; dynamic JIT compiler, garbage collection, thread scheduling, etc.

We do analysis by using Student�s T-Test
� Determine the probability that two data sets are from the same population (p-value)

Beware the Micro-Benchmark
� Make sure you understand what you are measuring (e.g. for testing writes: disk cache settings, processor speed, num processors, etc)
� Interesting, runtime compilation generally occurs in a separate thread (-Xbatch option force compilation to occur sequentially in thread requesting compilation)
� Always ensure you warm up your benchmark
� Use �XX:+PrintCompilation flag
o Verify that no compilation is occurring during measurement interval

Watch out for platform differences
� Heap size
� Garbage collection policy
� -client/-server ergonomics
o �client is default on 32-bit Windows
o �server is default on Solaris/Linux when there are 2 or more processors AND 2GB or more of RAM
� Compilation policy
o �Xbatch is default on 32-bit Windows
� If results look too good to be true, they probably are.

Different JVMs have different RAS performance (Reliability, availability, and serviceability)

J2SE 5.0 foci
� Language features
� Performance
o Out of the box server performance (-server ergonomics)
o Client startup and footprint
o We want our JVM system to tune things dynamically, you worry less about performance
� Quality
o 8000 bug fixes in Tiger!!!
o Far more number of test cases (compatibility, performance, etc)

Client performance
� Sun JVM (on Windows) has smallest footprint and fastest startup
o Note, comparing Sun client based JVM to IBM/BEA�s server JVMs which is a bit misleading, all out of the box, no options.
o JVM Footprint (small -> large): Sun J2SE 1.4.2; Sun J2SE 5.0; JSE 6; (huge step up) IBM 1.4.2; BEA 1.4.2; (really high) BEA 5.0
o JVM startup: (small -> large): Sun J2SE 5.0; Sun JSE 6; Sun J2SE 1.4.2; IBM 1.4.2; BEA 1.4.2; BEA 5.0

� BEA 5.0 was worst in all benchmarks.
� Sun JVM startup (particularly on J2SE 5.0 and JSE 7)
� BEA does compilation at startup vs HotSpot where compilation occurs when the code is first used

� Class Data Sharing not enabled for �-server�

Server performance
� Out-of-box; we are spending our time focusing on dynamic tuning so in many cases you don�t have to
� Under our latest HotSpot JVM the best hand tuning (vs default JVM startup options) you can get is 5% under all the benchmarks!!!
� Competitive performance options based on tuning found at: http://www.spec.org
� Under 64-bit JRockit 5.0 is only beat by IBM 1.4.2 and Sun JSE 6 (J2SE is about 20% slower)
� For J2SE 5.0 their extensive tuning options were: -Xbatdch �Xms1600m �Xmx1600m �XX+AggressiveHeap �Xss128k

HotSpot Server Tuning Recommendations
� Use large pages
o Before J2SE 5.0
� No good options on Linux and Windows
o J2SE 5.0 and beyond
� Use �XX:+UseLargePages on Linux and Windows
� Limit ParallelGC threads when running multiple JVMs on a large box
o Make sure total number of GC threads <= the number of cores/CPUs
� Don�t use AggressiveHeap with J2SE 5.0 or later
o Just increase heap size (if needed); default heap size is up to 1GB on server class machines
� Do use the �-server� flag

JavaOne Notes: Achieving Great File I/O Performance

Basic mom-and-apple-pie presentation on I/O, spent a long time warming up the basic concepts and the Java APIs. A good presenter from BEA with a few learnings shared, but nothing shocking. About an hour w/ the file I/O javadocs and a really good book that evaluates the actual performance internals and you likely would have captured the same.

What to think about when doing file I/O
� Data access pattern (rand, seq)
� Data size read/write at once
� How much of the file will be read/written
� How durable must the disk writes be (OS crash, power failure)

Writing Safely
� OS normally caches all disk writes (data may be lost in the event of OS crash)
� Most disks also cache all disk writes (data lost on power failure)
� APIs are available that let you �force� the I/O
� They make sure data is not cached by the O/S or disk
� These APIs do not always work!
o Depends on the O/S, filesystem, disk
o For IDE and Firewire it depends on hardware, device driver, and config settings for the driver
o Enterprise (SCSI, etc) hardware usually avoid these problems

General I/O concept: Bigger is better
� A few large I/Os are faster than many small I/Os
� There is overhead in traversing all the layers between app and disk
� Java -> JNI -> system call -> OS -> file system -> cache -> driver -> device

General I/O concept: Extending files is slow
� When you just write data to existing file, you just write to the disk blocks where your data goes
� When you write past the EOF, OS must keep track of new file size
� When you must extend, try doing it in larger chunks

FileOutputStream
� Easy to use � everyone knows paradigm
� Only supports sequential IO
� File IO may be easily buffered using BufferedOutputStream
� Limited force options!
� Suggestion: use BufferedOutputStream for buffering
o But don�t forget to close or flush when you�re done!
o Application exit doesn�t close/flush!!!

RandomAccessFile
� Totally random IO
� Supports all force options
� Suggestion: buffer your IO if you can to increase throughput, however you have to do this in your own code!

FileChannel
� You get a FileChannel from RandomAcceessFile
� Supports all force modes
� Implements the Channel interface(s)
o Takes NIO byte buffers as parameters
� Supports �transfer� � transfer a file to a socket; at the OS kernel level for Linux transfer data from a socket to a file, vice-a-versa (abstracted in Java)�very fast on those platforms!
� Suggestions:
o Performance issues for all small writes in some JVMs
o Solution: allocate a single direct ByteBuffer and copy to and from here
� ByteBuffer is a simple wrapper object around a byte array
� DirectByteBuffer doesn�t allow you access byte array directly...useful for JNI native code integration; downside is access time to DirectByteBuffer is slow�allocation and garbage collection is also much more expensive
� If you arfe using the regular ByteBuffer the performance on Sun JVMs doesn�t seem to be as good as RandomAccessFile (currently), something is happening in the JVM

MappedByteBuffer
� Created from a FileChannel
� Very convenient to map a file into memory; can be extremely fast IF you can use it
� Creates a big ByteBuffer containing your file
� You may explicitly force updates to disk
� Fatal flaw in Java no way to quickly unmap a file; mapping large files into a 32-bit address space JVM may start giving out of memory errors to other allocations in your app
� Handy class to use IF the amount of data is significantly small OR if you are on a 64-bit OS
� Suggestions:
o Use to randomly read and write small files if data integrity is unimportant
o Be very careful on 32-bit JVMs
o Don�t use too many�there�s no �close� call!!!!
o Believe that the file is unmapped during finalizer, but NO promises here!

Reliable Disk IO
� IO writes may be forced explicitly
� RandomAccessFile.flush()
o Forces contents AND meta-data
o Equivalent of Unix fsync()
� FileChannel.flush()
o Forces contents and OPTIONALLY meta-data
o Equivalent of fsync() or fdatasync()
� Meta-data is the last mod time, etc.; modifying means disk writes have to occur in 2+ places on disk
� Ignoring meta-data sync is slightly faster
� Or implicitly:
� RandomAccessFile �rws� mode
o Forces contents and meta-data
o Equivalent of open(O_SYNC)
� RandomAccessFile �rwd� mode
o Forces contents only
o Equivalent of open(O_DSYNC)
� If you can use �rwd� mode does the write in one system call vs multiple trips to write file and file meta-data.
� Implicit vs explicit puts control in your hands

Reliable Disk Writes
� Example: transaction log�reliable messaging log
� Possible solutions:
o Call �flush� after each write
o Open file with a �sync� flag like �rwd� or �rws�
o Use a memory mapped file and call �force�
� Results:
o Using the �rwd� mode will be fastest
o flush() was 2x map force for write sizes from 50-4096 bytes; rwd was 50%-300% faster (smaller writes much faster obviously when buffered internally) than flush()
o flush() and map means many JVM -> OS system calls

RandomReads
� Program: big file/database of several 100MB
� Randomly looking at bits and pieces
� Possible solutions:
o Use standard file IO APIs
o Use memory mapping
� Results
o Memory mapping (particularly for smaller reads) is tremendously faster
o If you can deal w/ the address space issues w/ memory mapping you can get phenomenal read speed

Direct I/O: Things you can�t do in Java
� Every O/S today has a feature called �direct I/O�
� Bypasses O/S buffer cache, generally reducing CPU usage for large I/O
� Designed for use by databases
� There is no Java API for this, so we must use native code�
� His example app he got 2x the I/O output rate (MB/sec), however stated his tests were a bit old

Q&A
� Guidelines for choosing buffer sizes on BufferedOutputStreams? No great suggestion other than make them big as possible; you�d actually have to do some performance tests
� How do these techniques apply to network I/O? I don�t know; certainly buffering data applies to network I/O
� Will this also apply to SAN and solid-state devices? To some degree, remember minimizing trips to kernel and file system still apply! I don�t have benchmarks.

May 17, 2005

SeaCode...an answer to H1B problems

Innovative solution to the H1B problem...Offshore Outsourcing! Plans are to have cruise ship(s) hanging out a few miles off the US coast and outside of US jurisdiction. Getting developers to stay on a ship might be interesting, though located in the nice sunny weather of California and the ability to take "tourist trips" helps. Plus they'll offer all the ammenities of a cruise including full on feasts, activities, etc.

Weird, but true. Now, why the US government can't make reasonable H1B policies is beyond me.

March 03, 2005

Strunk and White


I was looking for a recent paper on new techniques with SLP service discovery co-authored by Henning Schulzrinne and happened upon his guide for Writing Technical Papers. Besides the writeup being extremely useful on its own, he had a link to Strunk and White Elements of Style. Who knew this super useful style reference was available online in all of its full glory!

Heck, I'm not even sure where my print copy is these days. I'm sure long lost during one of my many office moves. Perhaps a quick refresh of topics is in order, I'm sure I misused all sorts of punctuation during this single blog entry. ;)

February 17, 2005

High rolling office in a 747...

My 1.1TB RAID has been down for over a month since I've been swapping from my old case where drives were jammed in w/o any real airflow. As I mentioned previously, my new case is a 3U RMC3E-XP w/ 16 IDE hot-swap drive bays (+ a few internal bays). The 650W power supply finally arrived today (a $120 cheap-o model) and today marked the first time I booted up the new system.

The noise level was ridiculous! Basically my office now sounds like I'm sitting in a commercial airline. Except louder. Commercial rack mount disk servers are clearly not design for residential use out of the box (I expected it to be loud, just not bearing on ridiculous). At least my neighbors haven't started complaining.

Everything is hot-swappable, including the internal fans, so I played a bit w/ removing some of the fans since I won't need as much cooling as what is provided. I'm currently only using 8 drives and access patterns aren't constant so figured I might be able to get away with a few less. Didn't really help much as it's unfortunately the internal fan for the 650W power supply fan that's the worst. I'm not willing to take the risk of doing the replacement myself. Especially since it's unclear if I can even get the system reasonably quiet regardless of whether I replaced all the fans or not.

I decided to whip out my trusty sound meter used for audio calibration for some quick measurements. Where I'm typing the noise level is currently 62.5dB. Within 1 foot of the RAID case (opened) it peaks out at 79dB! Putting the cover on really only dropped the noise level a couple of decibels. Doh. Normal noise with the rest of my current machines running is not even measurable by my sound meter (well under 50dB).

A single small Sunon (there are three) tops the system out at almost 90dB from 8 inches vs 68dB when that one is removed (running two the peak was 94dB). Those measurements were taken from a point inside the case.

Summary: The current setup is not an option. I can either start replacing fans, or sell the entire thing and start over w/ just a 4-5 400GB drives stuffed into a small case with better cooling. Given that I've had the 1.1TB up and running in mid-2002 w/ 160GB drives it might be time to think about replacing. Just hate to replace something that is still working well and solves what I need (mass storage).

[Update] It doesn't help that the Delta EFB0812SHF 80mm fans shipped with the case runs 4700RPM @ 45.5dBA! The two rear Sunon KDE1206PTV1 60mm is 34dBA w/ 23.5 CFM. The power supply Delta AFB0612EH 60mm runs 6800RPM @ 46.5 dBA (seems louder than that) for a fairly high 37.61 CFM.

November 06, 2004

Remote NAS

I've wanted to have off-site backups for some time, the first inclination months ago was to burn DVDs and keep them in a safety deposit box. Good idea, but I never overcame the time hurdle to actually organize and burn the DVDs, especially since I'd have to repeat burning every month or so. I only backup sporadically.

Yesterday I realized I have a perfect offsite solution. My family's Colorado house has a cable modem w/ unlimited bandwidth. I could easily take one of my old PCs that I'm no longer using and put it into service there as a secure remote NAS.

First I looked into a few NAS Linux distributions since it would be nice to have something out of the box rather than configuring an existing distribution for my needs:

Desired features:

  • SSH command line access
  • Web administration
  • Support software RAID 1 or 5
  • Support some journaling filesystem
  • ACL/login support
  • Free

Features nice to have, but not required:

  • rsync support
  • Ability to install other apps/services (desired)
  • SSL encrypted client-server communications (desired, could also use hardware VPN as my home router supports VPN)
  • Logical Volume Manager (nice to have)
  • Dynamic filesystem expansion (nice to have)

Others have built the same w/ RedHat 9 w/ XFS 1.3, LVM, etc. and having to customize the RH install.

Also ran across some other interesting tools:

  • NasBackup - NAS rsync backup solution, looks like requires server side component
  • Sync2NAS - Schedulable rsync backups for WinXP

October 12, 2004

Why Your Code Sucks

Intersting debate came from reading Why Your Code Sucks. One useful quote was "Code should be easy to read. Steve McConnell made the statement at his SD West '04 keynote that code should be convienent to read, not convienent to write." Even Joel Spolsky wrote in his rant on refactoring that the hardest thing for software developers to do is read and grok other developers code...they often find it easier to just throw away and re-write. Optimizing for reading makes sense when you start thinking about it.

Great comments on frameworks, again re-inforcing concepts about why our approach to BSF may have been fatally flawed in some cases, and good in others.

----

A few great criticisms from Steve Yegge:

But I'm not inclined to agree with him about removing getters and setters and opening up field access for "transparently readable and writeable" variables.
Doing this is playing with fire, and suggesting it reflects poorly on the author of the article.

If you simply create a publicly writeable field, you have no way of knowing when the field changes, and who's changing it. Just *think* how awful your debugging would be. With an explicit setter, you know exactly when the field changes, and using (e.g.) Thread.getStackTrace(), or Thread.dumpStack() pre-1.5, you can figure out exactly who's changing it.

You can use this info for debugging, or for profiling (e.g. put in a counter that increments every time you call the setter), or for logging, or even as a way to provide hooks -- e.g. you could add a Listener mixin interface that's called back every time the setter is called.

With getters and setters, you can change the behavior in a subclass. Rather than accessing a private data field, for instance, a subclass could call out to some other data store to get the value. And then it could keep the value cached.

Java doesn't provide a way to make a public field directly readable but only writeable through a method. So if you want to force write-access through a setter, you need to make the variable non-public and add a getter as well. Jython and presumably Groovy actually do the "right thing" (at least the article guy would think so), since this:

x = foo.a

actually invokes a getA() method via beany reflection, if there are appropriate getA and setA methods. So it has the advantage of being clearer and more readable, and is also trackable.

October 11, 2004

Experiments...

"If you don't run experiments before you start designing a new system, your entire system will be an experiment!" -- Mike Williams

July 08, 2004

Bloom Filters

Now that Amazon's software platform team has a regular "University", essentially a bi-weekly paper review, we've covered some interesting topics unrelated to our core products.

Next week we are covering Bloom filters, as well as optimizations to compress the space required. Another Bloom filter survey paper.

June 25, 2004

21 Rules of Thumb – How Microsoft Develops its Software

Ran across a good writeup by David Gristwood that covers management fundamentals of software projects.

April 14, 2004

BEEP beep!

Ben Black pointed out another interesting paper, On the Design of Application Protocols, describes the reasoning behind BEEP.

April 12, 2004

Adaptive distributed systems messaging

Cornell research into Spinglass, a set of "Adaptive Probabilistic Tools for Advanced Networks" looks very interesting. Particularly appealing since I've also thought about applying probability to solve problems that typically have required someone (engineer, admin, etc) to tune a system (one idea I never completed researching was applying the probability of a service request happening for some event to predictively pre-fetch data).

February 26, 2004

Virtualization of Hardware

Ben Black forwarded an interesting paper on Constructing Services with Interposable Virtual Hardware which I haven't gotten a chance to read but looks extremely interested. I'm sure there are other interesting papers as ewell describing the real meaty technical issues underlying VMWare like solutions.

July 09, 2003

Dijkstra's Manuscripts

Dijkstra's manuscripts covers many interesting CS topics.

July 07, 2003

Reverse Engineering

Very interesting writeup on reverse engineering software. Worth reading just to improve how you might debug various problems.

"This book is an attempt to provide an introduction to reverse engineering software under both Linux and Windows. Since reverse engineering is under legal fire, the authors figure the best response is to make the knowledge widespread. The idea is that since discussing specific reverse engineering feats is now illegal in many cases, we should then discuss general approaches, so that it is within every motivated user's ability to obtain information locked inside the black box. Furthermore, interoperability issues with closed-source proprietary systems are just plain annoying, and something needs to be done to educate more open source developers as to how to implement this functionality in their software. "

July 03, 2003

Read-copy update locking mechanism

Val Gough pointed out an interesting article describing a Read-Copy Update mechanism for dealing w/ concurrency.

Update:
Also see updated writeup in Linux Journal, Using RCU in the Linux 2.5 Kernel.

June 19, 2003

Deciding when to forget in the Elephant file system

Interesting paper on an approach to version controlled filesystems mentioned during an interview with Hans Reiser:

"Modern file systems associate the deletion of a file with the immediate release of storage, and file writes with the irrevocable change of file contents. We argue that this behavior is a relic of the past, when disk storage was a scarce resource. Today, large cheap disks make it possible for the file system to protect valuable data from accidental delete or overwrite.

This paper describes the design, implementation, and performance of the Elephant file system, which automatically retains all important versions of user files. Users name previous file versions by combining a traditional pathname with a time when the desired version of a file or directory existed. Storage in Elephant is managed by the system using file-grain user-specified retention policies."

---
Comment from Slashdot interview with Hans Resier:
"I'm going back to school this fall, and in a year I hope to be admitted into a Masters of Computer Science program. I'd like my main research focus to be on filesystems.

I'm preparing by reading everything I can find: I'm working on Tanenbaum & Woodhull's "OS Design & Implementation"; I've read "Design and Implementation of the Second Extended Filesystem"; Steve Pate's "UNIX Filesystems" is waiting on my shelf; and of course, there's the FAQ and ReiserFS v.3 Whitepaper at www.namesys.com [namesys.com]. Specific questions: what branches of math are useful in this line of research? Any books, articles, etc., that I haven't listed that are a 'must read' or 'should read'? Those who have succeeded in building a better filesystem: what have they done that I should also do? Any mistakes I should avoid? Anything that no one told you about filesystems that you wish you had known up front? And are there any special tricks (above and beyond mastering your subject) to getting hired in this field once a degree is in hand?

Hans:

I was never able to get hired in this field, so I am probably not the one to ask about how to get hired.;-) Hmmm. Oh I know one! Don't tell your potential employer that you are working on your own file system nights and weekends, and you will retain all rights to it, and you won't stop work on it once they hire you.;-)

You should probably read about Plan 9, and about namespaces generally. The literature on namespaces seems to be just about hierarchical namespaces, but the notion present in that literature that they should be unified is a good one. I rather liked Gerard Salton's book on automatic text processing. Ted Nelson's Xanadu project was interesting reading, and you'll want to read Codd and Date about databases. Mikhail Gilula's book about set theoretic databases is a good one.

In regards to math, study the design of new mathematical models. Study closure, and its importance to various models ranging from algebra to relational algebra. Understand why mathematical models were designed to have the structure they have rather than learning what those structures are, so that you can learn to construct your own models. I don't know of any courses that teach that, but it is what is important to learn.

Are you sure that it wouldn't be better to hang out in cafes and bookstores for 4 years, and at the end of it write some piece of a filesystem? Cafes, bookstores, and attending random seminars will educate you better, and writing some piece of a filesystem will employ you better."

June 16, 2003

Format string attacks

Who knew an incorrect printf statement could actually allow someone to spawn a root shell (using %n). Several docs pointed out by Cyrus Durgin on format string attacks.
http://www.lava.net/~newsham/format-string-attacks.pdf
http://www.team-teso.net/releases/formatstring-1.2.tar.gz

The first is a little light on content, but very well written. The second is full of content however many of the code examples are poorly written (also note that many were written with Sparc Solaris specifics).