Main

May 01, 2007

Good Developer Habit #3: Leverage Standard Documentation Conventions

Make sure you learn the simple formats in your language of choice for documenting (Javadoc, Doxygen for C++, Perldoc). Why? Besides generating HTML, PDF, etc. documentation users can read without loading source files, many modern IDEs (Eclipse, etc) parse documentation and can present it as you mouse over on code statements.

Java/C++

At the very least you should have Javadoc and Doxygen have very similar styles, the examples below comply in both system. the following docs for any source file:

  /**
   * @class MyClass
   *
   * Describe what the heck your class does, threading expectations,
   * how it should be used, etc.
   *
   * @version $Id: $
   */
  public class MyClass {

And for all public methods that you expect others to consume:

  /**
   * @return a Collection of Mojo, or empty Collection if has no Mojo
   */
  Collection getMyMojo();

Bonus points for including threading and concurrency expectations, exception behavior, performance hints.

April 30, 2007

Good Developer Habit #2: Be a Better Logger than Paul Bunyan

Like many things logging is an art, not a science. Many rules of thumb apply for effective logging, starting with use a logger (e.g. log4j, rlog, etc) rather than cerr or System.out.println.

Never emit a log statement w/o at least one argument!

Logging is expensive. It likely requires argument processing, disk writes, and storage. Make every log event really count by providing as much relevant context as possible for someone who views the log messages 5 minutes (or 5 days) later to understand what might have occurred. The more '''relevant''' context provided in the message the higher the probability someone can interpret and deduce the root cause.

Bad:

  log.warn("Missing item");

Good:

  log.warn("Missing item in order " + orderID + " for customer "
                 + customerID);

It only took 2 seconds longer to type a much more useful error message. Don't be lazy! Its extremely frustrating when you are under pressure to debug a production system and realize someone forgot to log critical information that would help understand root cause.

Use appropriate logging levels

In production applications you should expect to drive log files to zero fatals and errors. You should also consider driving warnings towards 0 as having unnecessary entries in your logs makes surfacing real problems more difficult.

Think about the computation

Reminder: '''even if the logging level is disabled all arguments passed into the logger get evaluated'''. The following is a good example using log4j where CPU cost is undefined:

  log.debug("Event " + event + " occurred for customer " + customerID);

While this looks innocuous, if event's toString() method actually does much computation this can eat up valuable CPU. For instance, event.toString() couldb e doing a database lookup to get event meta-data. A better approach is to add a conditional check:

  if( log.isDebugEnabled() )
    log.debug("Event " + event + " occurred for customer " + customerID);

For fatal/error/warning messages this is unnecessary since it is generally expected all log messages of those levels should be emitted into logs. However, for any more verbose levels (inform/debug/notify/verbose) you should wrap w/ a conditional check. '''If you are in a tight loop or critical core function, be extra careful!'''

Another rule of thumb: 'DON'T MODIFY STATE OR ADD BUSINESS LOGIC IN LOG STATEMENTS. This is considered a BIG NO NO:

  log.warning("Request count=" + (++requestCount));

If someone changes the logging level there is a possibility that those statements never execute (depending on whether the logging library has conditional check macros/byte code insertion)! In fact, even though it works one day there are no guarantees that deploying a new version of your logging library won't change the behavior.

Java

Perhaps one day Java's log4j or commons logging will finally add byte code injection to insert these statements automatically. However until that day, follow the conditional check pattern.

And if using Java's log4j for logging exceptions, then leverage it's native ability for including Exceptions w/ configurable backtraces:

  log.debug("Unexpected exception for accountID=" + accountID, e);

April 29, 2007

Good Developer Habit #1: Embed RCS ID into EVERY file

Ever run across a file and wondered where the heck it came from? I hated being paged at 3am, only to find a script or class file for something I didn't write having a bug and not being able to track down what branch to make a patch from. IMO, no file should ever be checked into a source repository without an RCS ID tag!

Almost every source control system supports the $Id: $ tag, I know for a fact that Perforce, Subversion, and CVS all support this tag. Use it or lose it (the file's origin that is)!

Scripting Languages

Embedding RCS ID is particularly important for Perl, Ruby, and other scripts as they often get copied around such that no tools can be used to find their origin. Just add a simple comment at the top:

  #!/usr/bin/ruby
  # $Id: $

Config

Again, embedding RCS ID tags enables someone to quickly find exactly where to change the config file in it's permanent repository. There is nothing worse than wasting 5 minutes of time trying to track down exactly how the file got deployed to a box, where it was imported from, and what branch it was released on. Save everyone time by adding tags where. Simple:

  # $Id: $
  
  *.*.SelfImplode = F;

This includes Java properties files, Spring, XML config, JSON, or whatever you might be using to "configure" any software component.

Java

An interesting pattern I've seen lately is to emit software versions into the application log based on log level:

  public class YourObject {
    private static final Logger log = Logger.getLogger(YourObject.class);
    static { log.debug("$Id: $"); }

Note, depending on how many classes your application uses this could increase JVM/application startup time. I suspect in 99.9% of the cases these few extra microseconds won't matter.

For a long time I've wanted a mechanism where, via [[JMX]], I could connect remotely to a running process and inspect which versions of software were loaded.

C++

Embedding RCS ID as a string variable into binaries/shared objects enables executing:

  % strings my-app-binary-file | grep "\$Id"

to find exactly what versions of code are linked in. Extremely useful for reducing MTTR, we've seen it in practice. This simple trick enables an oncall to quickly track back and patch.

It's easy to roll your own based off that macro, just make sure the compiler doesn't optimize out the strings. Create a simple #define macro for embedding these quickly.

Documentation (Javadoc/Doxygen/etc)

All of this applies equally to documentation. Ever read a Javadoc provided by another team and wanted to know where to update it to be clearer? Simply adding @version $Id: $ makes this trivial.

The following example ensures [[Javadoc]] generated docs include exactly what file is being referenced and where to change it!

  /**
   * @class Kryptonite
   *
   * Rock solid implementation, use sparingly.
   *
   * @version $Id: $
   */
  public class Kryptonite {

There are numerous other cases where embedded RCSID make sense, it's just another good habit to leverage them where possible.

Simple Developer Habits and Rules of Thumb

I've been doing more code reviews lately and realized many common rules of thumb weren't percolating through the general developer community. There are tons of great books out there on design patterns, dependency injection, test driven development, ORM, etc. but few that cover simple basics and tricks useful for writing and maintaining production software.

I'll be posting a series of simple suggestions on how to improve the code you write. These rules of thumb are based on reviewing millions of lines of code mixed with dealing w/ the less often discussed aspect of software development: code maintenance lifecycle. Many are extremely basic and you likely already put them to practice. However some seemingly trivial changes can pay off over the long haul, particularly as software evolves across multiple developers. YMMV.