Sunday 26 January 2014

Java: exceptions, error stories, and plot holes

Software should tell a good story about its errors. Failure to do this in complex applications can lead to hours or days of trying to puzzle out the cause of some seemingly random exception in the production logs.

Here are some tips for writing Java code that generates informative stack traces.

  1. Stack traces should include the point of failure
  2. Stack traces should avoid ambiguity
  3. Consider stack traces when using lambda expressions

Note: adhering to the first tip will significantly simplify your life for managing code of any complexity; the last two are icing on the cake.

Stack traces should include the point of failure

Consider this class for representing an e-mail address:

package demo.bad;

public final class Email {
  private final String address;

  public Email(String address) {
    this.address = address;
  }

  public String getLocalPart() {
    return address.substring(0, address.lastIndexOf('@'));
  }

  public String getDomain() {
    return address.substring(address.lastIndexOf('@') + 1);
  }

  @Override
  public String toString() {
    return address;
  }
}

This type has poor error checking which will have consequences for fault diagnosis.

Here is a class for registering an administrator email address using a system property:

package demo.bad;

import javax.servlet.*;
import javax.servlet.annotation.WebListener;

@WebListener
public class SettingsLoader implements ServletContextListener {
  @Override
  public void contextInitialized(ServletContextEvent evt) {
    String admin = System.getProperty("config.admin.email");
    Email email = new Email(admin);
    evt.getServletContext().setAttribute("adminEmail", email);
  }

  @Override
  public void contextDestroyed(ServletContextEvent evt) {}
}

The intent is that the server administrator should configure this property during deployment. This type is faulty because it does not check the property has been set.

One of the places this variable is used is in a page to display the address to users:

<%@page contentType="text/html" pageEncoding="UTF-8"%>
<!DOCTYPE html>
<html>
  <body>
    ...loads of view code for home page...
    
    <hr>
    Contact: ${adminEmail.localPart} at ${adminEmail.domain}
  </body>
</html>

Here is an example sequence of events:

  1. You develop, test and build an application and submit it for deployment
    • Assume the Email type is used all over the application
  2. The system administrator misinterprets the instruction to set the email property
  3. The sysadmin reports that the application is faulty
  4. The only access you have to the server is to view the logs

Here is the stack trace from the logs:

java.lang.NullPointerException
  at demo.bad.Email.getLocalPart(Email.java:11)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:601)
  at javax.el.BeanELResolver.getValue(BeanELResolver.java:363)
  at javax.el.CompositeELResolver.getValue(CompositeELResolver.java:188)
  at com.sun.el.parser.AstValue.getValue(AstValue.java:140)
  at com.sun.el.parser.AstValue.getValue(AstValue.java:204)
  at com.sun.el.ValueExpressionImpl.getValue(ValueExpressionImpl.java:226)
  at org.jboss.weld.el.WeldValueExpression.getValue(WeldValueExpression.java:50)
  at org.apache.jasper.runtime.PageContextImpl.evaluateExpression(PageContextImpl.java:1016)
  at org.apache.jsp.home_jsp._jspService(home_jsp.java:50)
  (remainder omitted)

Assuming the home page uses the Email type for multiple purposes, there is no clear indication of where the error originates.

Adding some cheap validation to the Email constructor changes the story completely:

  public Email(String address) {
    if(address == null || !address.contains("@")) {
      throw new IllegalArgumentException(address);
    }
    this.address = address;
  }

The stack trace has become:

org.apache.catalina.LifecycleException: java.lang.IllegalArgumentException
  at org.apache.catalina.core.StandardContext.start(StandardContext.java:5920)
  (remainder omitted)
Caused by: java.lang.IllegalArgumentException
  at demo.good.Email.<init>(Email.java:8)
  at demo.good.SettingsLoader.contextInitialized(SettingsLoader.java:12)
  at org.apache.catalina.core.StandardContext.contextListenerStart(StandardContext.java:5362)
  at com.sun.enterprise.web.WebModule.contextListenerStart(WebModule.java:743)
  at org.apache.catalina.core.StandardContext.start(StandardContext.java:5898)
  ... 44 more

In this stack trace SettingsLoader is present. From this chain of calls we can infer that the application is not configured correctly and that we need to add some better error logging to SettingsLoader so that the sysadmin can understand the problem and correct the configuration.

The change to Email has also allowed the application to fail fast PDF. See also crash early: "A dead program normally does a lot less damage than a crippled one."

Stack traces should avoid ambiguity

Anonymous classes are like anonymous Internet users - they spread bile from an ambiguous location.

Consider this code for performing a logical AND or OR on a boolean array asynchronously:

package demo.bad;

import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;

public class Logic {
  private static Callable<Boolean> and(final boolean... bools) {
    return new Callable<Boolean>() {
      public Boolean call() {
        boolean result = bools[0];
        for (int i = 1; i < bools.length; i++) {
          result &= bools[i];
        }
        return result;
      }
    };
  }

  private static Callable<Boolean> or(final boolean... bools) {
    return new Callable<Boolean>() {
      public Boolean call() {
        boolean result = false;
        for (int i = 0; i < bools.length; i++) {
          result |= bools[i];
        }
        return result;
      }
    };
  }

  public static Future<Boolean> eval(ExecutorService executor, Op op,
      boolean... bools) {
    switch (op) {
    case AND:
      return executor.submit(and(bools));
    case OR:
      return executor.submit(or(bools));
    default:
      throw new IllegalArgumentException();
    }
  }

  public static enum Op {
    AND, OR
  }
}

There is a bug in the code in that it does not have a default result for the AND operation which causes an exception when the array is empty. That is, when it is this call is made: Logic.eval(executor, Op.AND).get()

Here is the stack trace:

java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 0
  at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
  at java.util.concurrent.FutureTask.get(FutureTask.java:111)
  (remainder omitted)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
  at demo.bad.Logic$1.call(Logic.java:11)
  at demo.bad.Logic$1.call(Logic.java:9)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:722)

The anonymous type name Logic$1.call tells us that the problem is in Logic.java but the code can be rewritten to provide more meaningful names.

One approach is to use a named local type:

  private static Callable<Boolean> and(final boolean... bools) {
    class And implements Callable<Boolean> {

      @Override
      public Boolean call() {
        boolean result = bools[0];
        for (int i = 1; i < bools.length; i++) {
          result &= bools[i];
        }
        return result;
      }
    };
    return new And();
  }

The top of the cause trace now includes a meaningful name:

Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
  at demo.ok.Logic$1And.call(Logic.java:14)
  at demo.ok.Logic$1And.call(Logic.java:10)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

I always favour using named types over anonymous classes. It may add one or two extra lines but is invariably worth it.

Another approach is to defer the evaluation of the switch statement:

  private static boolean and(boolean... bools) {
    boolean result = bools[0];
    for (int i = 1; i < bools.length; i++) {
      result &= bools[i];
    }
    return result;
  }

  private static boolean or(boolean... bools) {
    boolean result = false;
    for (int i = 0; i < bools.length; i++) {
      result |= bools[i];
    }
    return result;
  }

  private static boolean eval(Logic.Op op,
          boolean... bools) {
    switch (op) {
      case AND:
        return and(bools);
      case OR:
        return or(bools);
      default:
        throw new IllegalArgumentException();
    }
  }

  public static Future<Boolean> eval(ExecutorService executor, final Logic.Op op,
          final boolean... bools) {
    return executor.submit(new Callable<Boolean>() {
      @Override
      public Boolean call() throws Exception {
        return eval(op, bools);
      }
    });
  }

The top of the cause trace is now a method name:

Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
  at demo.good.Logic.and(Logic.java:10)
  at demo.good.Logic.eval(Logic.java:29)
  at demo.good.Logic.access$000(Logic.java:7)
  at demo.good.Logic$1.call(Logic.java:42)
  at demo.good.Logic$1.call(Logic.java:39)

This last approach also reduces the number of classes loaded by the runtime.

Consider stack traces when using lambda expressions

Java 8's lambdas allow you to in-line code in less verbose expressions than anonymous classes allow. Behind the scenes the compiler has a number of options as to how lambdas are translated to bytecode; this can affect the stack trace output.

Logic.and(boolean[]) translated to a lambda:

  private static Callable<Boolean> and(final boolean... bools) {
    return () -> {
      boolean result = bools[0];
      for (int i = 1; i < bools.length; i++) {
        result &= bools[i];
      }
      return result;
    };
  }

The resultant stack trace includes the method name in this implementation so this is OK:

Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
  at demo.ok.Logic.lambda$and$0(Logic.java:10)
  at demo.ok.Logic$$Lambda$2/849460928.call(Unknown Source)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)

Still, a better approach would be to move the lambdas to Logic.eval:

  private static boolean and(final boolean... bools) {
    boolean result = bools[0];
    for (int i = 1; i < bools.length; i++) {
      result &= bools[i];
    }
    return result;
  }

  private static boolean or(final boolean... bools) {
    boolean result = false;
    for (int i = 0; i < bools.length; i++) {
      result |= bools[i];
    }
    return result;
  }

  public static Future<Boolean> eval(ExecutorService executor, Op op,
                                     boolean... bools) {
    switch (op) {
      case AND:
        return executor.submit(() -> {return and(bools);});
      case OR:
        return executor.submit(() -> {return or(bools);});
      default:
        throw new IllegalArgumentException();
    }
  }

The relevant part of the stack trace becomes:

Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
  at demo.good.Logic.and(Logic.java:9)
  at demo.good.Logic.lambda$eval$0(Logic.java:28)
  at demo.good.Logic$$Lambda$2/849460928.call(Unknown Source)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)

Note: when classes and methods are automatically generated naming is largely an implementation detail; the examples in this post were generated using the Oracle JDK.

No comments:

Post a Comment

All comments are moderated