User Tools

Site Tools


Sidebar

Dave Orme muses about data-first development.

My current work emphasizes data engineering and analysis using Kubernetes, Clojure, Scala, Eclipse, and Google Cloud Platform or AWS.


Blog

The Cloud

Scala, Clojure, and FP

Data-First Development

Agile

Older work

Coconut Palm Software home


Donate Bitcoin:

1Ecnr9vtkC8b9FvmQjQaJ9ZsHB127UzVD6

Keywords:

Kubernetes, Docker, Streaming Data, Spark, Scala, Clojure, OSGi, Karaf, GCP, AWS, SQL

Disclaimer:

Everything I say here is my own opinion and not necessarily that of my employer.

blog:improvements_on_null_safety_for_java_and_eclipse

Improvements on Null Safety for Java and Eclipse

In their introduction to their implementation of the Optional<T> type, the Google Guava developers observe: 1)

  • In the Google code base, approximately 95% of collections weren't supposed to have any null values in them.
  • It's rarely obvious what a null return value is supposed to mean.
    • For example, Map.get(key) can return null either because the value in the map is null, or the value is not in the map.
    • Null can mean failure, can mean success, can mean almost anything. Worst, a null value by itself gives no indication of what it means.

Further, I have observed that:

  • Failing to check for the presence of null sometimes results in the null value propagating through collections, eventually to an unrelated API call, which then causes a NullPointerException.
    • These bugs can be painful to fix because by the time the NullPointerException occurs, the original place where the null check was supposed to happen may no longer even be on the stack.
  • Using a more concrete type, rather than null, makes one's intensions clearer.

All of this (and more) could be summarized as:

“I call [null] my billion-dollar mistake.” - Sir C. A. R. Hoare, inventor of the null reference

Alternatives to null

In Java, there appear to be three main alternatives to null references:

  1. Throw an exception; never return an “empty” object
  2. The Null Object Pattern
  3. Some variation on the Option pattern, sometimes also known as “Maybe” or “Optional”

Here is why I think the Java community should consider adopting some form of Option:

1) The Java community has been moving away from throwing checked exceptions, and to this blogger, unchecked exceptions do not seem appropriate as a replacement for something that may not represent an actual failure.

2) I expect you've probably heard of the Null Object Pattern before, and it's been well-described elsewhere. It's a good idea when you can use it, but there isn't always a suitable null object for a given type.

3) A lot of Java programmers who don't have a background in functional programming haven't heard of the Option pattern or the Maybe pattern. I think this is a pity because by itself, Option has nothing to do with functional programming. It's just a good object-oriented design pattern that happens to have been discovered by the functional programming people.

So for the balance of this blog, I'd like to describe what Option is and how I've adapted Option to Java's idioms and coding patterns. I think that with a little consideration, you might grow to really like using this style.

Monad, Schmonad...

The first thing I'd like to get out of the way is this: Some of you have heard that Option is a Monad. Maybe you didn't listen past that or maybe that scared you if you don't already understand monads.

If this was true of you, no problem, because we're going to entirely ignore monads for the moment and demonstrate that our version of Option, which we will call Possible<T>, is plenty useful without them. (If you're interested, I'll blog about monads at some later date.)

A motivating example...

Back to the problem at hand: Most of the time when a null object is not possible, the purpose for returning or passing null is to indicate the “lack of an object”.

For the sake of our discussion, let's pretend that we are designing a data access object (DAO) layer and need to create a

findPerson(int id) 

API. The question is this:

  • If we are not going to return null or throw an exception, and if we feel that there is no reasonable “null Person object”, what do we return?

Return a collection?

Rather than return a Person object directly, one option might be to return a collection of Person objects instead, where the collection always has either 0 or 1 element(s). Something like the following could work:

public List<Person> findPerson(int id) { ... }
 
 
List<Person> foundPerson = findPerson(42);
if (!foundPerson.isEmpty()) {
   Person person = foundPerson.get(0);
} else {
   // handle the case where we didn't find the person if needed
}

Further, since the result is a Collection, if we do not care about the case where we didn't find a Person, we can 'foreach' over the results:

public List<Person> findPerson(int id) { ... }
 
 
List<Person> foundPerson = findPerson(42);
for (Person person : foundPerson) {
   // ...
}

Benefits and drawbacks of using List<T>

The benefits of using List<T> to contain 0 or 1 results include:

  • It's clear from the method signature that the API might return a List of 0 length.
  • Testing for the presence of a value is simple, straightforward, and familiar. After all, we're just returning and evaluating a collection.

The drawbacks include:

  • It's not clear from the method signature that the API will never return a foundPerson.size() > 1

How might we avoid this drawback?

Easy: Let's build a Java collection type that can only contain 0 or 1 elements. Additionally, we could add a few special-purpose methods for handling the kinds of cases for which we normally would use null.

Implementing Option<T> for Eclipse

Traditionally, Option or Maybe is implemented using pattern matching (and in Scala, using case classes). In Java, this translates to something like the following:

public interface Option<T> {
   boolean isEmpty();
   T get();
}
 
public class Some<T> implements Option<T> { // ...
   public static <T> Some<T> some(T value) { return new Some<T>(value); } 
   // ...
}
 
public class None<T> implements Option<T> { // ... 
   public static <T> None<T> none() { return new None<T>(); }
   // ...
}
 
 
public Option<Person> findPerson(int id) {
   boolean foundPerson = false;
   // look up person by ID
   if (foundPerson) {
      Person person = new Person(parameters);
      return some(person);
   } else {
      return none();
   }
}
 
 
public void whenWeNeedToFetchThePerson() {
   Option<Person> optionalPerson = findPerson(42);
   if (!optionalPerson.isEmpty()) {
      Person person = optionalPerson.get();
      // process the case where we found the person
   } else {
      // process the case where we did not find the person
   }
}

I have used an Option implementation similar to this for approximately the last year-and-a-half. My observation in team usage is that programmers who are familiar with pattern matching and case classes in other languages are comfortable enough with it, but that programmers who are only familiar with Java find the static #some and #none factories counter-intuitive and problematic.

In case you like this style, some of the E4 code that was never adopted, and also some of the P2 Director integration tests, have Option implementations after this style:

Improving our Option implementation

After time with these implementations, the question returned to my mind, “Can I find a way to implement Option<T> that is more idiomatic to Java?”

And the thought that occurred to me was, “What if I implemented something like Option<T>, but as a standard Java collection containing 0 or 1 element, nothing more.”

I've now done that. Since the implementation isn't really derived from Option any longer, I've chosen to call the new version Possible<T>. (That name also makes the factory methods read more nicely, IMO.)

Here is the previous example, rewritten to use Possible<T>:

public abstract class Possible<T> implements Iterable<T> {
   public static <T> Possible<T> value(T value) { ... }
   public static <T> Possible<T> value(T value, IStatus status) { ... }
   public static <T> Possible<T> emptyValue() { ... }
   public static <T> Possible<T> emptyValue(IStatus reason) { ... }
 
   abstract boolean contains(Object o);
   abstract boolean isEmpty();
   abstract boolean hasValue();
   abstract int size();
   abstract Object[] toArray();
   abstract <A> A[] toArray(A[] a);
   abstract T get();
   abstract T getOrSubstitute(T defaultValue);
   abstract <E extends Throwable> T getOrThrow(E exception) throws E;
   abstract IStatus getStatus();
}
 
public Possible<Person> findPerson(int id) {
   boolean foundPerson = false;
   // look up person by ID
   if (foundPerson) {
      Person person = new Person(parameters);
      return Possible.value(person);
   } else {
      return Possible.emptyValue();
   }
}
 
public void whenWeNeedToFetchThePerson() {
   Possible<Person> optionalPerson = findPerson(42);
   if (!optionalPerson.isEmpty()) {
      Person person = optionalPerson.get();
      // process the case where we found the person
   } else {
      // process the case where we did not find the person
   }
}

From this code snippet, I would like to highlight a few things:

  • Possible<T> only makes you deal with its own class as API. Implementation classes are not API and are not exposed.
  • The API is largely familiar. With only a few minor additions, it's just Java's existing Collections API, with zero surprises.
  • The static factories are there, but in the tradition of more idiomatic Java, names are chosen so that the fully-qualified forms make sense as well as the statically-imported versions:
    • Possible<Integer> meaning = Possible.value(42);
    • Possible<Integer> meaning = Possible.emptyValue();

Up to this point, you might be saying, “This is nice, but why would I chose this over, say, the fine Optional class in Google's Guava project?”

First, I would heartily recommend Guava. It has a lot of really nice ideas and adaptations from the functional programming world to Java and if that's the ultimate direction you want to choose, it's a great choice2).

There are a few reasons you might consider this version:

  • It has no dependencies other than the JDK and org.eclipse.runtime; it is small and simple enough you can “own” it yourself.
  • Possible<T>, in addition to containing a value or nothing, encapsulates an IStatus value which is useful in Eclipse development. (If no IStatus is specified, the value defaults to Status.OK_STATUS.)

For library authors who do not want to be opinionated about the logging framework choice, encapsulating an IStatus allows the library to return information in a form suitable for logging to any of several loggers, including Eclipse's.

Possible<T> may be found on GitHub.

~~LINKBACK~~

blog/improvements_on_null_safety_for_java_and_eclipse.txt · Last modified: 2014/10/17 22:08 (external edit)