User Tools

Site Tools


Sidebar

Dave Orme muses about data-first development.

My current work emphasizes data engineering and analysis using Kubernetes, Clojure, Scala, Eclipse, and Google Cloud Platform or AWS.


Blog

The Cloud

Scala, Clojure, and FP

Data-First Development

Agile

Older work

Coconut Palm Software home


Donate Bitcoin:

1Ecnr9vtkC8b9FvmQjQaJ9ZsHB127UzVD6

Keywords:

Kubernetes, Docker, Streaming Data, Spark, Scala, Clojure, OSGi, Karaf, GCP, AWS, SQL

Disclaimer:

Everything I say here is my own opinion and not necessarily that of my employer.

start

If you're coding Scala without Scalaz...

…Here's a presentation link to consider: http://noelmarkham.github.io/scalaz-scala.io/#/

Then consider how much less pain you'd have if you adopted Scalaz. :)

~~LINKBACK~~

2014/10/25 12:55 · djo

Lazy, Lazy Iterables

In Java, Iterable<T> is almost a Monad. Even so, it's still useful in a lot of ways that a monad is useful–such as for abstracting transformations from collections of one type to collections of another.

(Whaaaaaiiiiit! You used the word “MONAD”! Okay, I admit it; I did. But guess what? Although I'm going to talk about Monads here a bit, I'm NOT going to assume that you know what they are in order to get value from this post. Feel better yet?)

In Java 8, the Streams API helps a lot, but even without Java 8 (which I can't use just yet), it's still a really useful technique. One place I've found it particularly useful has been as an abstraction for data input.

Suppose you have a cluster and want to run commands remotely through SSH and collect the output. Assuming that you have set up SSH keys properly, something like the following is really useful:

SshClient remoteServer = new SshClient("username", "host", 22);
Iterable<String> lines = remoteServer.relay("tail /var/log/serverlog");

Traditionally, a construct like this would SSH to the server, run the specified command on the remote machine, collect the results, and return the resulting collection as an Iterable<String> of lines.

Simple and useful, yes?

But we can make it better than that! How? What about this?

SshClient remoteServer = new SshClient("username", "host", 22);
Iterable<String> lines = remoteServer.relay("tail -f /var/log/serverlog");

“Wait?” you ask, “That's exactly the same code as before except that the 'relay' command won't ever terminate!”

Not so fast.

What if that “relay” command returned immediately after initiating the remote session and then returned lines in the “lines” Iterable as they became available?

In other words, what if “relay” implemented an infinite lazy list when appropriate?

“Then you would never close the SSH connection and might leak resources,” someone might object.

Fair comment. But what if that “lines” Iterable also implemented Closeable? I think this solves all of the technical challenges…

Then all that's needed is something like Guava's “Iterables” class to finish the Monad implementation for Java's Iterables so we can transform, transformAndFlatten, and so on, directly over an Iterable…

Some will point out that the Guava engineers already use lazy Iterables.

You're right. I'm writing this because I want to tip my hat to them and give the idea some more visibility.

Here's the link: https://code.google.com/p/guava-libraries/wiki/CollectionUtilitiesExplained

Now go have fun!

~~LINKBACK~~

~~DISQUS~~

2014/10/19 22:35 · djo

Improvements on Null Safety for Java and Eclipse

In their introduction to their implementation of the Optional<T> type, the Google Guava developers observe: 1)

  • In the Google code base, approximately 95% of collections weren't supposed to have any null values in them.
  • It's rarely obvious what a null return value is supposed to mean.
    • For example, Map.get(key) can return null either because the value in the map is null, or the value is not in the map.
    • Null can mean failure, can mean success, can mean almost anything. Worst, a null value by itself gives no indication of what it means.

Further, I have observed that:

  • Failing to check for the presence of null sometimes results in the null value propagating through collections, eventually to an unrelated API call, which then causes a NullPointerException.
    • These bugs can be painful to fix because by the time the NullPointerException occurs, the original place where the null check was supposed to happen may no longer even be on the stack.
  • Using a more concrete type, rather than null, makes one's intensions clearer.

All of this (and more) could be summarized as:

“I call [null] my billion-dollar mistake.” - Sir C. A. R. Hoare, inventor of the null reference

Alternatives to null

In Java, there appear to be three main alternatives to null references:

  1. Throw an exception; never return an “empty” object
  2. The Null Object Pattern
  3. Some variation on the Option pattern, sometimes also known as “Maybe” or “Optional”

Here is why I think the Java community should consider adopting some form of Option:

1) The Java community has been moving away from throwing checked exceptions, and to this blogger, unchecked exceptions do not seem appropriate as a replacement for something that may not represent an actual failure.

2) I expect you've probably heard of the Null Object Pattern before, and it's been well-described elsewhere. It's a good idea when you can use it, but there isn't always a suitable null object for a given type.

3) A lot of Java programmers who don't have a background in functional programming haven't heard of the Option pattern or the Maybe pattern. I think this is a pity because by itself, Option has nothing to do with functional programming. It's just a good object-oriented design pattern that happens to have been discovered by the functional programming people.

So for the balance of this blog, I'd like to describe what Option is and how I've adapted Option to Java's idioms and coding patterns. I think that with a little consideration, you might grow to really like using this style.

Monad, Schmonad...

The first thing I'd like to get out of the way is this: Some of you have heard that Option is a Monad. Maybe you didn't listen past that or maybe that scared you if you don't already understand monads.

If this was true of you, no problem, because we're going to entirely ignore monads for the moment and demonstrate that our version of Option, which we will call Possible<T>, is plenty useful without them. (If you're interested, I'll blog about monads at some later date.)

A motivating example...

Back to the problem at hand: Most of the time when a null object is not possible, the purpose for returning or passing null is to indicate the “lack of an object”.

For the sake of our discussion, let's pretend that we are designing a data access object (DAO) layer and need to create a

findPerson(int id) 

API. The question is this:

  • If we are not going to return null or throw an exception, and if we feel that there is no reasonable “null Person object”, what do we return?

Return a collection?

Rather than return a Person object directly, one option might be to return a collection of Person objects instead, where the collection always has either 0 or 1 element(s). Something like the following could work:

public List<Person> findPerson(int id) { ... }
 
 
List<Person> foundPerson = findPerson(42);
if (!foundPerson.isEmpty()) {
   Person person = foundPerson.get(0);
} else {
   // handle the case where we didn't find the person if needed
}

Further, since the result is a Collection, if we do not care about the case where we didn't find a Person, we can 'foreach' over the results:

public List<Person> findPerson(int id) { ... }
 
 
List<Person> foundPerson = findPerson(42);
for (Person person : foundPerson) {
   // ...
}

Benefits and drawbacks of using List<T>

The benefits of using List<T> to contain 0 or 1 results include:

  • It's clear from the method signature that the API might return a List of 0 length.
  • Testing for the presence of a value is simple, straightforward, and familiar. After all, we're just returning and evaluating a collection.

The drawbacks include:

  • It's not clear from the method signature that the API will never return a foundPerson.size() > 1

How might we avoid this drawback?

Easy: Let's build a Java collection type that can only contain 0 or 1 elements. Additionally, we could add a few special-purpose methods for handling the kinds of cases for which we normally would use null.

Implementing Option<T> for Eclipse

Traditionally, Option or Maybe is implemented using pattern matching (and in Scala, using case classes). In Java, this translates to something like the following:

public interface Option<T> {
   boolean isEmpty();
   T get();
}
 
public class Some<T> implements Option<T> { // ...
   public static <T> Some<T> some(T value) { return new Some<T>(value); } 
   // ...
}
 
public class None<T> implements Option<T> { // ... 
   public static <T> None<T> none() { return new None<T>(); }
   // ...
}
 
 
public Option<Person> findPerson(int id) {
   boolean foundPerson = false;
   // look up person by ID
   if (foundPerson) {
      Person person = new Person(parameters);
      return some(person);
   } else {
      return none();
   }
}
 
 
public void whenWeNeedToFetchThePerson() {
   Option<Person> optionalPerson = findPerson(42);
   if (!optionalPerson.isEmpty()) {
      Person person = optionalPerson.get();
      // process the case where we found the person
   } else {
      // process the case where we did not find the person
   }
}

I have used an Option implementation similar to this for approximately the last year-and-a-half. My observation in team usage is that programmers who are familiar with pattern matching and case classes in other languages are comfortable enough with it, but that programmers who are only familiar with Java find the static #some and #none factories counter-intuitive and problematic.

In case you like this style, some of the E4 code that was never adopted, and also some of the P2 Director integration tests, have Option implementations after this style:

Improving our Option implementation

After time with these implementations, the question returned to my mind, “Can I find a way to implement Option<T> that is more idiomatic to Java?”

And the thought that occurred to me was, “What if I implemented something like Option<T>, but as a standard Java collection containing 0 or 1 element, nothing more.”

I've now done that. Since the implementation isn't really derived from Option any longer, I've chosen to call the new version Possible<T>. (That name also makes the factory methods read more nicely, IMO.)

Here is the previous example, rewritten to use Possible<T>:

public abstract class Possible<T> implements Iterable<T> {
   public static <T> Possible<T> value(T value) { ... }
   public static <T> Possible<T> value(T value, IStatus status) { ... }
   public static <T> Possible<T> emptyValue() { ... }
   public static <T> Possible<T> emptyValue(IStatus reason) { ... }
 
   abstract boolean contains(Object o);
   abstract boolean isEmpty();
   abstract boolean hasValue();
   abstract int size();
   abstract Object[] toArray();
   abstract <A> A[] toArray(A[] a);
   abstract T get();
   abstract T getOrSubstitute(T defaultValue);
   abstract <E extends Throwable> T getOrThrow(E exception) throws E;
   abstract IStatus getStatus();
}
 
public Possible<Person> findPerson(int id) {
   boolean foundPerson = false;
   // look up person by ID
   if (foundPerson) {
      Person person = new Person(parameters);
      return Possible.value(person);
   } else {
      return Possible.emptyValue();
   }
}
 
public void whenWeNeedToFetchThePerson() {
   Possible<Person> optionalPerson = findPerson(42);
   if (!optionalPerson.isEmpty()) {
      Person person = optionalPerson.get();
      // process the case where we found the person
   } else {
      // process the case where we did not find the person
   }
}

From this code snippet, I would like to highlight a few things:

  • Possible<T> only makes you deal with its own class as API. Implementation classes are not API and are not exposed.
  • The API is largely familiar. With only a few minor additions, it's just Java's existing Collections API, with zero surprises.
  • The static factories are there, but in the tradition of more idiomatic Java, names are chosen so that the fully-qualified forms make sense as well as the statically-imported versions:
    • Possible<Integer> meaning = Possible.value(42);
    • Possible<Integer> meaning = Possible.emptyValue();

Up to this point, you might be saying, “This is nice, but why would I chose this over, say, the fine Optional class in Google's Guava project?”

First, I would heartily recommend Guava. It has a lot of really nice ideas and adaptations from the functional programming world to Java and if that's the ultimate direction you want to choose, it's a great choice2).

There are a few reasons you might consider this version:

  • It has no dependencies other than the JDK and org.eclipse.runtime; it is small and simple enough you can “own” it yourself.
  • Possible<T>, in addition to containing a value or nothing, encapsulates an IStatus value which is useful in Eclipse development. (If no IStatus is specified, the value defaults to Status.OK_STATUS.)

For library authors who do not want to be opinionated about the logging framework choice, encapsulating an IStatus allows the library to return information in a form suitable for logging to any of several loggers, including Eclipse's.

Possible<T> may be found on GitHub.

~~LINKBACK~~

2013/02/06 20:56 · djo

The Server-side Scaling Evolution

Last Friday, I blogged about how we are undergoing a client-side seismic shift. In this article, I'll talk about the somewhat less surprising changes that I see happening on the server. However, in my opinion, these server-side changes will also have a pretty far-reaching impact on how we program that will play out particularly over the next five years or so.

Multi-core programming

Multi-core is here.

As I write in early 2013, 8-way servers are common and it's not uncommon to see much larger than that. 4-way cell phones are here. The laptop on which I'm writing this blog has 4 cores; so does my tablet.

But software development still mostly is written to utilize a single thread of execution.

The more cores we have in our hardware, the more pronounced this mismatch will become, which will pressure the industry into changing software-development practices to take advantage of the new power.

Functional Programming

Functional programming is one promising means of taking advantage of multi-core processors. When done in a pure manner, often the compiler can parallelize certain tasks automatically. Other operations–such as applying a single function over a large collection of data–can also be parallelized more automatically if coded in a functional style. And functional programming has given us the Actor model of concurrency, a promising way of managing concurrency that simplifies many of the problems typically associated with it.

I believe that Java 8 will mainstream functional programming techniques, much like C++ mainstreamed object oriented programming.

However, whether Java 8 will become the de-facto functional programming language or if it will just whet people's appetite for purer languages like Scala or Clojure remains to be seen.

What seems clear to me is that functional programming is coming, and that it is wiser to “skate to where the puck will be” rather than scrambling to overcome a deficit later.

Server-side Javascript

Javascript can be used as a functional language, and using Node.js can run on the server side. I have little contact with it and currently work in a large Java shop. But to the extent that it crosses my radar, I'll make notes about that here too. There seems to be a lot to be said for running the same language on the client and on the server…

~~LINKBACK~~

2013/01/27 20:55 · djo

The Client-side Seismic Shift

Since approximately 2003, I have spent most of my time developing high-performance client-side applications using Eclipse rich client platform (RCP) as my toolkit.

Although Eclipse RCP is no longer “trendy”, it is still the best tool for most business applications that require a performant, “native” look and feel–particularly for those that must run across platforms.

However, that is changing.

In 2003, the client-side market was mainly Windows and MacOS. Today, the client-side market also includes iOS, Android, and Windows Mobile. Additionally, with the rise of single-page web applications like GMail and Google Docs, the web browser has evolved to become a capable platform for performant rich graphical applications. To add weight to this trend, as-of late 2012, Samsung's Chromebook was the fastest selling computer on Amazon.

In the past, if you wanted to “write once, run anywhere,” you had to use Java. Today, with the rise of the mobile and tablet platforms, Java no longer delivers this capability, yet with the increasing proliferation of platforms, it is needed more than ever.

However, Javascript *does* deliver this capability, both in the client (in the browser), and more recently using Node.js and V8, on the server side as well.

The transition from Java to Javascript as the enterprise language of choice will not happen quickly, but given the market forces noted above, it does appear to be inevitable.

Having noted this, projects like Eclipse's Orion and Ember.js appear to be more timely than ever.

I have set up my own Orion instance on my own cloud server and love how it lets me play with web sites and Javascript from literally anywhere. I think I can say that this truly is the future of coding.

However, on the server side, other forces are at work.

But I'll save my thoughts on that for another blog article.

~~LINKBACK~~

2013/01/25 18:17 · djo

Databinding Chaining ObservableValues together

Sometimes the standard data binding primitives are perfect: data gets converted, validated, and reported nicely. But sometimes one's validation requirements are more than fit nicely into a standard IValidator. And sometimes the data bindingp pipeline that Binding implements for you isn't quite as flexible as one might want.

What I realized after reading quite a bit about monads is that if we treat an IObservableValue as immutable for a given moment in time (e.g.: it only gets changed by an external actor to the system, like the user or a clock or…) then we can chain IObservable values together using a function that maps from one IObservableValue to another in much the same way one would use the map function to transform one functional collection to another. Implementing this idea lets us write code like this:

IObservableValue textFieldTarget = new WritableValue("", String.class);  // Bound to a UI field somewhere
IObservableValue model = new WritableValue(new Integer(1), Integer.class);

ValuePindingPipeline pipeline = new ValueBindingPipeline(bindingContext)
   .from(model)
      .map(new IntToSringConversion())
      .to(textFieldTarget);

pipeline.getReversePipeline()
   .from(textFieldTarget)
      .map(new IsValidIntegerValidation())
      .map(new ConvertToInteger())
      .map(new RangeCheckValidation())
      .map(new LessThanValidation(someOtherObservable))
      .to(model);

As you can see, the resulting code is nice and explicit about what happens everywhere in the pipeline. This is really similar to some ideas that Matt Hall had , just expressed in a more monadic form.

~~LINKBACK~~

~~DISQUS~~

2012/08/17 22:58 · djo

Orion works on Asus Transformer tablet

I've been looking for a handy way to edit code on the go and wondered a while ago if Orion plus my Asus Transformer Prime tablet would fit the bill or not.

I originally tried it several months ago using both the built-in Android browser and the Chrome beta for Android. At the time, neither worked in any usable form at all.

However, today I tried again, and I'm happy to report that using the latest Chrome Beta on the tablet, it works fine–with good performance to boot.

Congrats to the Orion team!

Wish list

Now for what I really want:

  • An Orion-based hosted service
  • With Play Framework installed/integrated
  • And Scala/Java Development Tools integrated

Not this year, probably, but… ;)

~~LINKBACK~~ ~~DISCUSSION~~

2012/05/18 22:43 · djo

<< Newer entries | Older entries >>

start.txt · Last modified: 2014/10/20 15:40 (external edit)