The Iterable Monad

Informally

Informally: Executing a function over the contents of a boxed value and returning a new box with the results

Examples

The dot operator

The semicolon

(See Daniel Spiewak's blog)

Iterable<T>

Map<K,T>.keySet(), etc...

List<T>

...

Option<T>

Formally

f( M[A], A -> M[B] )

f may be implemented as a method

f may be implemented as a language feature

f may be implemented as a standalone function

f may be implemented as a fluent interface over some set of types

In Scala, a "monad" is ANY "boxed type" that

Has this method (in Scala, called "flatMap")

Follows ~the same algebraic laws as "+" and "*"

(These naming conventions let us metaprogram over any kind of monad)

The Monadic Axioms

Let us call "unit" the 1-arg constructor that "boxes" a value

Informally

The Monadic Laws--Expressed informally:

1) If you unbox a value and box it again, you'll get "the same"** monad as the original.

2) If you map/flatmap a function f over a boxed value v, the result is "the same" as if you'd computed f(v) directly.

3) "Associativity" works. If you chain flatmap operations together, executing the flatmap operations in order is the same as flatmapping each individual element inside M with F and collecting the results.

** "The same" in a monadic context means that #equals and #hashCode say that the objects are the same, not necessarily that the values are literally the same reference or that they are internally implemented identically.

Formally

Formally:

1) m flatmap unit === m

2) unit(x) flatmap f === f(x) === x.f()

3) m flatmap g flatmap f === m flatmap {x => g(x) flatmap f }

(Remember: Unit is the generic name for a monad's 1-arg constructor)

Thanks to James Iry's blog series for this formulation of the Monad Laws.

Informally:

1) If you unbox a value and box it again, you'll get "the same" monad as the original.

2) If you map/flatmap a function f over a boxed value v, the result is "the same" as if you'd computed f(v) directly.

3) "Associativity" works. If you chain flatmap operations together, executing the flatmap operations in order is the same as flatmapping each individual element inside M with F and collecting the results.

** "The same" in a monadic context means that #equals and #hashCode say that the objects are the same, not necessarily that the values are literally the same reference or that they are internally implemented identically.

Why?

Because you might like to operate on boxed types without explicitly unboxing

Because you might like to transform the elements inside one box into different values stored in another box of the same type

Because you might like to copy the elements inside one box into another box of the same type

Because you'd like to preserve reasonable algebraic properties over those operations

Because it would be *really* nice if all of these operations could run in parallel, automatically

Java isn't able to do this...or is it?

The Trigram Problem

For example, given the input:

I wish I may I wish I might

You might generate:

"I wish" => ["I", "I"]

"wish I" => ["may", "might"]

"may I" => ["wish"]

"I may" => ["I"]

This says that the words "I wish" are twice followed by the word "I", the words "wish I" are followed once by "may" and once by "might" and so on.

To generate new text from this analysis, choose an arbitrary word pair as a starting point. Use these to look up a random next word (using the table above) and append this new word to the text so far.

Trigram conversion pipeline

List of lines

List of words

List of trigrams

We've seen this problem...

In each case, we transform the contents of the container (the box) using a function and concatinate the results together into a new collection

f( M[A], A -> M[B] )

Example notes

ResultContainer is the bind function

transformAndFlatten is flatMap

Google Guava has something similar; they call it transformAndConcat, but they don't have a proper "bind" implementation, so you don't get the same collection type back.

Conclusion

Opportunities for future work

Iterate using continuations

Don't process the whole list at a time

Using continuations (JYield), observer patterns can be Iterable

Implement pipelined parallelism; each chained transformAndFlatten potentially gets a core

A *lot* of computing operations boil down to f( M[A], A -> M[B] )

Informally: Executing a function over the contents of a boxed value and returning a new box with the results