User Tools

Site Tools


Sidebar

Dave Orme muses about agile and functional programming.

My current work emphasizes SOA applications using Scala, Kubernetes, and AWS with a React-based SPA front-end. I'm also interested in progressive web applications and developer tools.


Blog

Scala, Clojure, and FP

Agile

The Cloud

Data-First Development

Older work

Coconut Palm Software home


Donate Bitcoin:

1Ecnr9vtkC8b9FvmQjQaJ9ZsHB127UzVD6

Keywords:

Kubernetes, Docker, Streaming Data, Spark, Scala, Clojure, OSGi, Karaf, GCP, AWS, SQL

Disclaimer:

Everything I say here is my own opinion and not necessarily that of my employer.

blog:enterprise_clojure_is_not_a_bad_phrase

This is an old revision of the document!


'Enterprise Clojure' and Specs

Over the past two years I have been using Clojure to deliver an Extract-Transform-Load (ETL) pipeline. During this time, I have spoken with a number of other developers who use Clojure to deliver larger-scale applications, and among some of these developers a consensus has arisen: “Clojure is too hard to use at larger scales.”

In what ways might they be right? Is there anything we can do to improve this situation?

This post looks at one concern I have seen raised and proposes a direction the community could take toward a solution.

Naturally, feedback is welcome.

Awhile back I needed to parse a string into words–except that single or double quoted substrings must function as a single word. Nested quotes are not supported.

This is similar to the way command line arguments function in Unixish shells:

$ java com.hello.Hello 'Hello world'    # 'Hello world' is parsed as a single entity

Here is the Clojure code I initially wrote to parse this way:

(^:private def delimiters [\'])
(^:private def delimiter-set (set delimiters))
 
(defn merge-strings
  "Given a vector of strings, merge strings beginning/ending with quotes into
  a single string and return a vector of standalone words and quoted strings.
  Nested / unbalanced quotes will return undefined results."
  [[result delimiter merging] next]
 
  (let [start (first (seq next))
        end   (last (seq next))]
    (cond
      (and ((set delimiters) start)
           ((set delimiters) end))   [(conj result next) nil ""]
      ((set delimiters) start)       [result start next]
      ((set delimiters) end)         [(conj result (str merging " " next)) nil ""]
      (nil? delimiter)               [(conj result next) nil ""]
      :else                          [result delimiter (str merging " " next)])))
 
 
(defn delimited-words
  "Split a string into words, respecting single or double quoted substrings.
  Nested quotes are not supported.  Unbalanced quotes will return undefined
  results."
  [s]
  (let [words (str/split s #"\s")
        delimited-word-machine (reduce merge-strings [[] nil ""] words)
        merged-strings (first delimited-word-machine)
        remainder (last delimited-word-machine)
        delimiter (second delimited-word-machine)]
    (if (empty? remainder)
      merged-strings
      (conj merged-strings (str remainder delimiter)))))

At the time I wrote the code, it made perfect sense to me. But I had a need to revisit it recently in order to write/update tests and needed to understand it again.

And I found myself staring at the parameter list and code of the

merge-strings

function, trying to understand what values each parameter could take. I was surprised at how non-obvious it was to me a few short months later.

To my thinking, this illustrated a common pain point my colleagues have expressed about Clojure, namely…

As complexity increases, the data expected in a function's parameters can quickly become non-obvious

Even though I had written a docstring for the function, notice that because this function is a reducer, and is not API, I had not rigorously described each parameter's possible values and usage within the function.

This week, I decided to use the upcoming Specs library from Clojure 1.9 to document each parameter's possible values and see if this helped with the readability and maintainability of this particular example.

Specifically, I wanted to use Specs according to the following form from the documentation:

(defn person-name
  [person]
  {:pre [(s/valid? ::person person)]
   :post [(s/valid? string? %)]}
  (str (::first-name person) " " (::last-name person)))
 
(person-name 42)
;;=> java.lang.AssertionError: Assert failed: (s/valid? :my.domain/person person)
 
(person-name {::first-name "Elon" ::last-name "Musk" ::email "elon@example.com"})
;; Elon Musk

After trying this in a few places, I became dissatisfied with the repetitiveness of manually calling

s/valid?

for each (destructured) parameter value, so I wrote a macro to DRY this pattern up. (The code is in the

clj-foundation

project) With the macro, the above defn can be rewritten in either of the following two ways:

(=> person-name [::person] string?
  "person->String"
  [person]
  (str (::first-name person) " " (::last-name person)))
 
;; Or:
 
(defn person-name
  "person->String"
  [person]
  (str (::first-name person) " " (::last-name person)))
 
(=> person-name [::person] string?)

To my eyes, this significantly enhanced the readability of the spec information added to the

person-name

function, so I applied the macro to my string parsing functions. That code now reads as follows:

(^:private def delimiters [\'])
(^:private def delimiter-set (set delimiters))
 
(s/def ::word-vector     (s/coll-of string?))
(s/def ::maybe-delimiter #(or (delimiter-set %)
                              (nil? %)))
(s/def ::merge-result    (s/tuple ::word-vector ::maybe-delimiter string?))
 
 
(=> merge-strings [::word-vector ::maybe-delimiter string? string?] ::merge-result
  "Given a vector of strings, merge strings beginning/ending with quotes into
  a single string and return a vector standalone words and quoted strings.
  Nested / unbalanced quotes will return undefined results."
  [[result delimiter merging] next]
 
  (let [start (first (seq next))
        end   (last (seq next))]
    (cond
      (and ((set delimiters) start)
           ((set delimiters) end))   [(conj result next) nil ""]
      ((set delimiters) start)       [result start next]
      ((set delimiters) end)         [(conj result (str merging " " next)) nil ""]
      (nil? delimiter)               [(conj result next) nil ""]
      :else                          [result delimiter (str merging " " next)])))
 
 
(=> delimited-words [string?] ::word-vector
  "Split a string into words, respecting single or double quoted substrings.
  Nested quotes are not supported.  Unbalanced quotes will return undefined
  results."
  [s]
  (let [words (str/split s #"\s")
        delimited-word-machine (reduce merge-strings [[] nil ""] words)
        merged-strings (first delimited-word-machine)
        remainder (last delimited-word-machine)
        delimiter (second delimited-word-machine)]
    (if (empty? remainder)
      merged-strings
      (conj merged-strings (str remainder delimiter)))))

With this code, it becomes easy to see that the

result

(destructured) parameter contains a word-vector, which must be a collection of strings. Similarly, without reading the function body, one can immediately note that the

delimiter

parameter is a character from the delimiter-set or nil.

And this bit of up-front information made (re)reading the body of the

merge-strings

function much easier.

Retrospective

With this in mind, I would like to offer the following thoughts about this experiment:

* I felt the experiment was successful. I believe the code I wound up with explains the original author's intentions better than the original code. * Only time will validate the

=>

macro, and I'm sure it will evolve over time. But I sincerely hope something like it makes it into Specs in the end. * More generally, I feel that this code illustrates how even quite straightforward functions can become opaque very quickly, and how providing explicit specifications describing what data a function accepts and provides can significantly enhance communication.

...and a word from our sponsors

In closing, I'm available for new Clojure gigs right now. If this kind of thinking and expertise is welcome on your Clojure team or on your Clojure project, feel free to email me using the address on the “Contacts” page off my home page.

blog/enterprise_clojure_is_not_a_bad_phrase.1495678912.txt.gz · Last modified: 2017/05/24 22:21 by djo