Clojure Debugging ’13: Emacs, nREPL, and Ritz

[NOTE: The release of cider deprecates much of the content here. I will post an update on Clojure Debugging ’14 early in the near year]

I’m ramping up for a new set of development projects in 2013 and 2014. My 2010 era setup with slime and swank-clojure is unlikely to remain a viable approach throughout the project. I’ve decided it is time to join the nREPL community as well as take advantage of some of architecture innovations there which may make it easier to debug the distributed systems I’m going to be working on.

Features I’m accustomed to from common lisp slime/swank:

Code navigation via Meta-. and Meta-,
Fuzzy completion in editor windows and the repl
Documentation help in mini-buffer
Object inspector. Ability to walk any value in the system
Walkable backtraces with one-key navigation to offending source
Evaluate an expression in a specific frame, inspect result
Easy tracing of functions to the repl or a trace buffer (in emacs)
Trigger a continuable backtrace via watchpoint or breakpoint

Only the first three of these features is available in the stock nrepl. The rest of this post will discuss how to setup a reasonable approximation to this feature set in Emacs using nREPL middleware providers as of May 2013.

Continue reading “Clojure Debugging ’13: Emacs, nREPL, and Ritz” →

Cloud Computing: Hype or Revolution? It’s all about time and fixed costs, not operating margin

A colleague recently emailed an argument that higher level computing providers like Heroku are doomed due to the “Fundamental Law of Another Mouth to Feed” where the margin requirements of the services these providers are reselling means they will eventually lose to someone without that margin in their supply chain. As with any asymptotic complexity argument, this is sound theoretical thinking, except it ignores the significance of constants in the equation. Following is a slightly cleaned up version of my response.

Continue reading →

The Fog of Innovation

The Fog of War is invoked to describe the uncertainty that permeates combat operations. This term is generally attributed to a quote by Clausewitz: “The great uncertainty of all data in war is a peculiar difficulty, because all action must, to a certain extent, be planned in a mere twilight, which in addition not infrequently — like the effect of a fog or moonshine — gives to things exaggerated dimensions and unnatural appearance.” (via wikipedia).

The evolution of an innovation or innovative company faces a similar fog. At any point in time, we extrapolate from current conditions a set of possible future outcomes and take action to try to bring the best outcome into being. In one view, we should make economically rational decisions, those that maximize our expected return over the outcomes weighted by their likelihood.

However, Nassim Nicholas Taleb’s “The Black Swan” argues convincingly us that the seeds of disruption and destruction lie in events that exist outside all reasonable extrapolations of a current state of affairs. The occurrence of a highly improbable event is, in fact, inevitable.

Since we can’t know anything for sure, innovators have to operate at a completely different level than analysts. It’s a chess game where the rules can change mid-play forcing a complete re-evaluation of strategy. For example, the iPhone was a Black Swan to the cellular industry, as this analyst’s perspective shows. This dynamnic is why you should ignore Forrester and other analysts – they tell you what the world will be like if nothing surprising happens, but surprising things always happen.

Continue reading “The Fog of Innovation” →

What can we learn from our E-Mail logs?

Recently I’ve been observing via RescueTime that I spend 3 hours or more hours in my e-mail application most days. However, I don’t have a good breakdown of how much of this is scheduling, looking up information, commenting on something substantive or social discourse. There is a tremendous amount of information locked up in the time-series of e-mail’s sent and received that can provide insight into aspects of my behavior such as focus of attention (time of day e-mail is sent), social relationships (what organizations I interact with in a given week), the density of idea generation, etc. E-mail logs contains a wealth of raw data that can be instrumented to uncover important information about our life.

Our E-mail logs are also rich archive of useful information such as phone numbers, addresses, what we said to someone, when we said something to someone, edits to papers, attachments, etc. With a proper set of tools, many of which have been built for analyzing social media, we can turn this archive into a database of useful information that can significantly enhance e-mail-based instrumentation.

Continue reading “What can we learn from our E-Mail logs?” →

Schema support for Clojure HBase Client now called clojure-hbase-schemas

My github fork of the Clojure library for HBase, clojure-hbase is now deprecated. I’ve extracted the functionality from David Santiago’s original library (with permission) along with a duplicate of his admin functions to create a parallel repository with the schema-oriented API I developed.

The new repository is owned by Compass Labs and can be found here. The library can also be referenced via Maven / Leiningen:

com.compasslabs/clojure-hbase-schemas "0.90.4"

Discipline and Innovation

David Fore of Lybba forwarded me this article on innovation. It provides a set of beautiful examples of the role of discipline and standardization in fostering innovation.

At my first startup company, Silicon Spice, we were extremely disciplined in how we structured internal communications, identified and followed technology standards and incorporated automatic testing into all our technology development efforts. The company-wide discipline we established allowed for an unprecedented pace of product innovation in the rapidly evolving communications market of the late 90’s. Peaking at only 120 employees, we constructed a complex set of technology targeting telco equipment vendors including reference voice switching boxes, a 2.5 Watt 21-core DSP processor, companion ASIC devices, a vector compiler, a real-time OS, a fully-featured set of voice processing software and a complete suite of development tools for our platform. A few people in key roles were able to spot opportunities that spanned numerous product sub-teams and we were able to quickly implement coordinated design changes across our home-grown technology stack.

A lead engineer from Texas Instruments, our primary competitor, once commented to me that they couldn’t believe that we had build our product in only 3-4 years with 100 people; I didn’t have the heart to tell him the product they were evaluating for acquisition was built by an average of 50 people over 16 months! We seized 70% of the carrier-class market from them in the 18 months following our acquisition.

Continue reading “Discipline and Innovation” →

Writing Java plugins for Flume in Clojure

I recently wrote a plugin in Clojure to add to the Cloudera Flume framework. As it was my first time writing a full java class interface I had to learn about the proper use of both proxy and gen-class. Given the poor error reporting at the java-clojure boundary, figuring out what you did wrong if you don’t get every detail exactly right (particularly when loading a class in the plugin’s final environment) can be difficult.

Continue reading “Writing Java plugins for Flume in Clojure” →

New Abstractions for Clojure-HBase

I just pushed Compass Lab’s HBase Client API to my fork of the clojure-hbase library. The API includes support for table schemas (to auto-encode inputs and outputs) and a constraint language that generates filters and calls for Get and Scan operations. We decided to integrate with the existing fork to retain access to the excellent admin functionality already implemented there. We’ll be talking with the original author and see if we’ll merge or split these two API development paths. In the meantime, you can use our fork of clojure-hbase.

Steps and Flows: Higher-order Composition for Clojure-Hadoop

The clojure-hadoop library is an excellent facility for running Clojure functions within the Hadoop MapReduce framework. At Compass Labs we’ve been using its job abstraction for elements of our production flow and found a number of limitations that motivated us to implement extensions to the base library. We’ve promoted this for integration into a new release of clojure-hadoop which should issue shortly.

There are still some design and implementation warts in the current release which should be fixed by ourselves or other users in the coming weeks.

Continue reading “Steps and Flows: Higher-order Composition for Clojure-Hadoop” →

Learning Bayesian statistics with R

I have a bad tendency in my research work to write my own code and libraries from scratch, in large part because I’ve decided to keep most of my coding in Common Lisp to leverage prior tools. However, I’ve recently been given a painful demonstration of how it is often faster to pay the up-front cost to learn the right tool than to rewrite (and maintain) the subsets you think you need. For example, I found myself venturing into Clojure/Java/Hadoop for my commercial work this year as a compromise between Lisp / dynamic language features and integration benefits. This week I’m finding the need to do some rather sophisticated work with graphical models and I need some tools to build and evaluate them.

I’ve looked at a wide variety of open source approaches such as Samiam (no continuous variables), WinBUGS (only windows), OpenBUGS (not quite ready), HBC (inference only), Mallett (OK, but I don’t like Java and doesn’t support all forms of real-valued random variables), Incanter (limited but growing support for graphical models) and R.

Continue reading “Learning Bayesian statistics with R” →