Dog-eared Digital Books
"If I die tomorrow, well, at least my children, who are approaching as we speak, at least they will have a very good idea of what I was like, of what my mind was like, because they will be able to read my books. So maybe there is an immortalizing principle at work even if it's just for your children. Even if they've forgotten you physically, they could never say that they didn't know what their father was like." - Martin Amis [1]
I recently attended the Books in Browsers (BiB) conference [2] at which I heard Bob Stein make the point that there is meaningful and emotional value in annotating written content [3]. While Martin Amis is referring to the act of writing a novel, we could all share experience, in text, across time. If I carefully annotated my thoughts (likes, associations and criticisms) on The Life of Pi [4], with contextual information (an autumnal commute, in a Mancunian bus, aged 24), my children could perform the same act, with the same text, at the same point in their lifetime. This experience could be extremely poignant.
While it occurs with physical books (more frequently in academic genres, perhaps), annotation could be promoted and enhanced through digital reading devices and services. A paper based manuscript is limited by the rate it degrades and its physical size, a digital equivalent (as long at its content can be deciphered) can last, and be expanded, indefinitely. Annotating is not a new concept for the web (and therefore formats like ePub), projects like the Open Annotations Standard [5] have been attempting to solve this problem, with limited success, for some time. One key constraint being the transient nature of the source.
Other BiB talks touched on these subjects. Blaine Cook and Maureen Evans talked about authors being able to use the tools familiar to software developers (source control systems like Git [6]) and social-networkers to aid communal content creation and blur the line between writer and reader (Maureen nurtured a crowd sourced cookery book based on these principals [7]).
Content creation, in this style, might not be appeal to every reader, but note making could. James Bridle's talk covered some of the key features that bookmarks and annotations should provide [8], an appealing idea he described was:
"Traditional books, physical paper books, are fantastic. You can read them cover to cover, bookmark them, dog-ear them, write notes in the margin, underline your favourite passages, treasure them, keep them, and lend them to your friends. Ebooks should let you do these things too - but sometimes they don't." - James Bridle [9]
Taking into account all of the above, there is a strong argument for implementing this kind of functionality within eReading tools. In the Open Books Checklist [8] (item three) includes the following: "You should be able to save these marks separate from the book itself." The Open Annotation proposals share the same ideas. There's another way to look at this.
Borrowing further from software development practices, open source in particular, readers could purchase a "release" of the book. From that point on the book is theirs (their own "version"). Annotations and bookmarks can be added directly to them (in a format that can be hidden/revealed) this version can then be passed on, and on, gaining a life of its own, each reader leaving their mark (as developers leave their comments/foot-prints on a code base). The digital equivalent of browning, dog-eared, written on pages.
As with software, the user's version may deviate from further "releases" made by the original author. These could be merged or ignored at the reader's discretion, this way the supplements can't be separated from their source material (physically or over time). Example usage might include: a teacher annotating a text, then providing this version to their students, who further branch the content. The teacher could choose to merge the best student content back into their version, and when a new text is released choose whether to merge or continue with the version they have with their next class.
My copy of the Life of Pi, becomes our family copy. Each of us leaving impressions, a new edition is irrelevant, ours is the version that means the most.
[1] http://www.theparisreview.org/interviews/1156/the-art-of-fiction-no-151-marti...
[3] http://www.slideshare.net/synch101/bi-b-bob-stein
[4] http://en.wikipedia.org/wiki/Life_of_Pi
[5] http://www.openannotation.org/
[8] http://www.openbookmarks.org/checklist/
[9] http://booktwo.org/notebook/everything-is-the-same-only-different/
Simple
Rich Hickey's [1] approach to developing software (or just thinking about and building tools) really chimes with me. His recent talk Simple Made Easy [2] really beautifully sums up how "easy" is a relative term, simple on the other hand is quantifiable and not necessarily "easy".
Following on from his previous talk on Hammock Driven Development [3] his approach is to solve the right problem. And, to really think about how to solve the right problem well. Being able to hold a problem in your mind is dependent on simplicity; if the problem is complex, the more thinking power it requires. Being able to keep problems encapsulated/modular allows them to be evaluated fully, simply.
This ties in with themes from previous posts [5] about reducing complexity and technical debt by keeping code small and well tended. Solving the right problem and not adding more than is necessary. To repeat a favorite quote:
"Real efficiency comes from elegant solutions, not optimized programs." Jonathan Sobel [4]
Efficiency, in this case, applies equally to the execution of the code and to the ability of the team to work with and improve it over a long period. While procedure and testing (agile, XP, CI, TDD and acceptance testing) are tools that I find helpful they do not truly assist with the act of problem solving. Complex code can be well tested and deploy flawlessly.
Hopefully these references help clarify some of the points I try to make about keeping code small, simple and disposable [5]. Problem solving is an intellectual activity, solving the right problem is key and doesn't require more code. More code requires more mental effort to understand, and in many cases it is easier to solve the next problem by building on top, and the cycle continues.
Solve the right problem and only the right problem; write less code that does more; solve problems with your brain and constantly ask "do we need this?".
Links:
- https://twitter.com/#!/richhickey
- http://www.infoq.com/presentations/Simple-Made-Easy
- http://blip.tv/clojure/hammock-driven-development-4475586
- http://www.cs.indiana.edu/~jsobel/c455-c511.updated.txt
- http://joelhughes.co.uk/code-life-cycle-every-line-is-technical-debt
Code Life Cycle: Every Line is Technical Debt
Programming in teams is hard: members favour their own code; ownership is important; once committed, code is hard to remove. Training new members is hard: they may not have confidence in their contributions; senior members my protect "their" work; all are afraid of pollution. It is desirable, but usually not practical, to have the following:
- Freedom to experiment and make mistakes while learning, improving and contributing to production.
- A small, flexible code-base (particularly vital in JavaScript).
- Easily learnt and maintained tools.
- Ability to keep maintenance interesting and evenly applied.
This is particularly applicable to teams working on languages like JavaScript. There are many approaches and styles that can be applied to a given problem. Less experienced members may fear making polluting mistakes, as will their teammates. And, despite their brilliance, most engineers would concede that the code they wrote today is superior to last year's/last week's/yesterday's equivalent. We all live with past mistakes, everything decays over time.
Even with the most detailed foresight and planning, over the lifetime of a code-base, technical debt grows and maintenance becomes increasingly expensive. It is easier to add than remove; large applications are harder to learn; complexity leads to mistakes and introduces bugs. More code, more complexity, more bugs, more maintenance (less flexibility), more training for new team members, more developers and so on. It may not be possible to eradicate these flaws in the development process but there may be ways to reduce, or slow, their effects.
Freshness: a Possible Solution
Practises such as TDD, XP, code review, lint and quality control all aim to minimise the ravages of large teams and time on projects. However, these do little to keep the size of the code-base down. Less code is easier to manage, read, learn and maintain (as long as it stays small). As a concise, well formed, sentence is easier to read and understand; a brief but poorly worded passage is easier to discuss and edit. New developers should be encouraged to get involved and feel comfortable to learn via their mistakes.
In his article The Carrying-Cost of Code: Taking Lean Seriously Michael Feathers makes the case for a 3 month lifetime imposed on every line of code (at which point it would vanish from the repository). By being forced to completely rewrite applications over an over: it becomes impractical to make the code-base larger as there simply wouldn't be time to cover it all.
This approach may be heavy handed, there would be cases where perfectly good code would be rewritten: preventing work in more valuable areas. However, over time requirements change; a perfect solution last week may become less so as the project's focus shifts. These areas are also ideal for training, a developer can resolve a well understood problem (backed up with API contracts and unit tests) while colleagues can advise and support. Coupled with the easy branching, which tools like Git provide: rewriting, reviewing and merging/rejecting are straight forward.
Code can be graded by its "last modified" or "last reviewed" date, regular code reviews and the ability to monitor "freshness" would ensure all members of the team understand the mechanics of their project.
The more central the code the shorter the life-span. With JavaScript in particular, more code means poorer performance (whether it is time to download, execute or understand). The modules and functions that are called most frequently are those that need the deepest understanding. Conversely, repetitive "cookie cutter" functionality, typically found in areas such as views and templates, could be given longer to atrophy.
Reaching the end of a life-span doesn't necessarily mean it has to be rewritten. If the code is still solving the problem it set out to, in the most efficient and readable manner, then its life-span can be extended. The code can be held up as an example of good practice and made the subject of comparison and code review.
Next Time: Implementation
Although the intention is to make the process more behavioural than technical, there would need to be a method of visualising the ageing process. This could easily be incorporated into a documentation tool such as selfDoc, and then utilised during code reviews. Next post will cover the tooling (improvements to selfDoc.js) and practise (based on experiences from a trial run inside a large team, building JS applications).
In Favour of Test Driven Development
"It sounds very mechanical, but the effect is the exact opposite. What it does is free you to write. It liberates you to write."
John McPhee on having a system to write to.
"Art is freedom; and in art, as in life, there is no freedom without law."
Martin Amis on the importance of having rules and a code of conduct.
While referring to literature, these quotes resonate strongly with my feelings on writing software (JavaScript in particular) Imposing constraints on the process leads to greater clarity of thought. In writing: correct grammar, consistent style and brevity; in code: strict lint rules, clear design and, again, brevity.
Test Driven Development (TDD) is another self-imposed constraint. The following relates to a recent StackExchange Podcast, who's hosts are not enthusiastic about its supposed benefits and have evidence to back up their assertions.
"[...] a survey of all of the studies that have been done on TDD have shown that the better the study done, the weaker the signal as to its benefit."
A small concession is made that, for the majority of projects, TDD is poorly understood and executed. In which case I agree; used for its own sake or inappropriately: TDD has little value. For example, retrofitting unit tests misses the point entirely.
Used on functional, self-contained, code (JavaScript, perhapse) TDD provides constraining benefits that free the developer to solve their problem. Thinking about and exploring a solution is never a waste of effort, writing a test is a chance to frame a problem with greater clarity. Each test further tightens focus on the task. Each test is a tangible step forward, a possible break point and a starting point for the next push.
TDD has little benefit when interacting with existing APIs (DOM interactions, UI or web services). Only code, over which the developer has complete control (new, independent functions), can be discovered creatively. Otherwise the tests are simply a collection of roll-played guesses. The do not clarify the author's thoughts and intentions, instead they can easily reinforce her misconceptions.
"The fact is that everything I've written is very soon going to be absolutely nothing - and I mean nothing."
John McPhee, again, on leaving a legacy.
If it isn't providing satisfaction or feels like a drain on progress then TDD is counterproductive. In the hands of a developer who enjoys the process it becomes valuable aid.
selfDoc.js: JavaScript that Documents Itself
Thanks to its interpreter, JavaScript can coerce functions to strings. These strings can be used as documentation. selfDoc.js is a JavaScript function that takes a function or object literal (containing a JavaScript application built in a modular style) and returns an object that describes the application's API. This can be combined with a templating tool (lm.js, perhaps) to produce a dynamic HTML document.
var test = function () {
// A function that does nothing
};
test.subFunction = function () {
// Another useless function
};
selfDoc("Test", test, "A test application using selfDoc.js");
// Returns an object describing the app
/**
{
appName: "Test",
comment: ["A function that does nothing"],
overview: "A test application using selfDoc.js"
properties: [{
name: "subFunction"
comment: ["Another useless function"],
implementation: "function () { ... }"
}]
}
*/The documentation is refreshed every time selfDoc is called. selfDoc can be kept alongside unit tests or demos providing a dynamic set of documentation that is always up to date. Combine the resulting object with your favorite templating tool and you have dynamic HTML documentation. The comments are only extracted from the first continuous block (at the top of each function).
selfDoc is low on dependencies, a browser is the only requirement. selfDoc coerces the public/current-scope functions to strings, extracting their comments (FireFox/JaegerMonkey removes comments at run time, so functionality is hampered here).
There are numerous alternatives. The majority are designed to be added into a build process, such as Maven or ant. Typically they are written in a language other than JavaScript, requiring a development environment with these tools/languages installed. These may be more appropriate to large projects with established build systems.
- jsdoc toolkit (java)
- PDoc (ruby)
- Natural Docs (?)
- YUI Doc (python)
- AjaxDoc (C#)
There are alternatives that also have native JavaScript implementations, docco looks particularly tasty.
Introducing LM.JS - Write HTML in JavaScript using Less Markup
Producing HTML from JavaScript is clumbersom. You're either writing out the mark-up using strings, or generating DOM nodes directly. Templating engines reduce effort and abstract away some of the inconvenience, but they have shortcomings when mixing logic with the generation of HTML/DOM (whether or not logic should be in a templating engine is another issue).
To get to the point: I saw :vana-templating, an elegant and expressive tool for Common Lisp. I wanted a JS version and built it. LM.JS is on GitHub for your perusal.
Using LM.JS
The aim was to be able to produce a :vana like data structure that could represent HTML. JS Arrays are most like Lisp's lists, objects made good key-value stores for attributes and nesting is handled by nesting the arrays. Recurse the array and evaluate each portion to a DOM node. No string parse step, only a small regular expression and little need for strict convention.
lm(['ul',
['li',
['p', 'one']],
['li',
['p', 'two']]]);
// Returns this DOM tree:
// <ul>
// <li>
// <p>one</p>
// </li>
// <li>
// <p>two</p>
// </li>
// </ul>Add some attributes and nest within text:
lm(['p', 'Some text ', ['em', 'emphasised'], ' and a ', ['a', {href: '#nest'}, 'link']]);
// <p>Some text <em>emphasised</em> and a link</p>What about logic? As the Array is JS, simply build the array using JavaScript's logic, no need to implement a mini-language.
lm(['ul', (function () {
var rtn = [], i;
for (i = 0; i < 10; i += 1) {
rtn.push(['li', (i + ' item')]);
}
return rtn;
}())]);
// <ul>
// <li>0 item</li>...lm.js differs from templating solutions as JavaScript logic can be used in-line.
lm.render(function (obj) {
return lm(['p', obj.text]);
}, {text: 'hello'});
// <p>hello</p>Performance
The aim of this version was to gain the expressive markup and feel of vana in JavaScript. While performance is an issue, it was not the sole aim to be fast at the expense of ease of use.
This templating performance test on JSPerf has LM.JS coming out in last place, 1% behind mustache.js. In the coming weeks I aim to place LM.JS higher up the list. However, the techniques used to satisfy this particular test may not be desirable in the long run. Using more intelligent caching and priming those caches are techniques that could be employed. On the other hand, creating a cache of each element, or element group, has its own overhead in execution speed and memory. "Cold start" performance matters too and memory is a major factor on mobile.
One improvement may be to use string concatenation with innerHTML as opposed to the current DOM elements attached to document fragments (a technique covered in Nicholas Zaka's Google tech talk).
LM.JS - The pros
- Some of the style from :vana in JavaScript
- No string parsing, lexing or heavy validation needed
- Uses native JS Arrays as core data type
- Provides a very flexible way to deal with DOM generation
- Requires fewer characters to produce well formed HTML, smaller JS payloads too
- Can be used as a markdown like tool for quickly producing HTML
LM.JS - The cons
- Slower than a templating engine, although not by a lot and dependent on the benchmarking rules
- JS is not as syntactically suited to list processing as Lisp, vana is more elegant
- Templates are tightly bound to JS, it would be hard to replicate the exact logic in another environment
- Only works in browser, e.g. node.js would require a rewrite using string concatenation
Next Steps
Building up LM.JS to cover more traditional templating functionality and move up the performance table at JSPerf. Just how far up is a difficult question, optimisation could be handled by the user in a case-by-case basis.
Comment/Vote at Hacker News or Reddit:JavaScript
Tools for Reading, wRiting and aRithmetic
The 3 Rs are the basis of education. So fundamental are these skills I inevitably use, improve and enjoy them daily. The tools needed to perform them vary, but I currently favour the following:
Reading: Amazon Kindle
Like the Portable MP3 Player's effect on music collections, portability has changed the way I consume writing. I often have three books on the go (programming for work, something educational and a novel), I read as I walk to work and have the kindle within reach pretty much the entire day. I still have plenty of physical material, but having my entire library at hand is a real luxury. I'd be very surprised if the momentum of e-reading doesn't equal that of portable audio soon. The kindle is simple, light and lacks the distractions offered by more complex devices. Mix in some Slow-Media philosophy: using long-form reading tools like Instapaper and custom Hpricot scraping scripts, you can read web-content away from the distractions of a PC.
wRiting: Squared Whitelines Notepads (1 A4 Wire Bound, 1 Pocket), Pencils and a Kum Automatic Longpoint Sharpener
Notepad and pencil are a simple, effective, low-tech, way to record ideas. Electronics are fragile (Kindle included), transient and, comparably, hamstrung by their inputs. Tablets may be improving, but drawing and writing are elegantly straight forward and tactile the old way.
I like to have a slim, pocket, notebook to hand at all times for quick use. Moleskin's Cahire and Field Notes are great, but, to me, Whitelines are the best match with an HB pencil for readability. An A4 pad is essential for prolonged note making. I prefer squared paper as it suites diagrams, notes, equations and can be used in any direction.
I'm still experimenting with pencils (recommendations welcome), although most HBs will satisfy (I know many readers will have favourite pens, but there is a unique charm to the humble wood and graphite pencil). It is the sharpener that I think makes the largest difference, thanks to Brendan Dawes, I've discovered that Kum's Automatic Longpoint Sharpener is a cut above any other I've used. Seriously, you may scoff but it's properly cool.
aRithmetic: HP-12c Finance Calculator
My wife rolled her eyes at discovery of my HP-12c finance calculator. It looks dated, for the uninitiated, the reverse-polish input is hard to operate and it's expensive (compared with bog-standard models). But this calculator is a classic, a design that has barely evolved in the last 30 years. It hasn't changed because it doesn't need to. Scott Locklin gives a better account than I, and the mechanical Curta could be a worth contender, if I could get my hand on one.
These tools all have a certain low-tech appeal, even the kindle. They feel like a gentle antidote to the hectic computing environment in which I spend most of each day. Although the HP-12c and Kindle are digital, and require power, they do not flicker, beep or emit light - and are, largely, dependency free (provided the Kindle is amply stocked). In combination, I feel that they could provide a life-time's worth of insight and enjoyment. Throw in a reliable solar charger and these tools could make a desert island a very enjoyable place to be marooned.
Functional, TDD JavaScript (influenced by Haskell, Lisp, Erlang...)
Having developed a taste for Functional Programming (FP), I've found that there are many aspects that make building software in a TDD style easier. Functions are the basis of FP, a function that takes arguments and returns values is easy to test. If this function is side effect free (i.e. doesn't effect the program from outside its internal scope; has no infulence on, or effect from, state), you can be confident that that function will always work the way your tests expect.
// SIDE EFFECT FREE
fAddOne = function (num) {
return (num + 1);
};
test("fAddOne", function () {
eq(fAddOne(0), 1);
eq(fAddOne(1), 2);
});
// SIDE EFFECT DEPENDENT
obj = {
num: 0,
pAddOne: function () {
this.num += 1
}
};
test("pAddOne", function () {
obj.num = 1;
obj.pAddOne();
eq(1, obj.num);
});In the above examples, fAddOne will work anywhere within the app, pAddOne is side-effect based and needs to be called within the scope of an object with a num property. If a refactor in needed, fAddOne, and its tests, can move arround or change applications. pAddOne has some requirements that unit tests don't describe as easily, refactoring will be a little trickier. In effect pAddOne's tests are testing side effects not functionality.
However, the example oversimplifies the problem. In reality side effects are essential, I/O can't be avoided. In (browser based) JavaScript this is usualy in the from of DOM API interaction. In GB.js I attempt to keep a CYOA/GameBook engine independent of side effects, in the demo, DOM building and events are kept to a minimum and try not to overlap. This is fine for individuals working to their own requirements. Teams have different problems; in JS, side effects are easy and the syntax encourages them, in most cases it's easier to just cave in. It may even get work done faster (at first). But, from a Unit Testing perspective quality drops or, at least, refactoring become trickier.
FP is a tool industry could gain more of value from, as does by adopting its features and principals (closures, currying, recursion etc...). The problem for me is blending Object Oriented (OO) and FP styles with TDD. Refactoring and reuse are important, and when a shortcut is made with OO then quality can suffer.
Another gain with FP can be shorter code, but when using TDD with loosely typed languages (JS, Erlang, Lisp etc...) type checking causes length to creep up. While rewriting some of excersises from "The Little Schemer" (with TDD JS) it became apparent that if you want high confidence in the Unit Tests then a lot of type checking happens. This is why I have a set of type checking functions that I use constantly. So, if I'm typechecking a lot am I just re-inserting the type safty of a strongly typed language?
Looking at other FP languages, Haskell currently satisfys me the most. Sepparating side effects into Monads (I'm still in the process of learning this concept) and using strong typing (and a compiler) feels like it provides real confidence in quality. In fact, by having to decide types in the function definitions there is no need to have tests to cover type safety. So the extra "boiler plate" (that isn't even required by the compiler) can reduce the overall lines of code typed.
I'm hoping that by delving deeper into Haskell's approach I can get a clearer steer on how to construct functional JavaScript applications.
Moved (to Posterous). Giving away GoodBaad
It's been a long time since my last post, so to give myself a bit of a boost I've decided to move the blog to Posterous (sorry if this causes a flurry of reposts in you RSS reader). This should make posting a little easier and means I can get rid of my VPS server and save a few pennies.
I've also decided that I need an outlet for the code I produce out of working hours. Therefore I've started a GitHub account so that others can benefit from the work I've done.
The first chunk of code is my GoodBaad web-app that was started around 2 years ago. It was a lot of fun and was a chance to try my hand at building a scalable PHP site from scratch (year's of working with frameworks like CakePHP left me with a feeling that I needed to understand the process of building one from scratch). Although it's a bit old now I think I learned a great deal about being economical with code and building small flexible units that were short, easy to understand and ran quickly. Some rough benchmarks on my workstation lead me to think that this stack could out-perform the equivalent CakePHP stack by around 10 times. These gains were done without any data abstraction (I preferred raw SQL) or a templating engine (PHP is a templating engine if you ask me), and the app didn't need page caching as it was pretty dynamic (and you could always use memcached if it became a requirement) .
The core MVC was loosely based on ideas described in Rasmus Lerdorf's no-framework PHP MVC framework blog post. I agreed that many frameworks simply bring too much code to a application and for me this hid the most interesting parts of writing code. So, the result was an MVC app that sits somewhere between the no-framework stack and something larger like Cake.
So, if you'd like to run your own version of GoodBaad or need a lightweight PHP MVC stack you can take a look at mine for inspiration:
That's all for now, I'll be posting some functional programming stuff shortly (expect plenty of Lisp/Scheme/Clojure and JavaScript!).
Vim is a great text editor
Vim is a text editor (evolved from Vi) that originated on the Amiga. Starting out as a command line only tool, all text editing was carried out via key strokes rather than via mouse and GUI. I first came across Vim on Linux servers, it was confusing and I struggled with the simplest editing tasks.
Today Vim has become my primary text editor. I love it. I use it on Windows and Linux as a standalone app and embedded into the Eclipse IDE.
What makes it so good? In short, once you have mastered a few basic commands (similar to shortcuts) you find yourself working with text in a more efficient/involved way. Your hands rarely leave the keyboard, you can complete complicated tasks in fewer steps resulting in a greater feeling of involvement with the code. The physical task of writing code hasn't changed a great deal over the years and Vim has been refined, over the last 18 years, to suit this task. Basically: "it doesn't get in the way".
If you want to get to know vim a little better here are a few links that I have found really helpful:
- Find the appropriate version of Vim for your OS and install.
- Start learning the ropes with these tutorials:
- Vim is very customisable, the settings are stored in the vimrc file which resides in the install's directory (C:\Program Files\Vim\_vimrc on my version of windows). These links helped me out:
There may be a steep learning curve but the rewards are well worth it!

