Logan McGrath's Blog

Lessons From Sterling

I’ve spent the last seven months developing a language called Sterling. Sterling was intended to be an untyped functional scripting language, something like lazily-evaluated, immutable JavaScript. Last week I decided to shelve Sterling.

How Sterling Worked

Sterling’s evaluation model is very simple and I felt it held a lot of promise because it made the language very flexible. Everything in Sterling is an expression. Some expressions accept a single argument—these were called lambdas. All expressions also contain sub-expressions, which could be accessed as attributes. With a little sugar, a bag of attributes could be made self-referencing and thus become an object.

An assortment of basic expression types
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// a constant expression which takes no arguments
anExpression = 2 + 2

// lambda expressions take only 1 argument
aLambda = (x) -> 2 + x

// function expressions take more than 1 argument
aFunction = (x y) -> x * y

// an object expression with constructor
anObject = (constructorArg) -> object {
    madeWith: constructorArg,
}

// an object expression that behaves like a lambda after constructed
invokableObject = (constructorArg) -> object {
    madeWith: constructorArg,
    invoke: (arg) -> "Made with #{self.madeWith} and invoked with #{arg}",
}

Expressions could be built up to carry a high amount of capability. Because Sterling is untyped, decoration and ducktyping are used heavily to compose ever more features into expressions.

Sterling was directly inspired by Lambda Calculus. This had an enormous impact on the design of the language, the largest of which was how the language executed at runtime. Expressions in Sterling are represented as trees and leaves. Top-level expressions have names, and they could be inserted into other expressions by referencing those names.

A recursive named expression looks like this:
1
2
3
4
5
fibonacci = (n) -> if n <= 1 then
                       n
                   else
                       fibonacci (n - 1) + fibonacci (n - 2)
                   end

Because each expression was a tree, no expression needed to be executed until its result was absolutely needed. This lazy execution model allows for very large, complex expressions to be built in one function then returned to the outside world to be further processed and executed. Functions could be created inline and passed as arguments to other functions, or constructed within functions and returned.

Sterling’s tree-based structure naturally supported a prototype-based object model. To modify an expression tree, the tree needed to create a copy of itself with any changes to it. All expressions, thus, were effective prototypes. This also had the benefit of directly supporting immutability and helped to enforce a functional programming paradigm.

What Could Have Been

I intended Sterling to be a functional scripting language. In some ways, I was looking to create a JavaScript reboot that clung closer to JavaScript’s functional roots and would be used for general-purpose scripting.

Sterling’s syntax was designed to be very terse, readable, and orthogonal. By that I mean everything in Sterling should be an expression that can be used virtually anywhere for anything. Because Sterling was based on lambdas, this worked particularly well for arguments expressions because arguments could fold into the function call result on the left:

Consing a list by folding arguments, left-to-write
1
2
3
4
5
[] 1 2 3 4
> [1] 2 3 4
> [1, 2] 3 4
> [1, 2, 3] 4
> [1, 2, 3, 4]

This folding capability meant that Sterling could support very expressive programming styles. Any function could be returned as the result of another function call and continue chaining against arguments. Sterling’s terse syntax also made defining functions very easy:

Some basic functions in Sterling
1
2
3
4
5
6
7
identity = (x) -> x
selfApply = (x) -> x x
apply = (x y) -> x y
selectFirst = (x y) -> x
selectSecond = (x y) -> y
conditional = (condition) -> if condition.true? then selectFirst else selectSecond end
friday? = say $ conditional (today.is :friday) 'Yay Friday!' 'Awww...'

Because Sterling was intended to be immutable, objects would be used to represent state and carry behavior to return new state resulting from an operation:

Printing arguments from an immutable list iterator
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
main = (args) ->
    print args.iterator // gets an Iterator

print = (iterator) ->
    say unless iterator.empty? then
        printNext iterator 0
    else
        'Empty iterator'
    end

printNext = (iterator index) ->
    unless iterator.empty? then
        "arg #{index} => #{iterator.current}\n" + printNext iterator.tail index.up
    end

Iterator = (elements position) -> object {
    empty?: position >= elements.length,
    head: Iterator elements 0,
    current: elements[position],
    tail: iterator elements position.up,
}

Paul Hammant at one point suggested baking dependency injection directly into a language, and even offered I do this in Sterling. This drove development of a metadata system in Sterling that could be used to support metaprogramming and eventually dependency injection.

Meta attributes on expressions
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
@component { uses: [ :productionDb ] }
@useWhen (runtime -> runtime.env is :production)
Inventory = (db) -> object {
    numberOfItems: db.asInt $ db.scalarQuery "SELECT COUNT(*) FROM thingies",
    priceCheck: (thingy) -> db.asMoney $ db.scalarQuery "SELECT price FROM thingies WHERE id = :id" { id: thingy.id },
}

@provides :productionDb
createDb = ...

@fake? true
@component { name: :Inventory }
@useWhen (runtime -> runtime.env is :development)
FakeInventory = object -> {
    numberOfItems: 0,
    priceCheck: (thingy) -> thingy.price,
}

The metadata system was very flexible and could support arbitrary meta annotations. The above metadata translates to the following map structures at runtime:

What meta attributes look like if they were JavaScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Inventory.meta = {
    "component": {
        "uses": [ "productionDb" ]
    },
    "useWhen": {
        "value": function (runtime) {
            return runtime["env"] == "production";
        }
    }
};

createDb.meta = {
    "provides": {
        "value": "productionDb",
    }
};

FakeInventory.meta = {
    "fake?": {
        "value": true
    },
    "component": {
        "name": "Inventory"
    },
    "useWhen": {
        "value": function (runtime) {
            return runtime["env"] == "development";
        }
    }
};

I felt these functional features and expressive syntax would make for an enjoyable and productive programming experience. The meta system in particular I felt could become quite powerful especially for customizing load-time behavior of Sterling programs. However, some of my goals came with a few problems.

The Problems

Speed

Sterling is amazingly slow. A natural consequence of a tree-based language is that trees must be copied and modified for many operations, no matter how “trivial” they may be (integer arithmetic, for example.) Recursive functions like the fibonacci expression above had a particularly nasty characteristic of building enormous trees that took a lot of time to reduce to single values.

The speed issues in Sterling were partially mitigated using memoization.

Memoization: Blessing But Possibly A Curse

Memoization increased the possibility for static state to hang around in an application. Applying arguments to an object constructor, for instance, would return a previously-constructed object. I’m not entirely sure what the total impact of the “object constructor problem” could have been, as objects are not mutable, but I didn’t like this charateristic nonetheless. Immutability, however, wasn’t entirely true (see “Escaping The Matrix” below).

Named expressions are persistent in memory. If a named expression took a large argument, or returned a large result, then the total memory cost of a memoizing expression could become quite high over time.

The Impacts Of Typelessness

Types are actually quite nice to have, and I began to miss them quite a bit the more I worked on Sterling. While Sterling is very flexible (because it has no types) it also has very poor support for polymorphism (because it has no types). Want to do something else if you receive an Asteroid object rather than a Spaceship object?

The naïve solution is to implement an if-case for each expected type:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Spaceship = object {
    collideWith: (other) ->
        if other.meta.name is 'Asteroid' then
            say 'Spaceship collided with an asteroid!'
        else if other.meta.name is 'Spaceship' then
            say 'Spaceships collide!'
        end
}

Asteroid = object {
    collideWith: (other) ->
        if other.meta.name is 'Asteroid' then
            say 'Asteroids collide!'
        else if other.meta.name is 'Spaceship' then
            say 'Asteroid collided with a spaceship!'
        end
}

This is fragile, though, and the code is complex. What’s worse, is there’s no way to ensure that a method is receiving an Asteroid and not another object that simply implements its API. A better solution is to let the colliding object select the proper method from the object it’s colliding with:

1
2
3
4
5
6
7
8
9
10
11
Spaceship = object {
    collideWith: (other) -> other.collidedWithSpaceship self,
    collideWithSpaceship: (spaceship) -> say 'Spaceships collide!',
    collideWithAsteroid: (asteroid) -> say 'Spaceship collided with an asteroid!',
}

Asteroid = object {
    collideWith: (other) -> other.collideWithAsteroid self,
    collideWithSpaceship: (spaceship) -> 'Asteroid collided with a spaceship!',
    collideWithAsteroid: (asteroid) -> 'Asteroids collide!',
}

This solution is better. It’s also similar to implementing visitor pattern in Java. I still don’t like it because there’s no type safety and adding support for more types requires violating the open/closed principle. For instance, in order for a Bunny to be correctly collided-with, a collidedWithBunny method must be added to both Spaceship and Asteroid. Developers may find it easier instead to allow the Bunny to masquerade as an asteroid:

Spaceship-eating Bunny
1
2
3
4
5
Bunny = object {
    collideWith: (other) -> other.collideWithAsteroid self, // muahaha I'm an asteroid!
    collidedWithSpaceship: (spaceship) -> say 'NOM NOM NOM NOM!',
    collidedWithAsteroid: (asteroid) -> ...
}

This single-dispatch behavior means that for any argument applied to a method name, the same method will be dispatched. In the case of Java, this is determined by the type of a method’s arguments at compile time. Adding new methods for similarly-typed arguments requires all client code be recompiled. While Sterling may not have typing, it is still single-dispatch.

The lack of types became particularly painful when implementing arithmetic operations and compile-time analysis was nearly impossible without collecting a great deal of superfluous metadata.

Escaping The Matrix

As I worked on Sterling, I required functionality that wasn’t yet directly supportable in the language itself. I solved this problem using the “glue” expression that could tie into a Java-based expression:

sterling/collection/_base.ag
1
2
3
4
5
EmptyIterator = glue 'sterling.lang.builtin.EmptyIterator'
List = glue 'sterling.lang.builtin.ListConstructor'
Set = glue 'sterling.lang.builtin.SetConstructor'
Tuple = glue 'sterling.lang.builtin.TupleConstructor'
Map = glue 'sterling.lang.builtin.MapConstructor'

For short-term problems, this option isn’t too bad, but it allows the programmer to escape the immutable “Matrix” of Sterling. For example, I implemented Sterling’s collections as thin wrappers around Java collections, and allowed them to be mutable. Actually, a lot of things in Sterling were mutable:

  • Method collections on expressions
  • Object methods
  • Maps
  • Lists

This, coupled with memoization, could cause a lot of issues with static state and had the potential to enable a lot of bad design decisions for programs written in Sterling.

The Good Parts

Despite the baggage, there’s a few takeaways!

Sterling’s syntax is very small and terse. I particularly enjoyed not having to type a lot of parentheses, braces, commas, and semicolons. Separating arguments by spaces allowed the language read like a book.

Most expressions can be delimited with whitespace alone, and because everything is an expression, objects could be created inline and if-cases could be used as arguments.

Operators are just methods. Any object or expression can define a “+” operator and customize what it does. With polymorphism supported with multi-methods, this can become an incredibly powerful feature.

Sterling also has the ability to define arbitrary metadata on any named expression. This metadata is gathered into a meta attribute and can be inspected at runtime to support a sort of meta programming.

What I’m Carrying Forward

I’m now working on a new language project that will be borrowing Sterling’s syntax. This time, however, I will be using types. Algebraic data types hold a certain fascination for me, and I’m interested in seeing what I can do with them. At the very least, I do intend on using multi-methods for better polymorphism support.

I don’t think I like declaring scope. It’s verbose. Or declaring types. That should be restricted to places where it impacts execution, like function signatures.

While Sterling’s meta system didn’t really go anywhere, I do intend on carrying it forward as a supplement to algebraic types. I may even still bake in dependency injection because I hate all the typing required to tie together an application.

I don’t believe I will carry forward mandatory immutability, though I may support some form of “immutability by default”.

Sterling’s lazy evaluation caused a lot of headaches more than a few times. I’ll probably not make any successor language lazily evaluated because memoization becomes a near requirement in order to make lazy evaluation useful.

My Holy Grail

  • A language that is interpreted and optionally compiled either AOT or JIT
  • Inferred typing as opposed to nominal typing
  • At least psuedo-declarative
  • Dynamic to some degree
  • Easy to write, easy to read
  • Highly composable
  • Simple closures
  • First-class functions, if not first-class everything

Sterling With Memoization

In my last post I wrote about performance in the Sterling programming language with a basic benchmark. Today I’m ticking off one @TODO item: Memoization.

Sterling now stores the results of each function/argument pair, returning respective results rather than forcing a recalculation of an already-known value. I’ve leveraged the benchmark from the previous post, and the difference in execution speed is very pronounced:

The Results
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Java Benchmark
--------------
Iteration 0: executions = 100; elapsed = 6 milliseconds
Iteration 1: executions = 100; elapsed = 4 milliseconds
Iteration 2: executions = 100; elapsed = 4 milliseconds
Iteration 3: executions = 100; elapsed = 4 milliseconds
Iteration 4: executions = 100; elapsed = 4 milliseconds
Iteration 5: executions = 100; elapsed = 4 milliseconds
Iteration 6: executions = 100; elapsed = 4 milliseconds
Iteration 7: executions = 100; elapsed = 4 milliseconds
Iteration 8: executions = 100; elapsed = 4 milliseconds
Iteration 9: executions = 100; elapsed = 4 milliseconds
--------------
Average for 10 iterations X 100 executions: 4 milliseconds

Sterling Benchmark
------------------
Iteration 0: executions = 100; elapsed = 648 milliseconds
Iteration 1: executions = 100; elapsed = 0 milliseconds
Iteration 2: executions = 100; elapsed = 1 milliseconds
Iteration 3: executions = 100; elapsed = 0 milliseconds
Iteration 4: executions = 100; elapsed = 0 milliseconds
Iteration 5: executions = 100; elapsed = 0 milliseconds
Iteration 6: executions = 100; elapsed = 0 milliseconds
Iteration 7: executions = 100; elapsed = 0 milliseconds
Iteration 8: executions = 100; elapsed = 0 milliseconds
Iteration 9: executions = 100; elapsed = 0 milliseconds
------------------
Average for 10 iterations X 100 executions: 64 milliseconds

Sterling without memoization required on average 0.079 seconds to calculate the 20th member of the Fibonacci sequence, but with memoization, the amount of time shrinks to 0.006 seconds. The time penalty only applies the first time the function is executed for a given argument, so call times become near-instantaneous.

Sterling is faster than Java!

Not really. But it is if I fiddle with the benchmark variables a bit (:

By changing the benchmark to execute the Fibonacci function 1000 times for 100 iterations, something interesting happens:

Fiddling with the benchmark
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Java Benchmark
--------------
Iteration 0: executions = 1000; elapsed = 42 milliseconds
Iteration 1: executions = 1000; elapsed = 39 milliseconds
Iteration 2: executions = 1000; elapsed = 38 milliseconds
Iteration 3: executions = 1000; elapsed = 39 milliseconds
Iteration 4: executions = 1000; elapsed = 39 milliseconds
Iteration 5: executions = 1000; elapsed = 39 milliseconds
Iteration 6: executions = 1000; elapsed = 41 milliseconds
Iteration 7: executions = 1000; elapsed = 40 milliseconds
Iteration 8: executions = 1000; elapsed = 38 milliseconds
Iteration 9: executions = 1000; elapsed = 38 milliseconds
...
Iteration 99: executions = 1000; elapsed = 39 milliseconds
--------------
Average for 100 iterations X 1000 executions: 39 milliseconds

Sterling Benchmark
------------------
Iteration 0: executions = 1000; elapsed = 629 milliseconds
Iteration 1: executions = 1000; elapsed = 0 milliseconds
Iteration 2: executions = 1000; elapsed = 0 milliseconds
Iteration 3: executions = 1000; elapsed = 0 milliseconds
Iteration 4: executions = 1000; elapsed = 0 milliseconds
Iteration 5: executions = 1000; elapsed = 0 milliseconds
Iteration 6: executions = 1000; elapsed = 0 milliseconds
Iteration 7: executions = 1000; elapsed = 0 milliseconds
Iteration 8: executions = 1000; elapsed = 1 milliseconds
Iteration 9: executions = 1000; elapsed = 0 milliseconds
...
Iteration 99: executions = 1000; elapsed = 0 milliseconds
------------------
Average for 100 iterations X 1000 executions: 6 milliseconds

This benchmark smells funny

Yes, the performance in this benchmark is very contrived. But this does present an interesting potential property of applications written in Sterling: If an application performs a great deal of repeated calculations, it will run faster over time. A quick glance at the second bench mark will show that Java is performing the calculation every single time it is called, whereas Sterling only requires the first call and then it stores the result. This suggests O(1) vs. O(n) time complexity in Sterling’s favor.

You won’t get this sort of performance for a web application because of their side effect-driven nature, but for number crunching Sterling may very well be a good idea.

@TODO

How does memoization impact memory?

Obviously, those calculated values get stored somewhere, and somewhere means memory is being used. I should perform another benchmark comparing memory requirements of the Fibonacci algorithm between pure Java and Sterling.

What if I don’t want memoization for a particular function?

There may be some cases where you want to recalculate a value for a known argument. For example, if I query a database I shouldn’t necessarily expect the same result each time. Sterling should give an easy way of signalling that a function should not leverage memoization.

Links

Sterling Benchmarks

Since mid January, I’ve been developing a functional scripting language I call Sterling. In the past few weeks, Sterling has become nearly usable, but it doesn’t seem to be very fast. So this weekend, I’ve taking the time to create a simple (read: naïve) benchmark.

The benchmark uses a recursive algorithm to calculate the Nth member of the Fibonacci sequence. I’ve implemented both Sterling and Java versions of the algorithm and I will be benchmarking each for comparison.

Sterling Implementation
1
2
3
fibonacci = n -> if n = 0 then 0
                 else if n = 1 then 1
                 else fibonacci (n - 1) + fibonacci (n - 2)
Java Implementation
1
2
3
4
5
6
7
8
9
static int fibonacci(int n) {
    if (n == 0) {
        return 0;
    } else if (n == 1) {
        return 1;
    } else {
        return fibonacci(n - 1) + fibonacci(n - 2);
    }
}

Why was the Fibonacci sequence chosen for the benchmark?

The algorithm for calculating the Nth member of the Fibonacci sequence has two key traits:

  • It’s recursive
  • It has O(2n) complexity

Sterling as of right now performs zero optimizations, so I’m assuming this algorithm will bring out Sterling’s worst performance characteristics (muahahaha).

The benchmark execution plan

I’m using a very basic benchmark excluding Sterling’s compilation overhead and comparing the results to native Java. I will execute the Fibonacci algorithm 100 times for 10 iterations, providing an average of the time elapsed for each iteration.

Benchmark Pseudo-Java™
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Expression input = IntegerConstant(20);
Expression sterlingFibonacci = load("sterling/math/fibonacci");

void javaBenchmark() {
    List<Interval> intervals;
    int value = input.getValue();
    for (int i : iterations) {
        long startTime = currentTimeMillis();
        for (int j : executions) {
            fibonacci(value);
        }
        intervals.add(currentTimeMillis() - startTime);
        printIteration(i, intervals.last());
    }
    printAverage(intervals);
}

void sterlingBenchmark() {
    List<Interval> intervals;
    for (int i : iterations) {
        long startTime = currentTimeMillis();
        for (int j : executions) {
            sterlingFibonacci.apply(input).evaluate();
        }
        intervals.add(currentTimeMillis() - startTime);
        printIteration(i, intervals.last());
    }
    printAverage(intervals);
}

The benchmark results

The Results
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Java Benchmark
--------------
Iteration 0: executions = 100; elapsed = 4 milliseconds
Iteration 1: executions = 100; elapsed = 4 milliseconds
Iteration 2: executions = 100; elapsed = 4 milliseconds
Iteration 3: executions = 100; elapsed = 4 milliseconds
Iteration 4: executions = 100; elapsed = 4 milliseconds
Iteration 5: executions = 100; elapsed = 4 milliseconds
Iteration 6: executions = 100; elapsed = 4 milliseconds
Iteration 7: executions = 100; elapsed = 4 milliseconds
Iteration 8: executions = 100; elapsed = 4 milliseconds
Iteration 9: executions = 100; elapsed = 4 milliseconds
--------------
Average for 10 iterations X 100 executions: 4 milliseconds

Sterling Benchmark
------------------
Iteration 0: executions = 100; elapsed = 8,152 milliseconds
Iteration 1: executions = 100; elapsed = 7,834 milliseconds
Iteration 2: executions = 100; elapsed = 7,873 milliseconds
Iteration 3: executions = 100; elapsed = 7,873 milliseconds
Iteration 4: executions = 100; elapsed = 7,910 milliseconds
Iteration 5: executions = 100; elapsed = 7,973 milliseconds
Iteration 6: executions = 100; elapsed = 7,927 milliseconds
Iteration 7: executions = 100; elapsed = 7,793 milliseconds
Iteration 8: executions = 100; elapsed = 7,912 milliseconds
Iteration 9: executions = 100; elapsed = 7,986 milliseconds
------------------
Average for 10 iterations X 100 executions: 7,923 milliseconds

Immediate conclusions:

Sterling is REALLY slow!

Sterling executes directly against an abstract syntax tree representing operations and data. This tree is generally immutable, so the execution is performed by effectively rewriting the tree to reduce each node into an “atomic” expression, such as an integer constant or lambda (which can’t be further reduced without an applied argument).

References to functions are inserted into the tree by copying the function’s tree into the reference’s node. The function is then evaluated with a given argument to reduce the tree to a single node. These copy-and-reduce operations are very costly and are a likely reason for Sterling’s poor performance.

@TODO

Memoization

Copying and reducing a function tree for an argument is expensive. These operations should not need to be performed more than once for any function and argument pair.

Bytecode perhaps?

Given the shear amount of recursion and method calls being performed to execute Sterling, does it makes sense to compile the syntax tree into a bytecode that can be executed in a loop?

Links

Promoting Changes With App-Config-App

The App-Config-App now lets you promote changes between environments!

How does it work?

Perforce lets you create mappings to define the relationship between two diverging code branches. This allows for easy integration of changes between the two branches by referencing the name of the mapping.

See Perforce’s documentation for more details on the how and why of branch mappings.

The App-Config-App reads these branch mappings in order to create paths for promotion between environments.

Promoting changes with App-Config-App

The App-Config-App setup_example.rb creates four branches with the following mappings:

1
2
3
4
5
Mapping        Source    Destination
------------------------------------
dev-qa         dev       qa
qa-staging     qa        staging
staging-prod   staging   prod

If you login to App-Config-App and go to “Promote Changes,” you get an interface showing these relationships:

Changes between environments can be promoted in either direction along a mapping configuration. The receiving environment accepts all changes (developers would know this as an ‘accept-theirs’ resolution) and you are then allowed to review the changes by clicking on the “Pending Changes” link.

For example, I’ve promoted changes from “qa” to “dev”:

I can then review the changes by clicking on “Pending Changes”:

Changes may be edited or reverted before committing them.

Promoting changes using P4V

P4V is the Perforce visual client. Using P4V, you have much greater control over how changes get promoted, but it requires a little more work.

I’ve connected P4V to my App-Config-App user workspace to perform the same promotion from “qa” to “dev”:

Select the “qa” folder, then from the menu bar go to “Actions” > “Merge/Integrate”. This will bring up a wizard for performing the integration.

Select the following:

1
2
3
4
Merge method: "Use branch mapping"
Branch mapping: "dev-qa"
Automatically resolve files after merging: checked
Resolve option: "Accept source"

And ensure the direction of integration is “Target” < “Source”:

Finally, click “Merge”. If you expand the “dev” folder, you can see the where the changes are:

You are now free to modify the files further before finally committing the changes.

How it compares

You get greater options when using P4V to promote changes, but producing the same result as App-Config-App’s default behavior is fairly involved. If you aren’t paying attention or don’t know what you’re doing, you might break something :(

@TODO

More options for resolving changes

When you promote changes in App-Config-App, the source changes will overwrite the destination. This behavior reduces the chance for a conflict to happen, but it means you really have to pay attention to what’s changed in the destination config and possibly edit the config further before finally committing it.

Conflict resolution

If a conflict occurs after promoting changes, a screen should be available for viewing and editing the conflicting changes.

Better error reporting if promotion fails due to permissions

Users with read-only access to multiple environments will still be able to promote changes between them. The promotion doesn’t actually occur (the files remain unchanged) but the application doesn’t report any errors when this happens.

App-Config-App in Action

Paul Hammant found this cool Server-Side Piano and I’ve modified it to be configurable from a running App-Config-App. Because the sound is generated at the server, you’re able to see (hear) the Server-Side Piano change its configuration without reloading the UI.

Making it work for yourself

I’ve updated the App-Config-App with additional configuration to support choosing which instrument the Server-Side Piano will play. A clean install of App-Config-App using setup_examples.rb will provide everything needed to run the Server-Side Piano.

The application’s configuration URL and credentials are located in web.xml. Additional details may be found in the application’s README.

SCM-Backed Application Configuration With Perforce

Continuing from my last post, I’ve forked Paul Hammant’s original App-Config-App and modified it to work against Perforce. I’ve decided not to continue using Perforce Chronicle as it is primarily intended for content management.

With this version, App-Config-App is written in Ruby, mostly using Sinatra, a lightweight web application framework. I’m still using AngularJS, but I’ve also added a few other things:

  • A .rvmrc file, so you automagically switch to Ruby 1.9.3
  • A Gemfile, so you don’t have to install everything individually :)
  • Sinatra-Contrib for view templating support
  • Rack Flash for flash messages
  • HighLine for masking passwords
  • json to manipulate JSON in native Ruby

Getting it to work.

App-Config-App requires a couple things to work:

  • Ruby 1.9.3 and Bundler
  • p4 – the Perforce command line client
  • p4d – the Perforce server

All installation and example setup details may be found in App-Config-App’s README.

Using App-Config-App

When you login, you should see this screen:

You’ll notice I made the extra effort to add colors and drop shadows :D The application works from the project root in Perforce, so the files in each branch are viewable here. Clicking on “Dev” > “aardvark_configuration.html” will bring up a form for editing aardvark_configuration.json as in the previous version:

Changes to the form data are automatically saved. After making a view edits, you can click “View Diff” to get the diffs or “Revert” your changes. Go ahead and change the email address and fiddle around with the banned nicks, then go click “Pending Changes”:

This screen shows all files that were changed and their diffs as well. You can “Revert” each file individually, and if you want to commit all changes, then enter a commit message and click “Commit Changes”. If you commit the changes and go back to “Dev” > “aardvark_configuration.html”, you’ll see the new values in the form:

Security and Permissions

Permissions and security are managed through Perforce. For users to be able to login, they must have a user and client configured in Perforce. Those users must also have permissions configured in order to view or modify files.

The setup_example.rb script creates three test users to demonstrate branch permissions:

1
2
3
4
5
Username        Password   Write     Read
-------------------------------------------------
sally-runtime   bananas    prod      staging, dev
jimmy-qa        apples     staging   dev
joe-developer   oranges    dev    

Logging in as any of these users will hide branches that don’t have at least read-level access, and branches that don’t have write-level access won’t allow changes.

All users created by setup_example.rb are intended only as examples. In the real world, all application users should be setup with real logins and real permissions.

It is this support for users and per-branch permissions that I am using Perforce as the SCM backend rather than Git.

Application Users

The setup_example.rb script also sets up three application users to demonstrate how an application would consume configuration:

1
2
3
4
5
Username   Password   Read
-----------------------------
dev-app    s3cret1    dev
qa-app     s3cret2    staging
prod-app   s3cret3    prod

In theory, an application would periodically poll aardvark_configuration.md5 until the hash value changed, then load aardvark_configuration.json and reconfigure itself.

Application user accounts are configured in Perforce like any other user. I highly recommend that application users be given ready-only access to individual files rather than entire branches.

Divergence

Right now, App-Config-App offers no UI tools for managing divergence and merging. Merges must be performed outside App-Config-App, and the specific safety nets to prevent nefarious change vulnerabilities are dependent on your branch specs and permissions configuration.

There are also are no tools to manage conflicts of existing edits with incoming changes from another user. If a Perforce sync fails due to a conflict, you are best to revert all changes and enter them again.

@TODO

A better model for autosave

Autosave in AngularJS isn’t very good. AngularJS doesn’t integrate with DOM events the way idiomatic JavaScript does, or provide a reasonable abstraction the way Dojo Toolkit or JQuery do. Right now, autosave in App-Config-App triggers with every key press in the config forms, and pummels the back-end server with ajax posts.

I’ve also noticed that the autosave triggers even when a value is invalid. The first time an email address, for example, becomes invalid, AngularJS will post back the JSON, but without the invalid email address field—the invalid field is entirely left out of the JSON structure. After that, AngularJS will stop autosaving until the value is valid. There are also no measures in place to prevent a user from leaving an invalid value and saving an incomplete JSON file.

A better model for validation

AngularJS does not offer a good validation API. The validation API is quite opaque and I haven’t found any real examples using it. The built-in form validation is inadequate. There are few ng-* HTML attributes exposing more than basic configuration parameters, and no hooks offered as extension points.

For example, I’m using regular expressions for date validation in App-Config-App. There isn’t a hook to provide custom validation checks, and regular expressions don’t perform sanity checks. Values such as “00/00/0000” will pass validation.

More example clients than the Java one needed

The App-Config-Java client is enough to show the basic idea behind caching and reloading configuration from App-Config-App. I would like to create a few more examples in a couple different platforms, possibly also showcasing “hot reconfiguration” for feature toggles.

Someone should port this to Subversion or TFS

App-Config-App should be usable by the largest possible audience. For instance, if you’re using Subversion, then you should be able to take advantage of the existing infrastructure.

The reason I point out Subversion and TFS is largely due to support of per-branch permissions.

Using Perforce Chronicle for Application Configuration

Following Paul Hammant’s post App-config workflow using SCM and subsequent proof of concept backed by Git, I will show that an app-config application backed by Perforce is possible using Perforce Chronicle.

Perforce and permissions for branches

Perforce is an enterprise-class source control management (SCM) system, remarkably similar to Subversion (Subversion was inspired by Perforce :) Perforce is more bulletproof than Subversion in many ways and it’s generally faster. Git does not impose any security constraints or permissions on branches, Perforce gives comprehensive security options allowing you to control access to different branches: for example, development, staging, and production. Subversion, however, can support permissions on branches with some extra configuration (Apache plus mod_dav_svn/mod_dav_authz). For these reasons, Perforce is a better option for storing configuration data than either Git or Subversion.

Perforce CMS as an application server

Perforce Chronicle is a content management system (CMS) using Perforce as the back-end store for configuration and content. The app-config application is built on top of Chronicle because Perforce does not offer a web view into the depot the way Subversion can through Apache. Branching and maintaining divergence between environments can be managed through the user interface, and Chronicle provides user authentication and management, so access between different configuration files can be restricted appropriately. The INSTALL.txt file that is distributed with Chronicle helps with an easy install, mine being set up to run locally from http://localhost.

There is a key issue in using Chronicle, however. The system is designed for the management of content and not necessarily arbitrary files. In order to make the app-config application work, I had to add a custom content type and write a module. Configuration and HTML are both plain-text content, so I created a “Plain Text” content type with the fields title and content:

  1. Go to “Manage” > “Content Types”
  2. Click “Add Content Type”
  3. Enter the following information:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Id:       plaintext
Label:    Plain Text
Group:    Assets
Elements:

[title]
type = text
options.label = Title
options.required = true
display.tagName = h1
display.filters.0 = HtmlSpecialChars

[content]
type = textarea
options.label = Content
options.required = true
display.tagName = pre
display.filters.0 = HtmlSpecialChars

Click “Save”.

The Config App

I’ve borrowed heavily from Paul’s app-config HTML page, which uses AngularJS to manage the UI and interaction with the server. Where Paul’s app-config app used the jshon command to encode and decode JSON, Zend Framework has a utility class for encoding, decoding, and pretty-printing JSON, and Chronicle also ships with the simplediff utility for performing diffs with PHP.

The source JSON configuration is the same, albeit sorted:

(stack_configuration.json) download
1
2
3
4
5
6
7
8
9
10
11
12
13
{
 "bannedNicks":[
  "derek",
  "dino",
  "ffff",
  "jjjj",
  "werwer"
 ],
 "defaultErrorReciever":"piglet@thoughtworks.com",
 "lighton":true,
 "loadMaxPercent":"88",
 "nextShutdownDate":"8\/9\/2012"
}

The index.html page has been modified from the original to support only the basic commit and diffs functionality:

(index.html) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xmlns:ng="http://angularjs.org">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
  <title>Configuration application (alpha)</title>
  <script type="text/javascript" ng:autobind src="http://code.angularjs.org/0.9.19/angular-0.9.19.min.js"></script>
  <style type="text/css">
    ins { color: #00CC00; text-decoration: none; }
    del { color: #CC0000; text-decoration: none; }
  </style>
</head>
<body ng:controller="AppCfg">
<script type="text/javascript">
    function AppCfg($resource, $xhr) {
        var self = this;
        this.newNickname = "";
        this.svrMessage;
        this.message;
        this.cfg = $resource("/appconfig/stack_configuration.json").get({});

        this.save = function() {
            self.cfg.$save({message: self.message}, function() {
                alert("Config saved to server");
            }, function() {
                alert("ERROR on save");
            });
            self.message = "";
        };

        this.newNick = function() {
            self.cfg.bannedNicks.push(self.newNickname);
            self.newNickname = "";
        };

        this.diffs = function() {
            $xhr("post", "/appconfig/diffs/stack_configuration.json", angular.toJson(self.cfg), function(code, svrMessage) {
                self.svrMessage = svrMessage;
            });
        };

        this.deleteNick = function(nick) {
            var oldBannedNicks = self.cfg.bannedNicks;
            self.cfg.bannedNicks = [];
            angular.forEach(oldBannedNicks, function(n) {
                if (nick != n) {
                    self.cfg.bannedNicks.push(n);
                }
            });
        };
    }

    AppCfg.$inject = ["$resource", "$xhr"];
</script>
  Light is on:  <input type="checkbox" name="cfg.lighton"/> <br/>
  Default Error Reciever (email): <input name="cfg.defaultErrorReciever" ng:validate="email"/> <br/>
  Max Load Percentage: <input name="cfg.loadMaxPercent" ng:validate="number:0:100"/> <br/>
  Next Shutdown Date: <input name="cfg.nextShutdownDate" ng:validate="date"/> <br/>
  Banned nicks:
      <ol>
        <li ng:repeat="nick in cfg.bannedNicks"><span>{{nick}} &nbsp;&nbsp;<a ng:click="deleteNick(nick)">[X]</a></span></li>
    </ol>
  <form ng:submit="newNick()">
    <input type="text" name="newNickname" size="20"/>
    <input type="submit" value="&lt;-- Add Nick"/><br/>
  </form>
  <hr/>
  <button ng:click="diffs()">View Diffs</button><br/>
  <button ng:disabled="{{!message}}" ng:click="save()">Commit Changes</button> Commit Message: <input name="message"></input><br/>
  Last Server operation: <br/>
  <div ng:bind="svrMessage | html:'unsafe'">
  </div>
</body>
</html>

Both of these assets were added by performing:

  1. Click “Add” from the top navbar
  2. Click “Add Content”
  3. Select “Assets” > “Plain Text”
  4. For “Title”, enter “index.html” or “stack_configuration.json”
  5. Paste in the appropriate “Content”
  6. Click “URL”, select “Custom”, and enter the same value as “Title” (otherwise, Chronicle will convert underscores to dashes, so be careful!)
  7. Click “Save”, enter a commit message, then click the next “Save”
  8. Both assets should be viewable as mangled Chronicle content entries from http://localhost/index.html and http://localhost/stack_configuration.json. You normally will not use these URLs.

At this point, neither asset is actually usable. Most content is heavily decorated with additional HTML and then displayed within a layout template, but I want both the index.html and stack_configuration.json assets to be viewable as standalone files and provide a REST interface for AngularJS to work against.

Come back PHP! All is forgiven

Chronicle is largely built using Zend Framework and makes adding extra modules to the system pretty easy. My module needs to be able to display plaintext assets, update their content using an HTTP POST, and provide diffs between the last commit and the current content.

To create the module, the following paths need to be added:

  • INSTALL/application/appconfig
  • INSTALL/application/appconfig/controllers
  • INSTALL/application/appconfig/views/scripts/index

Declare the module with INSTALL/application/appconfig/module.ini:

(module.ini) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
version = 1.0
description = Application config proof of concept
icon = images/icon.png
tags = config

[maintainer]
name = Perforce Software
email = support@perforce.com
url = http://www.perforce.com

[routes]
appconfig.type = Zend_Controller_Router_Route_Regex
appconfig.route = 'appconfig/(.+)'
appconfig.reverse = appconfig/%s
appconfig.defaults.module = appconfig
appconfig.defaults.controller = index
appconfig.defaults.action = index
appconfig.map.resource = 1

appconfig-operation.type = Zend_Controller_Router_Route_Regex
appconfig-operation.route = 'appconfig/([^/]+)/(.+)'
appconfig-operation.reverse = appconfig/%s/%s
appconfig-operation.defaults.module = appconfig
appconfig-operation.defaults.controller = index
appconfig-operation.defaults.action = index
appconfig-operation.map.action = 1
appconfig-operation.map.resource = 2

Add a view script for displaying plaintext assets, INSTALL/application/appconfig/views/scripts/index/index.phtml:

(index.phtml) download
1
<?=$this->entry->getValue('content') ?>

Add a view script for displaying diffs, INSTALL/application/appconfig/views/scripts/index/diffs.phtml:

(diffs.phtml) download
1
<pre><?=$this->diffs ?></pre>

And a controller at INSTALL/application/appconfig/controllers/IndexController.phtml:

(IndexController.php) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
<?php

defined('LIBRARY_PATH') or define('LIBRARY_PATH', dirname(__DIR__));
require_once LIBRARY_PATH . '/simplediff/simplediff.php';

class Appconfig_IndexController extends Zend_Controller_Action
{
    private $entry;

    private $mimeTypes = array(
        '.html' => 'text/html',
        '.json' => 'application/json',
    );

    public function preDispatch()
    {
        $request = $this->getRequest();
        $request->setParams(Url_Model_Url::fetch($request->getParam('resource'))->getParams());
        $this->entry = P4Cms_Content::fetch($request->getParam('id'), array('includeDeleted' => true));
    }

    public function indexAction()
    {
        $this->getResponse()->setHeader('Content-Type', $this->getMimeType(), true);
        $this->view->entry = $this->entry;

        if ($this->getRequest()->isPost()) {
            $this->entry->setValue('content', $this->getJsonPost());
            $this->entry->save($this->getRequest()->getParam('message'));
        }
    }

    private function getMimeType()
    {
        $url = $this->entry->getValue('url');
        $suffix = substr($url['path'], strrpos($url['path'], '.'));

        if (array_key_exists($suffix, $this->mimeTypes)) {
            return $this->mimeTypes[$suffix];
        } else {
            return 'text/plain';
        }
    }

    public function diffsAction()
    {
        $this->getResponse()->setHeader('Content-Type', 'text/html', true);
        $this->view->diffs = htmlDiff($this->entry->getValue('content'), $this->getJsonPost());
    }

    public function postDispatch()
    {
        $this->getHelper('layout')->disableLayout();
    }

    private function getJsonPost()
    {
        if ($this->getRequest()->isPost()) {
            return $this->prettyPrint(file_get_contents('php://input'));
        } else {
            throw new Exception('Can\'t get JSON without POST');
        }
    }

    private function prettyPrint($json)
    {
        $array = Zend_Json::decode($json);
        $this->sort($array);

        return Zend_Json::prettyPrint(Zend_Json::encode($array), array('indent' => ' '));
    }

    private function sort(array &$array)
    {
        if (count(array_filter(array_keys($array), 'is_string')) > 0) {
            ksort($array);
        }

        foreach($array as &$value) {
            if (is_array($value)) {
                $this->sort($value);
            }
        }
    }
}

AngularJS

After all files are in place, Chronicle needs to be notified that the new module exists by going to “Manage” > “Modules”, where the “Appconfig” module will be listed if all goes well :) Both assets will now be viewable from http://localhost/appconfig/index.html and http://localhost/appconfig/stack_configuration.json. AngularJS’ $resource service is used in index.html to fetch stack_configuration.json and post changes back.

From http://localhost/appconfig/index.html, the data from stack_configuration.json is loaded into the form:

Edits to stack_configuration.json can be made using the form, and the diffs viewed by clicking on “View Diffs”:

The changes can be saved by entering a commit message and clicking “Commit Changes”. After which, clicking “View Diffs” will show no changes:

To show that edits have in fact been made to stack_configuration.json, go to http://localhost/stack_configuration.json, select “History” and click on “History List”:

Chronicle also provides an interface for viewing diffs between revisions:

Disk Usage

Something to remember in using Chronicle is that each resource requested from Perforce is written to disk before being served to the client. This means that for each request to index.html, Chronicle allocates a new Perforce workspace, checks out the associated file, serves it to the client, then deletes the file and the workspace at the end of the request. This allocate/checkout/serve/delete cycle executes for stack_configuration.json and every other resource in the system.

@TODO

Security!

There’s one major flaw with the appconfig module: it performs zero access checks. By default, Chronicle can be configured to disallow anonymous access by going to “Manage” > “Permissions” and deselecting all permissions for “anonymous” and “members”. Logging out and attempting to access either http://localhost/appconfig/stack_configuration.json or http://localhost/appconfig/index.html will now give an error page and prompt you to log in. Clicking “New User” will also give an error, as anonymous users don’t have the permission to create users.

Access rights on content are checked by the content module, but are also hard-coded in the associated controllers as IF-statements. A better solution will be required for proper access management in the appconfig module.

Better integration

Chronicle’s content module provides JSON integration for most of its actions, but these mostly exist to support the Dojo Toolkit-enabled front-end. Integrating with these actions over JSON requires detailed knowledge of Chronicle’s form structures.

Chronicle has some nice interfaces for viewing diffs. If I could call those up from index.html I would be major happy :)

Automatic creation of plaintext content type

Before the appconfig module is usable, the plaintext content type has to be created. I would like to automate creation of the plaintext content type when the module is first enabled.

Making applications aware of updates to configuration

When stack_configuration.json is updated, there’s no way to notify applications to the change, and no interface provided so they may poll for changes. I’m not entirely sure at this point what an appropriate solution would look like. In order to complete the concept, I’d first have to create a client app dependent on that configuration.

Better interfaces for manipulating plaintext assets

I had to fiddle with index.html quite a bit. This basically involved editing a local copy of index.html, then pasting the entire contents into the associated form in Chronicle. I have not tried checking out index.html directly from Perforce, and I imagine that any edits would need to be made within Chronicle. Github offers an in-browser raw editor, and something like that would be real handy in Chronicle.

Handling conflicts

There is no logic in the appconfig module to catch conflicts if there are two users editing the same file. Conflicts are detectible because an exception is thrown if there is a conflict, but I’m not sure what the workflow for resolution is in Chronicle terms, or how to integrate with it. Who wins?

Working with branches

I did not take the time to see how Chronicle manages branches. I will need to verify that Chronicle and the appconfig module can work with development, staging, and production branches, with maintained divergence. For example, we’re still trying to figure out how to attach visual clients like P4V to the repository and work independently of Chronicle.

Kudos

I would like to thank the guys at Perforce for their assistance and answering all my questions as I worked with Chronicle, especially Randy Defauw.