I haven’t stopped fiddling with language development. Actually, I still do it
almost full-time in my free time. My latest prototype,
Scotch, which uses Haskell-like syntax
and semantics, is coming close to a point where I can start developing the main
libraries. Close, but there are a few kinks. Here I’ll be covering an issue I’m
Specifically, what I’m struggling to get my head around is how to model scoping
of values declared within
let declarations. (Read about
let declarations in
Learn You A Haskell.)
In this code, the child declaration
y uses child declaration
z, which is
used by the body of the
let. Consider for a moment: What if
z had side
1 2 3 4
How Are Side Effects A Problem In Child Scopes?
Scotch is compiled into JVM bytecode. Value and function declarations are encoded as static methods returning Java 8 lambdas. The declarations internally use thunks to suspend evaluation but also to retain the evaluated result.
1 2 3 4 5 6 7 8 9 10 11 12 13
1 2 3 4 5 6 7 8 9 10 11 12
The reason for using thunks is to support lazy evaluation. Arguments passed into functions are not evaluated until they are referenced. This also comes with the benefit that arguments are only ever evaluated once. Because child declarations form closures over variables in their parent scope, this also ensures local variables are also only evaluated once within these child declarations.
Top-level declarations, like values and functions, are encoded as static Java
methods. Each time they are referenced, a new thunk is returned. In
this poses a problem because every time a declaration is referenced, it returns
a new thunk. Say for example we have a child declaration called “username” which
fetches a username from a database. Every time the associated thunk is referenced,
the database is hit:
1 2 3 4 5 6 7 8
This becomes a major problem in Scotch because the values with side-effects look like variables!
1 2 3 4 5 6 7
Modeling Let As A Function
I’ve been mulling around a few crazy ways to model
let declarations. My favorite
so far is wrapping the
let body up in a function, passing in all declarations as
1 2 3 4 5 6 7 8 9 10 11 12
This leverages the existing way of modeling evaluation, but it’s kinda ugly. For
the near term, this solution is decent, however it adds the overhead of extra
arguments being passed around. I’m not entirely sure how this impacts runtime
performance in compiled code, though I would like another way of modeling
declarations to compare side-by-side.
Storing Nested Values As Variables
I really like this method of modeling
let because it behaves more like how a
let looks in source code. Instead of wrapping a
let within another function,
the declarations are assigned to local variables:
1 2 3 4 5 6 7 8 9
I’m not entirely sure what performance impact of variables in
let would be
compared to using arguments. Either way, I would be calling the methods associated
with each declaration and storing them somewhere. The only thing saved is an
additional function call, so any gains in performance would be minuscule.
Why The Indecision?
I’m not sure I’m following the best ways to model
let. And I also don’t have
any immediate way to test either solution short of branching the code and spiking
out the two solutions. The tradeoffs aren’t necessarily obvious, and I won’t see
which one is better in the long term compared to the one not chosen.
Compilation is hard. Scotch uses a multi-pass compiler
with 12 steps currently, so adding a new language feature or changing an existing
one can be very arduous tasks. Ideally, I only want to implement
I will be posting on progress and which solution I end up choosing, stay tuned.
- Write-up on lexical scoping in Scotch.
- Details on how Scotch compiles values and functions in Java 8 lambdas.
- How simple, non-recursive closures are implemented.
- Implementation of co-recursive closures.
Thanks to my dear friend Alexander Zagniotov who took time out of his busy day to review my multiple drafts.