Derw: the Gitbook, June 2022
This month has been a focus on improving performance, and documentation. I took quite a deep dive into both, which wasn’t helped by the heat we’re currently having. Derw at the end of this month is faster, with better documentation, and that makes me happy.
If you haven’t heard of Derw: it is a programming language aimed at those working with a TypeScript codebase but wishing to write in an ML, Haskell, or Elm style.
Performance comes in many forms. You have compiler time, which is important for developer experience. You have render time and download time, which is important for the user experience. I decided to look into all these options this month, and scope out improvements and see where Derw stood against contemporaries such as React and Elm. Not all these improvements landed in production yet, mostly due to involving a lot of moving parts.
To do any good performance work, you need to establish a baseline. Something that you can objectively, programatically compare against to see whether the changes you made were beneficial or not. There’s a whole range of tools out there, and a bunch of techniques for measuring performance. For the compiler, I wanted to identify slow parts particularly when starting up and when running the compiler itself.
Proper benchmarks and numbers will come at a later point - I’m setting up some infrastructure that allows for benchmarks to be run consistently.
The released, compiled version of Derw is pretty snappy, from an interaction standpoint. The most minimal command you can run in Derw is `derw init`, which simply copies writes a couple of small files. So I began there, and used Node’s profiling tools to see where most time was spent. A decent chunk of time was just spent on loading the TypeScript files, particularly the TypeScript libraries themselves which are used for verifying generated TypeScript from Derw when using the `—verify` flag with `derw compile`. This is not needed in the majority of cases, so instead of always importing it, I switched it over to a dynamic import. This cut down a bunch of time. The main cli file that’s the entry point for Derw’s compiler simply checks the subcommand, then calls the correct path. For example, `derw format` will parse the `format` part then call the format commands. In theory, that should mean that you only need to load the format code in that particular case, and not when doing `derw install` for example. But after testing, I found that any time saved was small at the cost of making the code more complex.
Another pain point that I spotted was using ts-node vs regular node. For one thing, more people are likely to have node installed rather than ts-node. But since I’ve been working in a TypeScript heavy environment, I defaulted to ts-node just to make life simpler. But ts-node is typically slower than node, particularly because it does all the TypeScript-y type checking which shouldn’t be necessary with my distributed Derw binary (since tsc is run to generate the binary). So I switched over to node, and saw a decent speed improvement on initial bootup. However, I realized that ts-node was necessary to allow the test runner to import TypeScript, which is an addressable problem but I put to the side for a future fix. I reverted to ts-node in the meantime. In theory a faster ts-node could be done by disabling typechecking, either via swc or transpileOnly, but I haven’t yet figured out the ideal way to distribute Derw as a binary with those flags enabled in a cross-platform manner. FreeBSD (and therefore Mac)’s implementation of `env` supports using `-S` to pass arguments to the command you pass to `env`, but no such luck on some Linux distros.
Next I turned my focus to Derw’s Html output. I was pretty sure that the Html output would be pretty small, along with being quite performant. The reality turned out to be that the output was indeed small, and performance was roughly average. Not especially slow, but not especially slow either. To test bundle size, I used esbuild and uglify with gzip to compare sizes both when unpacked and when downloaded. The numbers were really good, but I noted a bunch of places that could be improved by stricter dead code elimination. TodoMVC for Derw, with Derw’s optimize flag, comes out to ~7.5kb gzipped. I do have a branch that cuts that even further, but I haven’t merged it yet. For comparison, Elm with uglify optimization gets to ~9kb. Derw with uglify is down to ~6.6kb. But that’s without any special rules or anything, so it could be further reduced. The smallest number I managed to produce with Derw was ~3.5kb. I’m also exploring using rollup for smaller builds, but having trouble with making the API work as I want.
As a baseline for comparing projects, I’ve been using TodoMVC which has a bunch of examples in different languages and frameworks. Sadly it’s now pretty out of date, so I had to dig around and upgrade some of the examples in order to get current figures. I also wrote a hydrated example.
Content with the bundle size, I then moved to performance. There were a few ideas I had in mind: add identifier-based patching, switch patching and rendering from recursion, and cut down on CSS-triggering actions. I also considered some ideas like lazy loading and using web assembly.
Identifier based patching is what you might use keys for in React or Elm. The concept is somewhat simple: given a list of nodes which will change in length, find the matching nodes between the current v-dom and the next v-dom, and patch those. This allows you to avoid doing a bunch of attribute and children changes when it’s not necessary. Adding support for that was relatively straightforward, and had good results.
Moving patching and rendering from recursive functions to iterative was mostly an exercise in finding out how the two algorithms would perform. The conclusion: performance is basically the same between the two, in both Chrome and Firefox. So I will ditch that change.
Cutting down on CSS-triggering actions was a straightforward one: instead of setting both properties and attributes when an attribute is updated, only update properties when they need updating and the same for attributes. There are some fields in HTML where you need to use `setAttribute`, but others where you need to use `someNode[propertyName] = value`. I was doing both in every patch of an attribute, but have now switched over to doing it conditionally. This cut down on CSS being triggered and improved performance.
Lazy-loading is an idea which makes a lot of sense if you have a large html function which you only want to call in some cases. It ties in a bit with memoization, where you store previous function call results in order to cut down on compute time. I implemented this logic, but found that the approach I took wasn’t as performant as I’d like. Instead, I think I will approach lazy at a language level, and make Derw a lazy language.
I got tipped into looking at blockdom, which claims to be the fastest v-dom in the world. It seems to me like there could be some lessons found there, but I’m still digesting what I’ve read. I also explored using cloneNode and other dom functions that allow you to create and modify existing nodes, but didn’t find a good performance improvement in there. There was also some advice to use class-based operations instead of functions and JSON objects, but found no real improvement there.
Overall, the numbers puts currently Derw, with some of these improvements, somewhere around the optimized-Elm mark. These changes aren’t all merged yet, since they involve big re-writes and I’d like to settle on some design decisions before getting there.
Check it out: https://docs.derw-lang.com/
If you want people to use your project, you need good documentation. You might gain users through people being determined to be the first testers or early adopters, but if you want the majority of people to use it, you need to tell them how - and guide them to the easiest path. When I first got into the Elm community, I took an active role in supporting people and answering their questions. That kind of interaction is a great way to help folks getting started, but it doesn’t scale. So what better way to help a larger audience than to document the answers for their questions, before they ask them? So the Gitbook covers how to get Derw installed, how to set up VSCode, language features, and how to write webapps.
Since interop is such an important part of Derw, I’ve tried to document what each language feature is compiled to. The idea being that kernel code shouldn’t be a secret: it should be documented so that folks have a way to write their own Derw-compatible Derw code, so that library support isn’t blocked on me.
Pull requests are most welcome to the Gitbook. Currently I’ve planned to expand with more examples on how to work with interop and web-api libraries, and server side Derw.
Better error messages when accidentally using incorrect syntax pairs (e.g let..return instead of let..in)
Comments at the top-level are preserved when running format
Format now supports —watch
Add a format extension for vscode
Cut down on number of files compiled by Derw compile inside derw-packages
import “./Maybe” as Maybe exposing (Maybe) now allows you use both the type Maybe and the namespace Maybe
Add support for boolean attributes in coed + html (e.g checked)
Add syntax highlighting for != and !
Add stopPropagation and preventDefault to onWithOptions
Hydrated TodoMVC example
Reduce package size of Derw in npm
A pretty print function for the AST to make writing tests easier in the compiler
Fix module references inside lambdas
Add more newlines in places when using format
Set up derw-lang.com and docs.derw-lang.com