Derw status: January 2022 part 1 - Html support, infrastructure, and kernel code

Jan 17, 2022

This month has been a big one for Derw: Html support, packages that can be installed, writing functioning code both client and server side, ironing out some issues, and a bunch of code refactors. As this post reached email length limit, I decided to split it into two. So stay tuned for a follow up. The best place to stay up to date in on Twitter or by staring the Derw repo.

If you’ve just arrived here, Derw is a programming language in the ML family designed to enable better workflows with TypeScript codebases. If you’re wondering why, check out my blog post about exactly that. These blog posts are split into three parts: a changelog, some thoughts, and some meta comments on language development. Follow the blog with the button below.

Changelog

Html support

One of the big milestones before being publicly ready for beta testers has been Html support. I had already written a Html library in TypeScript called Coed, with a similar program setup to Elm. Most of the functions are identical to Elm, except with three arguments for tags with children - events, attributes and children, and two tags for those without - events and attributes.

There’s also a builtin program for working with model-view-update (MVU) structure as in Elm. If you’re unfamiliar with the concept, the model refers to the state of the program. This is immutable. There is a view function that takes the model and returns some html, including events messages triggered by interactions (such as clicks on input). These event messages are passed to an update function that also takes in the current model. The update function returns the new model, which is then fed to the view function and used to update the DOM.

Writing a thin wrapper around Coed in Derw allows it to be used from Derw itself. This means that Derw is now ready for usage for frontend code, and as a result I’ve started writing a few examples in it, and one work-related side project. I’m a big fan of dog fooding to ensure that my projects are easy to use. You can find that package at the html repo.

It’s possible to have multiple Derw and Coed apps in a single page, either from the same codebase or different ones.

Here’s an example of a counter from the official Elm examples, implemented in Derw:

And the code to get there:

Wordle in Derw

Wordle is a neat little game that’s getting quite a bit of internet attention, similar to 2048 did a few years ago. Seeing how visually appealing it was, and the fact that it relies on state updates (e.g when a new guess is entered), I thought it would be a great candidate for implementing in Derw. So I did, it can be found here and the code here.

Kernel code

In Elm, there is a way of writing untype safe, mostly used to write code that otherwise wouldn’t be possible in Elm — or for performance reasons. An example usage is to call Javascript functions or operators (like bitwise operations). In Derw, we don’t need that as much as most TypeScript can be called from Derw itself. However, there’s one case where there’s something that breaks the mold: `typeof`. Since it’s not a function but takes arguments, it’s pretty unique in the world of JS. So in order to write boxing code that depends on the results of typeof, I wrote some kernel code. In Derw, kernel code can be roughly type safe thanks to the fact that our kernel code isn’t in JS, but is in TS.

Switch from flags to commands for root actions

I originally intended for there to be a single binary which could be controlled via flags to switch what the binary was doing. This worked okay until I started doing packaging related tasks, and realized that each path may have different sub-flags. For example a —compile flag might be paired with —output, but —output makes no sense in the context of an —init flag. As a result I’ve now added compile, init install, and test as root commands that can be used and have their own distinct code paths with unique flag parsers. The file was also getting a little complicated, so I moved each command out into their own folder, the remaining code being quite succinct.

Installing packages

I’ve chosen to go for a lightweight route for now, with a solution that clones repos into a derw-packages directory, and then fetches a particular reference and checks it out. This probably won’t scale, and I’m not sure if Github want to be used in that way. It seems they have retired the previously open /zip and /tarball endpoints, which Elm used initially. Or there might be something I’m missing.

Package resolution is initially very simple: all packages installed must use other packages at the same version. This could mean a particular branch.

Packages are compiled when running `derw compile` from within the project directory. You can then refer to those files via “../derw-packages/<package path>/src/<filename>”. A better setup will come with time, but this should be enough to get started with. You can check out how I use it in the Wordle example.

An info command

Sometimes when you’re working with a big module, it can be quite complicated to figure out what you have and haven’t exported. In order to make that debugging experience easier when releasing packages, I’ve added a `derw info` command. For example, in the case below there’s several functions that I forgot to expose: node, map, render, class_, style_ and attribute.

Name lookups

I’ve added some basic name lookup to see if a given name is within the current scope. The current scope being: in the current file, imported from another file, given as an argument or in a let block within the current function. This isn’t perfect yet, as there’s no coverage for functions on objects from TS/JS. For example, `“hello“.split(““)`. As a result I’ve put it behind the `—names` flag.

Handling globals

There’s a mixture of various objects and functions that exist in the default namespace in JavaScript. Think of `isNaN`, `console`, etc. Not all of these are standardized across every platform, including `console`, which makes supporting them difficult unless you either have shims from a bundling step, or the compiler is aware of what platform you’re targeting. It might be worth later on adding a target: web | node attribute to derw-package.json, but for now I’ve added a default import to every namespace called globalThis which is supported as the way to access globals on most platforms except IE.

Improvements to inference

Currently, Derw inference is mostly limited to literals and constructors. Functions aren’t yet looked up, meaning that things like `text “hello”` does not know that `text` returns `HtmlNode msg`. Constructed values, like `Just 5` are though. I rewrote the compiling steps to provide each file with the full type tree of every imported Derw file, meaning that they can look up the union types and type aliases in a program.

Repl

One of the most common ways of exploring a new language is to run an interactive session via a REPL. Typical behaviour of a REPL is to evaluate given input and print it, allowing users to then continue the same session, e.g to first define a function then use it.

In Derw, all constants and functions must have types. But a common way to use a REPL is to quickly write a short line and run it without going through the typical parsing engine. So I added `:eval` which allows a user to evaluate value without needing to provide types for it.

Playground

I wanted to show people who examples in Derw look, both in terms of generated code and the runtime. I spotted that Ren had a playground similar to try.elm-lang.org which seemed simple and lightweight, so I took the same approach. It required bundling the compiler as a frontend-compatible package, which was fine except for one function I was using from the `path` module.

The playground allows you to enter any Derw, and choose between seeing it as TypeScript, JavaScript, Derw and Elm. The next step is to evaluate the generated JS and embed the generated program into the page, so that you could explore the html package.

Bundling

The expected developer experience when writing frontend code should be: compile the code, including it in your index.html. But to get there from Derw code, it goes through a chain like: Derw → TypeScript →Javascript → bundler, where bundler is some library that supports modules. Ideally I’d want that step to be as simple as possible, with no need to mess with complicated config files or plugin systems.

Parcel has been the tool I’ve used for TypeScript development, though I’ve encountered a couple of annoying things: a large and slow install time, and watch mode does not match production mode. There are bugs you can encounter where watch mode will work but production will give you an error. And those errors are often cryptic. So instead I looked into esbuild, and the speed blew me away.

I added a command for watching and bundling derw files: `derw bundle`. With this addition there’s no need for any other tooling to be used while developing Derw code, you can do it all from the Derw binary.

Tasks

Tasks are a representation of side-effectful bits of code. In Elm they have two type arguments: one type for the success state, then one type for the fail state. These tasks are run via commands and are managed by an effect manager, which then feeds the data into the main loop that’s used by Elm. This means that Tasks eventually must be wrapped in the msg type you used for the update loop.

In Derw, Tasks are thin wrappers around promises. They’re defined within kernel code, along with a function that will actually run it. Unlike Elm, Derw isn’t built entirely around a single update look. What this means is that there’s no natural place for evaluated tasks to feed data to. Coed (and therefore Derw’s Html library) has a callback that can be used to feed data into the main loop, so that works out similarly to Elm. Dealing with that server-side however is not yet landed. They’re also currently only successful, so the error state will be added soon.

Server side code

I’ve started a server package, which is currently a proof of concept to show that a wrapped http server from Node could work. Long term, it might be best to just provide a wrapper around Express rather than rewriting a server myself, but perhaps the server needs to be adapted for my intended workflow in Derw.

Homepage

There is now a homepage summarizing Derw with some useful links found here. It’s not written in Derw as there’s not much interaction or the need for dynamic content, so it’s plain HTML. I’m not the best salesman when it comes to pitching this kind of content, so if anyone feels like making a pull request to improve either the wording or the styling, please feel free to.

Case..of improvements

Now you can match a value name in a case..of against strings, and use default to apply a default case.

Additionally, it’s now possible to have a let..in definition within a branch of a case..of.

Small things

Don’t apply <any> to union type tags which don’t need it
Block reserved names like Object or Function from being used as constructors
Only compile files once per compile
Ensure const and return types are full qualified
Fix handling of multiple type args (e.g Either a b)
Nested arrays should handle function calls inside properly
Tests for init command
Nested object literals should work as expected
Published on npm as `derw` rather than `@eeue56/derw`
Fix handling of fields with type arguments in type aliases
Imports of derw files now support a derw file extension in the import name
Init now adds non-important derw files to gitignore
Add a —watch flag for derw test and derw compile
Empty type aliases now work

Thought cauldron

Objects vs type aliases vs dict

I’ve run into the problem of objects, which I’d thought about a bit during my Elm days. Objects in JavaScript are basically untyped, where keys can be anything and values can be anything. In languages where everything is typed, you must have a consistent type for the key, and a consistent type for the value. You could have a union type for either, where the type is a combination of two types e.g `number | string`. Elm solves this in a couple of ways: first, you can either use type aliases to represent objects where you know all the fields and their types. With a type alias you’re free to represent properties entirely independently - that is to say, the type of one field does not impact the type of another. The second solution is a Dict, a map where you know that the type of all the keys will be the same, and all the values will be the same. E.g a dict might have strings as the keys and numbers as the values, and cannot deviate from that. Dicts are therefore useful when you don’t know all the keys at compile time.

If we weren’t having interop with TypeScript and JavaScript, we could take the same approach as Elm. But since we’re going to work with complicated objects that might be used in unexpected ways, we need a way to look up properties at runtime and handle the appropriate value. Lookups should be straightforward - either use indexing to see if something is undefined, or use Object.keys. The response is more problematic, as that will be used in the rest of the code rather than just for lookups. So perhaps the idea would be to have a union type that represents all possible object values, and wrap retrieved values from that. The downside is that unboxing the value will be both slower programatically, and slower in developer experience. For example if we have a union type `Number number | String string | Function (a → b)`, every time you want to use a retrieved value you’d need to pattern match on the response.

So I’m thinking there might be four ways of dealing with types:

Type-safe Dict as in Elm, implemented in pure Derw
Type aliases as in Elm and in Derw today
Unsafe Dict methods that take any and return any
Wrapped Dict methods that take an object and wrap the values and keys in a boxed type

Hiding the internals

There’s a couple of approaches to dealing with internal/kernel/native code. The Elm approach is to minimize as much as possible, leading to a small amount of untyped code that the core libraries. Kernel code is restricted everywhere else outside of various hacks, in order to ensure that all other Elm code is valid, type safe, and unlikely to cause runtime errors (other than obvious logical issues like infinite recursion). The lack of unmanaged side effects also means that you can be sure that any package you install isn’t sending off your data to some dodgy server, unless it explicitly works with side effects (in the form of Tasks). Working with existing libraries therefore is quite difficult, and in some cases impossible. Javascript and Elm just simply have a different model, even if Elm eventually becomes Javascript.

The other approach is to allow users to write custom low level code, like the C that many Python packages rely on. This means that working with existing libraries is really easy, at the cost of all the unreliability of the underlying language and the mapping between Python and C or Fortran. That’s right, Python’s model for managing this is so powerful that even decades old Fortran is able to be used in Python’s numerical and scientific packages.

With Derw, the goal is to approach the developer experience of Elm, with the flexibility of Python. The cost of this is quite high: if Derw can call TypeScript, we lose all the type safety we had guaranteed in TypeScript. But here’s a powerful aspect: as Derw’s underyling code is TypeScript, and it always generates type-safe TypeScript, we can provide a greater level of safety than say a wrapper which ignores the underlying types.

Testing

I’m a big fan of codebases that have tests that follow these three rules: 1) tests should quick to run so that it’s natural to run them often, 2) integration tests should represent reality as closely as possible, and 3) unit tests should be consistently named.

In Derw’s compiler, I’m using Bach to run tests. Each language feature or syntax has a file dedicated to it, in some cases multiple where edge cases are involved. Each file has consistently named functions, for example testParse and testGenerate. This means if I want I can quickly test a particular piece of syntax against a change I made to the parser or generator. Then I have a bunch of tests that compile my examples folder and the stdlib. These mostly make sure that generation is consistent between changes, acting as my integration tests. I need to expand my testing of command line options, to ensure that I don’t break some command while refactoring.

Currently, running the entire suite (~650 unit tests) takes 7 seconds locally. It’s around the same in CI. Most of this time is due to long start times of ts-node. As we approach 1 minute, I’ll look into making that faster. But locally I’m also able to filter a specific file or function very easily, using Bach’s —file or —function arguments. For example, if I want to just test parsing of nested arrays, I can run `npm run test -- --file src/tests/nested_array_test.ts --function testParse` which is quick.

New languages on Github

Github has a policy about when it adds a language to the stats and the syntax highlighting. It sits at 200 unique repos required, which basically means that unless your language gets a decent amount of packages made, your code is gonna be mislabeled and without highlighting. There’s no way to say “this syntax is basically 90% of X language”, so sadly Derw source files will remain without highlighting. This kinda sucks, surely there can’t be so many new languages that have got as far as having a syntax tree that it’s overwhelming? I’m sure there’s good reasoning behind but it seems to really harm language adoption. If I want my code to be as readable as it is for me in my editor, I have to share screenshots rather than direct links to code. Direct links would allow people to try out the code themselves and make changes.

Derw

Derw status: January 2022 part 1 - Html support, infrastructure, and kernel code

Changelog

Thought cauldron

Meta

Discussion about this post