Designing an ML-family language on top of TypeScript: import and export syntax

Jul 18, 2022

Importing code from other files is a vital part of any language with the intention of being more than a simple script. Even with simple scripts, it’s often important to reuse other code. Imports might take the form of inlining code into a single generated file, or they might be more dynamic and done at run time. This is the exploration that led to Derw’s import syntax. We will only briefly touch on dependency management, but focus on the syntax within code itself for imports.

Terminology

An item is something defined within the current language: functions, types, constants, and macros.
Package is a collection of files compose a defined collection of items.
The scope is the collection of items that are accessible in the current file
Namespace is the current scope. This is typically items defined within the current file, but can also be populated by importing items from other files.
An import is a definition informing the current file to load content from another file.
A module is a singular file within a package.

Note that behaviors in other languages and frameworks might differ from those stated here, but an attempt to be accurate to current standards was made.

What other languages do

One of the oldest import syntaxes that is still in use is C’s. With C, you can use header files to provide the function signatures, constant values, types, and macros through the #include directive. C allows you to either look up system-wide dependencies with the #include <file> directive, or local headers with the #include “file”. There’s workarounds with the pre-processor to ensure that imported definitions are only evaluated once. Through macros it is also possible to selectively import different headers - for example, when you want to include different files based on the platform being targeted.

The use of macros leads to C being statically compiled but having a very flexible import system that might seem dynamic — they can be conditional. So while imports are not evaluated at runtime, it’s possible to have a single codebase with quite different implementations per platform. Linking plays an important factor here too: it is possible to either have dynamic linking, where system-wide dependencies are shared between multiple sources and looked up at runtime, or static linking, where all files are combined at compile time to generate a static binary with no external dependencies. Dynamic linking is useful for saving space and allowing for dependencies to be updated without needing to recompile the binary. Static linking is good for avoiding problems with dependencies meaning that the binary can be distributed as-is without needing to include instructions for dependencies.

On Linux, it’s typical to distribute header files separately from the compiled files, and only install header files when they are needed for development. On Windows, DLL files are used in projects - and often shipped with the final binary to avoid problems with version incompatibility.

Go’s import syntax is done through the import keyword, which can either the name of a package, or a Git repo. You can also use a dot to signify that all exposed elements should be imported into the current file, as if they were defined in this file itself. Imports can be aliased via import m “math” . If given a repo url, then the name of the package is used as the alias. Typically Go imports are grouped together into a single import declaration, with no real benefit other than typing import less often. Functions and constants that begin with a capital letter are exported by default, and there is no explicit export keyword. It is possible to import everything into the current namespace through dot syntax: import . “math”.

Go’s story of distribution of shared code is a little complicated. Originally there were packages that would exist in your GOPATH directory, and running go install would fetch the dependencies and put them in there. Now go modules exist, and work in a similar way, with some restrictions on how the repo looks in order to make it to conform with the module syntax, along with a module file that defines dependencies.

Java groups everything under a package. External packages can be imported through the import syntax, typically namespaced under the creator’s company name or domain. Multiple files can compose a single package by declaring the same package: package packagename. All public classes within a package can be imported via * syntax. By default, all public classes are accessible from any other class within that same package. Java’s imports are at the top of the file and exist outside of the normal scope of the classes and functions, meaning it is not possible to do dynamic imports - however fully qualified names (e.g java.lang.Math.PI) can be use within normal source code.

Python’s import syntax uses two main forms: import x and from x import name. The first will import x into the current namespace, whereas the second will add name into the current namespace. It is also possible to use the wildcard character (*) to import all things from a package into the current namespace, but it’s typically not recommended as it makes refactoring harder. Python’s imports can be anywhere within the file, so that it possible to conditionally import items - or import them only if they are needed. An example of conditionally importing a module might be when a developer writes a command line interface that might invoke functions that involve a lot of code and other imports based on a certain flag, and only conditionally importing that code when a certain flag is passed. Python allows you to dynamically call the import functions directly, too, so can be handled without a keyword. This is useful for working with generated Python modules. Imported items remain within their scope — an imported name inside a function is only accessible inside that function by default.

Python’s package system allowed a file called __init__.py which contained within each imports to be exposed from that directory. This allows for a sense of encapsulation — it is possible to only expose “public” names through this method. With CPython, it is also possible to import Python-compatible C code, in the same way that regular Python can be imported. Imported module names can be aliased using import name as other.

Elm’s imports mostly deal only with Elm files. Each file is a module with a given name declared by the module Name exposing (someFunction) syntax. Both external dependencies use the same naming system, so it’s not possible to have two modules with the same name between them — i.e if you have a module called Name, and there’s an external dependency exposing a module called Name, then there will be a collision and you’ll need to rename one of them. This is generally not a problem at small scale, as there is rarely times when names might collide. But packages which expose a module called View might find that it collides with most code written by users. It is possible to alias modules using import Name as Other. In core libraries in Elm, kernel code can be imported by refering to the fully-scoped filename, relative to the root of the package. For example src/Elm/Kernel/VirtualDom.js would be imported by import Elm.Kernel.VirtualDom.

Javascript has many ways of being bundled and combined. Prior to async and server-side Javascript existing, it was simplest to include script tags in the order of evaluation. For example, if you depend on JQuery in one script, simply place JQuery before the script tag loading your code. As JavaScript grew in usage, so did the need for multiple files and code sharing. This led to tools that would combine multiple scripts into one file, often refereed to as a bundle. To establish inter-file relationships, functions like require were used in order to specify which files need to load which other files. Require could be used to specify a file to load, which would then be evaluated and return an object containing the exports. The JavaScript story gets complicated with alternative standards for dealing with this: AMD, CommonJS, RequireJS and ESModules, of which some were implemented directly in JavaScript’s server side engine, Node. Additionally tools came into existence to deal with processing Javascript content: browserify, gulp, grunt, webpack, esbuild, parcel, babel. All these tools existed with the intention of making working with JavaScript dependencies and transformations easier. As more of html-type things became written in Javascript rather than HTML itself, alternative syntaxes like JSX began to become popular, which would need pre-processing before being served as Javascript. Compile-to-Javascript and compile-to-CSS became a standard part of any large project that dealt with TypeScript/SCSS/LESS/JSX.

This leads us to the current situation where Node, Deno and browsers have taken a bet on ESModules, which is what we’ll cover in more detail. ESModules define a way of dealing with imports and exports that are now part of the Javascript standard. Imports look like import { name } from “./somefile.ts”. These imports must have a file extension, which helps make imports cross-platform - rather than the old way without file extensions. Each imported item must be exported through the export keyword, which can be put before each item definition. It is also possible to specify a default export, or an export block, but named exports are clearer on what is being exported. An entire module can be imported under an alias using import * as somefile from “./somefile.ts”. Individual items can be aliases when importing with import { "name" as anotherName } from "./somefile.js". Imports can be done dynamically through the import function. External dependencies in Node are done through import * as fetch from “cross-fetch”, which looks up the dependency from node_modules, where the dependencies live. Deno allows for dependency imports to be imported from a url.

What Derw does

The syntax for importing and exporting Derw items can be seen as a combination of Elm and ESModules. Local imports look like import “./View” exposing ( view ), which will first look for View.derw, and failing that, View.ts, and failing that, View.js. This is to allow TypeScript and Javascript items to be imported into Derw files. Derw files should be named in CapsCase, with the extension .derw. All Derw imports are either relative, or in the case of importing Javascript or TypeScript from a external dependency, simply the name of the dependency. Note that currently Derw imports for Derw dependencies is currently done through relative imports, i.e import "../derw-packages/derw-lang/html/src/Html" as Html. This is to simplify handling of Derw imports: instead of two behaviours for Derw files, there is just one, and it’s all relative. It may be a wise idea to have external Derw dependencies imported in some other other fashion, but for now the goal is to keep things simple — especially considering that having to compensate for imports for node_modules is already a toll. Kernel code should have the extension _kernel.ts, but it is not required. It is recommended simply to differ between the Derw code, and the plain TypeScript or Javascript files you wish to work with. Kernel code is intended to be Derw-compatible, and in the Derw docs, you can find the generated code format for each item in Derw. If an import does not have an alias, then it is automatically aliased to the filename without the extension. There is no way to import all items from a module, but rather it is suggested to use an alias like Html, intended to make refactoring and debugging easier.

Modules do not need a module declaration as in Elm, and exports are done through exposing (name) syntax which can be provided multiple names at once, or through multiple exposing declarations. Exposed items can be used in TypeScript or JavaScript - typically useful when you have types in Derw you’d like to reuse (e.g from client-side Derw to server-side TypeScript), or when importing a view function to do hydration. There is no way to expose all items in a file, but test files will automatically expose functions beginning with test so that the test runner can run them.

There are no dynamic imports in Derw, but if they are needed they can be done in kernel files. It is possible to call the ESModules import function within Derw code, but it is not considered part of the language.

The philosophy behind the import system is that imports should be simple, obvious, and with least friction. Relative imports being the main way of writing Derw imports is intended to make it clear what the search path is for a file. Likely this can be made simpler still when dealing with imports from TypeScript/JavaScript, probably through a more explicit “I want JavaScript” import format (e.g file extensions or importFromNodeModules-type declaration). Ideally there will be only one way to do imports in Derw, and avoid the fracture of CommonJS and ESModules. Underneath Derw is TypeScript with esbuild, so Derw will follow what they support.

The end result ends up looking something like this:

Found this interesting? Follow Derw on Twitter, or star it on Github.