Compiler

The Basilisp compiler is a three step, form-at-a-time compiler. This means that unlike statically compiled languages, the unit of compilation is actually top-level forms in namespaces, rather than entire namespaces themselves. This is incredibly convenient for augmenting Basilisp’s metaprogramming capabilities on the fly. A previously defined function can be used inside of a macro on the next line! Even the initial functions that supply a lot of the functionality for macros and syntax quoting are actually defined in beginning part of basilisp.core. However, this flexibility also introduces some significant drawbacks. Since the compiler cannot reason about larger units of code, it cannot make inferences or linkages that might improve performance or reduce the likelihood of introducing a subtle bug due to the dynamism of the language.

There are three steps to the Basilisp compiler: analysis, generation, and optimization. After a Basilisp form is read in by the Reader, it is passed off to the analyzer to produce an abstract syntax tree. The generator reads the AST and produces Python code (_not_ bytecode) using Python’s builtin ast module. Afterwards, the compiler passes the generated AST through a quick optimization phase to remove redundant branches and other artifacts of code generation. From there, the compiler injects the compiled code into a dynamically-generated Python module which is associated with a Basilisp Namespace and executes the code so the generated objects are available.

The Basilisp compiler generates code and objects within the same process it is operating in, but it also caches the bytecode generated by the Python compiler. As such, the Basilisp compiler needs to be careful to create code in such a way that it does not assume the state of the current process, since it may either be executing cached code in a fresh process or it may be operating within the same process the Basilisp code was compiled in.

Configuration

The Basilisp compiler includes a few configuration options for emitting warnings and tweaking code generation that may be useful, particularly during development. The current Basilisp compiler options may be examined and modified using the dynamic Var basilisp.core/*compiler-opts*.

Warnings

The following settings enable and disable warnings from the Basilisp compiler during compilation.

  • warn-on-arity-mismatch - if true, emit warnings if a Basilisp function invocation is detected with an unsupported number of arguments

    • Environment Variable: BASILISP_WARN_ON_ARITY_MISMATCH

    • Default: true

  • warn-on-shadowed-name - if true, emit warnings if a local name is shadowed by another local name

    • Environment Variable: BASILISP_WARN_ON_SHADOWED_NAME

    • Default: false

  • warn-on-shadowed-var - if true, emit warnings if a Var name is shadowed by a local name

    • Environment Variable: BASILISP_WARN_ON_SHADOWED_VAR

    • Default: false

  • warn-on-unused-names - if true, emit warnings if a local name is bound and unused

    • Environment Variable: BASILISP_WARN_ON_UNUSED_NAMES

    • Default: true

  • warn-on-non-dynamic-set - if true, emit warnings if the compiler detects an attempt to set! a Var which is not marked as ^:dynamic

    • Environment Variable: BASILISP_WARN_ON_NON_DYNAMIC_SET

    • Default: true

  • warn-on-var-indirection - if true, if a Var reference cannot be direct linked

    • Environment Variable: BASILISP_WARN_ON_VAR_INDIRECTION

    • Default: true

    • See also: Direct Linking

Generation Settings

The following settings can affect the generated Python code.

  • generate-auto-inlines - if true, the compiler will generate inline function definitions for any def'ed functions with the ^{:inline true} metadata (replacing the boolean ^:inline key with the inline function)

    • Environment Variable: BASILISP_GENERATE_AUTO_INLINES

    • Default: true

    • See also: Inlining

  • inline-functions - if true, any invocations of a function with a callable ^:inline metadata key will be replaced with the return value of that callable (as a macro)

    • Environment Variable: BASILISP_INLINE_FUNCTIONS

    • Default: true

    • See also: Inlining

  • use-var-indirection - if true, all Var accesses will be performed via Var indirection

    • Environment Variable: BASILISP_USE_VAR_INDIRECTION

    • Default: false

    • See also: Direct Linking

Namespace Caching

The Basilisp compiler aggressively caches compiled namespace modules because compilation is relatively expensive and leads to significant slowdowns when starting Basilisp. Basilisp namespaces are cached using the same mechanism as the Python compiler uses – namespaces are cached as bytecode and only recomputed when the mtime of the source file differs from the mtime stored in the header of the cached file.

There may be times when the caching behavior is undesirable for whatever reason. Often in development, it is not desirable to allow namespace caching since such files may get out of sync of other uncached modules you are frequently updating, causing hard-to-diagnose bugs. In such cases, you can tell the Basilisp import mechanism to always ignore the cached copy of a namespace using the BASILISP_DO_NOT_CACHE_NAMESPACES environment variable.

export BASILISP_DO_NOT_CACHE_NAMESPACES=true

Direct Linking

By default, the Basilisp compiler attempts to generate direct links between generated Python code during compilation to improve performance. For example, if you defn a function, the compiler will generate a raw Python function and also intern that function in a Var in the current namespace. Accessing the function (for instance to call it) via its Var involves a dynamic lookup on the current Namespace (which may use a lock), whereas a direct linked reference to the function will circumvent the Var lookup entirely. This type of direct linking is similar to how you might reference a Python variable from within a Python function – no need for an extra lookup that must be performed at runtime.

There are cases where it may be impossible to emit a direct link to a Basilisp Var or name, such as when the Var is def-ed inside of a function. In such cases, the Basilisp compiler will emit a warning to let you know it is being forced to indirect through the Var. You can configure whether or not you see warnings for such things as described in Warnings.

Individual Vars may be accessed using indirection based on specific metadata even if direct linking is enabled. The ^:dynamic metadata key will force all accesses to the so-marked Var to be indirect to allow for the thread-local sets (which are a feature of the Var, not the value inside the Var). The ^:redef metadata key can be used if you intend to re-def a Var later and you need changes to be propagated. It is unlikely you will want to do this, but you can configure the compiler to emit all Var accesses with indirection using the use-var-indirection configuration option in Generation Settings.

Note

Changes to Vars which were direct linked will not be propagated to any code that used the direct link, rather than Var indirection.

Note

It is possible to initially define a Var with ^:redef and then remove that metadata later, allowing later uses to be direct linked even if those which were compiled while ^:redef was set will use indirection.

Inlining

The Basilisp compiler supports inlining function calls directly into a call site for simple functions. Inline definitions can be provided for named (defn'ed) functions by providing an anonymous function on the :inline meta key. The compiler will automatically inline calls to functions annotated with such a function in their meta if inlining is enabled.

The compiler additionally supports automatically generating inline function definitions for simple functions. Functions annotated with a boolean :inline meta key will have inline definitions generated automatically at compile time and will thereafter be eligible to be inlined (subject to the current function inlining settings set on the compiler). Only “simple” functions are eligible for inlining. Simple functions are functions of a single _fixed_ arity (no variadic functions) with only a single body expression. Functions not meeting these criteria will trigger compile time errors if they are annotated with boolean :inline metadata.

Note

Individual instances of inlining may be disabled by annotating the call site with the :no-inline metadata.

^:no-inline (first [1 2 3])

Warning

The boolean :inline key must be applied to the fn form or the optional fn name itself. The user-provided :inline function must be applied to the Var which is generally done by applying the metadata to the def name itself. Users are encouraged to simply apply these meta keys with defn, which will always do the right thing regardless of where you apply the metadata.

Warning

Inlining functions certainly has its benefits, namely: increasing performance making simple function calls.

However, inlining can come with some significant drawbacks if you aren’t careful. One such drawback is that inlined function references which use an imported, required, or referred symbol which is not available in at the inlined call site will not work and will produce a compile time error. Another drawback is that inlining, like macros, occurs at compile time and thus changes the final generated code – stack traces will not include the original inlined function invocation which can impede debugging. Relatedly, an inlined function cannot be re-def'ed, monkeypatched, or rebound at runtime.

Users should consider inlining primarily a Basilisp internal feature and use it extremely sparingly in user code.

Debugging

The compiler generates Python code by generating Python AST nodes, rather than emitting the raw Python code as text. This is convenient for the compiler, but inspecting Python AST nodes manually for bugs can be a bit of a challenge even with a debugger. For this reason, the Basilisp compiler can also use the ast.unparse (astor in versions of Python prior to 3.9) library to generate raw Python code for visual inspection.

Currently, the compiler is configured to automatically generate Python code for all namespaces. This code generation isn’t slow, but it does add an appreciable amount of time to the compilation of each individual namespace. Users can disable this behavior using the BASILISP_EMIT_GENERATED_PYTHON environment variable. This setting will be changed to be off by default once Basilisp is in a stable release (e.g. at 1.0).

export BASILISP_EMIT_GENERATED_PYTHON=false

Logging

Basilisp ships with a disabled Python logger set to WARNING. For development, it may be useful to enable the logger or to change its log level. The former can be configured via the environment variable BASILISP_USE_DEV_LOGGER, while the latter may be set by BASILISP_LOGGING_LEVEL.

export BASILISP_USE_DEV_LOGGER=true
export BASILISP_LOGGING_LEVEL=INFO