eucalypt

eucalypt is a tool, and a little language, for generating and transforming structured data formats like YAML, JSON and TOML.

If you use text-based templating to process these formats or you pipe these formats through several different tools or build steps, eucalypt might be able to help you generate your output more cleanly and with fewer cognitive somersaults.

eucalypt is a purely functional language that can be used quickly and easily from the command line.

It has the following features:

  • a concise native syntax that allows you to define data, functions, and operators
  • a simple embedding into YAML files to support in-place manipulation of the data (a la templating)
  • facilities for manipulating blocks (think JSON objects, YAML mappings)
  • facilities for manipulating text including string interpolation and regular expressions
  • an ergonomic command line interface and access to environment variables
  • metadata annotations and numerous extension points
  • a prelude of built-in functions, acting like a standard library

It can currently read YAML, JSON, JSON Lines, TOML, EDN, XML, CSV and plain text and eucalypt's own ("eu") syntax and it can export YAML, JSON, TOML, EDN or plain text.

Warning: eucalypt is still in an early phase of development and subject to change.

A lightning tour

Eucalypt has a native syntax for writing blocks, lists and expressions. The YAML embedding consists of a few YAML tags used to embed eucalypt expression in YAML so a basic understanding of the native syntax is helpful.

A few micro-examples should help give a flavour of eucalypt's native syntax. If you want to follow along, see Quick Start for notes on installation.

Example 1

Here is a simple one:

target-zones: ["a", "b", "c"] map("eu-west-1{}")

You can put this in a file named test.eu and run it with just:

eu test.eu

This outputs the following YAML:

target-zones:
  - eu-west-1a
  - eu-west-1b
  - eu-west-1c

As an aside, although we're looking at the native eucalypt syntax here, this example could just as easily be embedded directly in a YAML file using the !eu tag. Pop the following in a test.yaml file and process it with: eu test.yaml. You'll get the same result.

target-zones: !eu ["a", "b", "c"] map("eu-west-1{}")

First, this example illustrates how we apply transformations like map simply by concatenation. This "pipelining" or "catenation" is the natural way to apply transformations to values in eucalypt.

In fact this is simply a function call with the arguments rearranged a bit. In this example, map is a function of two parameters. Its first argument is provided in parentheses and its second argument is the value of what came before.

Note: Users of languages like Elixir or OCaml may recognise an implicit |> operator here. Clojure users may see an invisible threading macro. Note that writing elements next to each other like this gives you the reverse of what you might expect in Haskell or OCaml or Lisp: we write x f not f x.

There is a lot of freedom in eucalypt to express ideas in different ways and develop colourful and cryptic expressions. In a larger or more ambitious language this could be viewed as rope to hang yourself with. Please be careful.

The string template, "eu-west-1{}", actually defines a function of one argument that returns a string. The key ingredients here are:

  • the interpolation syntax "{...}" which allows values to be inserted into the string
  • the (hidden) use of numeric anaphora in the interpolation syntax ({0}, {1}, {2}, ...) which cause the string to define a function, not just sequence of characters
  • the use of the unnumbered anaphor ({}) which is numbered automatically for us, so in this case, {} is a convenient synonym for {0} - the first argument

Note: Anaphora crop up in various contexts in eucalypt and are generally preferable to the full generality of lambdas. If the idea is too complex to be expressed with anaphora, it should generally be explicitly named.

So:

a: 42 "The answer is {0}"

renders as

a: The answer is 42

eucalypt also has expression anaphora and block anaphora

Note: Users of Groovy or Kotlin may recognise an equivalent of the it parameter. Seasoned Lisp hackers are familiar with anaphoric macros. Clojure users will recognise the %, %1, %2 forms from #(...) contexts. Unlike % repeated uses of unnumbered anaphora in eucalypt refer to different parameters. "{}{}" is a two-argument function which concatenates strings.

Back to:

target-zones: ["a", "b", "c"] map("eu-west-1{}")

The whole line is a declaration. Declarations come in several types - this one is a property declaration. A block is written as a sequence of declarations enclosed in braces. For example:

{
  w: "foo" # a string
  x: 3     # a whole number
  y: 22.2  # a floaty number
  z: true  # the truth
}

(The # character introduces a comment which is ignored.)

Unlike YAML, indentation is never significant.

Unlike JSON, commas are not needed to separate declarations. Instead, the eucalypt parser determines the declarations mainly based on the location of colons. You can write:

{ x: 1 increment negate y: 2 }

...and eucalypt knows it's two declarations.

If that's a bit too crazy for you, then feel free to insert the commas. Eucalypt will accept them. Any of these are okay:

ok1: { a: 1 b: 2 c: 3 }
ok2: { a: 1, b: 2, c: 3 }
ok3: { a: 1, b: 2, c: 3, }

Note: Unlike Clojure which makes commas optional by treating them as whitespace, Eucalypt demands that if you are going to put commas in, they have to be in the right place, at the end of declarations. So you can use them if you believe it makes things clearer but you are prevented from using them in ways which would misguide.

Our target-zones property declaration is at the top level so need not be surrounded by braces. Nevertheless it is in a block: the top level block, known as a unit, that is defined by the file that contains it. You can imagine the braces to be there if you like.

As a final point on this example, it is probably worthwhile documenting declarations. eucalypt offers an easy way to do that using declaration metadata which we squeeze in between a leading backtick and the declaration itself:

` "AZs to deploy alien widgets in"
target-zones: ["a", "b", "c"] map("eu-west-1{}")

In fact, all sorts of things can be wedged in there, but if a string appears on its own, it is interpreted as documentation.

Example 2

Let's look at another small example:

character(name): {
  resource-name: name
  created: io.epoch-time
}

prentice: character("Pirate Prentice") {
  laser-colour: "red"
}

slothrop: character("Tyrone Slothrop") {
  eye-count: 7
}

We've introduced a new type of declaration here of the form f(x):. This is a function declaration.

Remember we saw a property declaration earlier. Eucalypt also has operator declarations but we'll ignore those for now.

The function declaration declares a function called character, which accepts a single parameter (name) and returns a block containing two properties.

Functions, like everything else in eucalypt, are declared in and live in blocks but they are left out when output is rendered, so you won't see them in the YAML or JSON that eucalypt produces.

The braces in the definition of character are there to delimit the resulting block - not to define a function body. A function that returned a number would not need them:

inc(x): x + 1 # this defines an increment function

The next important ingredient in this example is block catenation.

Blocks can be treated as functions of a single parameter. When they are applied as functions, the effect is a block merge.

We've already seen that functions can be applied to arguments by concatenation.

So writing one block after another produces a merged block. It contains the contents of the second block merged "on top" of the first.

There is more to be said on block merge, but for now:

{ a: 1 } { b: 2 } evaluates to { a: 1 b: 2 }.

and

{ a: 1 } { a: 2 } evaluates to { a: 2 }.

In our example, the resulting YAML is just:

prentice:
  resource-name: Pirate Prentice
  created: 1526991765
  laser-colour: red

slothrop:
  resource-name: Tyrone Slothrop
  created: 1526991765
  eye-count: 7

As you can see, io.epoch-time evaluates to a unix timestamp.

This metadata is generated once at launch time, not each time the expression is evaluated. eucalypt the language is a pure functional language, and there are no side-effects or non-deterministic functions (although its command line driver can perform all sorts of side-effects as input to the evaluation and as output from the evaluation and there are one or two dirty tricks in the debugging functions). For this reason, prentice and slothrop will have the same timestamps.

Block merge can be a useful means of generating common content in objects. The common content can appear first as in this case, allowing it to be overridden. Or it could be applied second allowing it to override the existing detail. Or a mixture of both. Many more sophisticated means of combining block data are available too.

Note: This merge is similar to the effect of merge keys in YAML, where a special << mapping key causes a similar merge to occur. Not all YAML processors support this and nor does eucalypt at present, but it probably will some day.

Be aware that eucalypt has nothing like virtual functions. The functions in scope when an expression is created are the ones that are applied. So if you redefine an f like this, in an overriding block...

{ f(x): x+1 a: f(2) } { f(x): x-2 }

...the definition of a will not see it.

a: 3

So block merge is only very loosely related to object oriented inheritance. Also by default you only get a shallow merge - deep merges are provided in the standard prelude. It is possible that a deep merge will become the default for block catenation in future.

Many more complicated ways of processing blocks are possible using functions, block anaphora and standard prelude functions.

Quick tour of the command line

On macOS you can install the eu command line tools using Homebrew with:

brew install curvelogic/homebrew-tap/eucalypt

Check the version you are running with:

eu version

eu is intended to be easy to use for common tasks and does its best to allow you to say what you want succinctly. The intention is to be easy to use in pipelines in combination with other tools like jq.

By default, it runs in ergonomic mode which will make a few assumptions in order to allow you to be a little less explicit. It also pulls in user-specific declarations from ~/.eucalypt. For repeatable builds and scripted usage, it is better to turn ergonomic mode off using the -B (--batch) switch.

The simplest usage is to specify a eucalypt file to evaluate and leave the default render format (YAML) and output (standard out) alone.

> eu test.eu

eu with no arguments will generally be taken to specify that input is coming from standard in. So the above is equivalent to:

> cat test.eu | eu

There is an -x switch to control output format explicitly (setting "yaml", "json", "text", "csv" or "eu") but for the very common case of requiring JSON output there is a shortcut:

> eu test.eu -j

You can, of course, redirect standard output to a file but if you specify the output file explicitly (with -o), eu will infer the output format from the extension:

> eu test.eu -o output.json # equivalent to eu test.eu -j > output.json

Small snippets of eucalypt can be passed in directly using the -e switch.

> eu -e '{ a: 8 * 8 }'

The fact that eucalypt makes relatively infrequent use of single quotes makes this straightforward for most shells.

By default, eu evaluates the entirety of the loaded source and uses all of it to render the result, leaving out any function values and other non-renderable content.

It is possible to select just parts of the eucalypt for rendering:

  1. A declaration in the source may be identified as the main target using the :main declaration metadata and we become the part rendered by default.
  2. targets may be defined and named using the :target declaration metadata and those targets can then be specified using the -t option to eu
  3. The -e option can be used in addition to other source file(s) to identify an expression to be rendered (e.g. eu test.eu -e x.y.z)

So eu's ability to read JSON and YAML natively combined with the last options give a simple way to pick values out of structured data which can be very handy for "querying" services that return YAML or JSON data.

> aws s3api list-buckets | eu -e 'Buckets map(lookup(:Name))'

There is much more to this story. For instance eu can:

  • accept several inputs to make definitions in earlier inputs available to subsequent inputs eu test1.eu test2.eu test3.eu
  • accept YAML and JSON files as pure data to be merged in: eu data.yaml tools.eu
  • accept YAML or JSON annotated with eucalypt to execute: eu data.yaml
  • override the default extensions: eu yaml@info.txt
  • automatically use Eufile files in the current folder hierarchy

See CLI Reference for more complete documentation.

What is Eucalypt?

Eucalypt is a tool, and a little language, for generating, templating, rendering and processing structured data formats like YAML, JSON and TOML.

If you use text-based templating to process these formats or you pipe these formats through several different tools or build steps, eucalypt might be able to help you generate your output more cleanly and with fewer cognitive somersaults.

Key Features

  • A concise native syntax for defining data, functions, and operators
  • A simple embedding into YAML files for in-place manipulation (a la templating)
  • Facilities for manipulating blocks (think JSON objects, YAML mappings)
  • String interpolation and regular expressions
  • An ergonomic command line interface with environment variable access
  • Metadata annotations and numerous extension points
  • A prelude of built-in functions acting as a standard library

Supported Formats

Input: YAML, JSON, JSON Lines, TOML, EDN, XML, CSV, plain text, and eucalypt's own .eu syntax.

Output: YAML, JSON, TOML, EDN, or plain text.

When to Use Eucalypt

Eucalypt is a good fit when you need to:

  • Transform data between structured formats (e.g. JSON to YAML)
  • Generate configuration files with shared logic
  • Query and filter structured data from the command line
  • Template YAML or JSON with embedded expressions
  • Build data processing pipelines

Learn More

Quick Start

Installation

On macOS via Homebrew

If you use Homebrew, you can install using:

brew install curvelogic/homebrew-tap/eucalypt

Linux / macOS (install script)

Alternatively, install the latest release binary directly:

curl -sSf https://raw.githubusercontent.com/curvelogic/eucalypt/master/install.sh | sh

This installs to ~/.local/bin. Set EUCALYPT_INSTALL_DIR to override the install location.

Otherwise binaries for macOS are available on the releases page.

On Linux

x86_64 and aarch64 binaries built in CI are available on the releases page.

On Windows

Sorry, haven't got there yet. But you could try installing from source.

From source

You will need a Rust installation and cargo.

Build and install should be as simple as:

cargo install --path .

Testing your installation

eu --version

...prints the version:

eu 0.3.0

...and...

eu --help

...shows command line help:

A functional language for structured data

Usage: eu [OPTIONS] [FILES]... [COMMAND]

Commands:
  run           Evaluate eucalypt code (default)
  test          Run tests
  dump          Dump intermediate representations
  version       Show version information
  explain       Explain what would be executed
  list-targets  List targets defined in the source
  fmt           Format eucalypt source files
  lsp           Start the Language Server Protocol server
  help          Print this message or the help of the given subcommand(s)

Arguments:
  [FILES]...  Files to process (used when no subcommand specified)

Options:
  -L, --lib-path <LIB_PATH>                Add directory to lib path
  -Q, --no-prelude                         Don't load the standard prelude
  -B, --batch                              Batch mode (no .eucalypt.d)
  -d, --debug                              Turn on debug features
  -S, --statistics                         Print metrics to stderr before exiting
      --statistics-file <STATISTICS_FILE>  Write statistics as JSON to a file
  -h, --help                               Print help
  -V, --version                            Print version

Use eu <command> --help for detailed help on each subcommand.

Your first program

Create a file called hello.eu:

greeting: "Hello, World!"

Run it:

eu hello.eu

Output:

greeting: Hello, World!

Try JSON output:

eu hello.eu -j
{"greeting": "Hello, World!"}

Next steps

Eucalypt by Example

This page presents a collection of worked examples showing how eucalypt solves real-world problems. Each example includes the problem, the eucalypt code, and the expected output.

1. Format Conversion: JSON to YAML

Problem: Convert a JSON configuration file to YAML.

echo '{"database": {"host": "db.example.com", "port": 5432}}' | eu

Output:

database:
  host: db.example.com
  port: 5432

Eucalypt reads JSON natively and defaults to YAML output. No code needed.

2. Extracting Fields from API Data

Problem: Given a list of users in JSON, extract just their names.

eu -e 'map(.name)' <<'JSON'
[
  {"name": "Alice", "role": "admin"},
  {"name": "Bob", "role": "user"},
  {"name": "Charlie", "role": "user"}
]
JSON

Output:

- Alice
- Bob
- Charlie

The _ is an expression anaphor -- _.name means "look up name in whatever the argument is".

3. Filtering and Transforming Data

Problem: From a list of products, find those over a price threshold and format them.

# products.eu
products: [
  { name: "Widget" price: 9.99 }
  { name: "Gadget" price: 24.99 }
  { name: "Gizmo" price: 49.99 }
  { name: "Doohickey" price: 4.99 }
]

expensive: products
  filter(.price > 20)
  map(.name str.to-upper)
eu products.eu -e expensive

Output:

- GADGET
- GIZMO

4. Generating Configuration with Shared Defaults

Problem: Generate environment-specific configs that share common defaults.

# config.eu
base: {
  app: "my-service"
  port: 8080
  log-level: "info"
  db: { host: "localhost" port: 5432 }
}

production: base << {
  log-level: "warn"
  db: { host: "prod-db.internal" }
}

staging: base << {
  db: { host: "staging-db.internal" }
}
eu config.eu -e production -j

Output:

{
  "app": "my-service",
  "port": 8080,
  "log-level": "warn",
  "db": {
    "host": "prod-db.internal",
    "port": 5432
  }
}

The << operator deep-merges blocks, so nested keys like db.port are preserved while db.host is overridden.

5. CSV to JSON Conversion

Problem: Read a CSV file and output as a JSON array.

Given people.csv:

name,age,city
Alice,30,London
Bob,25,Manchester
Charlie,35,Edinburgh
eu rows=people.csv -j -e 'rows map(.name)'

Output:

["Alice", "Bob", "Charlie"]

Or to transform the data:

eu rows=people.csv -j -e 'rows map{ name: •0.name age: •0.age num }'

This converts age from string to number (CSV values are always strings).

6. Generating Availability Zone Names

Problem: Generate AWS availability zone names from a list of zone letters.

eu -e '["a", "b", "c"] map("eu-west-2{}")'

Output:

- eu-west-2a
- eu-west-2b
- eu-west-2c

The string "eu-west-2{}" is a function: {} is a string anaphor that takes one argument.

7. Merging Multiple YAML Files

Problem: Combine values from several YAML files into a single output.

Given defaults.yaml:

timeout: 30
retries: 3

Given overrides.yaml:

timeout: 60
debug: true
eu defaults.yaml overrides.yaml

Output:

timeout: 60
retries: 3
debug: true

Later inputs override earlier ones. For a merged view with both available, use named inputs:

eu d=defaults.yaml o=overrides.yaml -e 'd << o'

8. Data Aggregation Pipeline

Problem: Compute summary statistics from structured data.

# sales.eu
sales: [
  { region: "North" amount: 1200 },
  { region: "South" amount: 800 },
  { region: "North" amount: 600 },
  { region: "South" amount: 1500 },
  { region: "East" amount: 900 }
]

` :suppress
amounts: sales map(.amount)
n: sales count

summary: {
  total: amounts sum
  count: n
  average: (amounts sum) / n
  max: amounts max-of
  min: amounts min-of
}
eu sales.eu -e summary

Output:

total: 5000
count: 5
average: 1000
max: 1500
min: 600

9. Querying Deeply Nested Configuration

Problem: Find all port numbers in a complex configuration.

eu -e '{
  web: { host: "0.0.0.0" port: 80 }
  api: { host: "0.0.0.0" port: 8080 }
  db: { host: "localhost" port: 5432 }
  cache: { host: "localhost" port: 6379 }
} deep-query("port")'

Output:

- 80
- 8080
- 5432
- 6379

deep-query recursively searches nested blocks. You can also use wildcards: deep-query("*.port", data) matches ports one level deep, while deep-query("**.port", data) matches at any depth.

10. String Processing: Parsing Log Lines

Problem: Extract timestamps and levels from log lines.

# logs.eu
lines: [
  "2024-03-15 10:30:00 ERROR Connection timeout"
  "2024-03-15 10:30:05 INFO Retry attempt 1"
  "2024-03-15 10:30:10 ERROR Connection timeout"
  "2024-03-15 10:30:15 INFO Connected"
]

` :suppress
parse(line): line str.match-with("(\S+ \S+) (\w+) (.*)") tail

parsed: lines map(parse) map({parts: •}.({
  timestamp: parts first
  level: parts second
  message: parts nth(2)
}))

errors: parsed filter(.level = "ERROR")
eu logs.eu -e errors

Output:

- timestamp: '2024-03-15 10:30:00'
  level: ERROR
  message: Connection timeout
- timestamp: '2024-03-15 10:30:10'
  level: ERROR
  message: Connection timeout

11. Templating CloudFormation Resources

Problem: Generate YAML with custom tags for CloudFormation.

# cfn.eu
resource(type, props): {
  Type: type
  Properties: props
}

resources: {
  MyBucket: resource("AWS::S3::Bucket", {
    BucketName: "my-bucket"
  })
  MyQueue: resource("AWS::SQS::Queue", {
    QueueName: "my-queue"
  })
}

This example shows how a simple function can template repetitive structure.

12. Working with Dates

Problem: Filter events by date and format the output.

# events.eu
events: [
  { name: "Launch" date: t"2024-01-15" }
  { name: "Review" date: t"2024-06-01" }
  { name: "Release" date: t"2024-09-30" }
]

cutoff: t"2024-06-01"

upcoming: events
  filter(.date >= cutoff)
  map(.name)
eu events.eu -e upcoming

Output:

- Review
- Release

The t"..." syntax creates date-time literals that support comparison operators.

13. Generating a Lookup Table

Problem: Build a key-value mapping from two parallel lists.

eu -e '["Alice", 30, "London"] zip-kv[:name, :age, :city]'

Output:

name: Alice
age: 30
city: London

zip-kv pairs up symbols as keys with values to produce a block. Note that zip-kv[:name, :age, :city] is shorthand for zip-kv([:name, :age, :city]) — when a function takes a single list or block argument, the outer parentheses can be omitted.

14. Parameterised Scripts

Problem: Write a reusable script that accepts command-line arguments.

# greet.eu
name: io.args head-or("World")
times: io.args tail head-or("1") num

greetings: repeat("Hello, {name}!") take(times)
eu greet.eu -e greetings -- Alice 3

Output:

- Hello, Alice!
- Hello, Alice!
- Hello, Alice!

Arguments after -- are available via io.args as a list of strings. Use num to convert to numbers.

15. Set Operations: Finding Unique Values

Problem: Find the unique tags across multiple items and compute overlaps.

items: [
  { name: "A" tags: ["fast", "reliable", "cheap"] }
  { name: "B" tags: ["fast", "expensive"] }
  { name: "C" tags: ["reliable", "cheap", "slow"] }
]

` :suppress
tag-sets: items map(.tags set.from-list)

all-tags: tag-sets foldl(set.union, ∅) set.to-list
common-tags: tag-sets foldl(set.intersect, tag-sets head) set.to-list

result: {
  all: all-tags
  common: common-tags
}
eu tags.eu -e result

Output:

all:
- cheap
- expensive
- fast
- reliable
- slow
common: []

Next Steps

Blocks and Declarations

In this chapter you will learn:

  • What blocks are and how they relate to structured data formats
  • The three types of declarations: property, function, and operator
  • How top-level files work as implicit blocks (units)
  • How to annotate declarations with metadata

Blocks

A block is eucalypt's fundamental data structure. It corresponds to a JSON object, a YAML mapping, or a TOML table: an ordered collection of named values.

Blocks are written with curly braces:

person: {
  name: "Alice"
  age: 30
  role: "engineer"
}

Running this file produces:

person:
  name: Alice
  age: 30
  role: engineer

Blocks can be nested:

config: {
  database: {
    host: "localhost"
    port: 5432
  }
  cache: {
    host: "localhost"
    port: 6379
  }
}

Property Declarations

The simplest declaration is a property declaration: a name followed by a colon and an expression.

greeting: "Hello, World!"
count: 42
pi: 3.14159
active: true
nothing: null

These declare names bound to values. The values can be any expression: numbers, strings, booleans, null, lists, blocks, or computed expressions.

Commas are Optional

Declarations can be separated by commas or simply by whitespace. Line endings are not significant. All of these are equivalent:

a: { x: 1 y: 2 z: 3 }
b: { x: 1, y: 2, z: 3 }
c: { x: 1, y: 2, z: 3, }
eu -e '{ x: 1 y: 2 z: 3 }'
x: 1
y: 2
z: 3

Symbols

Symbols are written with a colon prefix and behave like interned strings. They are used as keys and as lightweight identifiers:

status: :active
tag: :important
status: active
tag: important

Function Declarations

Adding a parameter list creates a function declaration:

greet(name): "Hello, {name}!"
double(x): x * 2

message: greet("World")
result: double(21)
message: Hello, World!
result: 42

Functions are not rendered in the output -- only property values appear. Functions can take multiple parameters:

add(x, y): x + y
total: add(3, 4)
total: 7

Operator Declarations

You can define custom infix operators using symbolic names:

(x <+> y): [x, y]
pair: 1 <+> 2
pair:
- 1
- 2

Prefix and postfix unary operators are also possible:

(!! x): x * x
squared: !! 5
squared: 25

Operator precedence and associativity are controlled through metadata annotations (covered below). See the Operators chapter for full details.

Note: While function declarations are namespaced to their block, operators do not have a namespace and are available only where they are in scope.

Units: Top-Level Blocks

The top-level of a .eu file is itself a block, called a unit. It does not need surrounding braces. So this file:

name: "Alice"
age: 30

...is equivalent to a block { name: "Alice" age: 30 } and produces:

name: Alice
age: 30

Comments

Comments start with # and continue to the end of the line:

# This is a comment
name: "Alice"  # inline comment

Declaration Metadata

Metadata can be attached to any declaration by placing it between a leading backtick and the declaration:

` "A friendly greeting"
greeting: "Hello!"

` { doc: "Add two numbers" }
add(x, y): x + y

A bare string is shorthand for documentation metadata.

Some metadata keys activate special behaviour:

  • :suppress -- hides the declaration from output
  • :target -- marks the declaration as an export target
  • :main -- marks the default target
` :suppress
helper(x): x + 1

` { target: :my-output }
output: {
  result: helper(41)
}

Running eu file.eu -t my-output renders only the output block.

Block and Unit Metadata

A single expression may precede the declarations in any block and is treated as metadata for that block. At the top level of a file (the unit), this means the first item, if it is an expression rather than a declaration, becomes metadata for the entire unit:

{ doc: "Configuration generator" }

host: "localhost"
port: 8080

Scope and Visibility

Names declared in a block are visible within that block and in any nested blocks:

x: 99
inner: {
  y: x + 1  # x is visible here
}
x: 99
inner:
  y: 100

Names in nested blocks can shadow outer names:

x: 1
inner: {
  x: 2
  y: x  # refers to inner x
}
x: 1
inner:
  x: 2
  y: 2

Warning: Be careful with self-reference. Writing name: name inside a block creates an infinite recursion, because the declaration name refers to itself. This is true regardless of whether name is defined in an outer scope.

Key Concepts

  • Blocks are ordered collections of named values (like JSON objects or YAML mappings)
  • Property declarations bind a name to a value
  • Function declarations bind a name to a function (not rendered in output)
  • Operator declarations define custom infix, prefix, or postfix operators
  • Metadata annotations control export, documentation, and other special behaviour
  • The top-level file is a unit: an implicit block without braces

Expressions and Pipelines

In this chapter you will learn:

  • The primitive value types in eucalypt
  • How function application works via catenation (pipelining)
  • How partial application and currying work
  • How to compose pipelines of transformations

Primitive Values

Eucalypt has the following primitive types:

TypeExamplesNotes
Numbers42, -7, 3.14Integers and floats
Strings"hello", "it's"Double-quoted only
Symbols:name, :activeColon-prefixed identifiers
Booleanstrue, false
NullnullRenders as YAML ~ or JSON null

Lists

Lists are comma-separated values in square brackets (unlike in blocks, commas are required):

numbers: [1, 2, 3, 4, 5]
mixed: [1, "two", :three, true]
nested: [[1, 2], [3, 4]]
empty: []

Calling Functions

Functions can be called by placing arguments in parentheses directly after the function name (with no intervening space):

add(x, y): x + y
result: add(3, 4)
result: 7

Catenation: The Pipeline Style

One distinctive feature of eucalypt is catenation: applying a function by writing the argument before the function name, separated by whitespace.

result: 5 inc

This is equivalent to inc(5) and produces 6.

Catenation lets you chain operations into readable pipelines:

eu -e '[1, 2, 3, 4, 5] reverse head'
5

Each step in the pipeline passes its result to the next function. You can read it left to right: "take the list, reverse it, take the head."

Combining Catenation with Arguments

When a function takes multiple arguments, you can supply some in parentheses and the rest via catenation. The catenated value becomes the last argument:

result: [1, 2, 3] map(inc)

Here map takes two arguments: a function and a list. inc is provided in parentheses and [1, 2, 3] is provided by catenation. The result is [2, 3, 4].

This is the standard eucalypt pattern for data processing pipelines:

eu -e '[1, 2, 3, 4, 5] filter(> 3) map(* 10)'
- 40
- 50

Currying and Partial Application

All functions in eucalypt are curried: if you provide fewer arguments than a function expects, you get back a partially applied function.

add(x, y): x + y
add-five: add(5)
result: add-five(3)
result: 8

Curried application also works with multi-argument calls:

f(x, y, z): x + y + z

a: f(1, 2, 3)   # all at once
b: f(1)(2)(3)    # one at a time
c: f(1, 2)(3)    # mixed

All three produce 6.

Lookup: The Dot Operator

The dot operator (.) accesses a named property within a block:

person: { name: "Alice" age: 30 }
name: person.name
person:
  name: Alice
  age: 30
name: Alice

Lookups can be chained:

config: { db: { host: "localhost" port: 5432 } }
host: config.db.host
config:
  db:
    host: localhost
    port: 5432
host: localhost

Warning: The dot operator binds very tightly (precedence 90). Writing list head.name is parsed as list (head.name), not (list head).name. Use explicit parentheses when combining lookup with catenation: (list head).name.

The (up arrow) prefix operator, which is shorthand for head, binds even tighter (precedence 95). So ↑xs.name means (↑xs).name.

"Juxtaposed" call syntax

When a function is passed only a single list argument or a single block argument, it is possible to omit the outer parentheses for brevity:

result: f[1, 2, 3] ∧ g{a: 1 b: 2}

The "juxtaposition" refers to the resulting feature that directly placing a function together with any form of brackets (with no intervening whitespace) is now call syntax - whereas using an intervening space is a pipeline syntax.

Juxtaposed call syntax: f(x), f[x], f{x}.

Pipeline syntax: x f, [x] f, {x} f.

Generalised Lookup (or "block-dot" notation)

Lookup can be generalised: any expression after the dot is evaluated in the context of the block to the left.

point: { x: 3 y: 4 }
sum: point.(x + y)
pair: point.[x, y]
label: point."{x},{y}"
point:
  x: 3
  y: 4
sum: 7
pair:
- 3
- 4
label: 3,4

This is particularly useful for creating temporary scopes:

result: { a: 10 b: 20 }.(a * b)
result: 200

"Block-dot" syntax is particularly significant in monadic blocks (see Monads and the monad() Utility).

Building Pipelines

Combining catenation, partial application, and the standard prelude creates powerful data processing pipelines:

eu -e '["alice", "bob", "charlie"] map(str.to-upper) filter(str.matches?("^[AB]"))'
- ALICE
- BOB

A more complete example:

people: [
  { name: "Alice" age: 30 }
  { name: "Bob" age: 25 }
  { name: "Charlie" age: 35 }
]

over-thirty: people filter(.age > 30) map(.name)
people:
- name: Alice
  age: 30
- name: Bob
  age: 25
- name: Charlie
  age: 35
over-thirty:
- Charlie

The then Function

The then function provides a pipeline-friendly conditional:

eu -e '5 > 3 then("yes", "no")'
yes

It is equivalent to if with the condition as the last argument, making it natural in pipelines:

result: [1, 2, 3] count (> 2) then("many", "few")
result: many

Key Concepts

  • Catenation applies a function by writing the argument before the function name: 5 inc means inc(5)
  • Pipelines are built by chaining catenation: data f g h
  • Functions are curried: partial application is automatic
  • The dot operator looks up properties: block.key
  • Generalised lookup evaluates expressions in a block's scope: block.(expr)
  • Combine these techniques for concise data processing pipelines

Lists and Transformations

In this chapter you will learn:

  • How to create and deconstruct lists
  • The core list operations: map, filter, foldl, foldr
  • Other useful list functions from the prelude
  • How to combine list operations into pipelines

Creating Lists

Lists are written with square brackets and commas:

numbers: [1, 2, 3, 4, 5]
strings: ["hello", "world"]
empty: []
nested: [[1, 2], [3, 4]]

Basic List Operations

head and tail

head returns the first element; tail returns everything after it:

eu -e '[10, 20, 30] head'
10
eu -e '[10, 20, 30] tail'
- 20
- 30

Use head-or to provide a default for empty lists:

eu -e '[] head-or(0)'
0

first and second

first is an alias for head. second returns the second element:

eu -e '[:a, :b, :c] second'
b

cons

cons prepends an element to a list:

eu -e 'cons(0, [1, 2, 3])'
- 0
- 1
- 2
- 3

nil?

Test whether a list is empty:

eu -e '[] nil?'
true

count

Count the elements:

eu -e '[10, 20, 30] count'
3

Transforming Lists

map

Apply a function to every element:

eu -e '[1, 2, 3] map(inc)'
- 2
- 3
- 4
eu -e '[1, 2, 3] map(* 10)'
- 10
- 20
- 30

filter

Keep only elements satisfying a predicate:

eu -e '[1, 2, 3, 4, 5, 6] filter(> 3)'
- 4
- 5
- 6

remove

The opposite of filter -- remove elements satisfying the predicate:

eu -e '[1, 2, 3, 4, 5] remove(> 3)'
- 1
- 2
- 3

Folding

Folds reduce a list to a single value by applying a binary function across all elements.

foldl

Left fold: foldl(op, init, list) applies op from the left:

eu -e 'foldl(+, 0, [1, 2, 3, 4, 5])'
15

foldr

Right fold: foldr(op, init, list) applies op from the right:

eu -e 'foldr(++, [], [[1, 2], [3, 4], [5]])'
- 1
- 2
- 3
- 4
- 5

Slicing

take and drop

eu -e '[1, 2, 3, 4, 5] take(3)'
- 1
- 2
- 3
eu -e '[1, 2, 3, 4, 5] drop(3)'
- 4
- 5

take-while and drop-while

eu -e '[1, 2, 3, 4, 5] take-while(< 4)'
- 1
- 2
- 3

Combining Lists

append and ++

eu -e '[1, 2] ++ [3, 4]'
- 1
- 2
- 3
- 4

concat

Flatten a list of lists:

eu -e 'concat([[1, 2], [3], [4, 5]])'
- 1
- 2
- 3
- 4
- 5

mapcat

Map then flatten (also known as flatMap or concatMap):

eu -e '["ab", "cd"] mapcat(str.letters)'
- a
- b
- c
- d

Checking Lists

all-true? and any-true?

eu -e '[true, true, false] all-true?'
false
eu -e '[true, true, false] any-true?'
true

all and any

Test with a predicate:

eu -e '[2, 4, 6] all(> 0)'
true
eu -e '[1, 2, 3] any(zero?)'
false

Reordering

reverse

eu -e '[:a, :b, :c] reverse'
- c
- b
- a

zip-with

Combine two lists element by element:

eu -e 'zip-with(+, [1, 2, 3], [10, 20, 30])'
- 11
- 22
- 33

zip-with and pair to create blocks

eu -e 'zip-with(pair, [:x, :y, :z], [1, 2, 3]) block'
x: 1
y: 2
z: 3

Infinite Lists

Eucalypt supports lazy evaluation, so you can work with infinite lists:

eu -e 'repeat(:x) take(4)'
- x
- x
- x
- x

Use take to extract a finite portion.

Sorting

qsort

Sort with a comparison function:

eu -e '[5, 3, 1, 4, 2] qsort(<)'
- 1
- 2
- 3
- 4
- 5

sort-nums

A convenience for sorting numbers in ascending order:

eu -e '[30, 10, 20] sort-nums'
- 10
- 20
- 30

Putting It Together

Here is a more complete example combining multiple list operations:

data: [
  { name: "Alice" score: 85 }
  { name: "Bob" score: 92 }
  { name: "Charlie" score: 78 }
  { name: "Diana" score: 95 }
]

top-scorers: data
  filter(.score >= 90)
  map(.name)
data:
- name: Alice
  score: 85
- name: Bob
  score: 92
- name: Charlie
  score: 78
- name: Diana
  score: 95
top-scorers:
- Bob
- Diana

Key Concepts

  • Lists are created with [...] and can be heterogeneous
  • map, filter, and foldl/foldr are the core transformation functions
  • take, drop, reverse, append (++), and concat reshape lists
  • all, any, all-true?, and any-true? test list conditions
  • Lazy evaluation allows working with infinite lists via repeat
  • qsort sorts with a custom comparator; sort-nums sorts numbers

Strings and Text

In this chapter you will learn:

  • The two string literal types (raw and c-strings)
  • How to embed expressions in strings using {...} syntax
  • How strings with anaphora become functions
  • Format specifiers for controlling output
  • The string functions available in the str namespace

String Literal Types

Eucalypt has two kinds of string literal: raw strings and c-strings.

Raw Strings

A plain double-quoted string is a raw string — backslashes are literal characters with no escape processing. This is convenient for regular expression usage.

greeting: "Hello, World!"
path: "C:\Users\alice\docs"
regex: "^\d+\.\d+"

The r"..." prefix is equivalent and can be used for clarity when the string contains backslashes:

path: r"C:\Users\alice\docs"
regex: r"^\d+\.\d+"

Raw strings support interpolation with {...}. Use {{ and }} for literal braces.

C-Strings (c"...")

If you require C-style escapes, you can use C-strings:

EscapeMeaning
\nNewline
\tTab
\rCarriage return
\\Literal backslash
\"Literal quote
\{, \}Literal braces
\xHHHex byte
\uHHHHUnicode code point
\UHHHHHHHHExtended Unicode
multiline: c"first line\nsecond line"

C-strings also support interpolation with {...}.

Basic Interpolation

Embed any expression inside a string using curly braces:

name: "World"
greeting: "Hello, {name}!"
name: World
greeting: Hello, World!

Interpolation braces accept names and dotted lookups. To use a computed value, bind it to a name first:

x: 3
y: 4
sum: x + y
result: "{x} + {y} = {sum}"
x: 3
y: 4
sum: 7
result: 3 + 4 = 7

Nested Lookups in Interpolation

You can use dotted paths inside interpolation:

data: { foo: { bar: 99 } }
label: "{data.foo.bar}"
data:
  foo:
    bar: 99
label: '99'

Note: Interpolation braces accept names and dotted lookups, but not arbitrary eucalypt expressions. If you need a computed value, bind it to a name first:

sum: x + y
result: "{sum}"

Escaping Braces

To include a literal brace in a string, double it:

example: "Use {{braces}} for interpolation"
example: Use {braces} for interpolation

This is also needed in regular expressions within interpolated strings:

pattern: "01234" str.match-with("\d{{4}}")

Format Specifiers

Add a format specifier after a colon inside the interpolation braces. These use printf-style format codes:

pi: 3.14159
formatted: "{pi:%.2f}"
padded: "{42:%06d}"
pi: 3.14159
formatted: '3.14'
padded: '000042'

String Anaphora

When a string contains {} (empty braces) or {0}, {1}, etc., the string literal actually defines a function rather than a plain string value:

eu -e '["a", "b", "c"] map("item: {}")'
- 'item: a'
- 'item: b'
- 'item: c'

Numbered anaphora control argument order:

reverse-pair: "{1},{0}"
result: reverse-pair(:a, :b)
result: b,a

You can mix named references and anaphora:

prefix: "Hello"
greet: "{prefix} {}!"
result: greet("World")
prefix: Hello
result: Hello World!

String Functions

The str namespace contains functions for working with strings.

Conversion

eu -e '42 str.of'
'42'

Case Conversion

eu -e '"hello" str.to-upper'
HELLO
eu -e '"GOODBYE" str.to-lower'
goodbye

Splitting and Joining

Split a string on a pattern:

eu -e '"one-two-three" str.split-on("-")'
- one
- two
- three

Join a list of strings:

eu -e '["a", "b", "c"] str.join-on(", ")'
a, b, c

Prefix and Suffix

eu -e '"world" str.prefix("hello ")'
hello world
eu -e '"hello" str.suffix("!")'
hello!

Characters and Letters

eu -e '"hello" str.letters'
- h
- e
- l
- l
- o
eu -e '"hello" str.letters count'
5

Regular Expressions

Testing a Match

eu -e '"hello" str.matches?("^h.*o$")'
true

Extracting Matches

str.match-with returns the full match and capture groups:

eu -e '"192.168.0.1" str.match-with("(\d+)[.](\d+)[.](\d+)[.](\d+)") tail'
- '192'
- '168'
- '0'
- '1'

str.matches-of returns all occurrences of a pattern:

eu -e '"192.168.0.1" str.matches-of("\d+")'
- '192'
- '168'
- '0'
- '1'

Base64 and SHA-256

eu -e '"hello" str.base64-encode'
aGVsbG8=
eu -e '"hello" str.sha256'
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

Practical Examples

Generating URLs

base: "https://api.example.com"
endpoints: ["users", "posts", "comments"] map("{base}/{}")
base: https://api.example.com
endpoints:
- https://api.example.com/users
- https://api.example.com/posts
- https://api.example.com/comments

Formatting a Table

rows: [
  { name: "Alice" score: 85 }
  { name: "Bob" score: 92 }
]

` :suppress
format-row(r): "{r.name:%10s} | {r.score:%3d}"

table: rows map(format-row)

Key Concepts

  • Interpolation uses {expression} inside double-quoted strings
  • Empty braces {} and numbered braces {0}, {1} create string functions (anaphora)
  • Format specifiers follow a colon: {value:%06d}
  • Escape literal braces by doubling them: {{ and }}
  • The str namespace provides splitting, joining, case conversion, matching, and more

Functions and Combinators

In this chapter you will learn:

  • How to define and call functions
  • How destructuring parameters work
  • How currying and partial application work
  • The standard combinators: identity, const, compose, flip, and forward composition with ;
  • How to build functions from other functions without lambdas

Defining Functions

A function declaration has a parameter list in parentheses:

square(x): x * x
add(x, y): x + y
greet(name, greeting): "{greeting}, {name}!"

Functions can be called with arguments in parentheses:

a: square(5)
b: add(3, 4)
c: greet("Alice", "Hello")
a: 25
b: 7
c: Hello, Alice!

Or via catenation (see Expressions and Pipelines):

a: 5 square
b: 4 add(3)

Destructuring Parameters

Function parameters can be destructuring patterns that extract structure from an argument inline, binding its components as named variables in the function body.

Block destructuring

A block pattern binds named fields from a block argument. Shorthand form binds the field name directly:

sum-of-point({x y}): x + y

p: { x: 3 y: 4 }
result: sum-of-point(p)
result: 7

A rename form binds a field under a different local name, using a colon between the field name and the binding name:

scaled({x: a  y: b}, scale): a * scale + b * scale

result: scaled({x: 2  y: 3}, 10)
result: 50

Shorthand and rename can be mixed freely:

describe({x  y: height}): "x={x} h={height}"

result: describe({x: 1  y: 5})
result: x=1 h=5

Fixed-length list destructuring

A list pattern binds positional elements from a list argument:

add-pair([a, b]): a + b

result: add-pair([10, 20])
result: 30

Multiple elements at any position are supported:

third([a, b, c]): c

result: third([1, 2, 3])
result: 3

Head/tail list destructuring

A head/tail pattern separates a list into its first element and the remaining list. A colon inside square brackets separates the fixed elements (heads) from the tail binding:

first-of([x : xs]): x
rest-of([x : xs]): xs

a: first-of([1, 2, 3])
b: rest-of([1, 2, 3])
a: 1
b: [2, 3]

Multiple heads are separated by commas before the colon:

sum-first-two([a, b : rest]): a + b

result: sum-first-two([10, 20, 30])
result: 30

Juxtaposed call syntax

When a function takes a block or list argument, you may call it by placing the block or list immediately after the function name with no space:

sum-xy({x y}): x + y

a: sum-xy{x: 3 y: 4}    # same as sum-xy({x: 3 y: 4})
a: 7

Similarly for list arguments:

add-pair([a, b]): a + b

b: add-pair[10, 20]      # same as add-pair([10, 20])
b: 30

Combined with block destructuring, juxtaposed calls give named arguments as an emergent pattern — no extra language concept needed:

greet({name greeting}): "{greeting}, {name}!"

result: greet{name: "Alice" greeting: "Hello"}
result: Hello, Alice!

Juxtaposed definition syntax

The juxtaposed bracket syntax also works on the definition side. Writing the bracket or brace directly against the function name (no space) is sugar for the parenthesised destructuring form:

# These pairs are equivalent:
add-pair[a, b]: a + b         # sugar for add-pair([a, b]): a + b
add-block{x y}: x + y        # sugar for add-block({x y}): x + y
my-head[h : t]: h             # sugar for my-head([h : t]): h

The cons operator

The operator (U+2016, DOUBLE VERTICAL LINE) prepends a single element to a list. It is right-associative, so chains build lists left-to-right without parentheses:

a: 1 ‖ [2, 3]          # [1, 2, 3]
b: 1 ‖ 2 ‖ [3]         # [1, 2, 3]
c: 1 ‖ []              # [1]
a: [1, 2, 3]
b: [1, 2, 3]
c: [1]

The precedence of (55) is between comparison (50) and arithmetic (75), so it binds more tightly than comparisons but less tightly than addition or multiplication.

Mixing patterns

Normal parameters and destructuring patterns can be combined in any order:

weighted-sum(w, [a, b, c]): w * a + w * b + w * c

result: weighted-sum(2, [1, 3, 5])
result: 18

Multiple destructuring parameters are also allowed:

combine({x}, [a, b]): x + a + b

result: combine({x: 10}, [3, 7])
result: 20

Functions are Values

Functions are first-class values. They can be passed as arguments, returned from other functions, and stored in blocks:

apply-twice(f, x): x f f
result: apply-twice(inc, 5)
result: 7
ops: {
  double: * 2
  negate: 0 -
}

result: 5 ops.double ops.negate
result: -10

Currying

All functions are automatically curried. Providing fewer arguments than expected returns a partially applied function:

add(x, y): x + y

add-five: add(5)      # partially applied
result: add-five(3)   # completes the application
result: 8

This is particularly useful with map and filter:

multiply(x, y): x * y
triple: multiply(3)

results: [1, 2, 3] map(triple)
results:
- 3
- 6
- 9

Sections

Operators can be partially applied too. When an operator has a missing operand, eucalypt fills in an implicit parameter:

eu -e '[1, 2, 3] map(+ 10)'
- 11
- 12
- 13

Here + 10 is a section: a function that adds 10 to its argument. Similarly:

eu -e '[1, 2, 3, 4, 5] filter(> 3)'
- 4
- 5

Sections can be used as standalone values:

add: +
sub: -

result: add(2, 3)
diff: sub(8, 3)
result: 5
diff: 5

Passing operators to higher-order functions

Operators can be passed as arguments using their operator name or as a section in parentheses:

total: foldl(+, 0, [1, 2, 3, 4, 5])
total: 15

Standard Combinators

The prelude provides several fundamental combinators.

identity

Returns its argument unchanged:

eu -e '42 identity'
42

const

Returns a function that always produces the given value:

eu -e ':x const(99)'
99

Useful for replacing every element with a fixed value:

eu -e '[1, 2, 3] map(const(:done))'
- done
- done
- done

compose and

Compose two functions: compose(f, g) produces a function that applies g first, then f:

eu -e '1 compose(zero?, dec)'
true

The operator is an infix form:

eu -e '(str.prefix("<") ∘ str.suffix(">"))("x")'
<x>

Forward composition ;

The ; operator composes in the other direction: f ; g applies f first, then g. This reads naturally in pipelines — left to right:

eu -e '"hello" (str.letters ; count)'
5

Forward composition is often simpler than because it follows the data flow:

` :suppress
shout: str.to-upper ; str.suffix("!")

result: "hello" shout
result: HELLO!

Use ; when building a pipeline from smaller steps:

eu -e '[3, 1, 4, 1, 5] map(inc ; (* 2))'
- 8
- 4
- 10
- 4
- 12

flip

Swap the first two arguments of a function:

eu -e 'flip(-, 1, 3)'
2

flip is useful for adapting functions to a pipeline:

` :suppress
with-tags: merge flip ({ tags: [:a, :b] })

result: { name: "foo" } with-tags

complement

Negate a predicate:

eu -e '0 complement(zero?)'
false

apply

Apply a function to a list of arguments:

eu -e 'apply(+, [3, 4])'
7

uncurry

Convert a curried function to one that takes a pair (two-element list):

eu -e 'uncurry(+)([3, 4])'
7

curry

The inverse of uncurry -- convert a function expecting a pair to a curried function:

eu -e 'curry(first)("a", "b")'
a

Building Functions without Lambdas

Eucalypt does not have a lambda syntax. Instead, you build functions from:

  1. Named functions -- the clearest approach
  2. Partial application -- add(5), * 2
  3. Sections -- (+ 1), (> 0)
  4. Composition -- f ∘ g or g ; f
  5. Anaphora -- _ + 1, _0 * _0 (see next chapters)

These compose naturally:

` :suppress
process: filter(> 0) ∘ map(dec)

result: [3, 1, 0, 5, 2] process

Practical Example: Transforming Data

people: [
  { name: "alice", age: 30 },
  { name: "bob", age: 25 },
  { name: "charlie", age: 35 }
]

` :suppress
format(p): "{p.name}: age {p.age}"

directory: people
  filter(.age >= 30)
  map(format)
people:
- name: alice
  age: 30
- name: bob
  age: 25
- name: charlie
  age: 35
directory:
- 'alice: age 30'
- 'charlie: age 35'

Key Concepts

  • Functions are first-class values
  • All functions are curried: partial application is automatic
  • Sections give partial application for operators: (+ 1), (> 3)
  • Combinators like identity, const, compose (), forward-compose (;), and flip build new functions from existing ones
  • Prefer named functions for anything complex; use partial application and sections for simple cases

Operators

In this chapter you will learn:

  • How to define custom nullary, binary, prefix, and postfix operators
  • How precedence and associativity work
  • How to control operator behaviour with metadata
  • The built-in operators provided by the prelude

Defining Binary Operators

A binary operator is declared by writing the operand names and the operator symbol in parentheses:

(x <+> y): x + y + 1

result: 3 <+> 4
result: 8

Operator names use symbolic characters: +, -, *, /, <, >, |, &, !, @, #, ~, ^, and any Unicode symbol or punctuation characters.

Nullary Operators

A nullary operator takes no operands — it is a constant written as a symbol:

(∅): '__SET.EMPTY'
result: ∅ set.to-list
result: []

The empty set is the only nullary operator in the standard prelude, but you can define your own:

(★): 42
answer: ★

Prefix and Postfix Operators

Prefix operators have the operator before the operand:

(¬ x): not(x)
result: ¬ true
result: false

Postfix operators have the operator after the operand:

(x !!): x * x
result: 5 !!
result: 25

Precedence

Without precedence rules, operator expressions would be ambiguous. In eucalypt, precedence determines which operators bind more tightly.

The prelude defines the standard precedence levels:

LevelNameOperators
95prefix (head)
90lookup.
88bool-unary!, ¬
85exp^, !! (nth)
80prod*, /, ÷, %
75sum+, -
60shift(shift operators)
55bitwise(bitwise operators)
50cmp<, >, <=, >=
45append++, <<
40eq=, !=
35bool-prod&&,
30bool-sum||,
20cat(catenation)
10apply@
5meta//, //=, //=>, //=?, //!, //!!

Higher numbers bind more tightly:

eu -e '1 + 2 * 3'
7

Because * (precedence 80) binds tighter than + (precedence 75), this is parsed as 1 + (2 * 3), not (1 + 2) * 3.

Associativity

When the same operator (or operators at the same precedence) appear in sequence, associativity determines the grouping.

  • Left-associative: 1 - 2 - 3 = (1 - 2) - 3 = -4
  • Right-associative: a -> b -> c = a -> (b -> c)

Most arithmetic and comparison operators are left-associative.

Setting Precedence and Associativity

Use declaration metadata to control your operator's precedence and associativity:

` { associates: :left
    precedence: :sum }
(x +++ y): x + y

` { associates: :right
    precedence: :prod }
(x *** y): x * y

Precedence can be specified as:

  • A named level: :sum, :prod, :exp, :cmp, :eq, :bool-prod, :bool-sum, :append, :map, :bool-unary, :cat, :apply, :meta, :shift, :bitwise
  • A numeric value: any integer (higher binds tighter)

Associativity can be :left, :right, or omitted (defaults to :left).

The Assertion Operators

Two special operators are provided for testing:

//= (assert equals)

Asserts that the left side equals the right side at runtime, and returns the value if true. Panics if false:

result: 2 + 2 //= 4

//=> (assert equals with metadata)

Like //= but also attaches the assertion as metadata:

checked: 2 + 2 //=> 4

Both are useful for embedding tests and sanity checks in code.

The Metadata Operator //

Attach metadata to any value:

tagged: 42 // { note: "the answer" }

The metadata can be retrieved with meta:

note: meta(tagged).note
note: the answer

See Advanced Topics for more on metadata.

The Deep Merge Operator <<

Deep merge combines two blocks, recursively merging nested blocks:

base: { a: { x: 1 y: 2 } b: 3 }
overlay: { a: { y: 9 z: 10 } }
result: base << overlay
base:
  a:
    x: 1
    y: 2
  b: 3
overlay:
  a:
    y: 9
    z: 10
result:
  a:
    x: 1
    y: 9
    z: 10
  b: 3

The Append Operator ++

Concatenate two lists:

eu -e '[1, 2] ++ [3, 4]'
- 1
- 2
- 3
- 4

Dot Sections

The dot operator can be used as a section to create lookup functions:

eu -e '[{x: 1}, {x: 2}, {x: 3}] map(.x)'
- 1
- 2
- 3

Idiot Brackets

Eucalypt lets you define custom Unicode bracket pairs that wrap and transform an expression. These are called idiot brackets (inspired by idiom brackets from applicative functor notation, but they are a general bracket overloading mechanism).

⌈ x ⌉: x * 2

doubled: ⌈ 3 + 4 ⌉
doubled: 14

The declaration ⌈ x ⌉: body defines a function named ⌈⌉ that takes one argument. Using ⌈ expr ⌉ in an expression calls that function with expr.

Any of the built-in Unicode bracket pairs can be used:

OpenCloseName
Mathematical white square brackets
Mathematical angle brackets
Mathematical double angle brackets
Ceiling brackets
Floor brackets
«»French guillemets

(and several others — see the syntax reference for the full list.)

Idiot brackets can also be given a monadic interpretation for sequencing — see Monads and the monad() Utility for details on bracket pair definitions with :monad metadata.

Key Concepts

  • Operators are declared with symbolic names in parentheses: (x op y):, (op x):, (x op):
  • Precedence controls binding strength; higher numbers bind tighter
  • Associativity determines grouping for equal-precedence operators
  • Use metadata to set precedence and associates on custom operators
  • The prelude provides standard arithmetic, comparison, boolean, and utility operators

Anaphora (Implicit Parameters)

Eucalypt doesn't have a lambda syntax in itself and prefers to encourage other approaches in most cases where you would use a lambda.

  • named functions
  • function values from composites, combinators, partials
  • anaphoric expressions, blocks or strings

However, through the combination of two Eucalypt features, namely block anaphora and generalised lookup, you can express arbitrary lambdas as we'll see below.

The various alternatives are considered one by one.

Named functions

Very likely, the clearest way to square a list of numbers is to map an explicitly named square function across it.

square(x): x * x
squares: [1, 2, 3] map(square) //=> [1, 4, 9]

The drawbacks of this are:

  • polluting a namespace with a name that is needed only once
  • arguably, a slightly tedious verbosity

The first can be dealt with as follows:

squares: { square(x): x * x }.([1, 2, 3] map(square)) //=> [1, 4, 9]

This exploits a feature called generalised lookup.

Why "generalised lookup"? In the simple case below, the dot signifies the "lookup" of key a in the block preceding the dot:

x: { a: 3 b: 4 }.a //=> 3

We can generalise this by allowing arbitrary expressions in place of the a by evaluating the expression after the dot in the context of the namespace introduced by the block to the left.

x: { a: 3 b: 4 }.(a + b) //=> 7

It works for any expression after the dot:

x: { a: 3 b: 4 }.[a, b] //=> [3, 4]
y: { a: 3 b: 4 }.{ c: a + b } //=> { c: 7 }
z: { a: 3 b: 4 }."{a} and {b}" //=> "3 and 4"

Warning: This is very effective for short and simple expressions but quickly gets very complicated and hard to understand if you use it too much. Nested or iterated generalised lookups are usually a bad idea.

In the squares example above, generalised lookup is used to restrict the scope in which square is visible right down to the only expression which needs it.

However in the case of a simple expression like the squaring example, a neater approach is to use expression anaphora.

Expression Anaphora

Any expression can become a function by referring to implicit parameters known as expression anaphora.

These parameters are called _0, _1 _2, and so on. There is also an unnumbered anaphor, _, which we'll come back to.

Just referring to these parameters is enough to turn an expression into a lambda.

So an expression that refers _0 and _1 actually defines a function accepting two parameters:

xs: zip-with(f, [1, 2, 3], [1, 2, 3]) //=> [3, 6, 9]

# or more succinctly
xs: zip-with(_0 + 2 * _1, [1, 2, 3], [1, 2, 3]) //=> [3, 6, 9]

Warning: Anaphora are intended for use in simple cases where they are readable and readily understood. The scope of the implicit parameters is not easy to work out in complicated contexts. (It does not extend past catenation or commas in lists or function application tuples.) Anaphoric expressions are not, and not intended to be, a fully general lambda syntax. Unlike explicit lambda constructions, you cannot nest anaphoric expressions.

squares: [1, 2, 3] map(_0 * _0) //=> [1, 4, 9]

In cases where the position of the anaphora in the expression matches the parameter positions in the function call, you can omit the numbers. So, for instance, _0 + _1 can simply be written _ + _, and _0 * _1 + x * _2 can be written _ * _ + x * _.

Each _ represents a different implicit parameter, which is why we had to write _0 * _0 in our squares example - it was important that the same parameter was referenced twice.

Sometimes you need explicit parentheses to clarify the scope of expression anaphora:

block: { a: 1 b: 2 }

x: block (_.a) //=> 1
y: block lookup(:a) //=> 1
#
# BUT NOT: block _.a
#

Sections

Even more conciseness is on offer in some cases where the anaphora can be entirely omitted. Eucalypt will automatically insert anaphora when it detects gaps in an expression based on its knowledge of an operator's type.

So it will automatically read (1 +) as (1 + _), for example, defining a function of one parameter. Or (*) as (_ * _), defining a function of two parameters. The parentheses may not even be necessary to delimit the expression:

x: foldl(+, 0, [1, 2, 3]) = 6

Again, use of sections is recommended only for short expressions or where the intention is obvious. This level of terseness can lead to baffling code if abused.

Block Anaphora

Expression anaphora are scoped by an expression which is roughly defined as something within parentheses or something which can be the right hand side of a declaration.

Sometimes however you would like to define a block-valued function. Imagine you wanted a two-parameter function which placed the parameters in a block with keys x and y:

f(x, y): {x: x y: y }

An attempt to define this using expression anaphora would fail. This defines a block with two identity functions:

f: {x: _ y: _ }

Instead, you can use block anaphora which are scoped by the block that contains them.

The block anaphora are named •0, •1, •2 with a special unnumbered anaphor , playing the same role as _ does for expression anaphora.

is the BULLET character (usually Option-8 on a Mac but you may find other convenient ways to type it). The slightly awkward character is chosen firstly because it looks like a hole and therefore makes sense as a placeholder, and secondly to discourage overuse of the feature...

The following defines the function we want:

f: { x: • y: • }

...and can, of course, be used:

x: [[1, 2], [3, 4], [5, 6]] map({ x: • y: • } uncurry)

Pseudo-lambdas

Astute observers may realise that by combining generalised lookup and block anaphora you end up with something that's not a million miles away from a lambda syntax:

f: { x: • y: • }.(x + y)

Indeed this does allow declaration of anonymous functions with named parameters and can occasionally be useful but it still falls short of a fully general lambda construction because it cannot (at least for now) be nested. It is to be regarded as a stylistic anti-pattern, use alternatives where available.

String Anaphora

Analogously, Eucalypt's string interpolation syntax allows the use of anaphora {0}, {1}, {2} and the unnumbered {} to define functions which return strings.

x: [1, 2, 3] map("#{}") //=> ["#1", "#2", "#3"]

Summary

There are lots of ways to define functions but the clearest is just defining them with names using function declarations and for anything even slightly complicated this should be the default. The only things you should be tempted to define on the spot are things that are simple enough that the various species of anaphora can handle them neatly.

Block Manipulation

In this chapter you will learn:

  • How to merge blocks with catenation and merge
  • How to inspect, transform, and restructure blocks
  • Key prelude functions for working with blocks
  • Patterns for building and modifying configuration data

Block Merge by Catenation

When two blocks appear next to each other, the result is a shallow merge. Later values override earlier ones:

eu -e '{ a: 1 } { b: 2 }'
a: 1
b: 2
eu -e '{ a: 1 } { a: 2 }'
a: 2

This is the same as calling the merge function:

eu -e 'merge({ a: 1 }, { b: 2 })'
a: 1
b: 2

Deep Merge

Use deep-merge or the << operator for recursive merging of nested blocks:

base: { server: { host: "localhost" port: 8080 } }
override: { server: { port: 9090 debug: true } }
config: base << override
base:
  server:
    host: localhost
    port: 8080
override:
  server:
    port: 9090
    debug: true
config:
  server:
    host: localhost
    port: 9090
    debug: true

Note that << merges nested blocks but replaces lists entirely.

Inspecting Blocks

elements

Break a block into its list of key-value pairs:

eu -e '{ a: 1 b: 2 } elements'
- - a
  - 1
- - b
  - 2

Each element is a two-element list: [key, value].

keys and values

eu -e '{ a: 1 b: 2 c: 3 } keys'
- a
- b
- c
eu -e '{ a: 1 b: 2 c: 3 } values'
- 1
- 2
- 3

has

Check whether a block contains a key:

eu -e '{ a: 1 b: 2 } has(:a)'
true

lookup and lookup-or

Look up a value by symbol key:

eu -e '{ a: 1 b: 2 } lookup(:b)'
2

With a default for missing keys:

eu -e '{ a: 1 } lookup-or(:z, 99)'
99

Reconstructing Blocks

block

Build a block from a list of key-value pairs:

eu -e '[[:a, 1], [:b, 2], [:c, 3]] block'
a: 1
b: 2
c: 3

zip-kv

Build a block from separate key and value lists:

eu -e 'zip-kv([:x, :y, :z], [1, 2, 3])'
x: 1
y: 2
z: 3

merge-all

Merge a list of blocks into one:

eu -e '[{a: 1}, {b: 2}, {c: 3}] merge-all'
a: 1
b: 2
c: 3

Transforming Blocks

map-values

Apply a function to every value, keeping keys:

eu -e '{ a: 1 b: 2 c: 3 } map-values(* 10)'
a: 10
b: 20
c: 30

map-keys

Transform the keys of a block:

eu -e '{ a: 1 b: 2 } map-keys(sym ∘ str.prefix("x-") ∘ str.of)'
x-a: 1
x-b: 2

filter-values

Return the list of values that satisfy a predicate (note: this returns a list, not a block):

eu -e '{ a: 1 b: 20 c: 3 d: 40 } filter-values(> 10)'
- 20
- 40

To keep matching entries as a block, use filter-items with by-value:

eu -e '{ a: 1 b: 20 c: 3 d: 40 } filter-items(by-value(> 10)) block'
b: 20
d: 40

map-kv

Apply a function to each key-value pair. The function receives two separate arguments (key, value) (it uses uncurry internally), and returns a transformed result:

eu -e '{ a: 1 b: 2 } map-kv("{}: {}")'
- 'a: 1'
- 'b: 2'

To produce a new block, combine with block:

eu -e '{ a: 1 b: 2 } map-kv(pair) block'
a: 1
b: 2

Modifying Individual Values

alter-value

Replace the value at a specific key:

config: { host: "localhost" port: 8080 }
updated: config alter-value(:port, 9090)
config:
  host: localhost
  port: 8080
updated:
  host: localhost
  port: 9090

update-value

Apply a function to the value at a specific key:

counters: { hits: 10 errors: 3 }
result: counters update-value(:hits, inc)
counters:
  hits: 10
  errors: 3
result:
  hits: 11
  errors: 3

set-value

Set a value, creating the key if it does not exist:

eu -e '{} set-value(:x, 42)'
x: 42

alter and update (nested)

Modify values deep in nested blocks using a key path:

config: { server: { db: { port: 5432 } } }
changed: config alter([:server, :db, :port], 3306)
bumped: config update([:server, :db, :port], inc)
config:
  server:
    db:
      port: 5432
changed:
  server:
    db:
      port: 3306
bumped:
  server:
    db:
      port: 5433

merge-at

Merge additional keys into a nested block:

config: { server: { db: { port: 5432 } } }
extended: config merge-at([:server, :db], { host: "10.0.0.1" })
config:
  server:
    db:
      port: 5432
extended:
  server:
    db:
      port: 5432
      host: 10.0.0.1

Patterns: Configuration Templating

A common pattern is to define a base configuration and layer environment-specific overrides on top:

base: {
  server: {
    host: "0.0.0.0"
    port: 8080
    workers: 4
  }
  logging: {
    level: "info"
    format: "json"
  }
}

production: base << {
  server: { workers: 16 }
  logging: { level: "warn" }
}

development: base << {
  server: { host: "localhost" }
  logging: { level: "debug" format: "text" }
}

Key Concepts

  • Block catenation merges two blocks; later keys override earlier ones
  • Deep merge (<<) recursively merges nested blocks
  • elements, keys, values decompose blocks; block and zip-kv reconstruct them
  • map-values, map-keys, filter-values transform blocks
  • alter-value, update-value, set-value modify individual entries
  • alter, update, merge-at modify deeply nested values

Imports and Modules

In this chapter you will learn:

  • How to import other eucalypt files and data files
  • How to scope imports to specific declarations
  • How to use named imports for namespacing
  • How git imports work for external dependencies

Basic Imports

Import another eucalypt file using the import key in declaration metadata:

{ import: "helpers.eu" }

# Names from helpers.eu are now available
result: helper-function(42)

When the metadata is at unit level (the first item in the file), the imported names are available throughout the entire file.

Scoped Imports

Imports can be scoped to a specific declaration, limiting where the imported names are visible:

` { import: "math.eu" }
calculations: {
  # Names from math.eu are available only within this block
  result: advanced-calculation(10)
}

# math.eu names are NOT available here

Named Imports

Give an import a name to access its contents under a namespace:

{ import: "cfg=config.eu" }

host: cfg.host
port: cfg.port

This is especially useful when importing data files that might contain names which clash with your own:

{ import: "prod=production.yaml" }

url: "https://{prod.host}:{prod.port}"

Importing Multiple Files

Supply a list to import several files at once:

{ import: ["helpers.eu", "config.eu"] }

result: helper(config-value)

Named and unnamed imports can be mixed:

{ import: ["helpers.eu", "cfg=config.eu"] }

result: helper(cfg.port)

Importing Data Files

Eucalypt can import files in any supported format. The format is inferred from the file extension:

{ import: "data=records.yaml" }

first-record: data head

You can override the format when the extension is misleading:

{ import: "data=yaml@records.txt" }

Formats That Return Lists

Some formats (CSV, JSON Lines, text) produce lists rather than blocks. These must be given a name:

{ import: "rows=transactions.csv" }

total: rows map(.amount num) foldl(+, 0)

Nested Imports

Imports can be placed at any level of nesting:

deep: {
  nested: {
    ` { import: "local-config.eu" }
    config: {
      value: local-setting
    }
  }
}

Import Resolution Order

When an import path is a relative string (e.g. "helpers.eu" rather than an absolute path or a git import), eucalypt resolves it by searching in this order:

  1. Source-relative directory — the directory containing the file that contains the import declaration.
  2. Global lib path — the directories supplied via -L flags and the current working directory, searched in the order they were specified.

If the file is found in the source-relative directory it is used immediately and the global lib path is not consulted.

Transitive resolution

Resolution is always relative to the importing file, not to the entry-point file passed on the command line. This means:

  • main.eu imports "lib/utils.eu" → resolved as <dir-of-main>/lib/utils.eu
  • lib/utils.eu imports "helpers/misc.eu" → resolved as <dir-of-main>/lib/helpers/misc.eu (relative to utils.eu, not to main.eu)
  • lib/helpers/misc.eu imports "sub/detail.eu" → resolved as <dir-of-main>/lib/helpers/sub/detail.eu

Each file sees its own directory as the base for relative imports, no matter how deep the chain goes.

Practical example

Suppose your project is laid out as follows:

project/
  main.eu
  lib/
    utils.eu
    helpers/
      misc.eu

main.eu can import lib/utils.eu using a path relative to itself:

{ import: "lib/utils.eu" }

result: util-function(42)

lib/utils.eu can import from lib/helpers/misc.eu using a path relative to its own location:

{ import: "helpers/misc.eu" }

util-function(x): misc-helper(x)

No -L flags or absolute paths are needed. If a file is not found source-relatively, eucalypt falls back to the global lib path, so existing projects that rely on -L continue to work without modification.

Git Imports

Import eucalypt code directly from a git repository. This is useful for sharing libraries without manually managing local copies:

{ import: { git: "https://github.com/user/eu-lib"
            commit: "abc123def456"
            import: "lib/helpers.eu" } }

result: lib-function(42)

The commit field is mandatory and should be a full SHA. This ensures the import is repeatable and cacheable.

Multiple git imports can be listed alongside simple imports:

{ import: [
  "local.eu",
  { git: "https://github.com/user/lib"
    commit: "abc123"
    import: "helpers.eu" }
] }

Streaming Imports

For large files, streaming imports read data lazily:

{ import: "events=jsonl-stream@events.jsonl" }

recent: events take(100)

Available streaming formats:

FormatDescription
jsonl-streamJSON Lines (one object per line)
csv-streamCSV with headers
text-streamPlain text (one string per line)

Combining Imports with the Command Line

Imports in .eu files complement the command line input system. You can use both together:

eu data.yaml transform.eu

Here data.yaml is a command-line input and transform.eu can also have its own { import: ... } declarations for helpers or configuration.

See The Command Line for details on the input system.

Practical Example: Configuration Layering

# base.eu
defaults: {
  host: "0.0.0.0"
  port: 8080
  workers: 4
}
# deploy.eu
{ import: "base.eu" }

production: defaults << {
  workers: 16
  host: "prod.example.com"
}

staging: defaults << {
  host: "staging.example.com"
}

Running eu deploy.eu produces layered configuration with shared defaults.

Key Concepts

  • Use { import: "file.eu" } in metadata to import files
  • Named imports ("name=file") provide namespace isolation
  • Imports can be scoped to individual declarations
  • Data files (YAML, JSON, CSV, etc.) can be imported like code
  • Relative paths resolve against the importing file's directory first, then the global lib path — no -L flags needed for co-located helpers
  • Git imports pull code directly from repositories at a specific commit
  • Streaming imports (jsonl-stream@, csv-stream@, text-stream@) handle large files lazily

Working with Data

In this chapter you will learn:

  • How to process JSON, YAML, TOML, CSV, and XML data
  • How to convert between formats on the command line
  • Patterns for querying and transforming structured data
  • How to combine multiple data sources

Format Conversion

The simplest use of eucalypt is converting between data formats. By default, output is YAML:

# JSON to YAML
eu data.json

# YAML to JSON
eu data.yaml -j

# JSON to TOML
eu data.json -x toml

Processing JSON

Pipe JSON from other tools into eucalypt:

curl -s https://api.example.com/users | eu -e 'map(.name)'

Or process a JSON file:

eu -e 'users filter(.active) map(.email)' data.json

Processing YAML

YAML files are read natively. All YAML features including anchors, aliases, and merge keys are supported:

# config.yaml
defaults: &defaults
  timeout: 30
  retries: 3

production:
  <<: *defaults
  debug: false
eu config.yaml -e 'production'
timeout: 30
retries: 3
debug: false

YAML Timestamps

YAML timestamps are automatically converted to date-time values:

created: 2024-03-15
updated: 2024-03-15T14:30:00Z

Quote the value to keep it as a string: created: "2024-03-15".

Processing TOML

eu config.toml -e 'database.port'
5432

Processing CSV

CSV files are imported as a list of blocks, where each row becomes a block with column headers as keys:

eu -e 'rows filter(_.age num > 30)' rows=people.csv

CSV values are always strings. Use num to convert to numbers when needed.

Processing XML

XML is imported as a nested list structure. Each element is represented as [tag, attributes, ...children]:

eu -e 'root' root=xml@data.xml

Use list functions to navigate the structure:

{ import: "root=xml@data.xml" }

# Get the tag name (first element)
tag: root first

# Get attributes (second element)
attrs: root second

# Get child elements (everything after the first two)
children: root drop(2)

Named Inputs

Use named inputs to make data available under a specific name:

eu users=users.json roles=roles.json -e 'users map(.name)'

Named inputs are essential for list-based formats (CSV, JSON Lines, text):

eu lines=text@log.txt -e 'lines filter(str.matches?("ERROR")) count'

Combining Multiple Sources

A powerful pattern is combining data from multiple sources:

eu users.yaml roles.yaml merge.eu

Where merge.eu contains logic that uses names from both inputs:

# merge.eu
summary: {
  user-count: users count
  role-count: roles count
}

Using Evaluands

The -e flag specifies an expression to evaluate against the loaded inputs:

# Select a nested value
eu config.yaml -e 'database.host'

# Transform and filter
eu data.json -e 'items filter(.price > 100) map(.name)'

# Aggregate
eu data.json -e 'items map(.price) foldl(+, 0)'

Collecting Inputs

The --collect-as (-c) flag gathers multiple files into a list:

eu -c configs *.yaml -e 'configs map(.name)'

Add --name-inputs (-N) to get a block keyed by filename:

eu -c configs -N *.yaml
configs:
  a.yaml:
    name: alpha
  b.yaml:
    name: beta

Output Formats

Control the output format:

FlagFormat
(default)YAML
-jJSON
-x jsonJSON
-x tomlTOML
-x ednEDN
-x textPlain text

The format can also be inferred from the output file:

eu data.yaml -o output.json

Practical Example: Data Pipeline

Suppose you have a CSV of sales data and want to generate a JSON summary:

eu sales=sales.csv -j -e '{
  total: sales map(.amount num) foldl(+, 0)
  count: sales count
  regions: sales map(.region) unique
}'

Or as a reusable eucalypt file:

# report.eu
{ import: "sales=sales.csv" }

` :suppress
amounts: sales map(.amount num)

report: {
  total: amounts foldl(+, 0)
  count: sales count
  average: report.total / report.count
}
eu report.eu -j -e report

Key Concepts

  • Eucalypt reads JSON, YAML, TOML, CSV, XML, EDN, JSON Lines, and plain text
  • Output defaults to YAML; use -j for JSON or -x for other formats
  • Named inputs (name=file) give data a name for reference
  • The -e flag evaluates expressions against loaded data
  • --collect-as gathers multiple files into a list or block
  • CSV values are strings; use num to convert to numbers
  • Combine multiple sources with the command line input system or imports

The Command Line

In this chapter you will learn:

  • The eu command structure and subcommands
  • How to specify inputs, outputs, and evaluands
  • How to use targets, arguments, and environment variables
  • How to use the formatter and other tools

Command Structure

eu [GLOBAL_OPTIONS] [SUBCOMMAND] [SUBCOMMAND_OPTIONS] [FILES...]

When no subcommand is given, run is assumed:

eu file.eu          # same as: eu run file.eu

Subcommands

CommandDescription
runEvaluate and render (default)
testRun embedded tests
dumpDump intermediate representations
versionShow version information
explainShow what would be executed
list-targetsList export targets
fmtFormat source files
lspStart the Language Server Protocol server

Inputs

File Inputs

Specify one or more files to process:

eu data.yaml transform.eu

Inputs are merged left to right. Names from earlier inputs are available to later ones. The final input determines what is rendered.

stdin

Use - to read from stdin, or simply pipe data when no files are specified:

curl -s https://api.example.com/data | eu -e 'items count'

Format Override

Override the assumed format with a format@ prefix:

eu yaml@data.txt json@-

Named Inputs

Prefix with name= to make the input available under a name:

eu config=settings.yaml app.eu

In app.eu, the YAML content is available as config:

port: config.port

Collecting Inputs

Gather multiple files into a named collection:

eu -c data *.json -e 'data map(.name)'

Add -N to key by filename:

eu -c data -N *.json

Outputs

Format

Output defaults to YAML. Common options:

eu file.eu            # YAML (default)
eu file.eu -j         # JSON (shortcut)
eu file.eu -x json    # JSON (explicit)
eu file.eu -x toml    # TOML
eu file.eu -x text    # Plain text

Output File

Write to a file (format inferred from extension):

eu data.eu -o output.json

Evaluands

The -e flag specifies an expression to evaluate:

eu -e '2 + 2'
4

When combined with file inputs, the evaluand has access to all loaded names:

eu data.yaml -e 'users filter(.active) count'

Multiple -e flags are allowed; the last one determines the output.

Quick Expressions

Use -e for quick data exploration:

# Inspect a value
eu config.yaml -e 'database'

# Count items
eu data.json -e 'items count'

# Extract and transform
eu data.json -e 'items map(.name) reverse'

Targets

Declarations annotated with :target metadata can be selected for rendering:

# multi-output.eu
` { target: :summary }
summary: { count: items count }

` { target: :detail }
detail: items

List available targets:

eu list-targets multi-output.eu

Select a target:

eu -t summary multi-output.eu

A target named main is rendered by default. If no :main target exists, the entire unit is the target.

Passing Arguments

Arguments after -- are available via io.args:

eu -e 'io.args' -- hello world
- hello
- world

Use in scripts:

# greet.eu
name: io.args head-or("World")
greeting: "Hello, {name}!"
eu greet.eu -e greeting -- Alice
Hello, Alice!

Arguments are strings. Use num to convert:

numbers: io.args map(num)
total: numbers foldl(+, 0)

Environment Variables

Access environment variables through io.env:

home: io.env lookup-or(:HOME, "/tmp")
path: io.env lookup(:PATH)

Random Seed

By default, random numbers use system entropy. Use --seed for reproducible output:

eu --seed 42 template.eu

The Formatter

Format eucalypt source files:

eu fmt file.eu              # print formatted to stdout
eu fmt --write file.eu      # format in place
eu fmt --check file.eu      # check (exit 1 if unformatted)
eu fmt --reformat file.eu   # full reformatting

Options:

  • -w, --width <N> -- line width (default: 80)
  • --indent <N> -- indent size (default: 2)

Debugging

Dumping Intermediate Representations

eu dump ast file.eu         # syntax tree
eu dump desugared file.eu   # core expression
eu dump stg file.eu         # compiled STG

Explaining Execution

eu explain file.eu          # show what would be executed

Statistics

eu -S file.eu               # print metrics to stderr

Batch Mode

Use -B for repeatable builds (disables ergonomic mode and ~/.eucalypt):

eu -B file.eu

Suppressing the Prelude

The standard prelude is loaded automatically. Suppress it with -Q:

eu -Q file.eu

Warning: Without the prelude, even true, false, if, and basic operators are unavailable.

Version Assertions

Ensure a minimum eu version in source files:

_ : eu.requires(">=0.3.0")

Check the current version:

eu version

LSP Server

Start a Language Server Protocol server for editor integration:

eu lsp

Provides syntax error diagnostics and formatting support.

Key Concepts

  • eu defaults to run when no subcommand is given
  • Inputs are merged left to right; the final input determines output
  • Named inputs (name=file) provide namespace isolation
  • -e evaluates an expression against loaded inputs
  • -t selects a named target for rendering
  • -- passes arguments available via io.args
  • eu fmt formats source files; eu test runs tests; eu lsp starts the language server

YAML Embedding

Eucalypt can be embedded in YAML files via the following tags:

  • eu
  • eu::suppress
  • eu::fn

The YAML embedding is not as capable as the native Eucalypt syntax but it is rich enough to be used for many YAML templating use cases, particularly when combined with the ability to specify several inputs on the command line.

Evaluating eucalypt expressions

As you would expect, YAML mappings correspond to Eucalypt blocks and bind names just as Eucalypt blocks do and YAML sequences correspond to Eucalypt lists.

YAML allows a wide variety of forms of expressing these (block styles and flow styles), to the extent that JSON is valid YAML.

Eucalypt expressions can be evaluated using the !eu tag and have access to all the names defined in the YAML unit and any others brought into scope by specifying inputs on the command line.

values:
  x: world
  y: hello

result: !eu "{values.y} {values.x}!"

...will render as:

values:
  x: world
  y: hello

result: Hello World!

Suppressing rendering

Items can be hidden using the eu::suppress tag. This is equivalent to :suppress metadata in the eucalypt syntax.

values: !eu::suppress
  x: world
  y: hello

result: !eu "{values.y} {values.x}!"

...will render as:

result: Hello World!

Defining functions

Functions can be defined using eu::fn and supplying an argument list:

values: !eu::suppress
  x: world
  y: hello
  greet: !eu::fn (h, w) "{h} {w}!"

result: !eu values.greet(values.y, values.x)

...will render as:

result: Hello World!

The escape hatch

Larger chunks of eucalypt syntax can be embedded using YAML's support for larger chunks of text, combined with !eu. Using this workaround you can access capabilities of eucalypt that are not yet available in the YAML embedding. (Although operators cannot be made available in YAML blocks because of the way that operator names are bound - see Operator Precedence Table.)

block: !eu |
  {
    x: 99
    (l ^^^ r): "{l} <_> {r}"
    f(n): n ^^^ x
  }

result: block.f(99)

Testing with Eucalypt

Eucalypt has a built-in test runner which can be used to run tests embedded in eucalypt files.

Test mode is invoked by the eu test subcommand and:

  • analyses the file to build a test plan consisting of a list of test targets and validations to run
  • executes the test plan and generates an evidence file
  • applies validations against the evidence to generate a results file
  • outputs results and generates an HTML report

Simple tests

By default eucalypt searches for targets beginning with test- and runs each to render a yaml output. The result is parsed read back in and eucalypt checks for the presence of a RESULT key. If it finds it and the value is PASS, the test passes. Anything else is considered a fail.

my-add(x, y): x + y

` { target: :test-add }
test: {
  RESULT: (2 + 2 = 4) then(:PASS, :FAIL)
}

Several test targets can be embedded in one file. Each is run as a separate test.

Test files

If your intention is not to embed tests in a eucalypt file but instead to write a test as a single file, then you can omit the test targets. Eucalypt will use a main target or run the entire file as usual and then validate the result (looking for a RESULT key, by default).

Other formats

In test mode, eucalypt processes the test subject to generate output and then parses that back to validate the result. This is to provide for validation of the rendered text and the parsing machinery.

By default YAML is generated and parsed back for each test target in the file but other formats can be selected in header metadata.

{
  test-targets: [:yaml, :json]
}

` { target: :test-add }
add: {
  RESULT: (2 + 2 = 4) then(:PASS, :FAIL)
}

` { target: :test-sub }
sub: {
  RESULT: (2 - 2 = 0) then(:PASS, :FAIL)
}

Running this file using eu test will result in four tests being run, two formats for each of the two targets.

Using the default validator, for all formats for which eucalypt provides import and export capability, it shouldn't make any difference which format is used. However, custom validators provide the ability to check the precise text that is rendered.

Custom validators

When a test runs, the execution generates an evidence block which has the following keys:

  • exit the exit code (0 on success) of the eucalypt execution
  • stdout text as a list of strings
  • stderr text as a list of strings
  • result (the stdout parsed back)
  • stats some statistics from the run

Date, Time, and Random Numbers

Zoned Date-Time (ZDT) Values

Eucalypt has native support for date-time values through the ZDT (Zoned Date-Time) type. ZDT values represent a point in time with timezone information.

ZDT Literals

Use the t"..." prefix to write date-time literals directly in eucalypt source:

today: t"2024-03-15"
meeting: t"2024-03-15T14:30:00Z"
local: t"2024-03-15T14:30:00+01:00"

The t"..." syntax accepts ISO 8601 formats:

FormatExampleNotes
Date onlyt"2024-03-15"Midnight UTC
UTCt"2024-03-15T14:30:00Z"
With offsett"2024-03-15T14:30:00+05:00"
Fractional secondst"2024-03-15T14:30:00.123Z"

Parsing and Formatting

The cal namespace provides functions for working with date-time values:

# Parse from a string
d: cal.parse("2024-03-15T14:30:00Z")

# Format to a custom string
label: t"2024-03-15" cal.format("%Y-%m-%d")  # "2024-03-15"

Date-Time Arithmetic

ZDT values support comparison operators:

before: t"2024-01-01" < t"2024-12-31"   # true
same: t"2024-03-15" = t"2024-03-15"     # true

Sorting Date-Times

dates: [t"2024-12-25", t"2024-01-01", t"2024-07-04"]
sorted: dates sort-zdts  # [Jan 1, Jul 4, Dec 25]

YAML Timestamps

When importing YAML files, unquoted timestamp values are automatically converted to ZDT values:

created: 2024-03-15
updated: 2024-03-15T14:30:00Z

Quote the value to keep it as a string: created: "2024-03-15".

See Import Formats for full details.

Current Time

The io.epoch-time binding provides the current Unix epoch time in seconds:

now: io.epoch-time

Random Numbers

Eucalypt provides pseudo-random number generation via the random namespace, which is a state monad over a PRNG stream.

The random stream

A random stream is provided at startup as io.random. Each run produces different values unless you supply a seed:

eu --seed 42 example.eu

Single random values

For a single random value, pass io.random as the stream:

roll: random.int(6, io.random).value + 1
colour: random.choice(["red", "green", "blue"], io.random).value

Each operation returns a {value, rest} block — extract .value to get the result.

Multiple random values

When you need several random values, you must propagate the stream from each call to the next. Reusing io.random would give the same value each time:

# WRONG — both use the same stream, so d1 = d2
d1: random.int(6, io.random).value
d2: random.int(6, io.random).value

# RIGHT — thread the stream manually
r1: random.int(6, io.random)
r2: random.int(6, r1.rest)
total: r1.value + r2.value + 2

This manual threading is error-prone. The random monad can ease the overhead somewhat — use a { :random ... } monadic block or combinators like sequence and map-m:

# Monadic block — stream threads automatically
dice: { :random
  d6: random.int(6)
  d20: random.int(20)
}.[d6, d20]

result: dice(io.random).value  # e.g. [3, 14]

# Or use sequence for a list of actions
two-dice: random.sequence([random.int(6), random.int(6)], io.random).value

Other operations

shuffled: random.shuffle(["a", "b", "c", "d"], io.random).value
picked: random.sample(3, range(1, 50), io.random).value

See the Random Numbers reference for the full API and the Monads guide for details on the state monad pattern.

IO and Shell Commands

As of version 0.5.0 Eucalypt can execute IO by invoking shell commands, via the IO monad.

Eucalypt is not a scripting language, it is a small, lazy, dynamically typed, pure functional language for transformation and templating of semi-structured data formats and that remains its sweet spot. Do not get too ambitious.

However, prior to the IO monad, arranging the sources of data in and out of the program was limited to what could be named or streamed in from the shell, and piped to pre-planned destinations. Allowing more direct control over IO within the process (while staying within the functional paradigm) allows much more flexibility.

IO operations are sequenced strictly by the runtime and the monad idiom.

All IO operations require the --allow-io / -I flag.


The IO monad

This is not a monad tutorial. Suffice it to say than an IO action (which may have side effects) is represented by an object in Eucalypt. The monad and the runtime together, ensure that the action is run at the appropriate time and its output is safely incorporated into the computation without imperiling the referential transparency of non-IO code.

In initial versions of this functionality, the only IO capability available is running shell programs or binaries from eucalypt, and passing data in and out of them. Data passed into and out of the shelled program is not streamed incrementally, but buffered whole. This is not suitable for large data pipelines.

The monad provides several ways to create IO actions, and string them together. When you run a eucalypt program, if the value of the named target (or :main target) is an IO action, that is run.

Just embedding an IO action in a data structure is not sufficient to have it execute. It must either be the value of the top-level target of a program, or invoked in some way by that action.

The key monad functions bind and return are defined in the io namespace, along with several traditional monad combinators. This supports the use of { :io ... } monadic blocks to combine several monad actions into one. Monads and the monad() Utility explains Eucalypt's monad machinery in a little more detail.

A simple IO action that doesn't perform any IO at all but represents a constant value, can be created with io.return. e.g.

greeting: io.return("hello")

To create more interesting IO actions, read on.

Running a shell command

The simplest way to run a command is io.shell:

result: "echo hello" io.shell

This creates an IO action which runs the specified command via sh -c and returns a block:

stdout: "hello\n"
stderr: ""
exit-code: 0

To extract a specific field, either use generalised lookup syntax on an :io monad block...

{ :io r: io.shell("echo hello") }.(r.stdout)

... or use io.map to apply a section within the IO monad chain.

"echo hello" io.shell io.map(.stdout)

The { :io ... } monadic block

A block tagged :io desugars into nested io.bind calls. Each field is a bind step; the name becomes available in all subsequent steps. The .() expression after the closing brace is the return value.

{ :io
  r: io.shell("echo hello")
  _: io.check(r)
}.(r.stdout)

Important: unlike normal blocks, monadic blocks bind names sequentially — each step can only refer to names from earlier steps, not later ones. See Monads and the monad() Utility for full details on monadic block syntax, forms, and the sequential binding constraint.


Checking for errors

io.check inspects a command result. If the exit code is non-zero, it fails the IO monad with the stderr message. Otherwise it returns the result unchanged:

{ :io
  r: io.shell("grep pattern file.txt")
  _: io.check(r)
}.(r.stdout)

If grep finds no matches (exit code 1), this produces an io.fail error with whatever was on stderr.


Exec: running a binary directly

io.exec runs a binary without going via the shell. The argument is a list where the first element is the command and the rest are arguments:

{ :io
  r: io.exec(["git", "rev-parse", "HEAD"])
}.(r.stdout)

If the binary does not exist, io.exec returns a result block with exit-code 127 and the OS error in stderr, rather than failing outright.


Options: stdin, timeout

Both io.shell-with and io.exec-with accept an options block as the first argument. This is merged into the spec block, overriding defaults:

"cat" io.shell-with{stdin: "hello world", timeout: 60} io.map(.stdout)

Available options:

OptionDefaultDescription
stdin(none)String to pipe to the command's standard input
timeout30Maximum seconds before the command is killed

The pipeline style reads naturally: the command string flows into shell-with which receives the options.


Combining IO actions

Sequencing with bind

io.bind chains two actions. The continuation receives the result of the first action:

io.bind(io.shell("echo hello"),
  _(r): io.shell("echo got: {r.stdout}"))

The { :io ... } block is almost always preferable to explicit io.bind calls.

Mapping over a result

io.map applies a pure function to the result of an action without needing a new IO step. The function can, of course, be a composition of several functions:

` :main
result: "curl https://example.com/test.json" io.shell io.map((.stdout) ; parse-as(:json))

Failing explicitly

io.fail aborts the IO monad with an error message:

{ :io
  r: io.shell("some-command")
  _: (r.exit-code = 0) then(io.return(r), io.fail("command failed: {r.stderr}"))
}.(r.stdout)

This is what io.check does internally — it is a convenience wrapper around this pattern:

` :main
greeting: "echo hello world" io.shell io.check io.map(.stdout)

Practical examples

Git commit hash

` :main
hash: "git rev-parse --short HEAD" io.shell io.check io.map(.stdout)

Run a command and parse the output as JSON

` :main
data: "curl -s https://api.example.com/data" 
  io.shell 
  io.check 
  io.map((.stdout) ; parse-as(:json))

Pipe data through a command

` :main
text: { :io
  r: "jq '.name'" io.shell-with({stdin: render-as(:json, data)})
}.(r.stdout)

Multiple commands in sequence

` :main
main: { :io
  a: io.shell("date +%s")
  b: io.shell("hostname")
}.(
  { timestamp: a.stdout 
    host: b.stdout }
)

or, equivalently:

io.sequence[io.shell("date +%s") io.map(.stdout),
            io.shell("hostname") io.map(.stdout)] with-keys[:timestamp, :host]


Testing IO code

Test targets that use IO should include requires-io: true in their target metadata. The test runner (eu test) skips these tests gracefully when --allow-io is not set:

` { target: :test requires-io: true }
test:
  { :io r: io.exec(["echo", "hello"]) }.(
    if(r.stdout str.matches?("hello.*"),
      { RESULT: :PASS },
      { RESULT: :FAIL }))

Reference

For the full API table, see the IO prelude reference.

For the monadic programming model, monad() utility, and how to build your own monads, see Monads and the monad() Utility.

Monads and the monad() Utility

Eucalypt has first-class support for monadic programming. Two built-in namespaces — io and random — are monads, and the monad() utility lets you build additional ones from just bind and return.

What is a monad in eucalypt?

A monad is just two primitives, which are conventionally grouped in a namespace.

PrimitiveRole
return(v)Wrap a pure value as a monadic action
bind(action, f)Run an action, pass its result to f, return a new action

With these two primitives available an alternative interpretation of the block structure, monadic blocks can be used.

Other monadic functions — map, then, join, sequence, map-m, filter-m — are typically provided in the same namespace and can be derived automatically with the aid of the monad function.

Monads are required for performing IO from eucalypt code and may be used to simplify random number access. Users may define monads themselves and assign unicode brackets to them if they so wish.


Monadic blocks

Eucalypt provides syntactic sugar for chaining monadic actions. A block tagged with a monad namespace name desugars into nested bind calls automatically.

The most common form tags the block with :io:

{ :io
  r: io.shell("echo hello")
  _: io.check(r)
}.(r.stdout)

This desugars to:

io.bind(io.shell("echo hello"),
  λr. io.bind(io.check(r),
    λ_. io.return(r.stdout)))

Each field becomes a bind step. The bound name is available in all subsequent steps. The .() expression after the closing brace is the return expression, wrapped in the monad's return.

Key constraint: sequential binding

Unlike normal blocks, names in a monadic block can only refer to names bound in earlier steps. Normal eucalypt blocks are declarative — bindings can refer to each other in any order. Monadic blocks are sequential — each step can only see what came before it, because the desugaring nests each continuation inside the previous one.

# WRONG — b is not yet bound when a is evaluated
{ :io
  a: io.map(inc, b)
  b: io.shell("echo 1")
}.(a)

# RIGHT — b is bound before a uses it
{ :io
  b: io.shell("echo 1")
  a: io.map(inc, b)
}.(a)

Block metadata forms

Several syntax forms are available for monadic blocks:

FormSyntaxMonad source
1{ :name decls }.exprNamespace name in scope
2{ { monad: name } decls }.exprNamespace name in scope
3{ { :monad namespace: name } decls }.exprNamespace name in scope
4{ { :monad bind: f return: r } decls }.exprExplicit f/r functions

Form 1 is the most common — { :io ... } tags a block with the io namespace. The desugarer looks up io.bind and io.return automatically.

Custom bracket pairs

You can define bracket pairs for monadic notation using the :monad metadata:

⟦{}⟧: { :monad bind: my-bind  return: my-return }

result: ⟦ x: some-action  y: other-action(x) ⟧.(x + y)

See the syntax reference for full details on bracket pair definitions.


The monad() utility

monad(m) takes a block with bind and return fields and returns a block of derived combinators:

my-monad: monad{bind: my-bind, return: my-return}

The returned block provides:

CombinatorDescription
bind(action, f)Passed through from m.bind
return(v)Passed through from m.return
map(f, action)Apply pure function f to the result of an action (fmap)
then(b, a)Sequence two actions, discarding the result of the first. Pipeline: a m.then(b)
join(mm)Flatten a nested monadic value
sequence(ms)Run a list of actions in order, collecting results
map-m(f, xs)Apply f to each element of xs, then sequence
filter-m(p, xs)Monadic filter: keep elements where p returns a truthy action

Building a monadic namespace

The typical pattern is to use monad() to produce the derived operations and then catenate (merge) domain-specific operations on top:

my-ns: monad{bind: my-bind, return: my-return} {
  some-extra-op(x): ...
}

Since monad() includes bind and return in its result, there is no need to repeat them in the right-hand block.

Why catenation (not <<)? Catenation is shallow merge — the right-hand block's fields override the left-hand block's fields at the top level. << is deep merge and is not appropriate here because the derived combinators are not nested blocks.

Overriding a derived combinator — put the specialised implementation in the right-hand block; it shadows the derived one:

my-ns: monad{bind: my-bind, return: my-return} {
  # override map with a more efficient implementation
  map(f, action): my-efficient-map(f, action)
}

The IO monad

The io namespace is a monad built around effect execution. IO operations cause side effects (shell commands, reads, writes). The eucalypt runtime sequences them strictly.

IO operations require --allow-io / -I at the command line.

Primitives

io.return(v)             # wrap a pure value; no side effects
io.bind(action, f)       # run action, pass result to f

Practical example

result: { :io
  r: io.shell("git rev-parse HEAD")
  _: io.check(r)
}.(r.stdout)

Extending the IO monad

You can derive additional combinators or extend with domain-specific operations via catenation:

app-io: monad{bind: io.bind, return: io.return} {
  read-file(path): io.shell("cat {path}")
  git-hash: io.shell("git rev-parse HEAD")
}

IO combinators reference

FunctionDescription
io.return(a)Wrap a pure value in the IO monad
io.bind(action, f)Sequence two IO actions
io.map(f, action)Apply pure function f to the result of an IO action
io.check(result)Fail if exit-code is non-zero; otherwise return result

See IO and Shell Commands for practical usage and the IO reference for the full API.


The random state monad

The random namespace implements a state monad over a PRNG stream. A random stream is provided at startup as io.random (seeded from the system or from --seed). Each random operation consumes part of the stream and returns a {value, rest} block — value is the result and rest is the remaining stream for subsequent operations.

The pitfall: forgetting to propagate the stream

For a single random value, passing io.random directly is simple:

roll: random.int(6, io.random).value + 1

But when you need multiple random values, you must thread the .rest stream from each call into the next:

r1: random.int(6, io.random)
r2: random.int(6, r1.rest)       # must use r1.rest, not io.random!
total: r1.value + r2.value + 2

If you accidentally pass io.random to both calls, you get the same random value twice. This manual threading is error-prone and verbose — which is exactly what the random monad solves.

With the monad: automatic threading

A { :random ... } monadic block threads the stream automatically:

triple: { :random
  d6: random.int(6)
  d20: random.int(20)
  d100: random.int(100)
}.[d6, d20, d100]

result: triple(io.random).value  # e.g. [3, 14, 57]

Each step receives the stream left over from the previous step — no manual .rest threading needed. The return expression (.[d6, d20, d100]) collects the results into a list.

The monadic block produces an action — a function that takes a stream — so you run it by passing io.random and extracting .value.

For common patterns, sequence and map-m are even more concise:

# Sequence a list of actions
two-dice: random.sequence[random.int(6), random.int(6)]
result: two-dice(io.random).value  # list of two rolls

# Map an action over a list of die sizes
dice: random.map-m(random.int, [4, 6, 8, 10, 20])
rolls: dice(io.random).value   # e.g. [2, 4, 1, 7, 15]

Important: always extract .value before rendering — the .rest field is an infinite stream and will hang if you try to print it.

Deterministic seeds

For reproducible output (useful in tests), pass --seed on the command line or create a stream from a fixed seed:

rolls: random.map-m(random.int, [6, 6])(random.stream(42)).value
# always produces the same result for seed 42

How the state monad works

Under the hood, random.return and random.bind are:

# return: wrap a pure value as a do-nothing action
random-ret(v, stream): { value: v, rest: stream }

# bind: run m, pass result to f, run f's action with remaining stream
random-bind(m, f, stream): {
  r: m(stream)
  run: f(r.value)(r.rest)
  value: run.value
  rest: run.rest
}

random-bind(m, f) is a 3-arg function; calling it with only two arguments returns an action (function of stream). This is the curried partial application pattern used throughout eucalypt.

Random combinators reference

FunctionDescription
random.bind(m, f)State monad bind
random.return(v)State monad return
random.floatAction: random float in [0,1)
random.int(n)Action: random integer in [0,n)
random.choice(list)Action: random element from list
random.shuffle(list)Action: shuffled copy of list
random.sample(n, list)Action: n elements sampled without replacement

Derived combinators (map, then, join, sequence, map-m, filter-m) are also available — see Random Numbers reference.


Writing your own monad

Any pair of functions satisfying the monad laws can be wrapped with monad(). Here is a minimal example — the identity monad, where "wrapping" a value is a no-op:

` :suppress
id-bind(m, f): f(m)
` :suppress
id-return(a): a
` :suppress
id-monad: monad{bind: id-bind, return: id-return}

` { target: :example }
example: id-monad.map-m(inc, [1, 2, 3])
# => [2, 3, 4]

A more useful example — a "maybe" monad where null short-circuits the computation:

` :suppress
maybe-return(v): v

` :suppress
maybe-bind(m, f): if(m = null, null, f(m))

` :suppress
maybe: monad{bind: maybe-bind, return: maybe-return}

safe-head(xs): if(xs nil?, null, xs head)

# Chain operations that may return null — short-circuits on first null
result: maybe.bind(safe-head([]), inc)   # => null (list was empty)
ok:     maybe.bind(safe-head([42]), inc) # => 43

Key concepts

  • monad(m) takes {bind, return} and returns a block with bind, return, and six derived combinators
  • Use catenation (juxtaposition) to merge derived and specialised operations — NOT <<
  • Monadic blocks ({ :io ... }, { :name ... }) desugar into nested bind calls — names are bound sequentially, not declaratively
  • The random: namespace is a state monad — actions are functions of a stream, bind threads the stream automatically
  • When running random actions, always extract .value before rendering; the .rest field is an infinite stream

Advanced Topics

In this chapter you will learn:

  • How the metadata system works
  • How to use sets for unique collections
  • How to search deep structures with deep-find and deep-query
  • Format specifiers and formatting techniques
  • Type predicates and other utilities

The Metadata System

Every value in eucalypt can carry metadata: a block of additional information that does not appear in the rendered output but can be inspected and used programmatically.

Attaching Metadata

Use the // operator to attach metadata to a value:

answer: 42 // { note: "the answer to everything" }

The value 42 is rendered normally; the metadata is hidden:

answer: 42

Reading Metadata

Use meta to retrieve the metadata block:

answer: 42 // { note: "the answer to everything" }
note: meta(answer).note
answer: 42
note: the answer to everything

Deep Merge Metadata

Use //<< to deep-merge additional metadata onto existing metadata:

x: 1 // { a: 1 }
y: x //<< { b: 2 }
result: meta(y)
x: 1
y: 1
result:
  a: 1
  b: 2

Declaration Metadata

Declaration metadata (written with the backtick syntax) is separate from value metadata. It controls how eucalypt processes the declaration:

` { doc: "A helper" export: :suppress }
helper(x): x + 1

Key metadata properties:

  • doc -- documentation string
  • export: :suppress -- hide from output
  • target -- mark as an export target
  • import -- import other files
  • associates / precedence -- operator fixity (see Operators for details)

YAML Tags

Metadata can carry a tag key which renders as a YAML tag:

ref: :my-resource // { tag: "!Ref" }
ref: !Ref my-resource

This is useful for generating CloudFormation, Kubernetes, and other tagged YAML formats.

Assertions with //= and //=>

The //= operator asserts equality at runtime:

result: 2 + 2 //= 4  # panics if not equal

The //=> operator additionally stores the assertion as metadata:

checked: 2 + 2 //=> 4
m: meta(checked)  # contains the assertion

Sets

Sets are unordered collections of unique values, provided by the set namespace.

Creating Sets

eu -e '[1, 2, 2, 3, 3, 3] set.from-list set.to-list'
- 1
- 2
- 3

Duplicates are removed and elements are sorted.

The empty set is written as :

eu -e '∅ set.to-list'
[]

Membership and Size

eu -e '[1, 2, 3] set.from-list set.contains?(2)'
true
eu -e '[1, 2, 3] set.from-list set.size'
3
eu -e '∅ set.empty?'
true

Adding and Removing

eu -e '∅ set.add(1) set.add(2) set.add(1) set.to-list'
- 1
- 2
eu -e '[1, 2, 3] set.from-list set.remove(2) set.to-list'
- 1
- 3

Set Algebra

Union:

a: [1, 2] set.from-list
b: [2, 3] set.from-list
result: a set.union(b) set.to-list
result:
- 1
- 2
- 3

Intersection:

a: [1, 2, 3] set.from-list
b: [2, 3, 4] set.from-list
result: a set.intersect(b) set.to-list
result:
- 2
- 3

Difference:

a: [1, 2, 3] set.from-list
b: [2, 3] set.from-list
result: a set.diff(b) set.to-list
result:
- 1

Deep Find

deep-find recursively searches a nested block structure for all values associated with a given key name:

eu -e 'deep-find("host", { server: { host: "10.0.0.1" db: { host: "10.0.0.2" } } })'
- 10.0.0.1
- 10.0.0.2

deep-find-first

Return just the first match, or a default:

eu -e 'deep-find-first("host", "unknown", { server: { host: "10.0.0.1" } })'
10.0.0.1

deep-find-paths

Return the key paths to each match:

eu -e 'deep-find-paths("host", { server: { host: "a" db: { host: "b" } } })'
- - server
  - host
- - server
  - db
  - host

Deep Query

deep-query provides a more powerful pattern-based search using dot-separated patterns with wildcards.

A bare key name searches recursively (equivalent to **.key):

eu -e 'deep-query("port", { web: { port: 80 } db: { port: 5432 } })'
- 80
- 5432

Dotted Path

A dotted path matches a specific path:

eu -e 'deep-query("server.host", { server: { host: "10.0.0.1" port: 80 } })'
- 10.0.0.1

Wildcard *

* matches exactly one level:

eu -e 'deep-query("*.port", { web: { port: 80 } db: { port: 5432 } name: "app" })'
- 80
- 5432

Double Wildcard **

** matches any depth:

eu -e 'deep-query("config.**.port", { config: { port: 9090 nested: { deep: { port: 3000 } } } })'
- 9090
- 3000

Variants

  • deep-query-first(pattern, default, block) -- first match or default
  • deep-query-paths(pattern, block) -- key paths of matches

Type Predicates

Test the type of a value:

eu -e '{ a: 1 } block?'
true
eu -e '[1, 2] list?'
true

Sorting

qsort

Sort with a custom comparison:

eu -e '["banana", "apple", "cherry"] qsort(<)'
- apple
- banana
- cherry

Sort by a derived key. Here str.letters ; count is a forward composition (see Functions and Combinators) that extracts the letter count:

` :suppress
shorter(a, b): (a str.letters count) < (b str.letters count)

words: ["one", "two", "three", "four", "five", "six"]
by-length: words qsort(shorter)
words:
- one
- two
- three
- four
- five
- six
by-length:
- one
- two
- six
- four
- five
- three

The prelude also provides sort-by(key-fn, cmp) as a convenience for this pattern:

by-length: words sort-by(str.letters ; count, <)

sort-nums

Sort numbers in ascending order:

eu -e '[30, 10, 20] sort-nums'
- 10
- 20
- 30

Grouping

group-by

Group list elements by a key function:

items: [
  { type: "fruit" name: "apple" }
  { type: "veg" name: "carrot" }
  { type: "fruit" name: "banana" }
]

grouped: items group-by(.type)
items:
- type: fruit
  name: apple
- type: veg
  name: carrot
- type: fruit
  name: banana
grouped:
  fruit:
  - type: fruit
    name: apple
  - type: fruit
    name: banana
  veg:
  - type: veg
    name: carrot

Format Specifiers

Format specifiers in interpolation control output formatting. They use printf-style codes after a colon inside the interpolation braces.

results: {
  n: 42
  padded: "{n:%06d}"
  pi: 3.14159
  float: "{pi:%.2f}"
  h: 255
  hex: "{h:%x}"
}
results:
  padded: '000042'
  float: '3.14'
  hex: ff

Available Format Codes

CodeDescriptionExample
%d, %iSigned decimal integer{42:%d}42
%uUnsigned decimal integer{42:%u}42
%oOctal{255:%o}377
%x, %XHexadecimal (lower/upper){255:%x}ff
%f, %FDecimal notation{3.14:%.1f}3.1
%e, %EScientific notation{1000:%e}1e3
%g, %GAuto (decimal or scientific){0.001:%g}0.001
%sString{:hello:%s}hello

Flags and Modifiers

  • %- — left-align
  • %+ — prepend + for positive numbers
  • %0 — zero-padding (e.g. %06d)
  • %# — alternate form (e.g. 0x prefix for hex)
  • Width: %10s — minimum field width
  • Precision: %.2f — decimal places for floats

See also the Strings and Text chapter for more on string interpolation.

Version Assertions

Assert a minimum version of eu:

_ : eu.requires(">=0.3.0")

Access build metadata:

info: {
  version: eu.build.version
  prelude: eu.prelude.version
}

Key Concepts

  • Metadata (//) attaches hidden information to values; meta retrieves it
  • Sets (set.*) provide unique collections with union, intersection, and difference
  • Deep find recursively searches nested blocks by key name
  • Deep query supports pattern-based search with * and ** wildcards
  • Type predicates (block?, list?) test value types
  • qsort sorts with custom comparators; format specifiers control numeric output
  • eu.requires asserts a minimum version for compatibility

Language Syntax Reference

Eucalypt has a native syntax which emphasises the mappings-and-lists nature of its underlying data model but adds enhancements for functions and expressions. Eucalypt is written in .eu files.

While eu happily processes YAML inputs with embedded expressions, many features are not yet available in the YAML embedding and the embedded expressions are themselves in Eucalypt syntax, so it is necessary to have an overview of how the syntax works to do anything interesting with Eucalypt.

A few aspects are unorthodox and experimental.

Overview

Eucalypt syntax comes about by the overlapping of two sub-languages.

  • the block DSL is how you write blocks and their declarations
  • the expression DSL is how you write expressions

They are entwined in a fairly typical way: block literals (from the block DSL) can be used in expressions (from the expression DSL) and expressions (from the expression DSL) appear in declarations (from the block DSL).

Comments can be interspersed throughout. Eucalypt only has line level comments.

foo: bar # Line comments start with '#' and run till the end of the line

Note: If you feel you need a block comment, you can use an actual block or a string property within a block and mark it with annotation metadata :suppress to ensure it doesn't appear in output.

Eucalypt has two types of names:

  • normal names, which are largely alphanumeric (e.g. f, blah, some-thing!, ) and are used to name properties and functions
  • operator names, which are largely symbolic (e.g. &&&, , -+-|, ) and are used to name operators

See Operator Precedence Table for more.

The block DSL

A block is surrounded by curly braces:

... { ... }

...and contains declarations...

... {
  a: 1
  b: 2
  c: 3
}

...which may themselves have blocks as values...

... {
  foo: {
    bar: {
      baz: "hello world"
    }
  }
}

The top-level block in a file (a unit) does not have braces:

a: 1
b: 2
c: 3

Blocks may also have a single expression, preceding its declarations which constitutes block metadata. This is optional.

So the block pattern is braces (except for the unit level) containing, optionally metadata, then any number of declarations.

{ «metadata expression»?  [«declaration»]* }

Both the metadata expression and the declarations may contain blocks. This recursive application of the block pattern defines the major structure of any eucalypt code.

So far all the declarations we have seen have been property declarations which contain a name and an expression, separated by a colon.

Commas are entirely optional for delimiting declarations. Line endings are not significant. The following is a top-level block of three property declarations.

a: 1 b: 2 c: 3

There are other types of declarations. By specifying a parameter list, you get a function declaration:

# A function declaration
f(x, y): x + y

two: f(1, 1)

Function parameters can be destructuring patterns as well as simple names. A block pattern extracts named fields from a block argument; a list pattern extracts positional elements from a list argument:

# Block destructuring — shorthand binds field name as variable name
sum-xy({x y}): x + y

# Block destructuring — rename binds field under a new variable name
product-ab({x: a  y: b}): a * b

# Mixed shorthand and rename
mixed({x  y: b}): x + b

# Fixed-length list destructuring
f-sum([a, b, c]): a + b + c
f-first([a, b]): a

# Head/tail list destructuring — colon separates fixed heads from tail
get-head([x : xs]): x
get-tail([x : xs]): xs
drop-two([a, b : rest]): rest

Juxtaposed call syntax passes a block or list literal as a single argument without parentheses. No space between the function name and the opening bracket:

# f{...} is sugar for f({...})
sum-xy{x: 10 y: 20}

# f[...] is sugar for f([...])
add-pair[1, 2]

Combined with block destructuring, this gives named arguments:

greet({name greeting}): "{greeting}, {name}!"
greet{name: "Alice" greeting: "Hello"}    # => "Hello, Alice!"

Juxtaposed syntax also works in definitions — the bracket or brace is written directly against the function name with no space. This is sugar for the parenthesised destructuring form:

# f[x, y]: ...  is sugar for  f([x, y]): ...
add-pair[a, b]: a + b

# f{x y}: ...   is sugar for  f({x y}): ...
add-block{x y}: x + y

# f[h : t]: ... is sugar for  f([h : t]): ...
my-head[h : t]: h

Destructuring patterns can be mixed with normal parameters:

f(n, [a, b]): n * (a + b)

The operator (U+2016, DOUBLE VERTICAL LINE) prepends an element to a list. It is right-associative:

1 ‖ [2, 3]        # => [1, 2, 3]
1 ‖ 2 ‖ [3]       # => [1, 2, 3]

See Functions and Combinators for more detail on destructuring.

...and using some brackets and suitable names, you can define operators too, either binary:

# A binary operator declaration
(x ^|^ y): "{x} v {y}"

...or prefix or postfix unary operators:

# A prefix operator declaration
(¬ x): not(x)

# A postfix operator declaration
(x ******): "maybe {x}"

Eucalypt should handle unicode gracefully and any unicode characters in the symbol or punctuation classes are fine for operators.

In addition to named operators, you can define idiot brackets — custom Unicode bracket pairs that wrap an expression and apply a function to it. The name is inspired by idiom brackets from applicative functor notation, but they are a general bracket overloading mechanism. A bracket pair declaration uses a Unicode bracket pair wrapping a single parameter directly (paren-free style):

# Ceiling brackets double
⌈ x ⌉: x * 2

# Floor brackets increment
⌊ x ⌋: x + 1

The older paren-wrapped style is still supported for backwards compatibility:

(⌈ x ⌉): x * 2    # paren style — still valid

Once declared, the bracket pair can be used as an expression:

doubled: ⌈ 3 + 4 ⌉    # => 14
bumped:  ⌊ 5 ⌋         # => 6

The declaration ⌈ x ⌉: body defines a function named ⌈⌉ (open then close bracket) that takes one argument x and returns body. Using ⌈ expr ⌉ in an expression calls that function with expr.

The following Unicode bracket pairs are built-in and can be used for idiot brackets without any registration:

OpenCloseName
Mathematical white square brackets
Mathematical angle brackets
Mathematical double angle brackets
Ceiling brackets
Floor brackets
Mathematical white curly brackets
Mathematical white tortoise shell brackets
Mathematical flattened parentheses
«»French guillemets
CJK lenticular brackets
CJK tortoise shell brackets
CJK white lenticular brackets
CJK white tortoise shell brackets
CJK white square brackets

Monadic blocks

Eucalypt supports monadic sequencing — analogous to Haskell do-notation — through two mechanisms: bracket pair definitions and block metadata.

Bracket pair definitions

A bracket pair gains a monad spec when declared with an empty block {} as its parameter and a body marked with :monad metadata, supplying bind and return function names:

# Explicit bind/return functions
⟦{}⟧: { :monad bind: my-bind  return: my-return }

# Namespace reference — delegates to a block with bind and return members
⟦{}⟧: { :monad namespace: my-monad }

A bracket expression whose inner content contains top-level colons is parsed as a bracket block — a sequence of name: monadic-action declarations. The closing bracket must be followed by a dot and a return expression:

result: ⟦ a: ma  b: mb ⟧.return_expr

Block metadata forms

Regular blocks can also be desugared monadically when the block carries monad metadata and is immediately followed by .return_expr in an expression. Five forms are accepted:

FormSyntaxMonad source
1{ :name decls }.exprNamespace name in scope
2{ { monad: name } decls }.exprNamespace name in scope
3{ { :monad namespace: name } decls }.exprNamespace name in scope
4{ { :monad bind: f return: r } decls }.exprExplicit f/r functions
5⟦{}⟧: { :monad namespace: name } (bracket def)Namespace name in scope

For namespace forms (1–3 and 5), the named value must be a block in scope with bind and return member functions. The desugarer emits name.bind(…) and name.return(…) lookup expressions.

Desugaring

All monadic block forms desugar to a right-to-left bind chain:

bind(ma, (a): bind(mb, (b): return(return_expr)))

All declarations are bind steps. Each bound name is in scope for later actions and for the return expression. The return expression may be any single element: a name (.r), a parenthesised expression (.(x + y)), a list, or a block.

Example — identity monad (bracket pair, explicit functions):

id-bind(ma, f): f(ma)
id-return(a): a

⟦{}⟧: { :monad bind: id-bind  return: id-return }

result: ⟦ x: 10  r: x + 5 ⟧.r     # => 15

Example — maybe monad (namespace reference via block metadata):

maybe: { bind(ma, f): if(ma = [], [], f(ma head))  return(a): [a] }

just:    { :maybe x: [1]  y: [2] }.(x + y)   # => [3]
nothing: { :maybe x: []   y: [2] }.(x + y)   # => []

Example — maybe monad (bracket pair with namespace reference):

maybe: { bind(ma, f): if(ma = [], [], f(ma head))  return(a): [a] }

⌈{}⌉: { :monad namespace: maybe }

just:    ⌈ x: [1]  y: [2] ⌉.(x + y)   # => [3]
nothing: ⌈ x: []   y: [2] ⌉.(x + y)   # => []

To control the precedence and associativity of user defined operators, you need metadata annotations.

Declaration annotations allow us to specify arbitrary metadata against declarations. These can be used for documentation and similar.

To attach an annotation to a declaration, squeeze it between a leading backtick and the declaration itself:

` { doc: "This is a"}
a: 1

` { doc: "This is b"}
b: 2

Some metadata activate special handling, such as the associates and precedence keys you can put on operator declarations:

` { doc: "`(f ∘ g)` - return composition of `f` and `g`"
    associates: :right
    precedence: 88 }
(f ∘ g): compose(f,g)

Look out for other uses like :target, :suppress, :main.

Finally, you can specify metadata at a unit level. If the first item in a unit is an expression, rather than a declaration, it is treated as metadata that is applied to the whole unit.

{ :doc "This is just an example unit" }
a: 1 b: 2 c: 3

The expression DSL

Everything that can appear to the right of the colon in a declaration is an expression and defined by the expression DSL.

Primitives

First there are primitives.

...numbers...

123
-123
123.333

...double quoted strings...

"a string"

...symbols, prefixed by a colon...

:key

...which are currently very like strings, but used in circumstances where their internal structure is generally not significant (i.e. keys in a block's internal representation).

Finally, booleans (true and false) are pre-defined constants. As is (null) which is a value which renders as YAML or JSON's version of null but is not used by Eucalypt itself.

Block literals

Block literals (in braces, as defined in the block DSL) are expressions and can be the values of declarations or passed as function arguments or operands in any of the contexts below:

foo: { a: 1 b: 2 c: 3}

List literals

List literals are enclosed in square brackets and contain a comma separated sequence of expressions:

list: [1, 2, :a, "boo"]

Names

Then there are names, which refer to the surrounding context. They might refer to properties:

x: 22
y: x

...or functions:

add-one(x): 1 + x
three: add-one(2)

...or operators:

(x &&& y): [x, x, x, y]
z: "da" &&& "dum"

Calling functions

Functions can be applied by suffixing an argument list in parens, with no intervening whitespace:

f(x, y): x + y
result: f(2, 2) # no whitespace

In the special case of applying a single argument, "catenation" can be used:

add-one(x): 1 + x
result: 2 add-one

...which allows succinct expressions of pipelines of operations.

In addition, functions are curried so can be partially applied:

add(x, y): x + y
increment: add(1)
result: 2 increment

...and placeholder underscores (or expression anaphora) can be used to define simple functions without the song and dance of a function declaration:

f: if(tuesday?, (_ * 32 / 12), (99 / _))
result: f(3)

In fact, in many cases the underscores can be omitted, leading to a construct very similar to Haskell's sections only even brackets aren't necessary.

Note: Eucalypt uses its knowledge of the fixity and associativity of each operator to find "gaps" and fills them with the unwritten underscores. This is great for simple cases but worth avoiding for complicated expressions.

increment: + 1
result: 2 increment (126 /)

Both styles of function application together with partial application and sectioning can all be applied together:

result: [1, 2, 3] map(+1) filter(odd?) //=> [3]

(//=> is an assertion operator which causes a panic if the left and right hand expressions aren't found to be equal at run time, but returns that value if they are.)

Note: There are no explicit lambda expressions in Eucalypt right now. For simple cases, expression or string anaphora should do the job. For more involved cases, you should use a named function declaration. See Anaphora for more.

Operators and Identifiers

Eucalypt distinguishes two different types of identifier, normal identifiers, like x, y, α, א, ziggety-zaggety, zoom?, and operator identifiers like *, @, &&, , , ⊙⊙⊙, <> and so on.

It is entirely a matter of the component characters which category an identifier falls into. Normal identifiers contain letters (including non-ASCII characters), numbers, "-", "?", "$". Operator identifiers contain the usual suspects and anything identified as an operator or symbol in unicode. Neither can contain ":" or "," or brackets which are special in eucalypt.

Any sequence of characters at all can be treated as a normal identifier by surrounding them in single quotes. This is the only use of single quotes in eucalypt. This can be useful when you want to use file paths or other external identifiers as block keys for instance:

home: {
  '.bashrc': false
  '.emacs.d': false
  'notes.txt': true
}

z: home.'notes.txt'

Normal identifiers

Normal identifiers are brought into scope by declarations and can be referred to without qualification in their own block or in more nested blocks:

x: {
  z: 99
  foo: z //=> 99
  bar: {
    y: z //=> 99
  }
}

They can be accessed from within other blocks using the lookup operator:

x: {
  z: 99
}

y: x.z //=> 99

They can be overridden using generalised lookup:

z: 99
y: { z: 100 }."z is {z}" //=> "z is 100"

They can be shadowed:

z: 99
y: { z: 100 r: z //=> 100 }

But beware trying to access the outer value:

name: "foo"
x: { name: name } //=> infinite recursion

Accessing shadowed values is not yet easily possible unless you can refer to an enclosing block and use a lookup.

Prefix operators

Some operators are defined as prefix (unary) operators rather than infix (binary) operators. These bind tightly to the expression that follows.

For example, the operator is a tight-binding prefix form of head:

xs: [1, 2, 3]
first: ↑xs  //=> 1

Because it binds tightly (precedence 95), it works naturally in pipelines without parentheses:

xs: [[1, 2], [3, 4]]
result: xs map(↑)  # map head over list of lists

Other prefix operators include ! and ¬ for boolean negation, and for numeric negation.

Operator identifiers

Operator identifiers are more limited than normal identifiers.

They are brought into scope by operator declarations and available without qualification in their own block and more nested blocks:

( l -->> r): "{l} shoots arrow at {r}"

x: {
  y: 2 -->> 3 //=> "2 shoots arrow at 3"
}

...and can be shadowed:

(l !!! r): l + r

y: {
  (l !!! r): l - r
  z: 100 !!! 1 //=> 99
}

But:

  • they cannot be accessed by lookup, so there is no way of forming a qualified name to access an operator
  • they cannot be overridden by generalised lookup

Prelude Reference

The eucalypt prelude is a standard library of functions, operators, and constants that is automatically loaded before your code runs.

You can suppress the prelude with -Q if needed, though this leaves a very bare environment (even true, false, and if are defined in the prelude).

Categories

  • Lists -- list construction, transformation, folding, sorting (64 entries)
  • Blocks -- block construction, access, merging, transformation (52 entries)
  • Strings -- string manipulation, regex, formatting (26 entries)
  • Numbers and Arithmetic -- numeric operations and predicates (14 entries)
  • Booleans and Comparison -- boolean logic and comparison operators (13 entries)
  • Combinators -- function composition, application, utilities (12 entries)
  • Calendar -- date and time functions (5 entries)
  • Sets -- set operations (11 entries)
  • Random Numbers -- random number generation, monadic random: namespace (20 entries)
  • Metadata -- metadata and assertion functions (7 entries)
  • IO -- environment, time, argument access, and monad utility (16 entries)

240 documented entries in total.

Lists

Basic Operations

FunctionDescription
consConstruct new list by prepending item h to list t
snoc(x, l)Append element x to the end of list l
headReturn the head item of list xs, panic if empty
(↑ xs)Return first element of list xs. Tight-binding prefix operator
nil?true if list xs is empty, false otherwise
non-nil?true if list xs is non-empty, false otherwise
(x ✓)true if x is not null, false otherwise. Postfix not-null check (precedence 88)
head-or(d, xs)Return the head item of list xs or default d if empty
tailReturn list xs without the head item. [] causes error
tail-or(d, xs)Return list xs without the head item or d for empty list
nilIdentical to [], the empty list
firstReturn first item of list xs - error if the list is empty
second(xs)Return second item of list - error if there is none
second-or(d, xs)Return second item of list - default d if there is none

List Construction

FunctionDescription
repeat(i)Return infinite list of instances of item i
iterate(f, i)Return list of i with subsequent repeated applications of f to i
ints-from(n)Return infinite list of integers from n upwards
range(b, e)Return list of ints from b to e (not including e)
cycle(l)Create infinite list by cycling elements of list l

Transformations

FunctionDescription
take(n, l)Return initial segment of integer n elements from list l
drop(n, l)Return result of dropping integer n elements from list l
take-while(p?, l)Initial elements of list l while p? is true
take-until(p?)Initial elements of list l while p? is false
drop-while(p?, l)Skip initial elements of list l while p? is true
drop-until(p?)Skip initial elements of list l while p? is false
map(f, l)Map function f over list l
map2(f, l1, l2)Map function f over lists l1 and l2, until the shorter is exhausted
cross(f, xs, ys)Apply f to every combination of elements from xs and ys (cartesian product)
filter(p?, l)Return list of elements of list l that satisfy predicate p?
remove(p?, l)Return list of elements of list l that do not satisfy predicate p?
reverse(l)Reverse list l

Combining Lists

FunctionDescription
zip-withMap function f over lists l1 and l2, until the shorter is exhausted
zipList of pairs of elements l1 and l2, until the shorter is exhausted
append(l1, l2)Concatenate two lists l1 and l2
prependConcatenate two lists with l1 after l2
concat(ls)Concatenate all lists in ls together
mapcat(f)Map items in l with f and concatenate the resulting lists
zip-apply(fs, vs)Apply fns in list fs to corresponding values in list vs, until shorter is exhausted

Splitting Lists

FunctionDescription
split-at(n, l)Split list in to at nth item and return pair
split-after(p?, l)Split list where p? becomes false and return pair
split-when(p?, l)Split list where p? becomes true and return pair
window(n, step, l)List of lists of sliding windows over list l of size n and offest step
partition(n)List of lists of non-overlapping segments of list l of size n
discriminate(pred, xs)Return pair of xs for which pred(_) is true and xs for which pred(_) is false

Folds and Scans

FunctionDescription
foldl(op, i, l)Left fold operator op over list l starting from value i
foldr(op, i, l)Right fold operator op over list l ending with value i
scanl(op, i, l)Left scan operator op over list l starting from value i
scanr(op, i, l)Right scan operator op over list l ending with value i

Predicates

FunctionDescription
all-true?(l)True if and only if all items in list l are true
all(p?, l)True if and only if all items in list l satisfy predicate p?
any-true?(l)True if and only if any items in list l are true
any(p?, l)True if and only if any items in list l satisfy predicate p?

Sorting

FunctionDescription
group-by(k, xs)Group xs by key function returning block of key to subgroups, maintains order
qsort(lt, xs)Sort xs using 'less-than' function lt
sort-nums(xs)Sort list of numbers ascending
sort-strs(xs)Sort list of strings or symbols ascending
sort-zdts(xs)Sort list of zoned date-times ascending
sort-by(key-fn, cmp, xs)Sort list xs by key extracted with key-fn using comparator cmp
sort-by-num(key-fn)Sort list xs ascending by numeric key extracted with key-fn
sort-by-str(key-fn)Sort list xs ascending by string key extracted with key-fn
sort-by-zdt(key-fn)Sort list xs ascending by zoned date-time key extracted with key-fn

Sorting Examples

nums: [3, 1, 4, 1, 5] sort-nums          # [1, 1, 3, 4, 5]
words: ["banana", "apple", "cherry"] sort-strs  # ["apple", "banana", "cherry"]

people: [{name: "Zara" age: 30}, {name: "Alice" age: 25}]
by-name: people sort-by-str(.name)       # sorted by name
by-age: people sort-by-num(.age)         # sorted by age

Other

FunctionDescription
nth(n, l)Return nth item of list if it exists, otherwise panic
(l !! n)Return nth item of list if it exists, otherwise error. For arrays, n must be a coordinate list (e.g. [row, col]) and delegates to arr.get.
count(l)Return count of items in list l
lastReturn last element of list l
over-sliding-pairs(f, l)Apply binary fn f to each overlapping pair in l to form new list
differencesCalculate difference between each overlapping pair in list of numbers l

Blocks

Block Construction and Merging

FunctionDescription
symCreate symbol with name given by string s
mergeShallow merge block b2 on top of b1
deep-mergeDeep merge block b2 on top of b1, merges nested blocks but not lists
block?True if and only if v is a block
list?True if and only if v is a list
elementsExpose list of elements of block b
block(re)construct block from list kvs of elements
has(s, b)True if and only if block b has key (symbol) s
lookup(s, b)Look up symbol s in block b, error if not found
lookup-in(b, s)Look up symbol s in block b, error if not found
lookup-or(s, d, b)Look up symbol s in block b, default d if not found
lookup-or-in(b, s, d)Look up symbol s in block b, default d if not found
lookup-alts(syms, d, b)Look up symbols syms in turn in block b until a value is found, default d if none
lookup-across(s, d, bs)Look up symbol s in turn in each of blocks bs until a value is found, default d if none
lookup-path(ks, b)Look up value at key path ks in block b

Block Utilities

FunctionDescription
merge-all(bs)Merge all blocks in list bs together, later overriding earlier
keyReturn key in a block element / pair
valueReturn key in a block element / pair
keys(b)Return keys of block
values(b)Return values of block
sort-keys(b)Return block b with keys sorted alphabetically
bimap(f, g, pr)Apply f to first item of pair and g to second, return pair
map-first(f, prs)Apply f to first elements of all pairs in list of pairs prs
map-second(f, prs)Apply f to second elements of all pairs in list of pairs prs
map-kv(f, b)Apply f(k, v) to each key / value pair in block b, returning list
map-as-block(f, syms)Map each symbol in syms and create block mapping syms to mapped values
pair(k, v)Form a block element from key (symbol) k and value v
zip-kv(ks, vs)Create a block by zipping together keys ks and values vs
with-keysCreate block from list of values by assigning list of keys ks against them
map-values(f, b)Apply f(v) to each value in block b
map-keys(f, b)Apply f(k) to each key in block b
filter-items(f, b)Return items from block b which match item match function f
by-key(p?)Return item match function that checks predicate p? against the (symbol) key
by-key-name(p?)Return item match function that checks predicate p? against string representation of the key
by-key-match(re)Return item match function that checks string representation of the key matches regex re
by-value(p?)Return item match runction that checks predicate p? against the item value
match-filter-values(re, b)Return list of values from block b with keys matching regex re
filter-values(p?, b)Return items from block b where values match predicate p?

Block Alteration

FunctionDescription
alter-value(k, v, b)Alter b.k to value v
update-value(k, f, b)Update b.k to f(b.k)
alter(ks, v, b)In nested block b alter value to value v at path-of-keys ks
update(ks, f, b)In nested block b applying f to value at path-of-keys ks
update-value-or(k, f, d, b)Set b.k to f(v) where v is current value, otherwise add with default value d
set-value(k, v)Set b.k to v, adding if absent
tongue(ks, v)Construct block with a single nested path-of-keys ks down to value v
merge-at(ks, v, b)Shallow merge block v into block value at path-of-keys ks
deep-merge-at(ks, v, b)Deep merge block v into block value at path-of-keys ks (preserves nested blocks)

Deep Find and Query

FunctionDescription
deep-find(k, b)Return list of all values for key k at any depth in block b, depth-first
deep-find-first(k, d, b)Return first value for key k at any depth in block b, or default d
deep-find-paths(k, b)Return list of key paths to all occurrences of key k at any depth in block b
deep-query(pattern, b)Query block b using dot-separated pattern string. * matches one level, ** matches any depth. Bare foo is sugar for **.foo
deep-query-first(pattern, d, b)Return first match for pattern in block b, or default d
deep-query-paths(pattern, b)Return list of key paths matching pattern in block b

Deep Find

Searches for a key at any nesting level:

config: {
  server: { host: "localhost" port: 8080 }
  db: { host: "db.local" port: 5432 }
}

hosts: config deep-find(:host)  # ["localhost", "db.local"]
first-host: config deep-find-first(:host, "unknown")  # "localhost"

Deep Query

Queries using dot-separated patterns with wildcards:

  • Bare name foo is sugar for **.foo (find at any depth)
  • * matches one level
  • ** matches any depth
data: {
  us: { config: { host: "us.example.com" } }
  eu: { config: { host: "eu.example.com" } }
}

# Find all hosts under any config
hosts: data deep-query("config.host")  # ["us.example.com", "eu.example.com"]

# Wildcard: any key at one level, then host
hosts: data deep-query("*.config.host")

Persistent Blocks (Experimental)

The pb: namespace provides persistent (immutable, structurally-shared) blocks backed by im::OrdMap. These offer O(log n) lookup and merge operations with structural sharing. This feature is experimental and is not included in generated documentation exports.

FunctionDescription
pb.from-block(b)Convert a standard block b into a persistent block
pb.lookup(k, d, p)Look up key (symbol) k in persistent block p, returning default d if absent
pb.to-list(p)Return an ordered list of [key, value] pairs from persistent block p
pb.merge(l, r)Merge persistent blocks l and r; left-hand values win on key conflicts
pb.merge-with(f, l, r)Merge persistent blocks l and r, resolving conflicts with f(left-val, right-val)
# Round-trip a standard block through a persistent block
p: {a: 1, b: 2} pb.from-block
v: p pb.lookup(:a, 0)           # 1
l: p pb.to-list block           # {a: 1, b: 2}

# Merge two persistent blocks (left wins on conflict)
merged: ({a: 1} pb.from-block) pb.merge({a: 99, b: 2} pb.from-block)
# pb.lookup(:a, 0) merged => 1

Strings

String Processing

FunctionDescription
str.ofConvert e to string
str.splitSplit string s on separators matching regex re
str.split-onSplit string s on separators matching regex re
str.joinJoin list of strings l by interposing string s
str.join-onJoin list of strings l by interposing string s
str.matchMatch string s using regex re, return list of full match then capture groups
str.match-withMatch string s using regex re, return list of full match then capture groups
str.extract(re)Use regex re (with single capture) to extract substring of s - or error
str.extract-or(re, d, s)Use regex re (with single capture) to extract substring of s - or default d
str.matchesReturn list of all matches in string s of regex re
str.matches-ofReturn list of all matches in string s of regex re
str.matches?(re, s)Return true if re matches full string s
str.suffix(b, a)Return string b suffixed onto a
str.prefix(b, a)Return string b prefixed onto a
str.lettersReturn individual letters of s as list of strings
str.lenReturn length of string in characters
str.fmtFormat x using printf-style format spec
str.to-upperConvert string s to upper case
str.to-lowerConvert string s to lower case
str.lt(a, b)True if string a is lexicographically less than b
str.gt(a, b)True if string a is lexicographically greater than b
str.lte(a, b)True if string a is lexicographically less than or equal to b
str.gte(a, b)True if string a is lexicographically greater than or equal to b
str.replace(pattern, replacement, s)Replace all matches of regex pattern with replacement in s
str.contains?(pattern, s)True if s contains a match for regex pattern
str.trimTrim leading and trailing whitespace
str.starts-with?(re, s)True if s starts with a match for regex re
str.ends-with?(re, s)True if s ends with a match for regex re
str.shell-escape(s)Wrap in single quotes for safe shell use, escaping embedded '
str.dq-escape(s)Escape $, `, ", \ for use inside double quotes
str.base64-encodeEncode string s as base64
str.base64-decodeDecode base64 string s back to its original string
str.sha256Return the SHA-256 hash of string s as lowercase hex

Character Constants

The ch namespace provides special characters:

  • ch.n -- Newline
  • ch.t -- Tab
  • ch.dq -- Double quote

Encoding and Hashing Examples

encoded: "hello" str.base64-encode    # "aGVsbG8="
decoded: "aGVsbG8=" str.base64-decode # "hello"
hash: "hello" str.sha256              # "2cf24dba5fb0a30e..."

Serialisation

Serialise eucalypt values to strings using a named output format. These are pure functions — no IO is required.

FunctionDescription
render(value)Serialise value to a YAML string
render-as(fmt, value)Serialise value to a string in format fmt

Supported formats for fmt: :yaml, :json, :toml, :text, :edn, :html.

Serialisation Examples

yaml-str: render({a: 1, b: 2})           # "a: 1\nb: 2\n"
json-str: render-as(:json, {a: 1, b: 2}) # "{\"a\":1,\"b\":2}"

These functions are backed by the RENDER_TO_STRING intrinsic, which traverses the evaluated heap value and serialises it using the same emitter pipeline as normal output.

Parsing

Parse a string of structured data back into eucalypt data. This is the inverse of render-as and is a pure function — no IO is required.

FunctionDescription
parse-as(fmt, str)Parse str as structured data in format fmt

Supported formats for fmt: :json, :yaml, :toml, :csv, :xml, :edn, :jsonl.

:json and :yaml share the same parser.

Safety: parse-as always uses data-only mode. YAML !eu tags and other embedded-code constructs are returned as plain string values and never evaluated. It is safe to parse untrusted input (e.g. shell command output) with this function.

Parsing Examples

# Parse JSON
data: "{\"x\": 1}" parse-as(:json)
data.x  # 1

# Round-trip
original: {x: 1, y: 2}
recovered: render-as(:json, original) parse-as(:json)
recovered.x  # 1

# Pipeline style (parse-as with first arg partially applied)
{ :io r: io.shell("kubectl get configmap foo -o json") }.r.stdout
  parse-as(:json)

parse-as is backed by the PARSE_STRING intrinsic.

Numbers and Arithmetic

Arithmetic Operators

FunctionDescription
(∸ n)Unary minus; negate

The binary arithmetic operators +, -, *, and / are polymorphic: when either operand is an n-dimensional array, the operation is performed element-wise. One operand may be a scalar number. For purely numeric operands, the operators behave as before (/ performs floor division for numbers, but element-wise float division for arrays).

Numeric Functions

FunctionDescription
incIncrement number x by 1
decDecrement number x by 1
negateNegate number n
zero?Return true if and only if number n is 0
pos?Return true if and only if number n is strictly positive
neg?Return true if and only if number n is strictly negative
numParse number from string
floorRound number downwards to nearest integer
ceilingRound number upwards to nearest integer
pow(b, e)Raise b to the power e
div(a, b)Floor division; same as a / b
mod(a, b)Floor modulus; same as a % b
quot(a, b)Truncation division; rounds toward zero
rem(a, b)Truncation remainder; result has same sign as dividend
sumSum a list of numbers
max(l, r)Return max of numbers l and r
max-of(l)max-of(l) - return max element in list of numbers l - error if empty
min(l, r)Return min of numbers l and r
min-of(l)min-of(l) - return min element in list of numbers l - error if empty

Booleans and Comparison

Essentials

FunctionDescription
nullA null value. To export as null in JSON or ~ in YAML
trueConstant logical true
falseConstant logical false
ifIf c is true, return t else f
then(t, f, c)For pipeline if: - x? then(t, f)
when(p?, f, x)When x satisfies p? apply f else pass through unchanged

Error and Debug Support

FunctionDescription
panicRaise runtime error with message string s
assert(p?, s, v)If v p? is true then return v otherwise error with message s. Composable in pipelines: x assert(non-nil?, "expected non-nil")

Boolean Logic

FunctionDescription
notToggle boolean
(! b)Not x, toggle boolean
(¬ b)Not x, toggle boolean
andTrue if and only if l and r are true
orTrue if and only if l or r is true

Combinators

Combinators

FunctionDescription
identity(v)Identity function, return value v
const(k, _)Return single arg function that always returns k
(-> k)Const; return single arg function that always returns k
compose(f, g, x)Apply function f to g(x)
apply(f, xs)Apply function f to arguments in list xs
flip(f, x, y)Flip arguments of function f, flip(f)(x, y) == f(y, x)
complement(p?)Invert truth value of predicate function
curry(f, x, y)Turn f([x, y]) into f' of two parameters (x, y)
uncurry(f, l)Turn f(x, y) into f' that expects [x, y] as a list
cond(l, d)In list l of [condition, value] select first true condition, returning value, else default d
juxt(f, g, x)juxt(f, g) - return function of xreturning list off(x) and g(x)

Utilities

FunctionDescription
fnil(f, v, x)Return a function equivalent to f except it sees x instead of null when null is passed

Calendar

Date and Time Functions

FunctionDescription
cal.zdtCreate zoned date time from datetime components and timezone string (e.g. '+0100')
cal.datetime(b)Convert block of time fields to zoned datetime (defaults: y=1, m=1, d=1, H=0, M=0, S=0, Z=UTC)
cal.parseParse an ISO8601 formatted date string into a zoned date time
cal.formatFormat a zoned date time as ISO8601
cal.fieldsDecompose a zoned date time into a block of its component fields (y,m,d,H,M,S,Z)

Sets

Set Operations

FunctionDescription
set.from-list(xs)Create a set from list xs of primitive values
set.to-listReturn sorted list of elements in set s
set.addAdd element e to set s
set.removeRemove element e from set s
set.contains?True if set s contains element e
set.sizeReturn number of elements in set s
set.empty?(s)True if set s has no elements
set.unionReturn union of sets a and b
set.intersectReturn intersection of sets a and b
set.diffReturn elements in set a that are not in set b
(∅)The empty set
s: set.from-list([1, 2, 3, 2, 1])
# s contains {1, 2, 3} (duplicates removed)

Set Algebra

a: set.from-list([1, 2, 3])
b: set.from-list([2, 3, 4])
u: set.union(a, b) set.to-list       # [1, 2, 3, 4]
i: set.intersect(a, b) set.to-list   # [2, 3]
d: set.diff(a, b) set.to-list        # [1]

Random Numbers

The random namespace provides pseudo-random number generation as a state monad over a PRNG stream.

The random stream

A random stream is provided at startup as io.random, seeded from system entropy. Each run produces different values unless you supply a fixed seed with --seed:

eu --seed 42 example.eu

You can also create a deterministic stream directly:

stream: random.stream(12345)

Using random operations

Each random operation takes a stream and returns a {value, rest} block. For a single value, pass io.random:

roll: random.int(6, io.random).value + 1

For multiple values, you must propagate .rest into the next call — reusing io.random gives the same value each time. Use a { :random ... } monadic block or combinators like sequence to handle this automatically:

dice: { :random
  a: random.int(6)
  b: random.int(6)
}.[a + b + 2]

result: dice(io.random).value

Always extract .value before rendering — the .rest field is an infinite stream.

Reference

FunctionDescription
random.stream(seed)Create a PRNG stream from an integer seed
random.bind(m, f)State monad bind: run action m, pass result to f, thread stream
random.return(v)State monad return: wrap a pure value as an action
random.floatAction returning a random float in [0,1)
random.int(n)Action returning a random integer in [0,n)
random.choice(list)Action returning a random element from list
random.shuffle(list)Action returning a shuffled copy of list
random.sample(n, list)Action returning n elements sampled without replacement
random.map(f, action)Apply pure function f to the result of an action (derived)
random.then(b, a)Sequence two actions, discard first result (derived). Pipeline: a random.then(b)
random.join(mm)Flatten a nested action (derived)
random.sequence(ms)Sequence a list of actions, collect results (derived)
random.map-m(f, xs)Map f over list producing actions, then sequence (derived)
random.filter-m(p, xs)Monadic filter over a list of actions (derived)

Metadata

Metadata is a powerful mechanism for attaching auxiliary information to any eucalypt expression. It is used for documentation, export control, import declarations, operator definitions, and testing assertions.

Attaching and Reading Metadata

Metadata Basics

FunctionDescription
with-metaAdd metadata block m to expression e
metaRetrieve expression metadata for e
raw-metaRetrieve immediate metadata of e without recursing into inner layers
merge-meta(m, e)Merge block m into e's metadata
validator(v)Find the validator for a value v in its metadata
check(v)True if v is valid according to assert metadata
checked(v)Panic if value doesn't satisfy its validator

Documentation Metadata

The backtick (`) before a declaration attaches metadata. When the value is a string, it sets the doc key:

` "Add two numbers together"
add(a, b): a + b

This is equivalent to:

` { doc: "Add two numbers together" }
add(a, b): a + b

For richer metadata, use a block:

` { doc: "Infix addition operator"
    precedence: :sum
    associates: :left }
(a + b): __ADD(a, b)

Common Metadata Keys

KeyPurpose
docDocumentation string
importImport specification
targetExport target name
exportExport control (:suppress to hide)
precedenceOperator precedence level
associatesOperator associativity (:left, :right)
parse-embedEmbedded representation format

IO

Prelude Versioning

FunctionDescription
eu.preludeMetadata about this version of the standard prelude
eu.buildMetadata about this version of the eucalypt executable
eu.requiresAssert that the eucalypt version satisfies the given semver constraint (e.g. '>=0.2.0')

Runtime IO Values

NameDescription
io.envRead access to environment variables at time of launch
io.epoch-timeUnix epoch time at time of launch
io.argsCommand-line arguments passed after -- separator
io.RANDOM_SEEDSeed for random number generation (from --seed or system time)
io.randomInfinite lazy stream of random floats in [0,1)

IO Monad

The io namespace is a monad. Blocks tagged with :io are desugared into monadic bind chains automatically:

result: { :io
  r: io.shell("ls -la")
  _: io.check(r)
}.r.stdout

...desugars to io.bind(io.shell("ls -la"), λr. io.bind(io.check(r), λ_. io.return(r.stdout))).

IO operations require the --allow-io / -I flag at the command line.

Monad primitives

FunctionDescription
io.return(a)Wrap a pure value in the IO monad
io.bind(action, continuation)Sequence two IO actions

Shell execution

FunctionDescription
io.shell(cmd)Run cmd via sh -c. Returns {stdout: Str, stderr: Str, exit-code: Num}
io.shell-with(opts, cmd)Run cmd via sh -c with extra options merged in (e.g. {stdin: s, timeout: 60}). Pipeline: "ls" shell-with({timeout: 60})
io.exec([cmd : args])Run cmd directly (no shell). Argument is a single list: first element is the command, rest are args
io.exec-with(opts, [cmd : args])Run cmd directly with extra options merged in. Pipeline: ["git", "rev-parse", "HEAD"] exec-with({timeout: 60})

Default timeout is 30 seconds. Override with {timeout: N} in opts. Optional {stdin: s} pipes string s to the command's standard input.

Combinators

FunctionDescription
io.check(result)If exit-code is non-zero, fail with the stderr message; otherwise return the result
io.checkedPipeline-friendly check: bind the preceding IO action through io.check
io.fail(msg)Fail the IO action with the given error message
io.map(f, action)Apply a pure function to the result of an IO action (fmap)
io.and-then(f, action)Pass the result of action to f (bind with flipped args, for pipeline use)
io.then(b, a)Sequence two actions, discarding the result of the first. Pipeline: a io.then(b)
io.join(mm)Flatten a nested IO action
io.sequence(ms)Run a list of IO actions in order, collecting results into a list
io.map-m(f, xs)Apply f to each element of xs (producing IO actions), then sequence
io.filter-m(p, xs)Monadic filter: keep elements where p returns a truthy IO action

The combinators map, then, join, sequence, map-m, and filter-m are derived automatically via monad(). See the Monads guide for details on the derivation pattern and the IO guide for practical usage.

FunctionDescription
monad(m)Derive standard monad combinators from m.bind and m.return
monad(m).bindPassed through from m.bind
monad(m).returnPassed through from m.return
monad(m).map(f, action)Apply pure function f to the result of a monadic action (fmap)
monad(m).and-then(f, action)Pass the result of action to f (bind with flipped args, for pipeline use)
monad(m).then(b, a)Sequence two monadic actions, discarding the result of the first. Pipeline: a m.then(b)
monad(m).join(mm)Flatten a nested monadic value
monad(m).sequence(ms)Sequence a list of monadic actions, collecting results into a list
monad(m).map-m(f, xs)Apply f to each element of xs (producing actions), then sequence
monad(m).filter-m(p, xs)Monadic filter: apply predicate p (returning a monadic bool) to each element

Other

FunctionDescription
alter?(k?, v!, k, v)If k satisfies k? then v! else v
update?(k?, f, k, v)If k satisfies k? then v! else v

N-Dimensional Arrays

Eucalypt provides a native n-dimensional array type (also called tensors) backed by a flat contiguous store with shape metadata. Arrays enable efficient grid and matrix operations, in particular for Advent of Code-style grid simulations where list-based approaches are too slow.

Arrays are an immutable, purely functional data structure. All mutation operations return new arrays.

Construction

FunctionDescription
arr.zeros(shape)Create an array of zeros with the given shape list
arr.fill(shape, val)Create an array filled with val with the given shape list
arr.from-flat(shape, vals)Create an array from a flat list of numbers with the given shape

The shape argument is always a list of integers, e.g. [3] for a 1D array of 3 elements or [2, 3] for a 2×3 matrix.

Access and Query

FunctionDescription
arr.get(a, coords)Get element at coordinate list coords in array a
arr.set(a, coords, val)Return new array with element at coords set to val
arr.shape(a)Return shape of array a as a list of integers
arr.rank(a)Return number of dimensions of array a
arr.length(a)Return total number of elements in array a
arr.to-list(a)Return flat list of elements in row-major order
arr.array?(x)true if x is an n-dimensional array
is-array?(x)true if x is an n-dimensional array (alias)

The coords argument to arr.get and arr.set is a list of integers, one per dimension, e.g. [row, col] for a 2D array.

The !! Operator

The indexing operator !! is overloaded for arrays. When the left operand is an array, !! delegates to arr.get:

my-2d-array !! [row, col]   # same as arr.get(my-2d-array, [row, col])
my-1d-array !! [idx]        # same as arr.get(my-1d-array, [idx])

For plain lists, !! retains its original list-index behaviour:

[10, 20, 30] !! 1   # => 20

Transformations

FunctionDescription
arr.transpose(a)Reverse all axes of array a
arr.reshape(a, shape)Reshape array a to new shape (total elements must match)
arr.slice(a, axis, idx)Take a slice along axis at idx, reducing rank by 1

Arithmetic

Array-specific arithmetic (explicit namespace):

FunctionDescription
arr.add(a, b)Element-wise addition; b may be a scalar
arr.sub(a, b)Element-wise subtraction; b may be a scalar
arr.mul(a, b)Element-wise multiplication; b may be a scalar
arr.div(a, b)Element-wise division; b may be a scalar

The standard arithmetic operators +, -, *, / are polymorphic: when either operand is an array, the operation is applied element-wise.

a: arr.from-flat([3], [1, 2, 3])
b: arr.from-flat([3], [10, 20, 30])
c: a + b          # element-wise: [11, 22, 33]
d: a * 2          # scalar broadcast: [2, 4, 6]

Note: / on arrays performs element-wise float division (not floor division as it does for plain integers).

Higher-order Operations

FunctionDescription
arr.indices(a)Return list of coordinate lists for every element, in row-major order
arr.map(f, a)Apply f to each element; return new array of same shape
arr.map-indexed(f, a)Apply f(coords, val) to each element; return new array of same shape
arr.fold(f, init, a)Left-fold f over all elements in row-major order, starting from init
arr.neighbours(a, coords, offsets)Return list of values at valid in-bounds neighbours of coords, given a list of offset vectors

arr.indices returns coordinates as lists; for a 2D array of shape [rows, cols], each entry is [row, col].

arr.neighbours silently skips any out-of-bounds coordinates, so it is safe to call on border elements without special-casing.

# Double every element of a 1D array
a: arr.from-flat([3], [1, 2, 3])
b: arr.map((_ * 2), a)   # => [2, 4, 6] (same shape)

# Sum all elements
total: arr.fold((+), 0, a)   # => 6

# List all coordinates of a 2×2 array
coords: arr.from-flat([2, 2], [0, 0, 0, 0]) arr.indices
# => [[0, 0], [0, 1], [1, 0], [1, 1]]

# Neighbours of centre cell in a 3×3 grid (4-connected)
grid: arr.from-flat([3, 3], [1, 2, 3, 4, 5, 6, 7, 8, 9])
ns: arr.neighbours(grid, [1, 1], [[-1, 0], [1, 0], [0, -1], [0, 1]])
# => [2, 8, 4, 6]

Example

# 3×3 grid initialised to zero
grid: arr.zeros([3, 3])

# Set a value at row 1, col 2
grid2: grid arr.set([1, 2], 42)

# Read back the value
val: grid2 arr.get([1, 2])    # => 42
val2: grid2 !! [1, 2]         # same thing via !! operator

# Compute element-wise sum of two grids
a: arr.from-flat([2, 2], [1, 2, 3, 4])
b: arr.from-flat([2, 2], [5, 6, 7, 8])
c: a + b   # => [[6, 8], [10, 12]] (as flat list: [6, 8, 10, 12])

CLI Reference

Eucalypt is available as a command line tool, eu, which reads inputs and writes outputs.

Everything it does in between is purely functional and there is no mutable state.

It is intended to be simple to use in unix pipelines.

eu --version # shows the current eu version
eu --help # lists command line options

Command Structure

The eu command uses a subcommand structure for clarity and extensibility:

eu [GLOBAL_OPTIONS] [SUBCOMMAND] [SUBCOMMAND_OPTIONS] [FILES...]

Subcommands

  • run (default) - Evaluate eucalypt code
  • test - Run tests
  • dump - Dump intermediate representations
  • version - Show version information
  • explain - Explain what would be executed
  • list-targets - List targets defined in the source
  • fmt - Format eucalypt source files
  • lsp - Start the Language Server Protocol server

When no subcommand is specified, run is used by default, so these are equivalent:

eu file.eu
eu run file.eu

Inputs

Files / stdin

eu can read several inputs, specified by command line arguments.

Inputs specify text data from:

  • files
  • stdin
  • internal resources (ignored for now)
  • (in future) HTTPS URLs or Git refs

...of which the first two are the common case. In the simplest case, file inputs are specified by file name, stdin is specified by -.

So

eu a.yaml - b.eu

...will read input from a.yaml, stdin and b.eu. Each will be read into eucalypt's core representation and merged before output is rendered.

Input format

Inputs must be one of the formats that eucalypt supports, which at present, are:

  • yaml
  • json
  • jsonl (JSON Lines)
  • toml
  • edn
  • xml
  • csv
  • text

Of these yaml, json, toml, edn and xml return blocks; jsonl, csv and text return lists. Inputs that return lists frequently need to be named (see below) to allow them to be used.

Usually the format is inferred from file extension but it can be overridden on an input by input basis using a format@ prefix.

For instance:

eu yaml@a.txt json@- yaml@b.txt

...will read YAML from a.txt, JSON from stdin and YAML from b.txt.

Named inputs

Finally inputs can be named using a name= prefix. This alters the way that data is merged by making the contents of an input available in a block or list with the specified name, instead of at the top level.

Suppose we have two inputs:

foo: bar
x: 42

then

eu a.yaml b.eu

would generate:

foo: bar
x: 42

but

eu data=a.yaml b.eu

would generate:

data:
  foo: bar

x: 42

This can be useful for various reasons, particularly when:

  • the form of the input's content is not known in advance
  • the input's content is a list rather than a block

Full input syntax

The full input syntax is therefore:

[name=][format@][URL/file]

This applies at the command line and also when specifying imports in .eu files.

stdin defaulting

When no inputs are specified and eu is being used in a pipeline, it will accept input from stdin by default, making it easy to pipe JSON or YAML from other tools into eu.

For example, this takes JSON from the aws CLI and formats it as YAML to stdout.

aws s3-api list-buckets | eu

How inputs are merged

When several inputs are listed, names from earlier inputs become available to later inputs, but the content that will be rendered is that of the final input.

So for instance:

a.eu

x: 4
y: 8

b.eu

z: x + y
eu a.eu b.eu

will output

z: 12

The common use cases are:

  • a final input containing logic to inspect or process data provided by previous inputs
  • a final input which uses functions defined in earlier inputs to process data provided in previous inputs

If you want to render contents of earlier inputs, you need a named input to provide a name for that content which you can then use.

For instance:

eu r=a.eu b.eu -e r

will render:

x: 4
y: 8

--collect-as and --name-inputs

Occasionally it is useful to aggregate data from an arbitrary number of sources files, typically specified by shell wildcards. To refer to this data we need to introduce a name for the collection of data.

This is what the command line switch --collect-as / -c is for.

eu --collect-as inputs *.eu

...will render:

inputs:
  - x: 4
    y: 8
  - z: 12

It is common to use -e to select an item to render:

eu -c inputs *.eu -e 'inputs head'

...renders:

x: 4
y: 8

If you are likely to need to refer to inputs by name, you can add --name-inputs / -N to pass inputs as a block instead of a list:

eu --collect-as inputs --name-inputs *.eu

...renders:

inputs:
  a.eu:
    x: 4
    y: 8
  b.eu:
    z: 12

This makes it easier to invoke specific functions from named inputs although you will need single-quote name syntax to use the generated names which contain .s.

Outputs

In the current version, eu can only generate one output.

Output format

Output is rendered as YAML by default. Other formats can be specified using the -x command line option:

eu -x json # for JSON
eu -x text # for plain text

JSON is such a common case that there is a shortcut: -j.

Output targets

By default, eucalypt renders all the content of the final input to output.

There are various ways to override this. First, :target metadata can be specified in the final input to identify different parts for potential export.

To list the targets found in the specified inputs, use the list-targets subcommand.

eu list-targets file.eu

...and a particular target can be selected for render using -t.

eu -t my-target

If there is a target called "main" it will be used by default unless another target is specified.

Evaluands

In addition to inputs, an evaluand can be specified at the command line. This is a eucalypt expression which has access to all names defined in the inputs and replaces the input body or targets as the data to export.

It can be used to select content or derive values from data in the inputs:

$ aws s3api list-buckets | eu -e 'Buckets map(lookup(:CreationDate)) head'
2016-12-25T14:22:30.000Z

...or just to test out short expressions or command line features:

$ eu -e '{a: 1 b: 2 * 2}' -j
{"a": 1, "b": 4}

Passing Arguments to Programs

You can pass command-line arguments to your eucalypt program using the -- separator. Arguments after -- are available via io.args:

$ eu -e 'io.args' -- foo bar baz
---
- foo
- bar
- baz

This is useful for writing eucalypt scripts that accept parameters:

# greet.eu
name: io.args head-or("World")
greeting: "Hello, {name}!"
$ eu greet.eu -e greeting -- Alice
---
Hello, Alice!

Arguments are passed as strings. Use num to convert numeric arguments:

# sum.eu
total: io.args map(num) foldl((+), 0)
$ eu sum.eu -e total -- 1 2 3 4 5
---
15

When no arguments are passed, io.args is an empty list:

$ eu -e 'io.args nil?'
---
true

Random Seed

By default, random numbers are seeded from system entropy and produce different results on each run. Use --seed for reproducible output:

eu --seed 42 template.eu

This sets io.RANDOM_SEED and seeds the io.random stream. See Random Numbers for the full random API.

IO Monad Operations

By default, eucalypt is a pure functional language with no side effects. To enable IO monad operations — specifically shell command execution — you must pass the --allow-io / -I flag:

eu -I script.eu
eu --allow-io script.eu

Without this flag, any program that attempts to execute an IO action will fail with an error:

IO operations require the --allow-io (-I) flag

Why this flag exists

The flag is a deliberate security measure. Eucalypt files are often used as configuration or data templates, and it would be unsafe for arbitrary .eu files loaded from the filesystem or network to execute shell commands without explicit consent. The --allow-io flag is your explicit acknowledgement that the program you are running may perform shell execution.

Usage with IO targets

IO programmes typically use the :io monadic block syntax and are run with a named target:

eu -I --target main script.eu
eu -I -t main script.eu

See the IO monad design documentation for full details of the IO API.

Suppressing prelude

A standard prelude containing many functions and operators is automatically prepended to the input list.

This can be suppressed using -Q if it is not required or if you would like to provide an alternative.

Warning: Many very basic facilities -- like the definition of true and false and if -- are provided by the prelude so suppressing it leaves a very bare environment.

Debugging

eu has a variety of command line switches for dumping out internal representations or tracing execution. The dump subcommand provides access to intermediate representations:

eu dump ast file.eu          # Parse and dump syntax tree
eu dump desugared file.eu    # Dump core expression
eu dump stg file.eu          # Dump compiled STG syntax
eu list-targets file.eu      # List available targets

Use eu --help and eu <subcommand> --help for complete option lists.

Formatting Source Files

The fmt subcommand formats eucalypt source files for consistent style:

eu fmt file.eu              # Print formatted output to stdout
eu fmt --write file.eu      # Format in place
eu fmt --check file.eu      # Check formatting (exit 1 if not formatted)
eu fmt *.eu --write         # Format multiple files in place

Options

  • -w, --width <WIDTH> - Line width for formatting (default: 80)
  • --write - Modify files in place
  • --check - Check if files are formatted (exit 1 if not)
  • --reformat - Full reformatting mode (instead of conservative)
  • --indent <INDENT> - Indent size in spaces (default: 2)

The formatter has two modes:

  • Conservative mode (default) - Preserves original formatting choices where possible, only reformatting where necessary
  • Reformat mode (--reformat) - Full reformatting that applies consistent style throughout

Language Server Protocol

The lsp subcommand starts an LSP server for use with editors that support the Language Server Protocol (e.g., VS Code, Neovim):

eu lsp

The LSP server provides:

  • Syntax error diagnostics
  • Formatting support (via textDocument/formatting)

Configure your editor to use eu lsp as the language server command for .eu files. A VS Code extension is available in the editors/vscode/ directory of the repository.

Version Assertions

The eu.requires function allows eucalypt source files to assert a minimum version of the eucalypt executable:

{ import: [] }  # unit-level metadata not required for eu.requires

# Assert that eu version satisfies semver constraint
_ : eu.requires(">=0.3.0")

If the running version of eu does not satisfy the constraint, an error is raised immediately. This is useful for library code that depends on features introduced in a particular version.

The eu namespace also provides build metadata:

version: eu.build.version    # e.g., "0.3.0"

Backward Compatibility

All existing command patterns continue to work unchanged:

eu file.eu                   # Still works (uses run subcommand)
eu -e "expression"           # Still works (uses run subcommand)
eu -j file.eu                # Still works (JSON output)
eu -S -Q file.eu             # Still works (statistics, no prelude)

Import Formats

Eucalypt supports importing content from other units in a variety of ways.

Imported names can be scoped to specific declarations, they may be made accessible under a specific namespace, and they may be imported from disk or direct from git repositories.

Import scopes

Imports are specified in declaration metadata and make the names in the imported unit available within the declaration that is annotated.

{ import: "config.eu" }
data: {
  # names from config are available here
  x: config-value
}

As described in Syntax Reference, declaration metadata can be applied at a unit level simply by including a metadata block as the very first thing in a eucalypt file:

{ import: "config.eu" }

# names from config are available here

x: config-value

Import syntax

Imports are specified using the key import in a declaration metadata block. The value may be a single import specification:

{ import: "dep-a.eu"}

or a list of import specifications:

{ import: ["dep-a.eu", "dep-b.eu"]}

The import specification itself can be either a simple import or a git import.

Simple imports

Simple imports are specified in exactly the same way as inputs are specified at the command line (see CLI Reference).

So you can override the format of the imported file when the file extension is misleading:

{ import: "yaml@dep.txt" }

...and provide a name under which the imported names will be available:

{ import: "cfg=config.eu" }

# names in config.eu are available by lookup in cfg:

x: cfg.x

In cases where the import format delivers a list rather than a block ("text", "csv", "jsonl", ...) a name is mandatory:

{ import: "txns=transactions.csv" }

Simple imports support exactly the same inputs as the command line, with the proviso that the stdin input ("-") will not be consumable if it has already been specified in the command line or another unit.

Git imports

Git imports allow you to import eucalypt direct from a git repository at a specified commit, combining the convenience of not having to explicitly manage a git working copy and a library path with the repeatability of a git SHA. A git import is specified as a block with the keys "git", "commit" and "import", all of which are mandatory:

{ import: { git: "https://github.com/gmorpheme/eu.aws"
            commit: "0140232cf882a922bdd67b520ed56f0cddbd0637"
            import: "aws/cloudformation.eu" } }

The git URL may be any format that the git command line expects.

commit is required and should be a SHA. It is intended to ensure the import is repeatable and cacheable.

import identifies the file within the repository to import.

Just as with simple imports, several git imports may be listed:

{ import: [{ git: ... }, { git: ... }]}

...and simple imports and git imports may be freely mixed.

YAML import features

When importing YAML files, eucalypt supports several YAML features that help reduce repetition and express data more naturally.

Anchors and aliases

YAML anchors (&name) and aliases (*name) allow you to define a value once and reference it multiple times. When eucalypt imports a YAML file with anchors and aliases, the aliased values are resolved to copies of the anchored expression.

# config.yaml
defaults: &defaults
  timeout: 30
  retries: 3

development:
  <<: *defaults
  debug: true

production:
  <<: *defaults
  debug: false

Anchors can be applied to any YAML value: scalars, lists, or mappings.

# Anchor on a scalar
name: &author "Alice"
books:
  - title: "First Book"
    author: *author
  - title: "Second Book"
    author: *author

# Anchor on a list
colours: &primary [red, green, blue]
palette:
  primary: *primary
  secondary: [yellow, cyan, magenta]

# Anchor on a mapping (block)
base: &base
  x: 1
  y: 2
ref: *base  # ref now has { x: 1, y: 2 }

Nested anchors are supported -- an anchored structure can itself contain anchored values:

outer: &outer
  inner: &inner 42
ref_outer: *outer   # { inner: 42 }
ref_inner: *inner   # 42

If you reference an undefined alias, eucalypt reports an error:

# This will fail: *undefined is not defined
value: *undefined

Merge keys

The YAML merge key (<<) allows you to merge entries from one or more mappings into another. This is useful for creating configuration variations that share a common base.

Single merge:

base: &base
  host: localhost
  port: 8080

server:
  <<: *base
  name: main
# server = { host: localhost, port: 8080, name: main }

Multiple merge:

When merging multiple mappings, later ones override earlier ones:

defaults: &defaults
  timeout: 30
  retries: 3

overrides: &overrides
  timeout: 60

config:
  <<: [*defaults, *overrides]
  name: myapp
# config = { timeout: 60, retries: 3, name: myapp }

Explicit keys override merged values:

Keys defined explicitly in the mapping (before or after the merge) always take precedence over merged values:

base: &base
  x: 1
  y: 2

derived:
  <<: *base
  y: 99
# derived = { x: 1, y: 99 }

Inline merge:

You can also merge an inline mapping directly:

config:
  <<: { timeout: 30, retries: 3 }
  name: myapp

The merge key value must be a mapping (or list of mappings). Attempting to merge a non-mapping value (e.g., <<: 42) results in an error.

Timestamps

Eucalypt automatically converts YAML timestamps to ZDT (zoned date-time) expressions. Plain scalar values matching timestamp patterns are parsed and converted; quoted strings are left as strings.

Supported formats:

FormatExampleNotes
Date only2023-01-15Midnight UTC
ISO 8601 UTC2023-01-15T10:30:00Z
ISO 8601 offset2023-01-15T10:30:00+05:00
Space separator2023-01-15 10:30:00Treated as UTC
Fractional seconds2023-01-15T10:30:00.123456Z

Examples:

# These are converted to ZDT expressions:
created: 2023-01-15
updated: 2023-01-15T10:30:00Z
scheduled: 2023-06-01 09:00:00

# This remains a string (quoted):
date_string: "2023-01-15T10:30:00Z"

Invalid timestamps fall back to strings:

If a value looks like a timestamp but has invalid date components (e.g., month 13 or day 45), it remains a string:

invalid: 2023-13-45  # Remains string "2023-13-45"

To keep timestamps as strings:

If you need to preserve a timestamp-like value as a string rather than converting it to a ZDT, quote it:

# As ZDT:
actual_date: 2023-01-15

# As string:
date_label: "2023-01-15"

Streaming imports

For large files, eucalypt supports streaming import formats that read data lazily without loading the entire file into memory. Streaming formats produce a lazy list of records.

FormatDescription
jsonl-streamJSON Lines (one JSON object per line)
csv-streamCSV with headers (each row becomes a block)
text-streamPlain text (each line becomes a string)

Streaming formats are specified using the format@path syntax:

# Stream a JSONL file
eu -e 'data take(10)' data=jsonl-stream@events.jsonl

# Stream a large CSV
eu -e 'data filter(.age > 30) count' data=csv-stream@people.csv

# Stream lines of text
eu -e 'data filter(str.matches?("ERROR"))' log=text-stream@app.log

Streaming imports can also be used via the import syntax in eucalypt source files:

{ import: "events=jsonl-stream@events.jsonl" }

recent: events take(100)

Note: Streaming imports require a name binding (e.g., data=) because they produce a list, not a block.

Note: text-stream supports reading from stdin using - as the path: eu -e 'data count' data=text-stream@-

Export Formats

Detailed export format documentation is under construction.

Eucalypt can export to the following formats:

FormatFlagNotes
YAML(default)Default output format
JSON-j or -x jsonCompact JSON output
TOML-x tomlTOML output
EDN-x ednEDN output
Text-x textPlain text output

The output format can also be inferred from the output file extension when using -o:

eu input.eu -o output.json  # infers JSON format
eu input.eu -o output.toml  # infers TOML format

Error Messages Guide

This reference is under construction. It will provide a guide to understanding eucalypt error messages with examples and solutions.

Agent Reference

Dense, example-heavy reference for AI coding agents working with Eucalypt. All content verified against source. See the guide chapters for narrative explanations.


1. Syntax Reference

1.1 Primitives

TypeSyntaxExamples
Integerdigits42, -7, 0
Floatdigits with .3.14, -0.5
String"...""hello"
Raw stringr"..."r"C:\path", r"^\d+"
C-stringc"..."c"line\nbreak", c"tab\there"
T-string (ZDT)t"..."t"2024-03-15", t"2024-03-15T14:30:00Z"
Symbol:name:key, :active
Booleanliteralstrue, false
Nullliteralnull

String escape sequences (c-strings only): \n newline, \t tab, \r carriage return, \\ backslash, \" quote, \{ \} literal braces, \xHH hex byte, \uHHHH unicode, \UHHHHHHHH extended unicode.

String interpolation (all string types except t-strings): embed expressions with {expr}. Literal braces via {{ and }}. Format specifiers: {value:%.2f}, {n:%06d}.

Raw strings treat backslashes literally but still support {} interpolation.

1.2 Collections

# List (commas required)
[1, 2, 3]
[]
[1, "two", :three, true]

# Block (commas optional)
{ a: 1 b: 2 c: 3 }
{ a: 1, b: 2, c: 3 }

1.3 Declarations

FormSyntaxNotes
Propertyname: exprDefines a named value, rendered in output
Functionf(x, y): exprNot rendered in output
List destructuref([a, b]): exprSingle param, list destructured
Block destructuref({x y}): exprSingle param, block destructured
Cons destructuref([h : t]): exprSingle param, head/tail split
Juxtaposed listf[a, b]: exprSugar for f([a, b]): expr
Juxtaposed blockf{x y}: exprSugar for f({x y}): expr
Binary operator(l op r): exprSymbolic name required
Prefix operator(op x): exprUnary prefix
Postfix operator(x op): exprUnary postfix

Top-level unit: the file itself is an implicit block without braces.

1.4 Comments

# Line comment (to end of line)
x: 42  # inline comment

1.5 Metadata Annotations

Placed between a leading backtick and the declaration:

# Documentation shorthand (string = doc metadata)
` "Adds two numbers"
add(x, y): x + y

# Structured metadata
` { doc: "Deep merge operator"
    associates: :left
    precedence: :append }
(l << r): deep-merge(l, r)

# Suppress from output
` :suppress
helper(x): x + 1

# Export target
` { target: :my-output }
output: { result: 42 }

# Mark as main (default) target
` :main
main: { result: 42 }

Unit-level metadata: if the first item in a file is an expression (not a declaration), it becomes metadata for the entire unit:

{ import: "helpers.eu" }
result: helper(42)

1.6 Function Application

# Standard call (NO whitespace before paren)
f(x, y)

# Catenation / pipeline (single argument, becomes LAST param)
x f                    # = f(x)
x f g h                # = h(g(f(x)))
[1,2,3] map(inc)       # map(inc, [1,2,3])

# Partial application (all functions are curried)
add(1)                 # returns a function that adds 1
[1,2,3] map(+ 10)     # section: adds 10 to each

# Sections (operator with gaps filled by implicit params)
(+ 1)                  # function: _ + 1
(* 2)                  # function: _ * 2
(> 3)                  # function: _ > 3
(/)                    # function: _ / _ (two params)

1.7 Lookup and Generalised Lookup

# Simple property lookup (dot, precedence 90)
block.key
config.db.host

# Generalised lookup (evaluate RHS in block scope)
{ a: 3 b: 4 }.(a + b)        # 7
{ a: 3 b: 4 }.[a, b]         # [3, 4]
{ a: 3 b: 4 }."{a} and {b}"  # "3 and 4"

Warning: . binds very tightly (precedence 90). list head.name parses as list (head.name), not (list head).name. Use parentheses: (list head).name.

1.8 Anaphora (Implicit Parameters)

TypeNumberedUnnumberedScope
Expression_0, _1, _2_ (each _ = new param)Expression
Block•0, •1, •2 (each = new param)Block
String{0}, {1}, {2}{} (each {} = new param)String literal
# Expression anaphora
[1,2,3] map(_0 * _0)          # square: [1,4,9]
[1,2,3] filter(_ > 1)         # [2,3]

# Block anaphora (• = bullet, Option-8 on Mac)
{ x: • y: • }                 # two-param function returning block
[[1,2],[3,4]] map({ x: • y: • } uncurry)

# Pseudo-lambda via block anaphora + generalised lookup
{ x: • y: • }.(x + y)         # anonymous two-param function

# String anaphora
[1,2,3] map("item: {}")       # ["item: 1", "item: 2", "item: 3"]
"{1},{0}"                      # two-param function, reversed order

Important: anaphora cannot be nested. For complex cases, use named functions.

1.9 Imports

# Unit-level import (available everywhere in file)
{ import: "lib.eu" }

# Named import (access as namespace)
{ import: "cfg=config.yaml" }
host: cfg.host

# Multiple imports
{ import: ["helpers.eu", "cfg=config.eu"] }

# Format override
{ import: "data=yaml@records.txt" }

# Scoped import (on specific declaration)
` { import: "math.eu" }
calculations: { result: advanced-calc(10) }

# Git import
{ import: { git: "https://github.com/user/lib"
            commit: "abc123def456"
            import: "helpers.eu" } }

# Streaming imports (lazy, for large files)
{ import: "events=jsonl-stream@events.jsonl" }
{ import: "rows=csv-stream@big.csv" }
{ import: "lines=text-stream@log.txt" }

Import resolution order: relative paths are resolved by searching:

  1. The directory containing the importing .eu file (source-relative)
  2. The directories on the lib path (-L flags and CWD)

This means a file at lib/utils.eu that imports "helpers/misc.eu" will find lib/helpers/misc.eu without needing -L lib on the command line. This works transitively, so lib/helpers/misc.eu can in turn import "sub/detail.eu" and it will resolve as lib/helpers/sub/detail.eu.

1.10 Quoted Identifiers

Single quotes turn any character sequence into a normal identifier:

home: {
  '.bashrc': false
  'notes.txt': true
}
z: home.'notes.txt'

2. Operator Precedence Table

All values verified against src/core/metadata.rs (named_precedence) and lib/prelude.eu operator metadata.

From highest (tightest) to lowest binding:

PrecNameAssocOperatorsDescription
95--prefixHead (tight prefix)
90lookupleft. (built-in)Property lookup
90callleft(built-in)Function call
88bool-unaryprefix!, ¬Boolean negation
88bool-unarypostfixNot-null check (true if not null)
88----, ;Composition (actual prec 88 from prelude)
85expright^Power
85exp--!! (nth)Indexing
80prodleft*, /, ÷, %Multiplication, floor division, precise division, floor modulo
75sumleft+, -Addition, subtraction
60shift--(shift ops)Reserved
55bitwise--(bitwise ops)Reserved
50cmpleft<, >, <=, >=Comparison
45appendleft++List concatenation
45appendleft<<Deep merge
42mapleft<$>Functor map
40eqleft=, !=Equality
35bool-prodleft&&, Logical AND
30bool-sumleft||, Logical OR
20catleft(catenation)Juxtaposition / pipeline
10applyright@Function application
5metaleft//, //<<, //=, //=>, //=?, //!?, //!, //!!Metadata and assertions

Named precedence levels for use in operator metadata: :lookup, :call, :bool-unary, :exp, :prod, :sum, :shift, :bitwise, :cmp, :append, :map, :eq, :bool-prod, :bool-sum, :cat, :apply, :meta.

User-defined operators default to left-associative, precedence 50 (:cmp level). Set custom values via metadata:

` { associates: :right precedence: :sum }
(x +++ y): x + y

Composition operators and ; are defined at precedence 88 in the prelude (between bool-unary and exp):

  • f ∘ g — compose right-to-left (g then f), right-associative
  • f ; g — compose left-to-right (f then g), left-associative

3. Top 30 Prelude Functions

All signatures verified against lib/prelude.eu. Pipeline style shown where idiomatic (catenated argument is always the last parameter).

3.1 List Functions

map(f, l) — transform each element

[1, 2, 3] map(inc)           # [2, 3, 4]
[1, 2, 3] map(* 10)          # [10, 20, 30]
["a","b"] map(str.to-upper)  # ["A", "B"]

filter(p?, l) — keep elements satisfying predicate

[1,2,3,4,5] filter(> 3)      # [4, 5]
[1,2,3,4,5] filter(pos?)     # [1, 2, 3, 4, 5]

foldl(op, i, l) — left fold

foldl(+, 0, [1,2,3,4,5])     # 15
[1,2,3] foldl(+, 0)          # ERROR: use foldl(+, 0, [1,2,3])

Note: foldl takes 3 args. In pipeline: the list is the last arg, so [1,2,3,4,5] foldl(+, 0) does NOT work — you must call foldl(+, 0, [1,2,3,4,5]) directly, or partially apply:

sum: foldl(+, 0)
[1,2,3,4,5] sum              # 15

foldr(op, i, l) — right fold

foldr(cons, [], [1,2,3])      # [1, 2, 3] (identity)
foldr(++, [], [[1,2],[3,4]])  # [1, 2, 3, 4]

head(xs) — first element (panics if empty)

[10, 20, 30] head            # 10

tail(xs) — all but first (panics if empty)

[10, 20, 30] tail            # [20, 30]

cons(h, t) — prepend element to list

cons(0, [1, 2, 3])           # [0, 1, 2, 3]

reverse(l) — reverse a list

[1, 2, 3] reverse            # [3, 2, 1]

count(l) — number of elements

[10, 20, 30] count           # 3

zip(l1, l2) — pair elements from two lists

zip([:a,:b,:c], [1,2,3])     # [[:a,1], [:b,2], [:c,3]]

concat(ls) — flatten one level of nesting

concat([[1,2], [3], [4,5]])   # [1, 2, 3, 4, 5]

mapcat(f, l) — map then concatenate (flatMap)

["ab","cd"] mapcat(str.letters)  # ["a","b","c","d"]

take(n, l) — first n elements

[1,2,3,4,5] take(3)          # [1, 2, 3]

drop(n, l) — remove first n elements

[1,2,3,4,5] drop(3)          # [4, 5]

any(p?, l) — true if any element satisfies p?

[1, 2, 3] any(> 2)           # true
[1, 2, 3] any(> 5)           # false

all(p?, l) — true if all elements satisfy p?

[2, 4, 6] all(> 0)           # true
[2, 4, 6] all(> 3)           # false

3.2 Block Functions

keys(b) — list of keys (as symbols)

{ a: 1 b: 2 c: 3 } keys     # [:a, :b, :c]

values(b) — list of values

{ a: 1 b: 2 c: 3 } values   # [1, 2, 3]

elements(b) — list of [key, value] pairs

{ a: 1 b: 2 } elements       # [[:a, 1], [:b, 2]]

merge(b1, b2) — shallow merge (b2 overrides b1)

merge({ a: 1 }, { b: 2 })    # { a: 1 b: 2 }
{ a: 1 } { a: 2 }            # { a: 2 } (catenation = merge)

has(s, b) — does block have key s? (s is a symbol)

{ a: 1 b: 2 } has(:a)        # true
{ a: 1 b: 2 } has(:z)        # false

lookup(s, b) — look up key s (panics if missing)

{ a: 1 b: 2 } lookup(:b)     # 2

lookup-or(s, d, b) — look up key s with default d

{ a: 1 } lookup-or(:z, 99)   # 99
{ a: 1 } lookup-or(:a, 99)   # 1

deep-find(k, b) — find all values for key k at any depth

{ a: { x: 1 } b: { x: 2 } } deep-find(:x)    # [1, 2]

deep-query(pattern, b) — query with dot-separated glob pattern

Patterns: bare foo = **.foo; * = one level; ** = any depth.

{ a: { b: { c: 1 } } } deep-query("a.b.c")    # [1]
{ a: { x: 1 } b: { x: 2 } } deep-query("*.x") # [1, 2]

3.3 String Functions (str namespace)

str.split-on(re, s) — split string on regex

"one-two-three" str.split-on("-")      # ["one", "two", "three"]
"a.b.c" str.split-on("[.]")           # ["a", "b", "c"]

str.join-on(sep, l) — join list with separator

["a", "b", "c"] str.join-on(", ")     # "a, b, c"

str.to-upper(s) — convert to upper case

"hello" str.to-upper                   # "HELLO"

str.to-lower(s) — convert to lower case

"GOODBYE" str.to-lower                 # "goodbye"

str.matches?(re, s) — does regex match full string?

"hello" str.matches?("^h.*o$")        # true
"hello" str.matches?("^H")            # false

3.4 Combinators

See also: identity(v), const(k, _), compose(f, g, x), flip(f, x, y), complement(p?), curry(f, x, y), uncurry(f, l), cond(l, d).


4. Pipeline Patterns and Idioms

4.1 Basic Pipeline

Read left to right: data flows through transformations.

[1, 2, 3, 4, 5]
  filter(> 2)
  map(* 10)
  reverse
# Result: [50, 40, 30]

4.2 Pipeline with Named Stages

Use :suppress to hide intermediate values from output:

` :suppress
raw-data: [
  { name: "alice" score: 85 }
  { name: "bob" score: 92 }
  { name: "charlie" score: 78 }
]

result: raw-data
  filter(_.score >= 90)
  map(_.name)
  map(str.to-upper)

4.3 Conditional Pipeline

Use then(true-val, false-val) at the end of a pipeline:

[1, 2, 3] count (> 2) then("many", "few")   # "many"

Or use when(p?, f, x) to conditionally transform:

5 when(> 3, * 10)   # 50 (condition met, apply * 10)
2 when(> 3, * 10)   # 2  (condition not met, pass through)

4.4 Configuration Layering with Deep Merge

base: {
  server: { host: "0.0.0.0" port: 8080 workers: 4 }
  logging: { level: "info" format: "json" }
}

production: base << {
  server: { workers: 16 }
  logging: { level: "warn" }
}

development: base << {
  server: { host: "localhost" }
  logging: { level: "debug" format: "text" }
}

4.5 Building Blocks from Lists

# Zip keys and values into a block
zip-kv([:x, :y, :z], [1, 2, 3])
# Result: { x: 1 y: 2 z: 3 }

# Reconstruct from element pairs
[[:a, 1], [:b, 2]] block
# Result: { a: 1 b: 2 }

# Merge a list of blocks
[{a: 1}, {b: 2}, {c: 3}] merge-all
# Result: { a: 1 b: 2 c: 3 }

4.6 Transforming Block Values

# Apply function to all values
{ a: 1 b: 2 c: 3 } map-values(* 10)
# Result: { a: 10 b: 20 c: 30 }

# Transform keys
{ a: 1 b: 2 } map-keys(sym ∘ str.prefix("x-") ∘ str.of)
# Result: { x-a: 1 x-b: 2 }

4.7 Nested Block Modification

config: { server: { db: { port: 5432 } } }

# Set a nested value
config alter([:server, :db, :port], 3306)

# Apply a function to a nested value
config update([:server, :db, :port], inc)

# Shallow merge into a nested block (replaces nested blocks)
config merge-at([:server, :db], { host: "10.0.0.1" })

# Deep merge into a nested block (preserves nested blocks)
config deep-merge-at([:server, :db], { host: "10.0.0.1" })

4.8 Grouping and Sorting

# Sort numbers
[5, 3, 1, 4, 2] qsort(<)           # [1, 2, 3, 4, 5]
[30, 10, 20] sort-nums              # [10, 20, 30]

# Sort by extracted key
people sort-by-num(_.age)

# Group by key function
items group-by(_.category)

4.9 Composition

# Right-to-left composition (g then f)
shout: str.to-upper ∘ str.suffix("!")
"hello" shout   # "HELLO!"

# Left-to-right composition (f then g)
process: filter(> 0) ; map(* 2)
[-1, 2, -3, 4] process   # [4, 8]

4.10 Handling Data from External Sources

# CSV (values are always strings, use num to convert)
{ import: "rows=transactions.csv" }
total: rows map(_.amount num) foldl(+, 0)

# Command-line arguments
name: io.args head-or("World")
greeting: "Hello, {name}!"

# Environment variables
home: io.env lookup-or(:HOME, "/tmp")

4.11 Infinite Lists

# Infinite repetition
repeat(:x) take(4)                   # [:x, :x, :x, :x]

# Infinite sequence of integers
ints-from(1) take(5)                 # [1, 2, 3, 4, 5]

# Range (finite)
range(1, 6)                          # [1, 2, 3, 4, 5]

# Infinite iteration
iterate(* 2, 1) take(6)             # [1, 2, 4, 8, 16, 32]

5. Common Pitfalls

5.1 Catenation Applies the LAST Argument

When using pipeline style, the catenated value becomes the last parameter. [1,2,3] map(inc) means map(inc, [1,2,3]) because the list is the last argument.

For foldl(op, init, list), you cannot write [1,2,3] foldl(+, 0) — you need either foldl(+, 0, [1,2,3]) or define a partial: sum: foldl(+, 0) then [1,2,3] sum.

5.2 Dot Binds Tighter Than Catenation

. has precedence 90 vs catenation at 20. So:

list head.name    # WRONG: parses as list (head.name)
(list head).name  # RIGHT: get head, then lookup .name

The prefix operator (precedence 95) binds even tighter: ↑xs.name = (↑xs).name.

5.3 No Whitespace Before Parentheses in Calls

f(x)    # function call
f (x)   # catenation: applies f to (x) as pipeline

5.4 map vs mapcat

  • map(f, l) — applies f to each element, preserving list structure
  • mapcat(f, l) — applies f (which must return a list) and concatenates all results
["ab","cd"] map(str.letters)     # [["a","b"], ["c","d"]]
["ab","cd"] mapcat(str.letters)  # ["a", "b", "c", "d"]

5.5 lookup vs Dot Lookup

  • block.key — compile-time property access (key must be a literal name known at compile time)
  • lookup(:key, block) / block lookup(:key) — runtime dynamic lookup using a symbol value

Use lookup / lookup-or when the key is computed or stored in a variable.

5.6 merge vs deep-merge (<<)

  • merge(a, b) — shallow merge; nested blocks in b completely replace those in a
  • deep-merge(a, b) / a << b — recursively merges nested blocks, but lists are still replaced entirely

5.7 has Takes a Symbol, Not a String

{ a: 1 } has(:a)       # true
{ a: 1 } has("a")      # WRONG: "a" is a string, not symbol :a

Use sym("a") to convert a string to a symbol if needed.

5.8 deep-find Takes a Symbol, Not a String

{ a: { x: 1 } } deep-find(:x)     # [1] — correct
{ a: { x: 1 } } deep-find("x")    # WRONG — takes a symbol

deep-find, deep-find-first, and deep-find-paths take a symbol key, matching the rest of the prelude (has(:key), lookup-or(:key, ...)).

5.9 Self-Reference Creates Infinite Recursion

name: "foo"
x: { name: name }    # INFINITE RECURSION: inner name refers to itself

The inner name shadows the outer one. Use a different name or generalised lookup.

5.10 Anaphora Scope Limits

  • Expression anaphora (_, _0) do not cross commas, catenation boundaries, or function argument lists
  • Block anaphora (, •0) are scoped to the enclosing block
  • String anaphora ({}, {0}) are scoped to the enclosing string
  • None of the anaphora types can be nested

5.11 Functions That Do NOT Exist

The following are commonly assumed but are not in the prelude:

  • str.replace — does not exist
  • str.trim — does not exist
  • str.starts-with? — does not exist
  • str.ends-with? — does not exist
  • str.contains? — does not exist
  • flatten — use concat (flattens one level)
  • unique — does not exist in prelude
  • abs — does not exist (use if(x < 0, negate(x), x))
  • even? / odd? — do not exist (use x % 2 = 0)
  • round / ceil — use floor and ceiling
  • select / dissoc — do not exist (use filter-items with by-key)

5.12 str.split-on Uses Regex, Not Literal Strings

"a.b.c" str.split-on(".")     # WRONG: "." matches any char
"a.b.c" str.split-on("[.]")   # RIGHT: escaped dot in regex

All str.match, str.split, str.matches? functions use regex patterns.


6. Quick CLI Reference

eu file.eu                  # Evaluate, output YAML
eu -j file.eu               # Output JSON
eu -x toml file.eu          # Output TOML
eu -x text file.eu          # Output plain text
eu -e 'expression'          # Evaluate inline expression
eu data.yaml transform.eu   # Merge inputs (left to right)
eu name=data.json app.eu    # Named input
eu -t target file.eu        # Render specific target
eu list-targets file.eu     # List available targets
eu test file.eu             # Run embedded tests
eu fmt file.eu              # Format source
eu fmt --write file.eu      # Format in place
eu -o output.json file.eu   # Write to file (format inferred)
eu -Q file.eu               # Suppress prelude
eu -B file.eu               # Batch mode (no ergonomic features)
eu --seed 42 file.eu        # Deterministic random
eu -e 'io.args' -- arg1 arg2  # Pass arguments

7. Complete str Namespace Reference

All functions verified against lib/prelude.eu:

FunctionSignatureDescription
str.ofof(e)Convert any value to string
str.splitsplit(s, re)Split string s on regex re
str.split-onsplit-on(re, s)Split s on regex re (pipeline-friendly)
str.joinjoin(l, sep)Join list l with separator sep
str.join-onjoin-on(sep, l)Join l with sep (pipeline-friendly)
str.matchmatch(s, re)Match s against regex, return [full, groups...]
str.match-withmatch-with(re, s)Match s against re (pipeline-friendly)
str.extractextract(re, s)Extract first capture group or error
str.extract-orextract-or(re, d, s)Extract first capture group or default d
str.matchesmatches(s, re)All matches of re in s
str.matches-ofmatches-of(re, s)All matches (pipeline-friendly)
str.matches?matches?(re, s)True if re matches full string s
str.lettersletters(s)List of individual characters
str.lenlen(s)String length in characters
str.fmtfmt(x, spec)Printf-style format
str.to-upperto-upper(s)Convert to upper case
str.to-lowerto-lower(s)Convert to lower case
str.prefixprefix(b, a)Prepend b onto a
str.suffixsuffix(b, a)Append b onto a
str.base64-encodebase64-encode(s)Base64 encode
str.base64-decodebase64-decode(s)Base64 decode
str.sha256sha256(s)SHA-256 hash (lowercase hex)

8. Assertion and Test Operators

All at precedence 5 (:meta).

Test expectations (return booleans, for use in test harnesses):

OperatorDescription
e //= vTest e equals v, return boolean
e //=? fTest f(e) is true, return boolean

Assertions (panic on failure, return e on success):

OperatorDescription
e //=> vAssert e equals v, panic with expected/actual on failure
e //!Assert e is true, panic on failure

Deprecated (use complement with positive forms instead):

OperatorReplacement
e //!? fe //=? complement(f)
e //!!Removed — negate the condition instead

Metadata operators:

OperatorDescription
e // mAttach metadata block m to value e
e //<< mMerge m into existing metadata of e

Design Philosophy

eucalypt, the language, is unorthodox in many respects -- probably more than you might realise on first acquaintance.

People tend to have deep-seated and inflexible opinions about programming languages and language design and will quite possibly find something in here that they have a kneejerk reaction against.

However, the design is not unprincipled and, while it is experimental in some respects, I believe it's internally consistent. Several aspects of the design and the aesthetic are driven by the primary use case, templating and generating YAML. Maybe by exploring some of the inspiration and philosophy behind the language itself, I can pre-empt some of the knee jerks.

Accept crypticality for minimal intrusion

eucalypt is first and foremost a tool, rather than a language. It is intended to replace generation and transformation processes on semi-structured data formats. Many or most uses of eucalypt the language should just be simple one-liner tags in YAML files, or maybe eucalypt files that are predominantly data rather than manipulation.

The eucalypt language is the depth behind these one-liners that allows eucalypt to accommodate increasingly ambitious use cases without breaking the paradigm and reaching for a general purpose imperative scripting language or the lowest common denominator of text-based templating languages.

The pre-eminence of one-liners and small annotations and "logic mark-up", means that eucalypt often favours concise and cryptic over wordy and transparent. This is a controversial approach.

  • eucalypt logic should "get out of the way" of the data. Templating is attractive precisely because the generating source looks very like the result. Template tags are often short (with "cryptic" delimiters -- {{}}, <%= %>, [| ]...) because these are "marking up" the data which is the main event. At the same time, the tags are often "noisy" or visually disruptive to ensure they cannot be ignored. eucalypt via operator and bracket definitions, picks and chooses from a similar palette of expressive effects to try and be a sympathetic cohabitee with its accompanying data.

  • There are many cases where it makes sense to resist offering an incomplete understanding in favour of demanding full understanding. For example, it is spurious to say that bind(x, f) gives more understanding of what is going on than x >>= f -- unless you understand the monad abstraction and the role of bind in it, you gain nothing useful from the ideas that the word bind connotes when you are trying to understand program text.

  • eucalypt just plain ignores the notion that program text should be readable as English text. This (well motivated) idea has made a resurgence in recent years through the back door of internal DSLs and "fluent" Java interfaces. There is much merit in languages supple enough to allow the APIs to approach the natural means of expression of the problem domain. However, problem domains frequently have their own technical jargon and notation which suit their purpose better than natural language so it cuts both ways. Program text should be approachable by its target audience but that does not mean it should make no demands of its target audience.

These stances lead directly to several slightly esoteric aspects of eucalypt that may be obnoxious to some:

  • eucalypt tends to be operator-heavy. Operators are concise (if cryptic) and the full range of unicode is available to call upon. Using operators keeps custom logic visually out of the way of the data whilst also signposting it to attract closer attention.

  • eucalypt lets you define your own operators and specify their precedence and associativity (which are applied at a relatively late stage in the evaluation pipeline -- operator soup persists through the initial parse). There are no ternary operators.

  • For absolute minimal intrusion, merely the act of placing elements next to each other ("catenation"), x f, is meaningful in eucalypt. By default this is pipeline-order function application, but blocks can be applied as functions to make common transformations, like block merge, very succinct.

  • For even more power, eucalypt has some recent experimental features... You can alter the meaning of concatenation via idiot brackets 1. («x y»: ...), the ability to define handling of arbitrary unicode bracket pairs. This is inspired by the idiom brackets that can be used to express applicative styles in functional programming 2, but they can be used much more generally, you could use them to define ternary operators for instance.

  • You can also customise block handling by reinterpreting it as a monadic pattern similar to Haskell's do notation. Such block handling can be assigned to unicode bracket pairs. This is very experimental but can be used to tidy up random number patterns for instance which are currently awkward in the purely functional context of eucalypt.

Cohabitation of code and data

Just like templates, eucalypt source (or eucalypt-tagged YAML) should be almost entirely data.

The idea behind eucalypt is to adopt the basic maps-and-arrays organisation philosophy of these data formats but make the data active -- allowing lambdas to live in and amongst it and operate on it and allowing the data to express dispositions towards its environment by addition of metadata that controls import, export, and execution preferences.

eucalypt therefore collapses the separation of code and data to some degree. You can run eu against a mixture of YAML, JSON and eucalypt files and all the data and logic appears there together in the same namespace hierarchy. The namespace hierarchy just is the data.

However, code and data aren't unified in the sense of Lisp for instance. eucalypt is not homoiconic. The relationship is more like cohabitation; code lives in amongst the data it operates on but is stripped out before export.

Nevertheless eucalypt is heavily inspired by Lisp and aims for a similar fluidity through:

  • lazy evaluation (going some way towards matching uses of Lisp macros which control evaluation order -- in eucalypt, if is just a function)
  • economical syntax to facilitate (future) manipulation of code as data

Simplicity

  • eucalypt values simplicity in the sense of fewer moving parts (and therefore, hopefully, fewer things to go wrong). It values ease of use in the sense of offering a rich and powerful toolkit. You may not think it achieves either.

  • eucalypt values familiarity mostly in the "shallower" parts of the language where it only requires a couple of mental leaps for the average programmer in these areas -- the (ab)use of catenation being the key one.

  • However, eucalypt isn't ashamed of its dusty corners. Dusty corners are areas where novices and experts alike can get trapped and lose time but they're also rich seams for experimentation, innovation and discovery. If you have to venture too far off-piste to find what you need, we'll find a way to bring it onto the nursery slopes but we won't close off the mountain.


Footnotes

1

Inspired by idiom brackets. If I didn't call them that, someone else would.

2

Applicative Programming with Effects, Conor McBride and Ross Paterson. (2008) http://www.staff.city.ac.uk/~ross/papers/Applicative.html

Lazy Evaluation

This chapter is under construction.

Eucalypt uses lazy evaluation, meaning expressions are only evaluated when their values are needed. This has important consequences:

  • if is just a function (both branches are not evaluated)
  • Infinite lists are possible (e.g. repeat(1), ints-from(0))
  • Unused computations have no cost

Eucalypt Architecture

This document provides a comprehensive overview of Eucalypt's design and implementation architecture.

Overview

Eucalypt is a functional programming language and tool for generating, templating, rendering, and processing structured data formats like YAML, JSON, and TOML. Written in Rust (~44,000 lines), it features a classic multi-phase compiler design with an STG (Spineless Tagless G-machine) runtime for lazy evaluation.

System Architecture

High-Level Pipeline

Source Code (*.eu files)
        │
        ▼
┌───────────────────────┐
│    Parsing Phase      │  src/syntax/
│  Lexer → Parser → AST │
└───────────────────────┘
        │
        ▼
┌───────────────────────┐
│     Core Phase        │  src/core/
│  Desugar → Cook →     │
│  Transform → Verify   │
└───────────────────────┘
        │
        ▼
┌───────────────────────┐
│   Evaluation Phase    │  src/eval/
│  STG Compile → VM →   │
│  Memory Management    │
└───────────────────────┘
        │
        ▼
┌───────────────────────┐
│    Export Phase       │  src/export/
│  JSON/YAML/TOML/etc   │
└───────────────────────┘

Module Structure

eucalypt/
├── src/
│   ├── bin/eu.rs           # CLI entry point
│   ├── lib.rs              # Library root
│   ├── common/             # Shared utilities
│   ├── syntax/             # Parsing and AST
│   │   └── rowan/          # Rowan-based incremental parser
│   ├── core/               # Core expression representation
│   │   ├── desugar/        # AST to core transformation
│   │   ├── cook/           # Operator fixity resolution
│   │   ├── transform/      # Expression transformations
│   │   ├── simplify/       # Optimisation passes
│   │   ├── inline/         # Inlining passes
│   │   ├── verify/         # Validation
│   │   └── analyse/        # Program analysis
│   ├── eval/               # Evaluation engine
│   │   ├── stg/            # STG syntax and compiler
│   │   ├── machine/        # Virtual machine
│   │   └── memory/         # Heap and garbage collection
│   ├── driver/             # CLI orchestration
│   ├── export/             # Output format generation
│   └── import/             # Input format parsing
├── lib/                    # Standard library (eucalypt source)
│   ├── prelude.eu          # Core prelude
│   ├── test.eu             # Test framework
│   └── markup.eu           # Markup utilities
└── docs/                   # Documentation

Parsing Pipeline

The parsing pipeline transforms source text into a structured AST using Rowan, an incremental parsing library that preserves full source fidelity including whitespace and comments.

Lexer

Implementation: src/syntax/rowan/lex.rs, src/syntax/rowan/string_lex.rs

The lexer (Lexer<C>) tokenises source text into a stream of SyntaxKind tokens:

#![allow(unused)]
fn main() {
pub struct Lexer<C: Iterator<Item = char>> {
    chars: Peekable<C>,           // Character stream with lookahead
    location: ByteIndex,           // Source position tracking
    last_token: Option<SyntaxKind>, // Context for disambiguation
    whitespace_since_last_token: bool,
    token_buffer: VecDeque<(SyntaxKind, Span)>,
}
}

Key features:

  • Unicode-aware identifier and operator recognition
  • Context-sensitive tokenisation (distinguishes OPEN_PAREN from OPEN_PAREN_APPLY)
  • String pattern lexing with interpolation support ("Hello {name}")
  • Preserves trivia (whitespace, comments) for full-fidelity AST

Token categories:

  • Delimiters: `{ } ( ) [ ] : , ``
  • Identifiers: foo, 'quoted name', +, &&
  • Literals: Numbers, strings, symbols (:keyword)
  • Annotations: Whitespace, comments (#)

Parser

Implementation: src/syntax/rowan/parse.rs

The parser uses an event-driven recursive descent approach:

#![allow(unused)]
fn main() {
pub struct Parser<'text> {
    tokens: Vec<(SyntaxKind, &'text str)>,
    next_token: usize,
    sink_stack: Vec<Box<dyn EventSink>>,
    errors: Vec<ParseError>,
}
}

Parse events:

#![allow(unused)]
fn main() {
enum ParseEvent {
    StartNode(SyntaxKind),  // Begin syntax node
    Finish,                  // Complete node
    Token(SyntaxKind),      // Include token
}
}

Key parsing methods:

  • parse_unit() - Top-level file (no braces required)
  • parse_expression() / parse_soup() - Expression sequences
  • parse_block_expression() - { ... } blocks with declarations
  • parse_string_pattern() - Interpolated strings

The parser maintains error recovery for LSP support, collecting errors while continuing to parse.

AST

Implementation: src/syntax/rowan/ast.rs

The AST uses a two-layer design:

  1. SyntaxNode (Rowan) - Rich, source-preserving tree
  2. AST Nodes - Typed wrappers via macros
#![allow(unused)]
fn main() {
macro_rules! ast_node {
    ($ast:ident, $kind:ident) => {
        pub struct $ast(SyntaxNode);
        impl AstNode for $ast { ... }
    }
}
}

Key AST types:

TypePurpose
UnitTop-level file structure
BlockEnclosed { ... } block
DeclarationProperty/function declaration
DeclHeadDeclaration name (before :)
DeclBodyDeclaration value (after :)
SoupUnordered expression sequence
ListList expression [a, b, c]
NameIdentifier reference
LiteralLiteral value
StringPatternInterpolated string

Core Expression Representation

The core representation is an intermediate language that facilitates powerful transformations while maintaining source information for error reporting.

Expression Type

Implementation: src/core/expr.rs

The Expr<T> enum (where T: BoundTerm<String>) represents all expression forms:

#![allow(unused)]
fn main() {
pub enum Expr<T> {
    // Variables
    Var(Smid, Var<String>),           // Free or bound variable

    // Primitives
    Literal(Smid, Primitive),          // Number, string, symbol, bool, null

    // Binding forms
    Let(Smid, LetScope<T>, LetType),   // Recursive let binding
    Lam(Smid, bool, LamScope<T>),      // Lambda abstraction

    // Application
    App(Smid, T, Vec<T>),              // Function application

    // Data structures
    List(Smid, Vec<T>),                // List literal
    Block(Smid, BlockMap<T>),          // Object/record literal

    // Operators (pre-cooking)
    Operator(Smid, Fixity, Precedence, T),
    Soup(Smid, Vec<T>),                // Unresolved operator soup

    // Anaphora (implicit parameters)
    BlockAnaphor(Smid, ...),
    ExprAnaphor(Smid, ...),

    // Access
    Lookup(Smid, T, String, Option<T>), // Property access

    // Metadata
    Meta(Smid, T, T),                   // Expression with metadata

    // Intrinsics
    Intrinsic(Smid, String),           // Built-in function reference

    // Error nodes
    ErrUnresolved, ErrRedeclaration, ...
}
}

Primitive types:

#![allow(unused)]
fn main() {
pub enum Primitive {
    Str(String),
    Sym(String),
    Num(Number),
    Bool(bool),
    Null,
}
}

Standard wrapper: RcExpr provides reference-counted immutable expressions with substitution and transformation methods.

Transformation Pipeline

The core pipeline transforms expressions through several phases:

AST → Desugar → Cook → Simplify → Inline → Verify → STG

Desugaring

Implementation: src/core/desugar/

Transforms parsed AST into core expressions:

  • Converts block declarations into recursive let bindings
  • Extracts targets and documentation metadata
  • Handles imports and cross-file references
  • Processes both native AST and embedded data (JSON/YAML)

Cooking

Implementation: src/core/cook/

Resolves operator precedence and anaphora:

  1. Fixity distribution - Propagate operator precedence info
  2. Anaphor filling - Infer missing implicit parameters ((+ 10)(_ + 10))
  3. Shunting yard - Apply precedence climbing to linearise operator soup
  4. Anaphor processing - Wrap lambda abstractions around anaphoric expressions

Example transformation:

(+ 10)        →  (λ _ . (_ + 10))
a + b * c     →  (+ a (* b c))      // with standard precedence

Simplification and Inlining

Implementation: src/core/simplify/, src/core/inline/

  • Compression - Remove eliminated bindings
  • Pruning - Dead code elimination
  • Inlining - Inline marked expressions

Verification

Implementation: src/core/verify/

Validates transformed expressions before STG compilation:

  • Binding verification
  • Content validation

STG Compilation and Evaluation Model

Eucalypt uses a Spineless Tagless G-machine (STG) as its evaluation model, providing lazy evaluation with memoisation.

STG Syntax

Implementation: src/eval/stg/syntax.rs

The STG syntax represents executable code:

#![allow(unused)]
fn main() {
pub enum StgSyn {
    Atom { evaluand: Ref },              // Value or reference
    Case { scrutinee, branches, fallback }, // Pattern matching (evaluation point)
    Cons { tag: Tag, args: Vec<Ref> },   // Data constructor
    App { callable: Ref, args: Vec<Ref> }, // Function application
    Bif { intrinsic: u8, args: Vec<Ref> }, // Built-in intrinsic
    Let { bindings: Vec<LambdaForm>, body }, // Non-recursive let
    LetRec { bindings: Vec<LambdaForm>, body }, // Recursive let
    Ann { smid: Smid, body },            // Source annotation
    Meta { meta: Ref, body: Ref },       // Metadata wrapper
    DeMeta { scrutinee, handler, or_else }, // Metadata destructure
    BlackHole,                            // Uninitialized marker
}
}

Reference types:

#![allow(unused)]
fn main() {
pub enum Reference<T> {
    L(usize),  // Local environment index
    G(usize),  // Global environment index
    V(T),      // Direct value (Native)
}
}

Lambda forms control laziness:

  • Lambda - Function with explicit arity
  • Thunk - Lazy expression (evaluated and updated in-place)
  • Value - Already in WHNF (no update needed)

STG Compiler

Implementation: src/eval/stg/compiler.rs

The compiler transforms core expressions to STG syntax:

#![allow(unused)]
fn main() {
impl Compiler {
    fn compile_body(&mut self, expr: &RcExpr) -> ProtoSyntax;
    fn compile_binding(&mut self, expr: &RcExpr) -> ProtoBinding;
    fn compile_lambda(&mut self, expr: &RcExpr) -> ProtoSyntax;
    fn compile_application(&mut self, f: &RcExpr, args: &[RcExpr]) -> ProtoSyntax;
}
}

Key decisions:

  • Thunk creation: Expressions not in WHNF and used more than once become thunks
  • WHNF detection: Constructors, native values, and metadata wrappers are WHNF
  • Deferred compilation: ProtoSyntax allows deferring binding construction until environment size is known

Virtual Machine

Implementation: src/eval/machine/vm.rs

The STG machine is a state machine executing closures:

#![allow(unused)]
fn main() {
pub struct MachineState {
    root_env: SynEnvPtr,           // Empty root environment
    closure: SynClosure,           // Current (code, environment) pair
    globals: SynEnvPtr,            // Global bindings
    stack: Vec<Continuation>,      // Continuation stack
    terminated: bool,
    annotation: Smid,              // Current source location
}
}

Execution loop:

#![allow(unused)]
fn main() {
fn run(&mut self) {
    while !self.terminated {
        if self.gc_check_needed() {
            self.collect_garbage();
        }
        self.step();
    }
}
}

Instruction dispatch (handle_instruction):

Code FormAction
AtomResolve reference; push Update continuation if thunk
CasePush Branch continuation, evaluate scrutinee
ConsReturn data constructor
AppPush ApplyTo continuation, evaluate callable
BifExecute intrinsic directly
LetAllocate environment frame, continue in body
LetRecAllocate frame with backfilled recursive references

Continuations

Implementation: src/eval/machine/cont.rs

Four continuation types manage control flow:

  1. Branch - Pattern matching branches for CASE
  2. Update - Deferred thunk update (memoisation)
  3. ApplyTo - Pending arguments for function application
  4. DeMeta - Metadata destructuring handler

Lazy Evaluation

Laziness is achieved through thunks and updates:

  1. Thunk creation (compile time): Non-WHNF expressions become LambdaForm::Thunk
  2. Thunk evaluation (runtime): When a thunk is entered, push Update continuation
  3. Memoisation: After evaluation, update the environment slot with the result
#![allow(unused)]
fn main() {
// When entering a local reference
if closure.update() {
    stack.push(Continuation::Update { environment, index });
}

// After evaluation completes
Continuation::Update { environment, index } => {
    self.update(environment, index);  // Replace thunk with result
}
}

Memory Management and Garbage Collection

Eucalypt uses an Immix-inspired memory layout with mark-and-sweep collection.

Memory Layout

Implementation: src/eval/memory/heap.rs, src/eval/memory/bump.rs

Block (32KB)
├── Line 0 (128B) ┐
├── Line 1 (128B) │ 256 lines per block
├── ...           │
└── Line 255      ┘

Size classes:

  • Small (< 128 bytes) - Single line
  • Medium (128B - 32KB) - Multiple lines within block
  • Large (> 32KB) - Dedicated Large Object Block

Heap state:

#![allow(unused)]
fn main() {
pub struct HeapState {
    head: Option<BumpBlock>,           // Active small allocation
    overflow: Option<BumpBlock>,       // Active medium allocation
    recycled: VecDeque<BumpBlock>,     // Blocks with reusable holes
    rest: VecDeque<BumpBlock>,         // Used blocks pending collection
    lobs: Vec<LargeObjectBlock>,       // Large objects
}
}

Object Headers

Implementation: src/eval/memory/header.rs

Every object has a 16-byte header:

#![allow(unused)]
fn main() {
pub struct AllocHeader {
    bits: HeaderBits,                  // Mark bit + forwarded flag
    alloc_length: u32,                 // Object size
    forwarded_to: Option<NonNull<()>>, // For potential evacuation
}
}

Garbage Collection

Implementation: src/eval/memory/collect.rs

Mark phase:

  1. Reset line maps across all blocks
  2. Breadth-first root scanning from machine state
  3. Transitive closure following object references
  4. Mark lines containing live objects

Sweep phase:

  1. Scan line maps in each block
  2. Identify holes (2+ consecutive free lines)
  3. Move recyclable blocks to recycled list

Collection triggering:

  • When --heap-limit-mib is set and limit exceeded
  • Check performed every 500 VM execution steps
  • Emergency collection on allocation failure

See gc-implementation.md for detailed analysis.

The Prelude and Standard Library

Intrinsic Functions

Implementation: src/eval/intrinsics.rs, src/eval/stg/

Built-in functions are implemented in Rust and indexed by position:

Categories:

  • Control flow: __IF, __PANIC, __TRUE, __FALSE, __NULL
  • Lists: __CONS, __HEAD, __TAIL, __NIL, __REVERSE
  • Blocks: __MERGE, __DEEPMERGE, __ELEMENTS, __BLOCK, __LOOKUP
  • Arithmetic: __ADD, __SUB, __MUL, __DIV, __MOD, comparisons
  • Strings: __STR, __SPLIT, __JOIN, __MATCH, __FMT
  • Metadata: __META, __WITHMETA
  • Time: __ZDT, __ZDT.PARSE, __ZDT.FORMAT
  • I/O: __io.ENV, __io.EPOCHTIME
  • Emission: __RENDER, __EMIT* family

Each intrinsic implements the StgIntrinsic trait with direct access to machine state.

Prelude

Implementation: lib/prelude.eu

The prelude (~29KB) is written entirely in eucalypt, wrapping intrinsics with ergonomic functions:

List functions:

  • take, drop, nth (!!), fold/foldr, map, filter
  • append (++), concat, reverse, zip, group-by, qsort

Block functions:

  • merge-all, keys, values, map-kv, map-keys, map-values
  • lookup-path, alter-value, update-value

Combinators:

  • identity, const, compose (, ;), flip, curry, uncurry

String functions:

  • str.split, str.join, str.match, str.fmt, str.letters

Loading:

  • Prelude is embedded in the binary at compile time
  • Loaded by default unless --no-prelude / -Q flag is used

Driver and CLI Architecture

Entry Point

Implementation: src/bin/eu.rs, src/driver/options.rs

The CLI uses clap v4 with derive macros:

#![allow(unused)]
fn main() {
#[derive(Parser)]
struct EucalyptCli {
    #[command(subcommand)]
    command: Option<Commands>,
    files: Vec<String>,
    // Global options...
}

enum Commands {
    Run(RunArgs),
    Test(TestArgs),
    Dump(DumpArgs),
    Fmt(FmtArgs),
    Explain(ExplainArgs),
    ListTargets(ListTargetsArgs),
    Version,
}
}

Modes of Operation

Run (default):

eu file.eu                    # Implicit run
eu -e "expression" file.eu    # With evaluand
eu -x json file.eu            # JSON output
eu -t target file.eu          # Select target

Test:

eu test tests/              # Run all tests in directory
eu test -t specific file.eu  # Run specific test target

Format:

eu fmt file.eu              # Print formatted to stdout
eu fmt --write file.eu      # Modify in place
eu fmt --check file.eu      # Check formatting (exit 1 if needs format)

Dump:

eu dump ast file.eu          # Dump AST
eu dump desugared file.eu    # Dump after desugaring
eu dump stg file.eu          # Dump STG syntax
eu dump runtime              # Dump intrinsic definitions

Output Formats

Implementation: src/export/

Emitters for each format implement the Emitter trait:

  • YAML (yaml.rs) - Uses yaml_rust, supports tags from metadata
  • JSON (json.rs) - Uses serde_json
  • TOML (toml.rs) - Structured output
  • Text (text.rs) - Plain text
  • EDN (edn.rs) - Clojure-like format
  • HTML (html.rs) - Markup with serialisation

Input Handling

Inputs are merged in order:

  1. Prologue: Prelude, config files, build metadata, IO block
  2. Explicit: Files and options (-c/--collect-as)
  3. Epilogue: CLI evaluand (-e)

Names from earlier inputs are available to later inputs.

Key Design Decisions and Trade-offs

Why STG?

The STG machine provides a well-defined reference point for lazy functional language implementation:

  • Clear semantics for lazy evaluation with memoisation
  • Established compilation strategies
  • Potential for future optimisations

Trade-off: More complex than direct interpretation but provides a solid foundation.

Why Rowan for Parsing?

Rowan provides:

  • Incremental parsing for IDE support
  • Full source fidelity (preserves whitespace and comments)
  • Error recovery for partial parsing

Trade-off: More complex API than traditional parser generators.

Core as Intermediate Language

The core representation enables powerful transformations:

  • User-definable operator precedence resolved by syntax transformation
  • Block semantics (binding vs structuring) separated cleanly
  • Source information preserved through Smid for error reporting

Trade-off: Additional compilation phase, but enables experimentation.

Block Duality

Eucalypt blocks serve two roles:

  1. Name binding - Like let expressions
  2. Data structuring - Like objects/records

The core phase separates these into distinct Let and Block expressions.

Trade-off: Elegant surface syntax at cost of semantic complexity.

Immix-Inspired GC

The memory layout provides:

  • Efficient bump allocation
  • Cache-friendly organisation
  • Effective hole reuse

Current limitation: No evacuation/compaction (mark-sweep only).

See gc-implementation.md for detailed analysis.

Embedded Prelude

The prelude is:

  • Written in eucalypt itself
  • Compiled into the binary
  • Loaded unless explicitly disabled

Benefit: Dogfooding, consistent semantics Trade-off: Slower startup if prelude is large

Performance Characteristics

Compilation

  • Parsing: O(n) with incremental support
  • Desugaring: O(n) pass over AST
  • Cooking: O(n log n) for operator precedence resolution
  • STG compilation: O(n) with binding analysis

Execution

  • Bump allocation: O(1) for most objects
  • Thunk updates: O(1) memoisation
  • GC: O(live objects) for marking, O(total blocks) for sweeping

Memory

  • 16-byte header overhead per object
  • Block-level allocation granularity
  • Large objects get dedicated blocks

Code Organisation Summary

DirectoryPurposeKey Files
src/syntax/rowan/Incremental parsinglex.rs, parse.rs, ast.rs
src/core/Expression representationexpr.rs
src/core/desugar/AST to coredesugarer.rs
src/core/cook/Operator resolutionshunt.rs, fixity.rs
src/eval/stg/STG syntax and compilersyntax.rs, compiler.rs
src/eval/machine/Virtual machinevm.rs, cont.rs, env.rs
src/eval/memory/Heap and GCheap.rs, collect.rs
src/driver/CLI orchestrationoptions.rs, eval.rs
src/export/Output formatsyaml.rs, json.rs
lib/Standard libraryprelude.eu

Further Reading

  • implementation.md - Brief implementation overview
  • gc-implementation.md - Detailed GC documentation
  • command-line.md - CLI usage guide
  • syntax.md - Language syntax reference
  • operators-and-identifiers.md - Operator definitions
  • anaphora-and-lambdas.md - Implicit parameter handling

Eucalypt Style Guide

Eucalypt is very flexible and allows you to write extremely ugly code.

Here is some general style guidance for writing eucalypt idiomatically.

List access

  • head/tail for list decomposition (variable-length, processing elements sequentially)
  • first/second/!! n for positional access into fixed-size tuples or records

Don't mix idioms on the same context. If you use second(xs), use first(xs) not head(xs). If you use !! 2, use !! 0 and !! 1 not first and second.

Conditionals

  • Prefer simple (unnested) conditionals.
  • Use <bool> then(a, b) over if(cond, a, b) for simple conditionals.
  • If nesting is unavoidable, use if, nested thens are confusing.
  • Never mix if() and then() in the same expression — it looks like they go together and is confusing.
  • Prefer restructuring to eliminate conditionals altogether: use nil? guards, max, min, default values (min-of-or, max-of-or), etc.
  • then(false, x) and then(true, x) are antipatterns. Refactor to disjunctions and conjunctions: cond ∧ x, cond ∨ x, etc. Use block bindings or parentheses to keep / separated from catenation pipelines:
    { ok: xs non-nil?
      result: xs map(f) sum
    }.(ok ∧ result > 0)
    

Catenation precedence

Catenation (juxtaposition / pipeline application) has the lowest operator precedence (20). ALL infix operators bind tighter, including (35), (30), = (40), + (75), etc. This means infix operators steal adjacent atoms from catenation pipelines:

  • xs f(a) + 1 parses as xs(f(a) + 1)+ grabs f(a) and 1
  • (k > 0) ∧ xs non-nil? parses as ((k > 0) ∧ xs) non-nil? grabs xs
  • xs tail ++ [0] parses as xs(tail ++ [0])++ grabs tail and [0]

Fix with parens around the catenation: (k > 0) ∧ (xs non-nil?), (xs tail) ++ [0].

Pipelines

  • The clearest function definition is a straight pipeline. Structure programs to maximise them.
  • Prefer pipeline (catenation) style: xs head not head(xs), xs map(f) filter(g) not filter(g, map(f, xs)).
  • Use ; (compose) for point-free function definitions and embedding in pipelines: abs ; (+ 1).
  • is also acceptable, particularly in mathematical of strongly FP contexts
  • Partial application for pipeline steps: map(area(p)), filter(v-spans(y)).

Parameter order

Choose parameter order to support partial application and pipeline style:

  • Put the "data" or "collection" parameter last so that partially applied functions slot into pipelines: g remove-node("dac") count-paths("svr") reads as a pipeline of transformations.
  • Put "configuration" or "small" parameters first so they can be fixed early: map(lookup-count(table)), foldl(dp-step(g), {}, order).
  • If a function will be used as a foldl accumulator, match the (acc, elem) signature: dfs-topo(g, state, node) allows foldl(dfs-topo(g), init, nodes).

When in doubt, ask: "how will this function most commonly be called?" and put the varying argument last.

Naming

  • Predicates end with ?: vertical?, nil?, at-y.
  • Use descriptive names: make-edges not mk-edges.

Recursion

  • Prefer folds, scans, or specific prelude algorithms over explicit recursion where possible.

Documentation

  • Use backtick string metadata (` "...") for documenting declarations. Backtick metadata attaches to the next declaration, so only use it when the comment is specific to that declaration.
  • Use # comments for inline explanatory notes within blocks (e.g. section separators, notes that apply to a group of bindings rather than one declaration) and for disabling code.
  • Doc metadata should be markdown: use backquotes for param and function names.

Sections and anaphora

  • Prefer sections without superfluous brackets: iterate(+ 2, 0) not iterate((+ 2), 0), map(* 2) not map((* 2)).
  • Prefer sections over anaphora when a section suffices: map(* 2) not map(_ * 2).
  • Use anaphora when the expression genuinely needs more than a simple section: map(_ * _ + 1).

Blocks

  • Use blocks for local bindings: { x: ... y: ... }.(x + y), but limit to one block, do not stack this construct
  • Keep block "results"" in .(...) concise, preferably simple pipelines or expressions, or even just {...}.result
  • Dynamic generalised lookup: a function can return a block whose names are then used as a namespace for subsequent pipelines, e.g. prepare(data).( edges take(k) ... ). Use very sparingly — it can defeat static analysis. Never nest or stack dynamic lookups.

Frequently Asked Questions

Getting Started

How do I install eucalypt?

On macOS, use Homebrew:

brew install curvelogic/homebrew-tap/eucalypt

On Linux or macOS without Homebrew, use the install script:

curl -sSf https://raw.githubusercontent.com/curvelogic/eucalypt/master/install.sh | sh

You can also download a binary from the GitHub releases page, or build from source with cargo install --path ..

Verify installation with:

eu version

How do I convert between data formats?

Pass a file in one format and specify the output format with -x or -j:

# YAML to JSON
eu data.yaml -j

# JSON to YAML (default output)
eu data.json

# YAML to TOML
eu data.yaml -x toml

What data formats does eucalypt support?

Input formats: YAML, JSON, JSON Lines (jsonl), TOML, EDN, XML, CSV, plain text, and eucalypt's own .eu syntax.

Output formats: YAML (default), JSON, TOML, EDN, and plain text.

Streaming input formats (for large files): jsonl-stream, csv-stream, text-stream.

How do I use eucalypt in a pipeline?

eu reads from stdin by default when used in a pipe and writes to stdout:

# Filter JSON from an API
curl -s https://api.example.com/data | eu -e 'items filter(.active)'

# Transform and re-export
cat data.yaml | eu transform.eu -j > output.json

Use -e to specify an expression to evaluate against the input data.

How do I pass arguments to a eucalypt program?

Use -- to separate eu flags from program arguments:

eu program.eu -- arg1 arg2 arg3

Inside your program, access them via io.args:

name: io.args head-or("World")
greeting: "Hello, {name}!"

Language

How do functions work in eucalypt?

Define functions with a parameter list after the name:

double(x): x * 2
result: double(21) //=> 42

Functions are curried -- applying fewer arguments than expected returns a partially applied function:

add(x, y): x + y
increment: add(1)
result: increment(9) //=> 10

What is catenation?

Catenation is eucalypt's pipeline syntax. Writing x f applies f to x as a single argument:

add-one(x): x + 1
result: 5 add-one //=> 6

Chain multiple transforms by writing them in sequence:

double(x): x * 2
add-one(x): x + 1
result: 5 double add-one //=> 11

This reads left to right: start with 5, double it (10), add one (11).

What are anaphora and when should I use them?

Anaphora are implicit parameters that let you define simple functions without naming them. There are three kinds:

Expression anaphora (_, _0, _1): turn an expression into a function.

squares: [1, 2, 3] map(_0 * _0) //=> [1, 4, 9]

String anaphora ({}, {0}, {1}): turn a string template into a function.

labels: [1, 2, 3] map("item-{}") //=> ["item-1", "item-2", "item-3"]

Block anaphora (, •0, •1): turn a block into a function.

Use anaphora for simple, readable cases. For anything more complex, prefer a named function. See Anaphora for details.

Why is there no lambda syntax?

Eucalypt deliberately omits lambda expressions. Instead, use:

  1. Named functions for anything non-trivial
  2. Anaphora (_, {}) for simple one-liners
  3. Sections ((+ 1), (* 2)) for operator-based functions
  4. Partial application (add(1)) for curried functions
# All equivalent ways to add one:
add-one(x): x + 1
result1: [1, 2, 3] map(add-one) //=> [2, 3, 4]
result2: [1, 2, 3] map(_ + 1) //=> [2, 3, 4]
result3: [1, 2, 3] map(+ 1) //=> [2, 3, 4]

How does block merging work?

When you write one block after another (catenation), they merge:

base: { a: 1 b: 2 }
overlay: { b: 3 c: 4 }
merged: base overlay //=> { a: 1 b: 3 c: 4 }

The second block's values override the first. This is a shallow merge. For recursive deep merge, use the << operator:

base: { x: { a: 1 b: 2 } }
extra: { x: { c: 3 } }
result: base << extra

How do I handle the lookup precedence gotcha?

The . (lookup) operator has higher precedence than catenation, so xs head.id parses as xs (head.id), not (xs head).id.

Use explicit parentheses:

data: [{ id: 1 }, { id: 2 }]
first-id: (data head).id //=> 1

See Syntax Gotchas for more.

Data Processing

How do I filter and transform lists?

Use map to transform and filter to select:

numbers: [1, 2, 3, 4, 5, 6]
small: numbers filter(< 4) //=> [1, 2, 3]
doubled: numbers map(* 2) //=> [2, 4, 6, 8, 10, 12]

Combine them in a pipeline:

result: [1, 2, 3, 4, 5, 6] filter(> 3) map(* 10) //=> [40, 50, 60]

How do I look up values in nested blocks?

Use chained . lookups for known paths:

config: { db: { host: "localhost" port: 5432 } }
host: config.db.host //=> "localhost"

For dynamic key lookup, use lookup with a symbol:

data: { name: "Alice" age: 30 }
field: data lookup(:name) //=> "Alice"

Use lookup-or to provide a default:

data: { name: "Alice" }
age: data lookup-or(:age, 0) //=> 0

How do I search deeply nested data?

Use deep-find for recursive key search:

# Finds all values for key "id" at any depth
ids: data deep-find("id")

Use lookup-path for a known sequence of keys:

data: { a: { b: { c: 42 } } }
result: data lookup-path([:a, :b, :c]) //=> 42

How do I sort data?

Sort lists with sort-nums or sort-strs:

names: ["Charlie", "Alice", "Bob"]
sorted: names sort-strs //=> ["Alice", "Bob", "Charlie"]
nums: [5, 1, 3, 2, 4]
sorted: nums sort-nums //=> [1, 2, 3, 4, 5]

For sorting by a key, use sort-by-str or sort-by-num:

people: [{ name: "Zoe" age: 25 }, { name: "Amy" age: 30 }]
by-name: people sort-by-str(.name)
youngest: (by-name head).name //=> "Amy"

How do I work with dates?

Use t"..." literals for date-time values:

meeting: t"2024-03-15T14:30:00Z"
date-only: t"2024-03-15"
before: t"2024-01-01" < t"2024-12-31" //=> true

See Date, Time, and Random Numbers for parsing, formatting, and arithmetic.

Advanced

How do I attach metadata to declarations?

Use the backtick (`) prefix:

` "Compute the square of a number"
square(x): x * x

result: square(5) //=> 25

Metadata can be a string (documentation) or a block with structured data:

` { doc: "Custom operator" associates: :left precedence: 75 }
(l <+> r): l + r

How do imports work?

Imports are specified in declaration metadata using the import key:

{ import: "helpers.eu" }

result: helper-function(42)

For named imports (scoped access):

{ import: "cfg=config.eu" }

host: cfg.host

See Import Formats for the full syntax including git imports.

How do I write tests?

Use the //=> assertion operator to check values inline:

double(x): x * 2
result: double(21) //=> 42

If the assertion fails, eucalypt panics with a non-zero exit code. Other assertion operators:

x: 5
check1: (x > 3) //!
check2: (x = 0) //!!
check3: x //=? pos?

How do I generate random values?

Use io.random for a stream of random floats, or pass --seed for reproducible output:

die: random.int(6, io.random).value + 1
eu --seed 42 game.eu

See Random Numbers for the full API.

What are sets and how do I use them?

The set namespace provides set operations. Convert lists to sets with set.from-list:

sa: set.from-list([1, 2, 3, 4])
sb: set.from-list([3, 4, 5, 6])
common: sa set.intersect(sb) set.to-list //=> [3, 4]
combined: sa set.union(sb) set.to-list sort-nums //=> [1, 2, 3, 4, 5, 6]
diff: sa set.diff(sb) set.to-list sort-nums //=> [1, 2]

Syntax Cheat Sheet

A dense single-page reference covering all syntax forms, operators, common patterns, and key prelude functions.

Primitives

TypeSyntaxExamples
Integerdigits42, -7, 0
Floatdigits with .3.14, -0.5
Stringdouble quotes"hello", "line\nbreak"
Symbolcolon prefix:key, :name
Booleanliteralstrue, false
Nullliteralnull
ZDTt"..." prefixt"2024-03-15", t"2024-03-15T14:30:00Z"

Blocks

# Property declaration
name: expression

# Function declaration
f(x, y): expression

# Operator declaration (binary)
(l ++ r): expression

# Operator declaration (prefix / postfix)
(! x): expression
(x ******): expression

# Block literal
{ a: 1 b: 2 c: 3 }

# Commas are optional
{ a: 1, b: 2, c: 3 }

# Nested blocks
{ outer: { inner: "value" } }

Top-level unit: the file itself is an implicit block (no braces needed).

Lists

# List literal
[1, 2, 3]

# Empty list
[]

# Mixed types
[1, "two", :three, true]

String Interpolation

# Insert expressions with {braces}
"Hello, {name}!"

# String anaphora (defines a function)
"#{}"           # one-parameter function
"{0} and {1}"   # two-parameter function

Comments

# Line comment (to end of line)
x: 42 # inline comment

Declarations

FormSyntaxNotes
Propertyname: exprDefines a named value
Functionf(x, y): exprNamed function with parameters
Block patternf({x y}): exprDestructures block argument fields
Block renamef({x: a y: b}): exprDestructures with renamed bindings
List patternf([a, b, c]): exprDestructures fixed-length list
Cons patternf([h : t]): exprDestructures head and tail of list
Binary operator(l op r): exprInfix operator
Prefix operator(op x): exprUnary prefix
Postfix operator(x op): exprUnary postfix
Idiot bracket⟦ x ⟧: exprCustom Unicode bracket pair

Idiot Brackets

Idiot brackets allow custom Unicode bracket pairs to wrap and transform expressions — a general bracket overloading mechanism.

# Declare a bracket pair function
⟦ x ⟧: my-functor(x)

# Use the bracket pair in expressions
result: ⟦ some-expression ⟧  # calls my-functor(some-expression)

Built-in bracket pairs: ⟦⟧, ⟨⟩, ⟪⟫, ⌈⌉, ⌊⌋, ⦃⦄, ⦇⦈, ⦉⦊, «», 【】, 〔〕, 〖〗, 〘〙, 〚〛.

Monadic Blocks

Monadic sequencing (like do-notation) is supported via bracket pairs or block metadata. A bracket block or annotated block followed by .expr desugars to a bind chain.

# Bracket pair definition — explicit functions
⟦{}⟧: { :monad bind: my-bind  return: my-return }

# Bracket pair definition — namespace reference
⟦{}⟧: { :monad namespace: my-monad }

# Block metadata forms (all followed by .return_expr):
{ :name decls }.expr                          # Form 1: bare symbol namespace
{ { monad: name } decls }.expr               # Form 2: monad key
{ { :monad namespace: name } decls }.expr    # Form 3: tagged namespace
{ { :monad bind: f return: r } decls }.expr  # Form 4: explicit inline

# ⟦ a: ma  b: mb ⟧.return_expr
# desugars to: bind(ma, (a): bind(mb, (b): return(return_expr)))
result: ⟦ a: ma  b: mb ⟧.(a + b)

All declarations are bind steps whose names are in scope for later declarations and the return expression. The return expression follows the closing bracket (or block) as .name, .(expr), or .[list].

Metadata Annotations

# Declaration metadata (backtick prefix)
` "Documentation string"
name: value

# Structured metadata
` { doc: "description" associates: :left precedence: 50 }
(l op r): expr

# Unit-level metadata (first expression in file)
{ :doc "Unit description" }
a: 1

Special metadata keys: :target, :suppress, :main, associates, precedence, import.

Function Application

# Parenthesised application (no whitespace before paren)
f(x, y)

# Catenation (pipeline style, single argument)
x f              # equivalent to f(x)
x f g h          # equivalent to h(g(f(x)))

# Partial application (curried)
add(1)           # returns a function adding 1

# Sections (operator with gaps)
(+ 1)            # function: add 1
(* 2)            # function: multiply by 2
(/)              # function: floor divide (two params)
(÷)              # function: precise divide (two params)

Lookup and Generalised Lookup

# Simple lookup
block.key

# Generalised lookup (evaluate RHS in block's scope)
{ a: 3 b: 4 }.(a + b)        # 7
{ a: 3 b: 4 }.[a, b]         # [3, 4]
{ a: 3 b: 4 }."{a} and {b}"  # "3 and 4"

Anaphora (Implicit Parameters)

TypeNumberedUnnumberedScope
Expression_0, _1, _2_ (each use = new param)Expression
Block•0, •1, •2 (each use = new param)Block
String{0}, {1}, {2}{} (each use = new param)String
# Expression anaphora
map(_0 * _0)        # square each element
map(_ + 1)          # increment (each _ is a new param)

# Block anaphora (bullet = Option-8 on Mac)
{ x: •0 y: •1 }    # two-parameter block function

# String anaphora
map("item: {}")     # format each element

Operator Precedence Table

From highest to lowest binding:

PrecNameAssocOperatorsDescription
95--prefixTight prefix (head)
90lookupleft.Field access / lookup
88bool-unaryprefix!, ¬Boolean negation
88bool-unarypostfixNot-null check (true if not null)
85expright^, , ;Power, composition
80prodleft*, /, ÷, %Multiplication, floor division, precise division, floor modulo
75sumleft+, -Addition, subtraction
55--rightList cons (prepend element)
50cmpleft<, >, <=, >=Comparison
45appendright++, <<List append, deep merge
42mapleft<$>Functor map
40eqleft=, !=Equality
35bool-prodleft&&, Logical AND
30bool-sumleft||, Logical OR
20catleft(catenation)Juxtaposition / pipeline
10applyright@Function application
5metaright//, //<< , //=, //=>Metadata / assertions

User-defined operators default to left-associative, precedence 50. Set custom values via metadata: ` { precedence: 75 associates: :right }

Named precedence levels for use in metadata: lookup, call, bool-unary, exp, prod, sum, shift, bitwise, cmp, append, map, eq, bool-prod, bool-sum, cat, apply, meta.

Block Merge

# Catenation of blocks performs a shallow merge
{ a: 1 } { b: 2 }       # { a: 1 b: 2 }
{ a: 1 } { a: 2 }       # { a: 2 }

# Deep merge operator
{ a: { x: 1 } } << { a: { y: 2 } }  # { a: { x: 1 y: 2 } }

Imports

# Unit-level import
{ import: "lib.eu" }

# Named import
{ import: "cfg=config.eu" }

# Multiple imports
{ import: ["dep-a.eu", "dep-b.eu"] }

# Format override
{ import: "yaml@data.txt" }

# Git import
{ import: { git: "https://..." commit: "sha..." import: "file.eu" } }

Key Prelude Functions

Lists

FunctionDescription
headFirst element
tailAll but first
cons(x, xs)Prepend element
map(f)Transform each element
filter(p?)Keep elements matching predicate
foldl(f, init)Left fold
foldr(f, init)Right fold
sort-by(f)Sort by key function
take(n)First n elements
drop(n)Remove first n
zipPair elements from two lists
zip-with(f)Combine elements with function
flattenFlatten nested lists one level
reverseReverse a list
countNumber of elements
range(a, b)Integers from a to b-1
nil?Is the list empty?
any?(p?)Does any element match?
all?(p?)Do all elements match?
uniqueRemove duplicates

Blocks

FunctionDescription
lookup(key)Look up a key (symbol)
lookup-or(key, default)Look up with default
has(key)Does block contain key?
keysList of keys (as symbols)
valuesList of values
elementsList of {key, value} pairs
map-keys(f)Transform keys
map-values(f)Transform values
select(keys)Keep only listed keys
dissoc(keys)Remove listed keys
merge(b)Shallow merge
deep-merge(b)Deep recursive merge
sort-keysSort by key name

Strings (str namespace)

FunctionDescription
str.len(s)String length
str.upper(s)Upper case
str.lower(s)Lower case
str.starts-with?(prefix)Starts with prefix?
str.ends-with?(suffix)Ends with suffix?
str.contains?(sub)Contains substring?
str.matches?(regex)Matches regex?
str.split(sep)Split by separator
str.join(sep)Join list with separator
str.replace(from, to)Replace occurrences
str.trimRemove surrounding whitespace

Serialisation and Parsing

FunctionDescription
render(value)Serialise to YAML string
render-as(fmt, value)Serialise to string in named format
parse-as(fmt, str)Parse string as structured data (inverse of render-as)

Formats for render-as: :yaml, :json, :toml, :text, :edn, :html. Formats for parse-as: :json, :yaml, :toml, :csv, :xml, :edn, :jsonl.

parse-as is safe for untrusted input — YAML !eu tags are never evaluated.

Combinators

FunctionDescription
identityReturns its argument unchanged
const(k)Always returns k
compose(f, g) or f ∘ gCompose functions
flip(f)Swap argument order
complement(p?)Negate a predicate
curry(f)Curry a function taking a pair
uncurry(f)Uncurry to take a pair

Numbers

FunctionDescription
numParse string to number
absAbsolute value
negateNegate number
inc / decIncrement / decrement
max(a, b) / min(a, b)Maximum / minimum
even? / odd?Parity predicates
zero? / pos? / neg?Sign predicates
floor / ceil / roundRounding

Arrays (arr namespace)

FunctionDescription
arr.zeros(shape)Create array of zeros; shape is a list of integers
arr.fill(shape, val)Create array filled with val
arr.from-flat(shape, vals)Create array from flat list of numbers
arr.get(a, coords)Element at coordinate list coords
arr.set(a, coords, val)New array with element at coords set to val
arr.shape(a)Shape as list of integers
arr.rank(a)Number of dimensions
arr.length(a)Total number of elements
arr.to-list(a)Flat list of elements in row-major order
arr.array?(x) / is-array?(x)Is x an array?
arr.transpose(a)Reverse all axes
arr.reshape(a, shape)Reshape (total elements must match)
arr.slice(a, axis, idx)Slice along axis at idx (reduces rank by 1)
arr.add(a, b) / arr.sub / arr.mul / arr.divElement-wise arithmetic; b may be scalar
a !! coordsIndex operator; for arrays, coords is a list e.g. [row, col]
arr.indices(a)List of coordinate lists for every element (row-major)
arr.map(f, a)Apply f to each element; same shape
arr.map-indexed(f, a)Apply f(coords, val) to each element; same shape
arr.fold(f, init, a)Left-fold over all elements in row-major order
arr.neighbours(a, coords, offsets)Values at valid in-bounds neighbours given offset vectors

The standard +, -, *, / operators are polymorphic and apply element-wise when either operand is an array. Scalar broadcasting is supported.

IO

Binding / FunctionDescription
io.envBlock of environment variables
io.epoch-timeUnix timestamp at launch
io.argsCommand-line arguments (after --)
io.randomInfinite lazy stream of random floats
io.RANDOM_SEEDCurrent random seed
io.return(a)Wrap a pure value in the IO monad
io.bind(action, cont)Sequence two IO actions
io.shell(cmd)Run cmd via sh -c; returns {stdout, stderr, exit-code}
io.shell-with(opts, cmd)Run cmd with extra options (e.g. {stdin: s, timeout: 60}). Pipeline: "cmd" shell-with(opts)
io.exec([cmd : args])Run cmd directly (no shell); argument is a list [cmd, arg1, arg2, ...]
io.exec-with(opts, [cmd : args])Run cmd directly with extra options. Pipeline: ["git", "log"] exec-with(opts)
io.check(result)Fail with stderr if exit-code is non-zero; otherwise return result
io.map(f, action)Apply pure function to result of IO action (fmap)

Assertion Operators

OperatorDescription
e //=> vAssert e equals v (panic if not)
e //= vAssert equals (silent, returns e)
e //!Assert e is true
e //!!Assert e is false
e //=? fAssert f(e) is true
e //!? fAssert f(e) is false

Command Line Quick Reference

eu file.eu                  # Evaluate file, output YAML
eu -j file.eu               # Output JSON
eu -x text file.eu          # Output plain text
eu -e 'expression'          # Evaluate expression
eu a.yaml b.eu              # Merge inputs
eu -t target file.eu        # Render specific target
eu list-targets file.eu     # List targets
eu --seed 42 file.eu        # Deterministic random
eu -Q file.eu               # Suppress prelude
eu fmt file.eu              # Format source
eu dump stg file.eu         # Dump STG syntax
eu -- arg1 arg2             # Pass arguments (io.args)

Syntax Gotchas

This document records unintuitive consequences of Eucalypt's syntax design decisions that can lead to subtle bugs or confusion.

Operator Precedence Issues

Eucalypt's catenation (juxtaposition) operator has very low precedence (20) — lower than all arithmetic and comparison operators. This means infix operators bind more tightly than catenation, which can produce surprising parses.

The precedence hierarchy (highest to lowest):

PrecedenceNameOperators
95 (head prefix)
90lookup/call. (lookup), f(x, y)
88bool-unary! (not)
85exp^ (power)
80prod*, /, ÷, %
75sum+, -
50cmp<, >, <=, >=
45append++
40eq=, !=
35bool-prod (and)
30bool-sum (or)
20catcatenation (juxtaposition)
10apply
5meta` (metadata)

Field Access vs Catenation

Problem: The lookup operator (.) has higher precedence (90) than catenation (precedence 20), which can lead to unexpected parsing.

Gotcha: Writing objects head.id is parsed as objects (head.id) rather than (objects head).id.

Example:

# This doesn't work as expected:
objects: range(0, 5) map({ id: _ })
result: objects head.id  # Parsed as: objects (head.id)

# Correct syntax requires parentheses:
result: (objects head).id  # Explicitly groups the field access

Error Message: When this occurs, you may see confusing errors like:

  • cannot return function into case table without default
  • bad index 18446744073709551615 into environment (under memory pressure)

Solution: Always use parentheses to group the expression you want to access fields from:

  • Use (expression).field instead of expression target.field
  • Be explicit about precedence when combining catenation with field access

Arithmetic After a Pipeline

Problem: Infix operators like +, -, * have higher precedence than catenation. When an infix operator follows a pipeline (catenation chain), it binds to the last function in the chain, not to the result of the whole pipeline.

Gotcha: Writing xs foldl(f, 0) + 1 is not parsed as (xs foldl(f, 0)) + 1. Instead, the shunting-yard algorithm sees:

[xs, cat@20, foldl, call@90, (f,0), +@75, 1]

Because + (75) has higher precedence than cat (20), the + binds first: foldl(f, 0) + 1 is evaluated (adding 1 to a partial application — a type error), and then the result is applied to xs.

Example:

# WRONG — fails at runtime:
n: xs foldl(+, 0) + 1        # Parsed as: (foldl(+, 0) + 1)(xs)

# CORRECT — use parentheses:
n: (xs foldl(+, 0)) + 1      # Explicit grouping

# CORRECT — split into two bindings:
m: xs foldl(+, 0)
n: m + 1

Error Messages:

  • cannot return function into case table without default — the + intrinsic receives a function (the partial application) instead of a number.

Rule of thumb: If a pipeline result feeds into an infix operator, always use parentheses around the pipeline or bind it to a name first.

Anaphora and Function Syntax

Lambda Syntax Does Not Exist

Problem: Eucalypt does not have lambda expressions or arrow functions. There is no syntax for anonymous multi-parameter functions.

Gotcha: The -> token is the const operator (returns its left operand, ignoring the right). Writing x -> x + 1 does not create a function — it evaluates x, discards x + 1, and returns x.

Invalid Examples:

# NONE of these create functions in Eucalypt:
map(\x -> x + 1)     # Invalid syntax
map(|x| x + 1)       # Invalid syntax
map(fn(x) => x + 1)  # Invalid syntax
filter(x -> x > 0)   # -> is const, not lambda!

Correct Approach: Use sections, anaphora (_, _0, _1), named functions, or partial application:

# Operator sections (preferred for simple cases):
map(+ 1)             # Section: adds 1
map(^ 2)             # Section: squares
filter(> 0)          # Section: positive?

# Identity anaphor — passes an identity function:
map(_)               # Same as map(identity)

# Anaphora in expressions:
map(_ + 1)           # Anonymous single-parameter function
filter(_ > 0)        # Anonymous predicate

# Multiple `_` — each is a separate parameter:
f: (_ * _)           # f is a 2-arg multiply function
f(3, 4)              # => 12

# Numbered anaphora for same-param reuse:
sq: (_0 * _0)        # sq(5) => 25

# Using named function + partial application:
add-one(x): x + 1
map(add-one)

# For multi-parameter needs, define a named helper:
has-y(y, h): first(h) = y
filter(has-y(target-y))   # Partial application

Important: Each _ introduces a separate parameter. _ * _ is a 2-arg function (like (_0 * _1)), not (_0 * _0). Use numbered anaphora _0 * _0 when you need to reference the same parameter twice.

Reference: See Anaphora for detailed explanation of anaphora usage.

Anaphor Scoping: Parens Are Opaque

Problem: Parentheses create an anaphor scope boundary. Anaphors inside parens form a lambda at the paren level, not at the enclosing expression.

Scoping rules:

  1. Parens are opaque by default — anaphors inside parens create a lambda scoped to that paren group. This applies to both _ and _0/_1.

  2. Subsumption — if the enclosing expression already has direct anaphors, inner paren groups become transparent (their anaphors join the outer scope).

  3. ArgTuples follow the same rules — function call arguments are opaque unless subsumed by an outer anaphoric scope.

Examples:

# Parens opaque — paren group forms its own lambda:
(_ + 1)               # λ(a). a + 1
(_ * _)               # λ(a, b). a * b
(_ = :quux) ∘ tag     # (λ(a). a = :quux) ∘ tag  ✓ composition works

# Without parens — whole expression is the lambda body:
_ + 1                 # λ(a). a + 1   (same result here)
_ * _                 # λ(a, b). a * b

# Parens opaque breaks cross-operator average:
(_0 + _1) / 2         # (λ(a,b). a+b) / 2  — type error!
# Correct idiom (no parens):
_0 + _1 / 2           # λ(a, b). a + b/2  — note: b is halved, not sum

# Subsumption: outer _ makes inner (_ * _) transparent:
_ + (_ * _)           # λ(a, b, c). a + (b * c)  ✓

# ArgTuple opaque by default:
map(_ + 1)            # map(λ(a). a + 1)   ✓
filter(_ > 0)         # filter(λ(a). a > 0)  ✓

# ArgTuple subsumed when outer has anaphors:
_0 * (_1 + 2)         # λ(a, b). a * (b + 2)  ✓

Common mistake: Expecting (_0 + _1) / 2 to be a 2-argument averaging function. Under the opaque parens model, (_0 + _1) forms its own lambda and / 2 tries to divide that lambda by 2 — a type error. Use a named helper instead:

avg2(a, b): (a + b) / 2
zip-with(avg2, xs, ys)

Expanding scope with subsumption: The subsumption rule can be exploited to make the outer expression anaphoric, causing inner paren groups to become transparent. A common idiom is to place a direct anaphor — such as a not-nil check _0✓ — at the outer level:

# _0✓ makes the whole expression anaphoric in _0,
# so _0 inside count(_0) is subsumed — both refer to the same parameter.
_0✓ && count(_0) >= 4    # λ(a). a != null && count(a) >= 4

Without the outer _0✓, the _0 inside count(_0) would form a separate lambda at the ArgTuple level, and the outer && would see a function value rather than a boolean. The not-nil postfix is often the most natural way to introduce the outer anaphor when you want to also guard against null.

Metadata vs Comments

Backtick Is Metadata, Not a Comment

Problem: The backtick (`) attaches metadata to the next declaration. It is not a comment syntax.

Gotcha: Writing ` "some text" with no declaration following it causes a parse error, often reported at an unexpected location (e.g., line 1).

Example:

# WRONG — dangling metadata with nothing to attach to:
` "This is not a comment"

# CORRECT — use # for comments:
# This is a comment

# CORRECT — metadata attached to a declaration:
` "Documentation for my-fn"
my-fn(x): x + 1

Rule: Use # for comments. Only use ` when you intend to attach metadata to the immediately following declaration.

Single Quote Identifiers

Single Quotes Are Not String Delimiters

Problem: Single quotes (') in Eucalypt are used to create identifiers, not strings.

Gotcha: Coming from languages where single quotes delimit strings, developers might expect 'text' to be a string literal.

Key Rules:

  • Single quotes create normal identifiers that can contain any characters
  • The identifier name is the content between the quotes (quotes are stripped)
  • This is the only use of single quotes in Eucalypt
  • String literals use double quotes (") only

Examples:

# Single quotes create identifiers (variable names):
'my-file.txt': "content"     # Creates identifier: my-file.txt
home: {
  '.bashrc': false           # Creates identifier: .bashrc
  '.emacs.d': false          # Creates identifier: .emacs.d
  'notes.txt': true          # Creates identifier: notes.txt
}

# Access using lookup:
z: home.'notes.txt'          # Looks up identifier: notes.txt

# NOT string literals:
'hello' = 'hello'            # Compares two variable references (not strings)
"hello" = "hello"            # Compares two string literals (correct)

Division Operators

/ Is Floor Division

Problem: The / operator performs floor division (integer division), not precise division.

Gotcha: 7 / 2 evaluates to 3, not 3.5.

Example:

7 / 2    # => 3 (floor division)
7 ÷ 2    # => 3.5 (exact division)

Rule: Use ÷ for exact/precise division. Use / only when you want integer (floor) division.

Cons Patterns vs Normal Lists

The [h : t] cons pattern is only valid in a function parameter position. A colon inside a list literal in an expression context is not valid:

# Valid — cons pattern in a function parameter
list-head([h : t]): h

# Invalid — colon is not a list separator in expression context
bad: [1 : rest]   # parse error

In expression context, use the cons operator (precedence 55) or the cons prelude function:

xs: [1, 2, 3]
first: xs head         # = 1
rest: xs tail          # = [2, 3]
built: 1 ‖ [2, 3]     # = [1, 2, 3]
also: cons(1, [2, 3])  # = [1, 2, 3]

Block-Dot Lookup Applies to Any Block Literal

The generalised lookup syntax {...}.field and {...}.(expr) works on any block literal, not only IO monadic blocks. The dot binds the lookup to the block immediately to its left:

# Field lookup on a plain block
{ x: 1, y: 2 }.x          # → 1

# Parenthesised expression scoped over the block's bindings
{ x: 1, y: 2 }.(x + y)    # → 3

# The same syntax is used for IO monadic block return expressions
{ :io r: io.shell("echo hello") }.(r.stdout)
{ :io r: io.shell("echo hello") }.r.stdout

Precedence: . binds tightly (precedence 90), so the lookup attaches to the block literal, not to the result of any surrounding expression. Use parentheses when you need to apply a lookup to a computed value:

# Parsed as: list (head.name)  — probably not what you want
list head.name

# Correct: (list head).name
(list head).name

IO monadic block desugaring: { :io r: cmd }.(expr) desugars to io.bind(cmd, lambda(r). io.return(expr)). The .() return expression is part of the general block-dot syntax, not IO-specific.

Bare-expression files: A .eu file containing only a block-dot expression (no outer key: declaration) is supported when the expression starts with a block literal {...}:

# This works as a standalone .eu file:
{ :io r: io.shell("echo hello") }.(r.stdout)

Block Field Names Shadow Outer Bindings — {x: x} Is Always Self-Reference

Problem: Every declaration inside a block literal introduces a new binding that is visible to all other expressions in that block, including its own right-hand side. The name on the left of : immediately shadows any outer binding with the same name, so {x: x} does not copy an outer x into the block — it creates a self-referential binding that refers to itself:

# WRONG — infinite loop: cmd refers to itself, not the function parameter
shell-spec(cmd): {:io-shell cmd: cmd, timeout: 30}

Running eu -e '{cmd: cmd}' gives error: infinite loop detected: binding refers to itself.

Why it bites functions: When a function parameter has the same name as a block field you want to populate, the RHS expression sees the block's own binding rather than the parameter:

fn(cmd): {cmd: cmd}   # cmd: cmd is self-reference — fn's parameter is invisible

Fix: Use a different name for the function parameter so it is not shadowed:

# Correct: parameter c is not shadowed by the block field cmd
shell-spec(c): {:io-shell cmd: c, timeout: 30}

Rule: Never write {key: key} expecting it to read an outer binding named key. If you need a block field to hold a value from an outer scope, the outer name and the field name must differ.

Future Improvements

These gotchas highlight areas where the language could benefit from:

  1. Better Error Messages: More specific error messages when precedence issues occur
  2. Linting Rules: Static analysis to catch common precedence mistakes
  3. IDE Support: Syntax highlighting and warnings for ambiguous expressions
  4. Documentation: Better examples showing correct precedence usage

Migration from v0.2 to v0.3

Migration guide is under construction.

Key changes in v0.3:

  • Subcommand structure: eu test replaces -T, eu dump replaces -p/--dump-xxx
  • All existing command patterns continue to work (backward compatible)
  • New eu fmt and eu lsp subcommands