Working with Data
In this chapter you will learn:
- How to process JSON, YAML, TOML, CSV, and XML data
- How to convert between formats on the command line
- Patterns for querying and transforming structured data
- How to combine multiple data sources
Format Conversion
The simplest use of eucalypt is converting between data formats. By default, output is YAML:
# JSON to YAML
eu data.json
# YAML to JSON
eu data.yaml -j
# JSON to TOML
eu data.json -x toml
Processing JSON
Pipe JSON from other tools into eucalypt:
curl -s https://api.example.com/users | eu -e 'map(.name)'
Or process a JSON file:
eu -e 'users filter(.active) map(.email)' data.json
Processing YAML
YAML files are read natively. All YAML features including anchors, aliases, and merge keys are supported:
# config.yaml
defaults: &defaults
timeout: 30
retries: 3
production:
<<: *defaults
debug: false
eu config.yaml -e 'production'
timeout: 30
retries: 3
debug: false
YAML Timestamps
YAML timestamps are automatically converted to date-time values:
created: 2024-03-15
updated: 2024-03-15T14:30:00Z
Quote the value to keep it as a string: created: "2024-03-15".
Processing TOML
eu config.toml -e 'database.port'
5432
Processing CSV
CSV files are imported as a list of blocks, where each row becomes a block with column headers as keys:
eu -e 'rows filter(_.age num > 30)' rows=people.csv
CSV values are always strings. Use num to convert to numbers when
needed.
Processing XML
XML is imported as a nested list structure. Each element is
represented as [tag, attributes, ...children]:
eu -e 'root' root=xml@data.xml
Use list functions to navigate the structure:
{ import: "root=xml@data.xml" }
# Get the tag name (first element)
tag: root first
# Get attributes (second element)
attrs: root second
# Get child elements (everything after the first two)
children: root drop(2)
Named Inputs
Use named inputs to make data available under a specific name:
eu users=users.json roles=roles.json -e 'users map(.name)'
Named inputs are essential for list-based formats (CSV, JSON Lines, text):
eu lines=text@log.txt -e 'lines filter(str.matches?("ERROR")) count'
Combining Multiple Sources
A powerful pattern is combining data from multiple sources:
eu users.yaml roles.yaml merge.eu
Where merge.eu contains logic that uses names from both inputs:
# merge.eu
summary: {
user-count: users count
role-count: roles count
}
Using Evaluands
The -e flag specifies an expression to evaluate against the loaded
inputs:
# Select a nested value
eu config.yaml -e 'database.host'
# Transform and filter
eu data.json -e 'items filter(.price > 100) map(.name)'
# Aggregate
eu data.json -e 'items map(.price) foldl(+, 0)'
Collecting Inputs
The --collect-as (-c) flag gathers multiple files into a list:
eu -c configs *.yaml -e 'configs map(.name)'
Add --name-inputs (-N) to get a block keyed by filename:
eu -c configs -N *.yaml
configs:
a.yaml:
name: alpha
b.yaml:
name: beta
Output Formats
Control the output format:
| Flag | Format |
|---|---|
| (default) | YAML |
-j | JSON |
-x json | JSON |
-x toml | TOML |
-x edn | EDN |
-x text | Plain text |
The format can also be inferred from the output file:
eu data.yaml -o output.json
Practical Example: Data Pipeline
Suppose you have a CSV of sales data and want to generate a JSON summary:
eu sales=sales.csv -j -e '{
total: sales map(.amount num) foldl(+, 0)
count: sales count
regions: sales map(.region) unique
}'
Or as a reusable eucalypt file:
# report.eu
{ import: "sales=sales.csv" }
` :suppress
amounts: sales map(.amount num)
report: {
total: amounts foldl(+, 0)
count: sales count
average: report.total / report.count
}
eu report.eu -j -e report
Key Concepts
- Eucalypt reads JSON, YAML, TOML, CSV, XML, EDN, JSON Lines, and plain text
- Output defaults to YAML; use
-jfor JSON or-xfor other formats - Named inputs (
name=file) give data a name for reference - The
-eflag evaluates expressions against loaded data --collect-asgathers multiple files into a list or block- CSV values are strings; use
numto convert to numbers - Combine multiple sources with the command line input system or imports