Transforms
A building block is a specification — it defines a data model that data can conform to. Transforms complement that by defining a reusable conversion library: for data that conforms to this building block, here is how to convert it into another format, encoding, or building block representation. Clients and tools can discover these transforms from the building block register and use them as ready-made adapters without having to implement the conversion logic themselves.
Typical conversions include encoding translations (e.g. XML to JSON), schema or structural transformations, semantic uplift to RDF, and vocabulary or terminology mappings.
Transforms are declared in a transforms.yaml file in the building block directory. During postprocessing, example
snippets that match a transform’s declared input media types are automatically run through it — this demonstrates the
transform works and gives clients a concrete preview of the output. The transform library itself, however, is the
primary artifact; the snippet outputs are illustrative.
transforms.yaml structure
transforms:
- id: my-transform # required; alphanumeric and dashes only
description: What it does # optional; Markdown accepted
type: jq # required; see supported types below
inputs:
mediaTypes:
- application/json # media types this transform accepts
outputs:
mediaTypes:
- application/json # media types this transform produces
code: | # inline code/script
.foo = "bar"
The transform code can be declared inline with code, or referenced from a separate file with ref:
- id: my-transform
type: jq
ref: transforms/my-script.jq
Input and output media types can be given as plain strings (application/json) or as objects when a file extension is
needed for the output:
outputs:
mediaTypes:
- mimeType: text/csv
defaultExtension: csv
Common short-form aliases such as json, xml, or turtle are also accepted and will be normalized to their canonical
MIME types.
Supported transform types
jq
Applies a jq expression to JSON input.
- Default inputs:
application/json - Default outputs:
application/json
- id: add-type
type: jq
code: |
.type = "ex:MyFeature"
sparql-construct
Runs a SPARQL CONSTRUCT query on RDF input, producing an RDF graph.
- Default inputs:
application/ld+json,text/turtle - Default outputs:
text/turtle
- id: to-geosparql
type: sparql-construct
ref: transforms/to-geosparql.sparql
sparql-update
Runs a SPARQL UPDATE statement on an RDF graph in-place.
- Default inputs:
application/ld+json,text/turtle - Default outputs: same as input
- id: remap-predicates
type: sparql-update
ref: transforms/remap.sparql
shacl-af-rule
Applies SHACL Advanced Features rules (SPARQL-based) to an RDF graph.
- Default inputs:
application/ld+json,text/turtle - Default outputs:
text/turtle
- id: infer-types
type: shacl-af-rule
ref: transforms/infer-types.shacl.ttl
xslt
Applies an XSLT stylesheet to XML input.
- Default inputs:
application/xml - Default outputs:
application/xml
- id: normalise-xml
type: xslt
ref: transforms/normalise.xslt
json-ld-frame
Applies a JSON-LD frame to JSON-LD or RDF input.
- Default inputs:
application/ld+json,text/turtle - Default outputs:
application/ld+json
- id: frame-feature
type: json-ld-frame
ref: transforms/frame.jsonld
semantic-uplift
Applies a semantic uplift mapping (as used by the OGC NA tools) to JSON input, producing RDF.
- Default inputs:
application/json - Default outputs:
text/turtle
- id: uplift
type: semantic-uplift
ref: transforms/uplift.yaml
python
Runs a Python code snippet. The snippet receives input_data (a string) and must assign its result to output_data.
A transform_metadata namespace is also available with the following attributes:
| Attribute | Description |
|---|---|
source_mime_type |
MIME type of the input snippet |
target_mime_type |
MIME type of the declared output |
metadata |
Extra metadata from the transform declaration (keys starting with _ excluded). Supports both attribute access (transform_metadata.metadata.mode) and dict-style access (transform_metadata.metadata['mode'], for k in transform_metadata.metadata, etc.) |
context |
Transform context namespace |
- id: uppercase-keys
type: python
inputs:
mediaTypes: [ application/json ]
outputs:
mediaTypes: [ application/json ]
code: |
import json
data = json.loads(input_data)
output_data = json.dumps({k.upper(): v for k, v in data.items()}, indent=2)
With dependencies:
- id: to-csv
type: python
inputs:
mediaTypes: [ application/json ]
outputs:
mediaTypes:
- mimeType: text/csv
defaultExtension: csv
metadata:
dependencies:
pip: pandas>=1.5
python: ">=3.10" # optional; skipped if not met
code: |
import json, pandas as pd
data = json.loads(input_data)
output_data = pd.DataFrame(data if isinstance(data, list) else [data]).to_csv(index=False)
pip accepts any specifier that pip install understands, including GitHub URLs. If python is set to
a PEP 440 version specifier, the transform is silently skipped when the runtime
does not meet the requirement.
The snippet can be adapted into a standalone script by reading from stdin and printing to stdout — input_data is just
a string variable, and output_data is whatever string you assign.
Binary data: input_data may be a bytes object when the input comes from a binary-producing transform in a
chain. Assign bytes to output_data to produce binary output; the postprocessor will detect this and open the
output file in binary mode. print() calls are captured and do not interfere with the output.
Python transforms can also call transforms from other building blocks using the
get_transformer() builtin.
node
Runs a Node.js code snippet. The snippet receives inputData (a string) and must assign its result to outputData.
A transformMetadata object is also available with the following properties:
| Property | Description |
|---|---|
sourceMimeType |
MIME type of the input snippet |
targetMimeType |
MIME type of the declared output |
metadata |
Extra metadata from the transform declaration (keys starting with _ excluded) |
context |
Transform context object (snake_case keys) |
- id: add-metadata
type: node
inputs:
mediaTypes: [ application/json ]
outputs:
mediaTypes: [ application/json ]
code: |
const data = JSON.parse(inputData);
data.generatedBy = 'my-transform';
outputData = JSON.stringify(data, null, 2);
With dependencies:
- id: to-csv
type: node
inputs:
mediaTypes: [ application/json ]
outputs:
mediaTypes:
- mimeType: text/csv
defaultExtension: csv
metadata:
dependencies:
npm: json2csv
node: ">=18" # optional; skipped if not met
code: |
const { Parser } = require('json2csv');
const rows = Array.isArray(inputData) ? inputData : [JSON.parse(inputData)];
outputData = new Parser().parse(rows);
npm accepts any package name or specifier that npm install understands. If node is set to a semver range, the
transform is silently skipped when the runtime does not meet the requirement.
Binary data: inputData may be a Buffer when the input comes from a binary-producing transform in a chain.
Assign a Buffer to outputData to produce binary output; assigning a string produces text output. console.log()
calls are captured and do not interfere with the output.
Node transforms can also call transforms from other building blocks using the
getTransformer() function.
get_transformer / getTransformer
Python and Node transforms can call any transform defined in any building block — including transforms of a different type — using a built-in composition helper. This lets you build complex pipelines by reusing transforms across building blocks without duplicating logic.
The callable returned by the helper accepts the content to transform plus optional parameters, runs the target transform in a sub-process, and returns the result.
Python: get_transformer(bblock_id, transform_id)
get_transformer is injected as a built-in into every Python snippet. Call it to obtain a callable for a specific
transform, then invoke that callable with the data you want to transform.
# In a python transform
# Get a callable for another building block's transform
convert = get_transformer('ogc.example.other-bblock', 'my-jq-transform')
result_str = convert(data)
output_data = result_str
Callable signature:
callable(content, source_mime_type=None, extra_metadata=None)
| Parameter | Type | Description |
|---|---|---|
content |
str or bytes |
The input data to transform |
source_mime_type |
str | None |
Optional MIME type hint passed to the target transform |
extra_metadata |
dict | None |
Optional dict merged into the target transform’s metadata (caller values take precedence over the target’s declared metadata) |
The callable returns a str (or bytes for binary outputs), or None if the target transform produced no output.
Node: getTransformer(bblockId, transformId)
getTransformer is injected into every Node snippet. The returned callable accepts the content and an options object.
// In a node transform
const convert = getTransformer('ogc.example.other-bblock', 'my-python-transform');
const data = JSON.parse(inputData);
const result = convert(JSON.stringify(data), { sourceMimeType: 'application/json' });
outputData = result;
Callable signature:
callable(content, opts?)
| Parameter | Type | Description |
|---|---|---|
content |
string or Buffer |
The input data to transform |
opts.sourceMimeType |
string |
Optional MIME type hint passed to the target transform |
opts.extraMetadata |
object |
Optional object merged into the target transform’s metadata |
The callable returns a string (or Buffer for binary outputs), or null if the target transform produced no output.
Supported target types
Both helpers can call transforms of the following types: python, node, jq, xslt, json-ld-frame.
SPARQL, SHACL-AF, and semantic-uplift transforms are not supported as targets.
Cross-type chaining
Any combination of supported types can call each other arbitrarily deep — for example, a Python transform can call a jq transform that was defined in another building block, or a Node transform can call a Python transform, which in turn calls an XSLT transform. The composition is fully symmetric across language boundaries.
Cycle detection
If a transform is already executing in the current call chain, calling it again via get_transformer /
getTransformer raises a RuntimeError (Python) or throws an Error (Node) immediately. Cycle detection works
across process and language boundaries.
Metadata scoping
Each transform always receives its own declared metadata from transforms.yaml — the caller’s metadata is
not inherited. If you need to pass values from the calling transform into the target, use extra_metadata:
# Python — forward the caller's metadata to the sub-transform
convert = get_transformer('other.bblock', 'some-transform')
result = convert(data, extra_metadata=transform_metadata.metadata)
// Node — same idea
const convert = getTransformer('other.bblock', 'some-transform');
const result = convert(data, { extraMetadata: transformMetadata.metadata });
extra_metadata / extraMetadata is merged on top of the target’s own declared metadata, so the target’s keys
take lower priority than what the caller explicitly passes.
_nested_transform metadata flag
The target transform’s metadata will contain _nested_transform: true when invoked via get_transformer /
getTransformer. This lets a transform behave differently when called as a sub-transform versus running as a
top-level postprocessing step.
Output profile validation
A transform’s outputs can be validated against one or more building blocks by declaring them as profiles. During postprocessing, every output file produced by the transform is validated against each declared profile using the same validators that run on regular test resources (JSON Schema, JSON-LD context, and SHACL).
Profiles are declared under outputs.profiles as a list of building block identifiers, using the
bblocks:// URI scheme:
transforms:
- id: to-geojson-feature
type: jq
inputs:
mediaTypes: [ application/json ]
outputs:
mediaTypes: [ application/geo+json ]
profiles:
- bblocks://ogc.geo.features.feature
Both locally-defined building blocks and imported building blocks from other registers are supported.
What gets produced
For each declared profile, postprocessing creates a subdirectory named after the profile identifier alongside the transform outputs and writes:
- A
.validation_{passed|failed}.txttext report for each output file - Semantic uplift side-outputs (
.jsonld,.ttl) when the profile includes a JSON-LD context - A consolidated
_report.jsoncovering all validated outputs for that profile
The per-snippet transform result in register.json gains a profilesValidation map keyed by
profile identifier:
"profilesValidation": {
"ogc.geo.features.feature": {
"result": true,
"report": "build/tests/my.bblock/transforms/ogc.geo.features.feature/_report.json",
"upliftedFiles": {
"jsonld": "build/tests/my.bblock/transforms/ogc.geo.features.feature/output.jsonld",
"ttl": "build/tests/my.bblock/transforms/ogc.geo.features.feature/output.ttl"
}
}
}
Transform context
All executable transform types (Python, Node, and plugins) receive a transform context with metadata about
the building block, example, and postprocessing run. In Python snippets it is transform_metadata.context
(a SimpleNamespace); in Node snippets it is transformMetadata.context (a plain object); in plugins it is
metadata.ctx (a SimpleNamespace). All fields use snake_case.
Most transforms only need a handful of these fields; the full set is listed here for reference.
Building block:
| Field | Type | Description |
|---|---|---|
bblock_id |
str |
Building block identifier |
bblock_name |
str | None |
Human-readable name |
bblock_version |
str | None |
Version string |
bblock_tags |
list |
Tags declared in bblock.json |
bblock_files_path |
str |
Absolute path to the building block source directory |
bblock_annotated_path |
str |
Absolute path to the annotated output directory |
bblock_metadata |
dict |
Full building block metadata snapshot at transform time |
source_schema_path |
str | None |
Relative path to the source schema file, or URL if declared as a remote reference |
annotated_schema_path |
str | None |
Relative path to the annotated schema, if generated |
jsonld_context_path |
str | None |
Relative path to the generated JSON-LD context, if present |
shacl_shapes_paths |
list |
Relative paths or URLs of SHACL shapes (local files are relativized to CWD; remote references are preserved as URLs) |
Example and snippet:
| Field | Type | Description |
|---|---|---|
example_index |
int |
Zero-based index of the current example |
example |
dict |
Full example object (title, prefixes, base-output-filename, etc.) — snippets excluded |
snippet_index |
int |
Zero-based index of the current snippet within the example |
snippet |
dict |
Full snippet object (language, url, ref, json-path, prefixes, etc.) — code excluded (use input_data) |
Note: When
json-pathis set on a snippet,snippet['full-code']contains the complete content of the referenced file before path extraction. This is useful when the transform needs context beyond the extracted value.
Note:
snippet['shacl-closure']contains the merged list of SHACL closure entries from both the building block’sshaclClosures(bblock.json) and the snippet’s ownshacl-closure(examples.yaml), deduplicated. Entries may be URLs or paths relative to the building block source directory. To resolve a relative path:import os closures = context.snippet.get('shacl-closure') or [] resolved = [ c if c.startswith('http') else os.path.join(context.working_dir, context.bblock_files_path, c) for c in closures ]
Output:
| Field | Type | Description |
|---|---|---|
output_file |
str |
Absolute path where this transform’s output will be written |
output_dir |
str |
Absolute path to the transform output directory for this building block |
working_dir |
str |
Working directory at postprocessing time |
Register and configuration:
| Field | Type | Description |
|---|---|---|
base_url |
str | None |
Base URL for generated output |
github_base_url |
str | None |
GitHub repository base URL (e.g. https://github.com/org/repo/) |
git_repository |
str | None |
Git remote URL |
id_prefix |
str |
Building block identifier prefix from bblocks-config.yaml |
imported_register_urls |
list |
Register import URLs from bblocks-config.yaml |
transform_plugins |
list |
Active transform plugins |
Note: bblock_metadata reflects the state at transform time — fields populated after the transforms step
(such as shaclShapes URLs and documentation) will not be present yet.
Transform plugins
Declaring a transform type not listed in Supported transform types is valid — it will be included in the building block register for other tools or systems that support it, and skipped during postprocessing unless a matching plugin is declared here.
You can add support for custom transform types by declaring transform plugins in bblocks-config.yaml:
plugins:
transforms:
- pip: git+https://github.com/example/my-bblocks-plugin.git
modules:
- my_bblocks_plugin
Note: The legacy
transform-plugins.ymlfile is still accepted but deprecated. Move its contents to theplugins.transformskey inbblocks-config.yaml.
Each plugin entry installs one or more pip packages and scans the listed Python modules for transformer classes. A transformer class is recognised by duck typing — it needs:
transform_types: a non-empty list of type name stringstransform(metadata): a callable that accepts a metadata object and returns a string or bytes, or raises an exception on failure
Each plugin runs in its own isolated virtualenv (created automatically under the postprocessing sandbox), so dependency conflicts between plugins, or between a plugin and the postprocessor itself, are not a concern.
pip accepts any specifier that pip install understands, including version constraints, GitHub URLs,
and local paths. It can be a string or a list when multiple packages are needed.
The postprocessor automatically derives a human-facing URL from the pip specifier (PyPI page for
package names, repository URL for git+https:// references). You can override this with an explicit
url field:
plugins:
- pip: git+https://github.com/example/my-bblocks-plugin.git
url: https://github.com/example/my-bblocks-plugin
modules:
- my_bblocks_plugin
Plugin metadata (types, class names, pip reference, and URL) is included in register.json under
transformPlugins, allowing viewers and tooling to attribute each transform type to its plugin.
The metadata object
The metadata argument passed to transform() is a plain namespace with the following attributes:
| Attribute | Type | Description |
|---|---|---|
type |
str |
The transform type identifier (e.g. jinja2) |
transform_content |
str |
The code or script declared in transforms.yaml (code or ref) |
input_data |
str |
The example snippet text |
source_mime_type |
str |
MIME type of the input snippet |
target_mime_type |
str |
MIME type of the declared output |
metadata |
namespace / dict | Extra metadata from the transform declaration (keys starting with _ excluded). Supports both attribute access and dict-style access |
sandbox_dir |
None |
Always None in the plugin subprocess context |
ctx |
SimpleNamespace |
Transform context |
Return value and error handling
Return a str or bytes to produce output. Return None to produce no output (not an error).
Raise any exception to signal failure — the full traceback becomes the transform’s stderr output.
Any output written to stdout or stderr during transform() (e.g. print() calls) is captured and
logged at DEBUG level. To see it, run the postprocessor with --log-level DEBUG.
Transformer class attributes
| Attribute | Required | Description |
|---|---|---|
transform_types |
yes | List of type name strings this class handles |
default_inputs |
no | Default input media types (used when inputs is not declared in transforms.yaml) |
default_outputs |
no | Default output media types (used when outputs is not declared in transforms.yaml) |
Example plugin
The following skeleton shows the minimal structure. metadata.transform_content carries the
user-supplied code or script from transforms.yaml, so the transform logic is data-driven rather
than hard-coded in the plugin.
# my_bblocks_plugin/__init__.py
import json
class MyTransformer:
transform_types = ['my-type']
default_inputs = ['application/json']
default_outputs = ['text/plain']
def transform(self, metadata):
data = json.loads(metadata.input_data)
# metadata.transform_content holds the code/expression from transforms.yaml
# return a string or bytes, or raise on error
return str(data)
A real-world example is the
bblocks-jinja2-transform-plugin,
which adds a jinja2 transform type that renders Jinja2 templates against JSON input.