Flowmetal

A shining mercurial metal, laden with sensors and almost infinitely reconfigurable. The stuff of which robots and servitors are made.

Flowmetal is a new clustered scripting platform, designed to make it easy to write high reliability automation (usually called workflows) in networked environments.

Some example applications:

🗹 Implementing CI flows
🗹 Reacting to webhooks
🗹 Integrating between RESTful systems
🗹 Orchestrating batch jobs (via REST or other APIs)

Key Features

Simple, Familiar Syntax

No new DSL notation to learn. Just a Python subset (Starlark).

load("pkg.flowmetal.io/prelude/v1", "flow", "FlowContext")

def _whoami_impl(fctx: FlowContext) -> str:
    return fctx.quota.username

whoami = flow(implementation = _whoami_impl)

def hello():
    print("Hello from Flowmetal, {}!".format(whoami()))

if __name__ == "__flow__":
    hello()

whoami is a flow — a durable, checkpointed unit of execution. hello is just a normal function that happens to call one. Most of your code doesn’t need to know or care about the difference.

Scripting ease

Just flowmetal run ./script.flow. No need to wait for scheduled deployments or slow builds.

Composable by default

Flows are just functions. Call them, pass them around, run them in parallel.

analyze_value = unwrap(analyze())

# Run two flows in parallel
save_task = fctx.actions.run(save, data = analyze_value)
report_task = fctx.actions.run(report, data = analyze_value)

# Wait for both to complete
results = fctx.actions.wait([save_task, report_task])

Resources clean up after themselves

Declare what your flow needs. The runtime opens resources before execution, closes them when the flow completes, and reopens them if a flow resumes after failure. No leaked connections. No orphaned containers.

gha_workflow = flow(
    implementation = _gha_impl,
    resources = {
        "container": lambda fctx: docker.container(
            image = fctx.args.scratch_image,
        ),
    },
)

Designed for multitenancy

Resource quotas and isolation are in the box. Every flow belongs to a quota tree — administrators can limit, inspect, and cancel any flow under their authority.

Reliability for free

Your flow calls an external API. The worker dies mid-request. Flowmetal replays from the last checkpoint, declares the interrupted request failed, and your retry logic picks up where it left off — on a different worker. You didn’t write any recovery code. It’s structural.

Easy to operate

Flowmetal workers are stateless. They interpret Starlark. They don’t run your builds, hold your connections, or store your state. A worker dying costs nothing because the flow replays from checkpoint. Scale up, scale down, roll workers — flows don’t notice.

How It Works

             +--------------+
 Users  <--> |  Client API  |
             +--------------+
                    |
                    V
             +--------------+
             |   Flow DB    | <--> Connectors
             +--------------+
                    ^
                    |
             +--------------+
Workers <--> |  Worker API  |
             +--------------+

Flowmetal is a grid execution environment: a user-facing API, a database, a worker-facing API, and a fleet of stateless workers.

flowmetal run --context=default ./hello.flow assembles this script and its dependencies into a bundle, uploads it to the grid, and requests instantiation under the connected user’s role and quota.

A worker picks up the flow, downloads the bundle, and executes the Starlark it contains. Execution happens with normal Python-style semantics — __name__ is bound to __flow__, and a global flow context is established.

Every fctx.actions.* call is written into a commit log before any side-effects are performed. This means it is always safe to restart any flow from its last committed log state. The worst case is that the last requested action went awry and was lost, which the runtime cleans up by declaring it a failure and producing an error response.

Connector services extend the built-in actions (clock, sleep, inter-flow messaging, HTTP) with specialized capabilities by listening to job logs and asynchronously returning results.

We are indebted to Meiklejohn et al. for A.M.B.R.O.S.I.A., which describes this approach in detail.

I would like to learn more!

We’ve also prepared some other material that doesn’t fit here:

Read the manifesto
Extract, Transform, Load — error handling, retry with backoff, parallel execution
Reactive Event Handling — inboxes, stateful handlers, request-reply patterns
Lights — secrets, resource recipes, external service integration, cron scheduling

A live prototype console with some standing traffic is published here https://console.flowmetal.io/console.

Fire dispatch — a REST to REST connector running live on the console

Can I try it?

While these ideas have had years of refinement, Flowmetal is in early stages of development.

In the meantime feedback and inquiries may be directed to flowmetal AT tirefireind DOT us. We’d love to hear what you’d like to use a platform like this for, or any commentary on the DSL!