Cloud

Embedded integration tests,
run like software tests.

Most teams can't script "inject a gyro fault at t=1000ms, press reset at t=3000ms" — a test like that needs a person at a bench. Here it's four lines of TypeScript: build the board in code, flash real firmware, schedule faults in simulation time, and assert on what the firmware did. On every commit, in CI.

smart-fan-controller.test.ts
const mcu = graph.addComponent(Components.ADAFRUIT_STM32F405_EXPRESS)
const imu = graph.addComponent(Components.MPU6050)

mcu.setFlash("./firmware.elf")
graph.connect(imu.pins.sda, mcu.pins.sda)
graph.connect(imu.pins.scl, mcu.pins.scl)

const run = await graph.run({ duration: 5000 })

await run.at(1000).do(() => imu.setYGyro(5))
await run.at(3000).do(() => mcu.pressReset())

const logs = await run.logs()
TOPOLOGY

Boards are code

Compose MCUs, sensors, and displays into a graph and connect bus-level signals — pin to pin, not string to string. The board definition lives in your repo, versioned with the firmware it tests.

OBSERVABILITY

Streams and records

Subscribe to live streams — sensor outputs, CPU registers — or record USART, I2C, and RTT during the run and query the logs after. Runs are persisted, so postmortems don't require reproducing anything.

AUTOMATION

Built for CI

Bounded runs in simulation time, scheduled fault injection, and queryable results — hardware regression tests that run on every commit, with no hardware in the loop.

Digital Twins

Tests end. Twins keep running.

A run with a duration is a test. Take the duration away and the same graph becomes a digital twin: a simulated copy of your product that accumulates uptime for weeks, queryable through the SDK the whole time.

fleet monitor — fan-ctrl-0412 drift detected

Physical unit

stalled

Unit in the field. The fan stopped, the loop is crawling — but on its own it just looks quiet.

Digital twin

nominal

Same firmware, same inputs, simulated. This is what the unit should be doing right now.

Signal Physical Twin Offset
fan_rpm 0 1,450 Δ −1,450 ▲
loop_time 212 ms 12 ms Δ +200 ms ▲
heap_free 9.4 KB 31.6 KB Δ −22.2 KB ▲
uptime 31d 04:12 31d 04:12 Δ 0

Run a twin next to the real thing. When the two drift apart, the offsets tell you what failed — before the support ticket does.

The long-run bugs that live in firmware

Memory leaks, heap fragmentation, counter wraps, watchdog timeouts — bugs that need days of continuous runtime to show up, long after the bench test passed. A twin never resets: let it accumulate runtime, snapshot at any moment, and query the heap and stack history the instant something drifts. It won't model temperature, analog effects, or flash wear — it will show you the heap that fragments four bytes per transaction.

Rehearse the OTA before production sees it

Push a firmware update to a fleet of twins before a single production device — each with its own hardware config, firmware version, and simulated sensor environment. Watch which configurations apply cleanly and which fall back, all from the same SDK calls your tests already use. It exercises your update logic, not your radio.

Agents

And it pairs well with Claude Code.

None of the above needs AI. But if your team uses coding agents, the SDK is what makes them useful on embedded projects: agents are only as good as their feedback loop, and on embedded the loop usually ends at "flash it and see." The SDK closes it — every simulation result is queryable text, which is exactly what an agent can reason about.

Agent + lab bench

  • OpenOCD configs, probe drivers, and a J-Link that works on Tuesdays — before the agent writes a single line
  • Output lives on a scope screen and a serial terminal — the agent can't read either
  • Press reset, rewire a sensor, reflash — every step needs a human at the desk
  • One board, one test at a time, business hours only

Agent + Simulator86

  • npm install — the whole lab is an API key
  • Every signal is queryable text: logs, registers, bus traffic, RTT
  • Reset, rewire, inject faults — all SDK calls the agent makes itself
  • Parallel runs, overnight, in CI

Write, run, read, fix — unattended

Point Claude Code at a repo with the SDK installed and it can do what no agent can do against a physical bench: compile your firmware, run it on a simulated board, read the logs, registers, and bus traffic, and iterate until the test passes — overnight, in parallel, without a board on anyone's desk. Your engineers review passing runs instead of babysitting benches.

Trust the run, not the model

Your team is right to be skeptical of AI-written firmware — in embedded, a hallucination isn't a bad merge, it's a field failure. That's the point of the SDK: nothing the agent writes has to be believed. Every change must survive a simulated run before a human sees it, and what reaches review is code plus evidence — the logs, register transitions, and bus traffic to prove it behaves.

Introduce Simulation Culture to Your Team

Imagination is more important than knowledge.

· A. Einstein ·

Start Building