Agent Flight Recorder - a Hugging Face Space by RFTSystems

RFTSystems · January 9, 2026, 12:04am

Who’s had enough of unverifiable ‘memory’ claims, silent edits, and post-hoc explanations that don’t survive scrutiny?

Hi everyone I’ve just published a new Space that’s meant to solve this problem I keep seeing in agent demos and “AI memory” claims: people describe what happened… but you can’t independently verify it. This Space is a proof-first flight recorder for AI/agent runs.

Instead of asking anyone to trust logs, screenshots, or vibes, it produces a tamper-evident, hash-chained event timeline and lets you export a ZIP bundle that any third party can verify locally.

What it does (in plain terms)

Records each step of an agent run as an event (prompt, tool call/result, output, memory read/write, retrieval, errors, notes, etc.)
Stores events in an append-only JSONL log where every event links to the previous one via hash (prev_event_hash_sha256)
If a single line is edited, removed, or reordered, verification fails (that’s the point)
Optional Ed25519 signatures if you want cryptographic “this came from my key” provenance
Finalisation commits a session anchor and the recorder refuses any further writes to that session (no post-hoc “quiet changes”)
Exportable proof bundles (rft_flight_bundle_<session_id>.zip) that others can upload back into the Space (or verify locally)

The UX: no guessing what to click

There’s a Quickstart (1-click) tab that runs a complete demo flow:
Start session → append events → verify → finalise → export bundle
…and it auto-fills the session id across the other tabs so you can explore without getting lost.

Why this exists

The real governance question isn’t “can agents access memory?”
It’s: can you prove they didn’t silently rewrite it?
This Space makes the “audit trail” the actual permission layer: if the history is changed, it won’t verify.

Included “brutal tests”

This repo includes a brutal_test.py script with two hard checks:

Two-tab spam test (concurrent writers): session must still PASS verification
Tamper ZIP test: modify exported event payload → import verification must FAIL

Part of the RFTSystems verification suite

This Space is part of a wider collection focused on live verification / receipts / auditability. If you’re into verifiable agent behaviour, you’ll probably like the full suite.
Collection:

Related Spaces in the suite:

What would be great (feedback)

If you try it, I’d love feedback on:

missing event types you’d want for real agent runs
whether the exported bundle format is clear enough for third-party review
what would make “verification” feel more obvious to non-technical users
any edge cases where you think the recorder could be tricked

If you build agents and care about reproducibility + auditability, this should be useful. Cheers.

RFTsystems - Liam Grinstead

#AI #Agents #LLM #Auditability #Verification #Reproducibility #Security #MLOps #AIEngineering #Provenance #Cryptography #Ed25519 #HashChain #Forensics #Observability #Governance #RAG #MemorySystems Gradio #OpenSource

Topic		Replies	Views
RFTSystems: Agent Forensics Suite - a RFTSystems Collection Spaces	0	15	January 9, 2026
TrustStack Console - a Hugging Face Space by RFTSystems Spaces	0	22	January 2, 2026
RFTSystems Agent Forensics Suite — audit, prove, replay, diff agent runs Show and Tell	0	42	January 10, 2026
RFT Memory Receipt Engine - a Hugging Face Space by RFTSystems Spaces	0	25	January 2, 2026
ReplayProof Agent POV Verified Replay - a Hugging Face Space by RFTSystems Spaces	2	37	January 9, 2026