What an audit trail
looks like.
Every picture 1snob AstroRoom outputs comes with a chain of hashes. The chain says yes, this is the picture I made, and yes, this is exactly how I made it. Both halves are necessary.
| seq | event | prev_hash | this_hash |
|---|---|---|---|
| 0 | session:created | 000… | 9c7af2… |
| 1 | archive:ingested · chandra · 47 obsids | 9c7af2… | 3f81ce… |
| 2 | compat:grouped · 3 stack-ready sets | 3f81ce… | 2d4ab8… |
| 3 | stack:built · sigma=2.5 · n=47 | 2d4ab8… | 71e90c… |
| 4 | recipe:applied · hubble_heritage | 71e90c… | 4ae512… |
| 5 | evidence:investigation_built · merkle=…b08f | 4ae512… | e0c34a… |
verify_chain(session_id). Each row links to the previous row's SHA-256. Change any row above and every hash from there forward changes.The reproducibility crisis in science is a problem of we don't know what happened. The fix is not more notes; the fix is a record that resists being edited later. 1snob AstroRoom keeps two of them in parallel.
— 01Why a regular log is not enough
A regular log is editable. You wrote down the parameters, you ran the analysis, you got a result; later, you tried a different parameter, got a different result, and updated the log to match. By the time the paper is written, the log is the version of the run that agrees with the paper. This is not malice. It is what software's save button does. The log lost the part of the truth where you and the data disagreed.
— 02How a hash chain prevents that
A hash chain is a linked list where each entry includes the SHA-256 hash of the entry before it. To change one entry without breaking the chain, you have to recompute every hash from there forward. 1snob AstroRoom's event ledger does exactly this: each session has a head row pointing at the last event's hash, and each event includes its own payload hash plus the previous one. The implementation lives in backend/app/services/event_ledger.py; the function verify_chain(session_id) walks the entire session and reports the first mismatch, if any.
— 03How a Merkle root proves the whole run
For a complete pipeline run — ingest, stack, develop, evidence — 1snob AstroRoom also computes a Merkle root over every artefact produced. Each artefact is hashed individually; pairs of hashes are combined into parent nodes; parents are combined into grandparents; the final root is a single sixty-four-character string that uniquely fingerprints the whole run. Implementation: backend/app/services/evidence_merkle.py. The functions build_merkle_root and verify_merkle_root are deliberately built on stdlib hashlib only, no external dependencies. Hand a colleague a Merkle root and a manifest, and they can verify, on their own machine, that the manifest matches the run that produced it.
— 04What gets logged
Everything. The frames that went into the stack and the ones that didn't. The sigma-clip threshold. The exposure-time weight. The recipe id used by the false-colour engine. The user who clicked, and the second on which they clicked. The version of 1snob AstroRoom in production at that moment. The datasets pulled from Chandra, MAST, SkyView, with their archive timestamps. A peer reviewer with the manifest and the input set should be able to reproduce the exact pixel.
— 05Why this matters for the picture
A picture without a manifest is decoration. A picture with a manifest is evidence. The first one belongs in a brochure. The second one belongs in a paper. 1snob AstroRoom outputs both, but the second one is what the project is about. The whole architecture exists so that the answer to the question "how did you make this?" is a single hash, and so that anyone who cares can follow the hash back to the truth.