How we built sub-8ms sealing at the edge

Back to blogs

May 11, 2026

How we built sub-8ms sealing at the edge

The engineering decisions behind sentinel's seal latency, mostly about what we removed.

an intricate, detailed schematic of a central processing unit (CPU) and its surrounding circuit pathways.

Interception, not logging

The shift came when we stopped thinking of sealing as logging and started thinking of it as stream interception. The sentinel sdk doesn't wait for a response to complete before it starts working. It opens a stream, intercepts the token flow at the edge node closest to the model's datacenter, and begins constructing the sha-3 hash incrementally — token by token, as they arrive. By the time the last token lands, 90% of the seal is already computed.

Removing the database from the hot path

Early versions wrote directly to a postgres instance on every seal. That gave us clean reads but added 60–90ms of write latency on every call. We replaced it with a write-ahead log at the edge — a content-addressed append-only structure that flushes to the merkle ledger asynchronously. The seal commits in memory first. Ledger confirmation happens within 400ms in the background. The caller never waits for it.

The tls problem

The third thing we removed was tls renegotiation on the internal hop between the interception layer and the edge node. We pre-warm persistent connections during the sdk handshake instead. That single change saved 22ms on cold paths.

What's left

After you remove the database, the post-processing model, and the tls overhead, what remains is a tight stream processor running on a v8 isolate with one job: hash this, sign it, queue it. That's where 8ms lives. Not in the code we wrote — in the code we deleted.

Lev Mikhailov

May 4, 2026

Sentinel in regulated industries

May 4, 2026

Sentinel in regulated industries

May 4, 2026

Sentinel in regulated industries

an illustration of a rocket ship blasting upward, leaving a trail of exhaust.

May 12, 2026

Who is liable when the model is wrong?

May 12, 2026

Who is liable when the model is wrong?

May 12, 2026

Who is liable when the model is wrong?

Faq

Everything you need to know, answered.

Latency, data exposure, compliance, cost. self-hosting. We answer the hard ones here without the marketing language. If something is missing, the spec is public.

Does sentinel add latency to my llm calls?

Does sentinel see my prompts or responses?

What happens if sentinel goes down?

Is this compliant with hipaa, soc 2, and gdpr?

How is this different from just logging my llm calls?

What does a seal actually contain?

The proof layer your
stack is missing

Sentinel seals every prompt, every response, every inference. Ship AI with the same confidence you ship code.

Try free for 14 days