Evidence

Engineering Evidence

The primary evidence on this page is the 7-run public-v2 performance test pack: repeated local-host benchmark runs with published sample counts, target status, run spread, and future evaluation items.

Updated 2026-04-26

Partner-facing proof

Can I prove what happened?

Yes. wavebird connects request logs, decision state, render beacons, click beacons, settlement rows, and proof exports into one operator-presentable path.

What can be proven

Decision, render, beacon, settlement

The evidence path shows which slot was requested, what decision returned, which beacons fired, and how billable settlement values were computed.

Operator demo

Admin trace + export

For pilot conversations, the operator can show dashboard logs, proof records, and settlement exports without exposing secrets or raw user data.

Current scope

Pilot first, SSP live later

This page supports partner confidence without claiming public ad-market readiness before configured live partners are active.

Evidence pack checklist

A request id links API calls, decisions, and errors.

Render and visibility beacons create the lifecycle trace.

Settlement export contains the commercial math for review.

Proof integrity material explains how records can be audited.

Primary performance test

7-run public-v2 benchmark pack

7 measured runs · 1000 warmup requests · 2026-03-23 source pack

This is the main performance evidence on the page. It uses a local-host harness with embedded wavebird runtime and embedded Mock-SSP, 1000 warmup requests, 7 measured runs, median-per-benchmark selection, sample counts, run spread, target status, and explicit future evaluation baselines.

The repo performance report identifies this 7-run packet as the strongest quotable public evidence pack. The newer April 19 alias is kept lower as a validation snapshot, but it has only one measured run.

Required claim boundary

Measured on a local host benchmark harness with Mock-SSP and without external GenAI wait time; this is reproducible engineering benchmark evidence, not a production SLA, internet-wide latency claim, or third-party audit.

Firewall p99 latency

0.22ms

Filtering path before request egress; 7/7 runs cleared target.

Mock-SSP round-trip p99

15.28ms

Local Mock-SSP path, not a public SSP edge; 7/7 runs cleared target.

End-to-end p99

28.76ms

Full local flow with external GenAI wait time excluded; 7/7 runs cleared target.

Sequential beacon p99

1.39ms

Sequential beacon ingress on the local benchmark harness.

Settlement max runtime

887.58ms

Maximum runtime in the seeded settlement benchmark.

Mock-SSP request throughput

1364.82ops/s

Best local Mock-SSP request throughput; target was >=500 ops/s.

Method context

Execution mode: local_host
Runtime: embedded_local
Exchange substitute: Mock-SSP embedded_local
Environment: Node v22.16.0 on win32 10.0.26200, 12th Gen Intel Core i9-12900H, 31.68 GiB memory
Warmup: 1000 requests
Measured runs: 7
Selection: median_per_benchmark
Boundaries: No public internet, no real SSP edge, no external GenAI wait time, not an SLA or audit

Run-spread notes

Headline latency: 7/7 target-cleared runs
End-to-end p99 CV: 9.02%
Mock-SSP p99 CV: 23.58%
Firewall p99 CV: 36.74%

Latency results

Benchmark	Status	Samples	p50	p95	p99 / max	Target
firewall_bench	Target cleared	12000	0.04 ms	0.12 ms	0.22 ms	p99 <= 100 ms
ssp_roundtrip_bench	Target cleared	6000	7.11 ms	13.03 ms	15.28 ms	p99 <= 50 ms
e2e_bench	Target cleared	1200	12.98 ms	24.35 ms	28.76 ms	p99 <= 500 ms
beacon_bench_seq	Target cleared	10000	0.37 ms	0.93 ms	1.39 ms	p99 <= 20 ms
beacon_bench_c10	Target cleared	10000	5.45 ms	8.96 ms	10.62 ms	p99 <= 20 ms
beacon_bench_c50	Baseline recorded	10000	23.83 ms	36.46 ms	41.76 ms	p99 <= 20 ms
beacon_bench_c100	Baseline recorded	10000	52.31 ms	73.23 ms	78.75 ms	p99 <= 20 ms
settlement_bench	Target cleared	2	832.93 ms	887.58 ms	max 887.58 ms	max <= 5000 ms

Throughput results

Benchmark	Status	Best concurrency	Best result	p95	p99	Target
ssp_requests_per_second	Target cleared	50	1364.82 ops/s	56.51 ms	83.80 ms	>=500 ops/s
settlement_slots_per_minute	Target cleared	1	104483.02 ops/s as reported	717.82 ms	717.82 ms	>=1666.67 ops/s
jobs_per_second	Baseline recorded	100	798.99 ops/s	175.05 ms	220.32 ms	>=1000 ops/s
beacons_per_second	Baseline recorded	1	2027.93 ops/s	1.12 ms	1.64 ms	>=5000 ops/s

Future evaluation from this pack

Beacon c50 and c100 latency are measured at 41.76 ms and 78.75 ms p99; future runs should evaluate these against the internal <=20 ms target.
Beacon throughput is measured at 2027.93 ops/s and jobs throughput at 798.99 ops/s; future runs should evaluate movement against the >=5000 and >=1000 ops/s targets.
Mock-SSP results remain local exchange-substitute measurements, so future partner validation should separately evaluate real SSP edge and public-internet paths.
The benchmark excludes external GenAI wait time; future end-to-end validation should keep that boundary explicit or measure it separately.

Latest validation snapshot

Speed

19.51ms end-to-end p99

Full local flow from job creation through decision, billable beacons, and settlement export in the April 19 one-run validation snapshot.

Throughput

1486.64Mock-SSP req/s

Best measured Mock-SSP request throughput from the latest validation snapshot. This is the internal exchange substitute path, not a public-internet SSP edge.

Tracked baseline

3908.69beacons/s

The April 19 snapshot records beacon throughput at 3908.69 ops/s. A future rerun will evaluate movement against the internal >=5000 ops/s target.

Terms used on this page

Slot: One sponsoring opportunity in the runtime (one decision attempt).
Proof: A signed evidence record produced for filled slots and used for audit and settlement.
Beacon: A post-render signal from the wrapper/app confirming a creative was rendered.
Mock-SSP: A simulated ad exchange response used to measure the internal ad path without public network noise.
Fault injection: Deliberate, randomized failures introduced during the campaign (latency jitter, slow responses, HTTP errors, malformed responses, drops, no-bid spikes).

How we measured

The main performance test on this page is the 7-run public-v2 packet from March 23, 2026. That pack is more useful for external diligence because it includes warmup, repeated runs, sample counts, run spread, target status, and future evaluation items.

The latest validation snapshot was generated on April 19, 2026 from the benchmark tooling inside this repo. It is kept lower on the page as a newer one-run validation snapshot, not as the main diligence pack.

Both source tiers are intentionally precise about what the numbers mean. They show reproducible engineering benchmarks for the runtime itself. They do not claim SLA behavior, live-model latency, or partner-edge performance.

Mock-SSP

Mock-SSP is the controlled exchange substitute used by the benchmark harness so the page can describe the internal ad path without public-network noise.

Latency path clears target

Firewall p99 is 0.24 ms, Mock-SSP round-trip p99 is 7.66 ms, and end-to-end p99 is 19.51 ms. Every headline latency metric in the packet clears its internal target.

Throughput improved materially

The current packet clears the jobs-per-second target at 1028.75 ops/s and records Mock-SSP request throughput at 1486.64 ops/s.

Throughput baseline is documented

Beacon throughput is recorded at 3908.69 ops/s against the internal >=5000 ops/s target. The page keeps that measured baseline visible for future reruns.

Why the 7-run pack is the main test

The April 19 snapshot is newer, but the curated performance report identifies the March 23 public-v2 packet as the strongest quotable pack because it uses 7 measured runs, 1000 warmup requests, and published run-to-run spread.

Latest validation packet details

This disclosure mirrors the April 19 validation packet: packet scope, website-safe metrics, throughput results, claim boundaries, and the one measured throughput baseline for future evaluation.

Packet scope

Generated 2026-04-19T09:40:45.434Z from the current repo benchmark harness.
Execution mode: local_host.
Runtime source: embedded_local.
Mock-SSP source: embedded_local.
External GenAI wait time is excluded from the end-to-end benchmark.

This is benchmark evidence for the runtime and measurement harness, not a production or internet-wide claim.

Website-safe headline metrics

Firewall p99 latency: 0.24 ms.
Mock-SSP round-trip p99 latency: 7.66 ms.
End-to-end p99 latency: 19.51 ms.
Sequential beacon p99 latency: 0.72 ms.
Settlement max runtime: 0.28 ms.

Throughput results

Jobs/sec best: 1028.75 ops/s (target cleared).
Mock-SSP requests/sec best: 1486.64 ops/s (target cleared).
Settlement slots/minute best: 47619.05 (target cleared).
Beacons/sec best: 3908.69 ops/s (tracked baseline).

What this proves

The repo contains reproducible benchmark tooling, raw artifacts, and code to rerun the same benchmark family.
The measured local host environment reached the reported latency and throughput values on the stated hardware and software stack.
Claim language can stay tied to published artifacts instead of hand-maintained estimates.

Future evaluation baseline

beacons_per_second: 3908.69 ops/s measured against internal target >=5000 ops/s.
This creates a clear throughput baseline for the next benchmark packet.

Artifacts in the repo

Raw benchmark JSON and markdown are published under services/csl/perf/results/public-evidence-latest*.
The generator, raw source artifacts, and derived CSV exports are already included in the repo.
The page no longer relies on older April 7 / April 9 hard-coded headline values.

Beacon concurrency snapshot

The current packet includes three beacon concurrency slices. All published latency results clear the <=20 ms p99 target.

“c100” means 100 concurrent connections.

Concurrent connections	Response time (p99)	Throughput
10	2.51 ms	3713.06 ops/s
50	2.52 ms	3741.53 ops/s
100	2.11 ms	4424.19 ops/s

Across c10, c50, and c100, beacon ingress stayed between 2.11 ms and 2.52 ms at p99 and cleared target in every published concurrency slice.

How to read this table

These rows are benchmark cases, not live traffic. The current packet reports target clearance across c10, c50, and c100; the “Errors” column remains 0 because this benchmark slice is latency-focused rather than a fault-injection matrix.

Run stability and packet strength

The current public packet is stronger as a reproducible snapshot than as a long variation study because it contains one measured run after warmup. The page keeps that limitation explicit instead of implying broader statistical certainty than the artifact supports.

Current reading

The April 19 artifact is useful as the latest validation snapshot: end-to-end latency clears target, jobs-per-second clears target, and beacon throughput is documented as the next measured baseline for reruns. The March 23 detailed pack remains the stronger diligence packet because it repeats the measurements across 7 runs.

Detailed methodology

The evidence page uses two explicit source tiers: the primary March 23 7-run detailed packet and the latest April 19 validation snapshot. The aim is to keep public claim language aligned with actual artifacts.

Evidence date: 2026-04-19
Execution mode: local_host (local host harness)
Runtime source: embedded_local
Exchange substitute: Mock-SSP (embedded_local)
Runs: 1 measured run after 5 warmup requests
Selection method: median_per_benchmark
Environment: Node v22.16.0 on win32 10.0.26200, 12th Gen Intel Core i9-12900H, 31.68 GiB memory
Exclusions: No external GenAI wait time and no public-internet SSP path in these measurements

The March 23 public-v2 pack is the primary detailed benchmark evidence. The April 19 packet remains visible as a lower-priority validation snapshot.

Full benchmark metrics

Headline benchmarks

2026-04-19

Firewall p99 latency

0.24ms

Filtering path before any request leaves the runtime.

Mock-SSP round-trip p99 latency

7.66ms

Controlled exchange-substitute round trip inside the local harness.

End-to-end p99 latency

19.51ms

Full local flow with external GenAI wait time explicitly excluded.

Sequential beacon p99 latency

0.72ms

Single-beacon ingress on pre-warmed visible assets.

Settlement max runtime

0.28ms

Maximum runtime for the seeded settlement benchmark in the current packet.

Latest validation snapshot

2026-04-19

Headline metrics cleared

7/8

Seven of eight published headline metrics clear their internal targets; beacon throughput is kept as a measured baseline.

Jobs/sec best

1028.75ops/s

Current jobs-per-second benchmark, now above the >=1000 target.

Beacons/sec best

3908.69ops/s

Measured beacon throughput baseline against the >=5000 target.

SSP requests/sec best

1486.64ops/s

Best Mock-SSP throughput from the published packet.

Measured runs

1run

One measured run after warmup in the current public artifact.

What this does not claim

The page is explicit about what this evidence does and does not prove:

These are internal benchmark measurements, not a third-party audit.
The numbers are not a production SLA.
They are not public-internet latency claims.
They do not prove absolute latency against external GenAI providers or external SSPs.
They do not certify live-market settlement behavior outside the measured harness.

What we will evaluate next

The page keeps measured baselines visible so future benchmark packets can show movement without overstating the current evidence.

April 19 validation snapshot: beacon throughput is documented at 3908.69 ops/s against the internal >=5000 ops/s target.
7-run detailed pack: beacon c50/c100 latency, beacon throughput, and jobs/sec are documented as future evaluation baselines.

Artifacts

The current repo already contains the public evidence summary, companion markdown, raw benchmark JSON, derived CSV exports, and the generator used to produce them.

Related material

How wavebird works Proof integrity Safety API

Next step

See how the measured path maps into integration

If the evidence packet is what you needed, the next step is to inspect the integration surface and the proof model together.

See the API docs Read proof integrity