Evidence
Engineering Evidence
The primary evidence on this page is the 7-run public-v2 performance test pack: repeated local-host benchmark runs with published sample counts, target status, run spread, and future evaluation items.
Partner-facing proof
Can I prove what happened?
Yes. wavebird connects request logs, decision state, render beacons, click beacons, settlement rows, and proof exports into one operator-presentable path.
What can be proven
Decision, render, beacon, settlement
The evidence path shows which slot was requested, what decision returned, which beacons fired, and how billable settlement values were computed.
Operator demo
Admin trace + export
For pilot conversations, the operator can show dashboard logs, proof records, and settlement exports without exposing secrets or raw user data.
Current scope
Pilot first, SSP live later
This page supports partner confidence without claiming public ad-market readiness before configured live partners are active.
Evidence pack checklist
A request id links API calls, decisions, and errors.
Render and visibility beacons create the lifecycle trace.
Settlement export contains the commercial math for review.
Proof integrity material explains how records can be audited.
Primary performance test
7-run public-v2 benchmark pack
7 measured runs · 1000 warmup requests · 2026-03-23 source pack
This is the main performance evidence on the page. It uses a local-host harness with embedded wavebird runtime and embedded Mock-SSP, 1000 warmup requests, 7 measured runs, median-per-benchmark selection, sample counts, run spread, target status, and explicit future evaluation baselines.
The repo performance report identifies this 7-run packet as the strongest quotable public evidence pack. The newer April 19 alias is kept lower as a validation snapshot, but it has only one measured run.
Required claim boundary
Measured on a local host benchmark harness with Mock-SSP and without external GenAI wait time; this is reproducible engineering benchmark evidence, not a production SLA, internet-wide latency claim, or third-party audit.
Firewall p99 latency
Filtering path before request egress; 7/7 runs cleared target.
Mock-SSP round-trip p99
Local Mock-SSP path, not a public SSP edge; 7/7 runs cleared target.
End-to-end p99
Full local flow with external GenAI wait time excluded; 7/7 runs cleared target.
Sequential beacon p99
Sequential beacon ingress on the local benchmark harness.
Settlement max runtime
Maximum runtime in the seeded settlement benchmark.
Mock-SSP request throughput
Best local Mock-SSP request throughput; target was >=500 ops/s.
Method context
- Execution mode
- local_host
- Runtime
- embedded_local
- Exchange substitute
- Mock-SSP embedded_local
- Environment
- Node v22.16.0 on win32 10.0.26200, 12th Gen Intel Core i9-12900H, 31.68 GiB memory
- Warmup
- 1000 requests
- Measured runs
- 7
- Selection
- median_per_benchmark
- Boundaries
- No public internet, no real SSP edge, no external GenAI wait time, not an SLA or audit
Run-spread notes
- Headline latency
- 7/7 target-cleared runs
- End-to-end p99 CV
- 9.02%
- Mock-SSP p99 CV
- 23.58%
- Firewall p99 CV
- 36.74%
Latency results
| Benchmark | Status | Samples | p50 | p95 | p99 / max | Target |
|---|---|---|---|---|---|---|
| firewall_bench | Target cleared | 12000 | 0.04 ms | 0.12 ms | 0.22 ms | p99 <= 100 ms |
| ssp_roundtrip_bench | Target cleared | 6000 | 7.11 ms | 13.03 ms | 15.28 ms | p99 <= 50 ms |
| e2e_bench | Target cleared | 1200 | 12.98 ms | 24.35 ms | 28.76 ms | p99 <= 500 ms |
| beacon_bench_seq | Target cleared | 10000 | 0.37 ms | 0.93 ms | 1.39 ms | p99 <= 20 ms |
| beacon_bench_c10 | Target cleared | 10000 | 5.45 ms | 8.96 ms | 10.62 ms | p99 <= 20 ms |
| beacon_bench_c50 | Baseline recorded | 10000 | 23.83 ms | 36.46 ms | 41.76 ms | p99 <= 20 ms |
| beacon_bench_c100 | Baseline recorded | 10000 | 52.31 ms | 73.23 ms | 78.75 ms | p99 <= 20 ms |
| settlement_bench | Target cleared | 2 | 832.93 ms | 887.58 ms | max 887.58 ms | max <= 5000 ms |
Throughput results
| Benchmark | Status | Best concurrency | Best result | p95 | p99 | Target |
|---|---|---|---|---|---|---|
| ssp_requests_per_second | Target cleared | 50 | 1364.82 ops/s | 56.51 ms | 83.80 ms | >=500 ops/s |
| settlement_slots_per_minute | Target cleared | 1 | 104483.02 ops/s as reported | 717.82 ms | 717.82 ms | >=1666.67 ops/s |
| jobs_per_second | Baseline recorded | 100 | 798.99 ops/s | 175.05 ms | 220.32 ms | >=1000 ops/s |
| beacons_per_second | Baseline recorded | 1 | 2027.93 ops/s | 1.12 ms | 1.64 ms | >=5000 ops/s |
Future evaluation from this pack
- Beacon c50 and c100 latency are measured at 41.76 ms and 78.75 ms p99; future runs should evaluate these against the internal <=20 ms target.
- Beacon throughput is measured at 2027.93 ops/s and jobs throughput at 798.99 ops/s; future runs should evaluate movement against the >=5000 and >=1000 ops/s targets.
- Mock-SSP results remain local exchange-substitute measurements, so future partner validation should separately evaluate real SSP edge and public-internet paths.
- The benchmark excludes external GenAI wait time; future end-to-end validation should keep that boundary explicit or measure it separately.
Latest validation snapshot
Speed
Full local flow from job creation through decision, billable beacons, and settlement export in the April 19 one-run validation snapshot.
Throughput
Best measured Mock-SSP request throughput from the latest validation snapshot. This is the internal exchange substitute path, not a public-internet SSP edge.
Tracked baseline
The April 19 snapshot records beacon throughput at 3908.69 ops/s. A future rerun will evaluate movement against the internal >=5000 ops/s target.
Terms used on this page
- Slot
- One sponsoring opportunity in the runtime (one decision attempt).
- Proof
- A signed evidence record produced for filled slots and used for audit and settlement.
- Beacon
- A post-render signal from the wrapper/app confirming a creative was rendered.
- Mock-SSP
- A simulated ad exchange response used to measure the internal ad path without public network noise.
- Fault injection
- Deliberate, randomized failures introduced during the campaign (latency jitter, slow responses, HTTP errors, malformed responses, drops, no-bid spikes).
How we measured
The main performance test on this page is the 7-run public-v2 packet from March 23, 2026. That pack is more useful for external diligence because it includes warmup, repeated runs, sample counts, run spread, target status, and future evaluation items.
The latest validation snapshot was generated on April 19, 2026 from the benchmark tooling inside this repo. It is kept lower on the page as a newer one-run validation snapshot, not as the main diligence pack.
Both source tiers are intentionally precise about what the numbers mean. They show reproducible engineering benchmarks for the runtime itself. They do not claim SLA behavior, live-model latency, or partner-edge performance.
Mock-SSP
Mock-SSP is the controlled exchange substitute used by the benchmark harness so the page can describe the internal ad path without public-network noise.
Latency path clears target
Firewall p99 is 0.24 ms, Mock-SSP round-trip p99 is 7.66 ms, and end-to-end p99 is 19.51 ms. Every headline latency metric in the packet clears its internal target.
Throughput improved materially
The current packet clears the jobs-per-second target at 1028.75 ops/s and records Mock-SSP request throughput at 1486.64 ops/s.
Throughput baseline is documented
Beacon throughput is recorded at 3908.69 ops/s against the internal >=5000 ops/s target. The page keeps that measured baseline visible for future reruns.
Why the 7-run pack is the main test
The April 19 snapshot is newer, but the curated performance report identifies the March 23 public-v2 packet as the strongest quotable pack because it uses 7 measured runs, 1000 warmup requests, and published run-to-run spread.
Latest validation packet details
This disclosure mirrors the April 19 validation packet: packet scope, website-safe metrics, throughput results, claim boundaries, and the one measured throughput baseline for future evaluation.
Packet scope
- Generated 2026-04-19T09:40:45.434Z from the current repo benchmark harness.
- Execution mode: local_host.
- Runtime source: embedded_local.
- Mock-SSP source: embedded_local.
- External GenAI wait time is excluded from the end-to-end benchmark.
This is benchmark evidence for the runtime and measurement harness, not a production or internet-wide claim.
Website-safe headline metrics
- Firewall p99 latency: 0.24 ms.
- Mock-SSP round-trip p99 latency: 7.66 ms.
- End-to-end p99 latency: 19.51 ms.
- Sequential beacon p99 latency: 0.72 ms.
- Settlement max runtime: 0.28 ms.
Throughput results
- Jobs/sec best: 1028.75 ops/s (target cleared).
- Mock-SSP requests/sec best: 1486.64 ops/s (target cleared).
- Settlement slots/minute best: 47619.05 (target cleared).
- Beacons/sec best: 3908.69 ops/s (tracked baseline).
What this proves
- The repo contains reproducible benchmark tooling, raw artifacts, and code to rerun the same benchmark family.
- The measured local host environment reached the reported latency and throughput values on the stated hardware and software stack.
- Claim language can stay tied to published artifacts instead of hand-maintained estimates.
Future evaluation baseline
- beacons_per_second: 3908.69 ops/s measured against internal target >=5000 ops/s.
- This creates a clear throughput baseline for the next benchmark packet.
Artifacts in the repo
- Raw benchmark JSON and markdown are published under services/csl/perf/results/public-evidence-latest*.
- The generator, raw source artifacts, and derived CSV exports are already included in the repo.
- The page no longer relies on older April 7 / April 9 hard-coded headline values.
Beacon concurrency snapshot
The current packet includes three beacon concurrency slices. All published latency results clear the <=20 ms p99 target.
“c100” means 100 concurrent connections.
| Concurrent connections | Response time (p99) | Throughput | Errors |
|---|---|---|---|
| 10 | 2.51 ms | 3713.06 ops/s | 0 |
| 50 | 2.52 ms | 3741.53 ops/s | 0 |
| 100 | 2.11 ms | 4424.19 ops/s | 0 |
Across c10, c50, and c100, beacon ingress stayed between 2.11 ms and 2.52 ms at p99 and cleared target in every published concurrency slice.
How to read this table
These rows are benchmark cases, not live traffic. The current packet reports target clearance across c10, c50, and c100; the “Errors” column remains 0 because this benchmark slice is latency-focused rather than a fault-injection matrix.
Run stability and packet strength
The current public packet is stronger as a reproducible snapshot than as a long variation study because it contains one measured run after warmup. The page keeps that limitation explicit instead of implying broader statistical certainty than the artifact supports.
Current reading
The April 19 artifact is useful as the latest validation snapshot: end-to-end latency clears target, jobs-per-second clears target, and beacon throughput is documented as the next measured baseline for reruns. The March 23 detailed pack remains the stronger diligence packet because it repeats the measurements across 7 runs.
Detailed methodology
The evidence page uses two explicit source tiers: the primary March 23 7-run detailed packet and the latest April 19 validation snapshot. The aim is to keep public claim language aligned with actual artifacts.
- Evidence date
- 2026-04-19
- Execution mode
- local_host (local host harness)
- Runtime source
- embedded_local
- Exchange substitute
- Mock-SSP (embedded_local)
- Runs
- 1 measured run after 5 warmup requests
- Selection method
- median_per_benchmark
- Environment
- Node v22.16.0 on win32 10.0.26200, 12th Gen Intel Core i9-12900H, 31.68 GiB memory
- Exclusions
- No external GenAI wait time and no public-internet SSP path in these measurements
The March 23 public-v2 pack is the primary detailed benchmark evidence. The April 19 packet remains visible as a lower-priority validation snapshot.
Full benchmark metrics
Headline benchmarks
2026-04-19
Firewall p99 latency
Filtering path before any request leaves the runtime.
Mock-SSP round-trip p99 latency
Controlled exchange-substitute round trip inside the local harness.
End-to-end p99 latency
Full local flow with external GenAI wait time explicitly excluded.
Sequential beacon p99 latency
Single-beacon ingress on pre-warmed visible assets.
Settlement max runtime
Maximum runtime for the seeded settlement benchmark in the current packet.
Latest validation snapshot
2026-04-19
Headline metrics cleared
Seven of eight published headline metrics clear their internal targets; beacon throughput is kept as a measured baseline.
Jobs/sec best
Current jobs-per-second benchmark, now above the >=1000 target.
Beacons/sec best
Measured beacon throughput baseline against the >=5000 target.
SSP requests/sec best
Best Mock-SSP throughput from the published packet.
Measured runs
One measured run after warmup in the current public artifact.
What this does not claim
The page is explicit about what this evidence does and does not prove:
- These are internal benchmark measurements, not a third-party audit.
- The numbers are not a production SLA.
- They are not public-internet latency claims.
- They do not prove absolute latency against external GenAI providers or external SSPs.
- They do not certify live-market settlement behavior outside the measured harness.
What we will evaluate next
The page keeps measured baselines visible so future benchmark packets can show movement without overstating the current evidence.
- April 19 validation snapshot: beacon throughput is documented at 3908.69 ops/s against the internal >=5000 ops/s target.
- 7-run detailed pack: beacon c50/c100 latency, beacon throughput, and jobs/sec are documented as future evaluation baselines.
Artifacts
The current repo already contains the public evidence summary, companion markdown, raw benchmark JSON, derived CSV exports, and the generator used to produce them.
Related material
Next step
See how the measured path maps into integration
If the evidence packet is what you needed, the next step is to inspect the integration surface and the proof model together.