Performance

Latency profiles, accuracy benchmarks, and methodology.

Latency overview

ChessGrammar is designed for low-latency analysis. Latency depends on the depth mode, the number of patterns present, and whether move sequences are requested.

Mode	Median	p99	Description
L1 (structural scan)	~3ms	< 15ms	Fast geometric pattern candidate detection
L2 (forcing tree)	~42ms	< 500ms	Full confirmation with alpha-beta pruning
L2 + with_sequence	~205ms/tactic	varies	Includes forcing move sequences

All measurements are per position, taken on the production deployment (engine-side time, excluding network overhead).

Latency by pattern

Different patterns have different computational costs. Simple structural patterns (smothered mate, skewer) are fastest; patterns requiring deeper forcing tree search (double check, interference) take longer.

Pattern	L1 p50	L2 p50	Notes
Fork	2ms	38ms	Geometric detection with iterative SEE
Pin	4ms	38ms
Skewer	1ms	1ms	Fastest — structural only
Discovered Attack	5ms	114ms	Blocker-ray geometric pre-filter
Double Check	6ms	457ms	Geometric blocker mask, deep forcing tree
Back Rank Mate	2ms	20ms
Smothered Mate	3ms	7ms	Low L1 and L2 — strict structural condition
Deflection	2ms	132ms	Requires defender analysis
Interference	3ms	246ms	Requires line analysis
Trapped Piece	10ms	111ms	Requires full mobility check

Detection accuracy

Accuracy is measured against a curated dataset of annotated positions from international tournament play and established puzzle databases.

Metric	Value
Overall accuracy	97.3%
Dataset size	25,000 annotated positions
False positive rate (L2)	< 2%
False negative rate (L2)	< 4%

Accuracy by pattern

Pattern	Accuracy	Notes
Fork	98.1%
Pin	97.8%
Skewer	96.5%
Discovered Attack	97.2%
Double Check	99.1%	Highest — binary condition
Back Rank Mate	98.4%
Smothered Mate	99.3%	Highest — strict structural condition
Deflection	95.2%	Most subjective pattern
Interference	94.8%
Trapped Piece	96.1%

Note on L1 accuracy: L1 has a higher false positive rate (~15-20%) since it detects structural candidates without confirmation. L2 is recommended for applications requiring precision.

Game analysis performance

Full game analysis (PGN) processes each position sequentially. Performance depends on tactical density of the game.

Game length	L1 estimate	L2 estimate
20 moves (40 plies)	~0.2s	~1.5s
40 moves (80 plies)	~0.5s	~5s
60 moves (120 plies)	~0.8s	~8s

Use depth: "l1" for fast game scanning, then L2 on specific positions of interest (two-pass strategy).

Engine v2.0 improvements

The v2.0 engine introduces significant performance and detection improvements:

Geometric detection for fork (5x faster), double check (160x faster), and discovered attack (3x faster)
Alpha-beta pruning in the L2 forcing tree for fast candidate rejection
Quiescence search prevents horizon-effect evaluation errors
More tactics detected: structural detection without premature gain filtering catches patterns that v1 missed

Methodology

Dataset: Positions sourced from FIDE-rated tournament games (2000+ ELO) and curated puzzle databases
Annotation: Each position manually verified by titled players (FM+)
Measurement: Latency measured on production Vercel deployment (cold start excluded, engine-side time)
Reproducibility: All benchmarks can be reproduced using benchmarks/capture_baseline.py