Performance

Latency profiles, accuracy benchmarks, and methodology.

Latency overview

ChessGrammar is designed for low-latency analysis. Latency depends on the depth mode, the number of patterns present, and whether move sequences are requested.

ModeMedianp99Description
L1 (structural scan)~3ms< 15msFast geometric pattern candidate detection
L2 (forcing tree)~42ms< 500msFull confirmation with alpha-beta pruning
L2 + with_sequence~205ms/tacticvariesIncludes forcing move sequences

All measurements are per position, taken on the production deployment (engine-side time, excluding network overhead).

Latency by pattern

Different patterns have different computational costs. Simple structural patterns (smothered mate, skewer) are fastest; patterns requiring deeper forcing tree search (double check, interference) take longer.

PatternL1 p50L2 p50Notes
Fork2ms38msGeometric detection with iterative SEE
Pin4ms38ms
Skewer1ms1msFastest — structural only
Discovered Attack5ms114msBlocker-ray geometric pre-filter
Double Check6ms457msGeometric blocker mask, deep forcing tree
Back Rank Mate2ms20ms
Smothered Mate3ms7msLow L1 and L2 — strict structural condition
Deflection2ms132msRequires defender analysis
Interference3ms246msRequires line analysis
Trapped Piece10ms111msRequires full mobility check

Detection accuracy

Accuracy is measured against a curated dataset of annotated positions from international tournament play and established puzzle databases.

MetricValue
Overall accuracy97.3%
Dataset size25,000 annotated positions
False positive rate (L2)< 2%
False negative rate (L2)< 4%

Accuracy by pattern

PatternAccuracyNotes
Fork98.1%
Pin97.8%
Skewer96.5%
Discovered Attack97.2%
Double Check99.1%Highest — binary condition
Back Rank Mate98.4%
Smothered Mate99.3%Highest — strict structural condition
Deflection95.2%Most subjective pattern
Interference94.8%
Trapped Piece96.1%

Note on L1 accuracy: L1 has a higher false positive rate (~15-20%) since it detects structural candidates without confirmation. L2 is recommended for applications requiring precision.

Game analysis performance

Full game analysis (PGN) processes each position sequentially. Performance depends on tactical density of the game.

Game lengthL1 estimateL2 estimate
20 moves (40 plies)~0.2s~1.5s
40 moves (80 plies)~0.5s~5s
60 moves (120 plies)~0.8s~8s

Use depth: "l1" for fast game scanning, then L2 on specific positions of interest (two-pass strategy).

Engine v2.0 improvements

The v2.0 engine introduces significant performance and detection improvements:

  • Geometric detection for fork (5x faster), double check (160x faster), and discovered attack (3x faster)
  • Alpha-beta pruning in the L2 forcing tree for fast candidate rejection
  • Quiescence search prevents horizon-effect evaluation errors
  • More tactics detected: structural detection without premature gain filtering catches patterns that v1 missed

Methodology

  • Dataset: Positions sourced from FIDE-rated tournament games (2000+ ELO) and curated puzzle databases
  • Annotation: Each position manually verified by titled players (FM+)
  • Measurement: Latency measured on production Vercel deployment (cold start excluded, engine-side time)
  • Reproducibility: All benchmarks can be reproduced using benchmarks/capture_baseline.py

Performance characteristics may vary during the Developer Preview as the engine evolves.