OCC ADAPTIVE POLICY V1
Purpose:
Define when GLYPH should use scalar Occ and when it should use AVX2 Occ.
Measured basis:
OCC_STEP_BENCH_V1
Machine:
AMD EPYC 4344P
Observed breakpoint:
scan_len around 64 bytes.
Measured interpretation:
scan_len <= 32
scalar and AVX2 are effectively tied
scan_len 64
AVX2 begins reducing tail latency
scan_len >= 128
AVX2 becomes clearly useful
Policy V1:
if scan_len < 64
use scalar Occ
else
use AVX2 Occ when available
Rationale:
Current GLYPH latency layout keeps avg_scan_bytes around 28.
Therefore scalar remains optimal or equal for the common short-scan path.
AVX2 is valuable for longer scan paths created by larger checkpoint_step values.
This avoids forcing SIMD overhead onto tiny scans.
Architecture implication:
SIMD is not a global replacement for scalar Occ.
SIMD is a conditional execution path.
Future layout profiles:
LATENCY
checkpoint_step around 32
expected scan below 64
default Occ path:
scalar
BALANCED
checkpoint_step around 128
expected scan around breakeven
default Occ path:
adaptive scalar/AVX2
COMPACT
checkpoint_step 256+
expected scan above breakeven
default Occ path:
AVX2 preferred
Future branch:
bit-plane Occ layout
This is separate from byte-comparison AVX2.
Bit-plane layout may become GLYPH_LAYOUT_V2.
Current policy applies only to byte-layout BWT.