Segmented Real 1GB Retrieval — GLYPH v0.2
Branch:
- feature/segmented-v0.2
Setup:
- HDFS 1GB corpus
- split into 2 independent 512MB shards
- independent SA/BWT/FM built per shard
Manifest:
- manifests/segmented_1gb_real.json
Query:
- blk_-100000266894974466
Result:
- shard0 count: 18
- shard1 count: 0
- merged total_count: 18
Latency:
- query_time_sec: ~0.000319
Interpretation:
This is the first real segmented retrieval validation.
Each shard has an independent corpus and independent FM index.
The segmented query layer:
- dispatches queries across shards
- merges deterministic counts
- preserves exact retrieval semantics