BrewPulse Coffee
Synthetic operational corpus for Retrieval-Augmented Generation demos. Eight interconnected documents simulating a real enterprise knowledge base — where no single file answers any meaningful question alone.
AI-generated, manually curated. Designed with deliberate retrieval failure modes: indirect references, multi-hop causal chains, and ambiguous attribution.
Corpus Structure
01_north_regional_ops_report.md
North England Regional Operations Report — March 2024
Regional Ops ReportSarah Mitchell's overview of Leeds Central and Manchester Piccadilly. The entry-point document that surfaces the core problem cluster without fully explaining any single issue.
02_incident_leeds_espresso_failure.md
Incident Report — Espresso Machine Failure, Leeds Central
Incident ReportFormal incident log INC-2024-0312 by duty manager Daniel Park. Precise timeline of the Bar 2 valve failure. Key technical anchor for cross-document reasoning.
03_supplier_northbrew_oat_milk.md
Supplier Update — NorthBrew Supplies Oat Milk Disruption
Supplier UpdateDominic Ferrara's account of the Wakefield depot logistics failure. Documents a previous Sep 2023 incident — critical for establishing recurrence.
04_customer_feedback_north.md
Customer Feedback Summary — Leeds & Manchester
Customer FeedbackAmara Osei's March compilation. 47 submissions, 62% negative. Connects "watery espresso" to equipment faults — without naming the root cause.
05_staffing_issues_north.md
Staffing Issues Report — North Region, Q1 2024
Staffing IssueSarah Mitchell & HR partner Gemma Holroyd's Q1 assessment. Tom Okafor resigned, Priya Nair transferred. Identifies pay gap and shift patterns as structural drivers.
06_maintenance_report_north.md
Maintenance Report — North England Branches, March 2024
Maintenance ReportMarcus Webb's technical log. Confirms the same valve failure class at both Leeds and Manchester — a pattern invisible in any other single document.
07_regional_performance_q1_north.md
Regional Performance Summary — North England Q1 2024
Performance SummaryKPI review: Leeds CSS dropped 3.8→3.1 (steepest recorded decline). Manchester mobile adoption 23% but 18% complaint rate weeks 1–2. Sheffield as control case.
08_logistics_mobile_ordering_disruption.md
Logistics & Technology Disruption — Mobile Ordering Rollout
Logistics ReportLena Frost's post-implementation analysis. Documents the Orda POS modifier sync bug and its 2-week fix lag. Bridge document linking supplier to technology thread.
The Central Causal Chain
This four-hop chain spans four files — the kind of reasoning that requires multi-document retrieval to reconstruct. No single document tells this full story. This is Query 20 in the demo: the wow highlight.
Wakefield depot failure → Google Review complaint
Wakefield depot systems migration fails
file 03
40% oat milk shortfall at Leeds Central
files 01, 03
Mobile oat flat white order unfulfillable
file 08
Formal complaint + Google Review posted
file 04
Why this matters: Graph traversal reconstructs the full chain by following entity links (NorthBrew → Wakefield → oat milk shortage → mobile order → complaint). Semantic search retrieves isolated fragments — it finds the complaint and the depot failure, but cannot reliably bridge all four hops.
Entity Cross-Reference Map
Named entities act as retrieval anchors. The more files an entity appears in, the more it can bridge unrelated-seeming documents during graph traversal.
| Entity | Type | Referenced in Files |
|---|---|---|
| NorthBrew Supplies | supplier | 0103040708 |
| Sarah Mitchell | person | 010203050607 |
| Leeds Central | branch | 0102030405060708 |
| Manchester Piccadilly | branch | 0104060708 |
| Espresso machine failure | incident | 0102040607 |
| Oat milk shortage | supply | 010203040708 |
| Mobile ordering rollout | technology | 0104050708 |
| Staffing shortage | hr | 01040507 |
| Marcus Webb | person | 0206 |
| Dominic Ferrara | person | 030708 |
| Amara Osei | person | 0407 |
| James Rowley | person | 0108 |
20 Demo Queries
All 20 questions require retrieval across multiple documents. None can be answered correctly from a single file. Grouped by theme for demo navigation.
- 01Which operational issues were directly caused or worsened by the NorthBrew Supplies disruption?
- 02What branches were affected by the oat milk shortage, and what were the downstream consequences at each?
- 03Has NorthBrew Supplies caused supply problems before, and how does the March 2024 situation compare?
- 04What steps has BrewPulse taken to reduce dependency on NorthBrew Supplies?
- 05Which branches experienced espresso machine failures, and what is the common root cause?
- 06What is the relationship between the Leeds Central and Manchester Piccadilly equipment faults?
- 07Are the espresso machine issues at Leeds and Manchester likely to recur at other branches?
- 08What maintenance actions are currently outstanding and which carry the highest operational risk?
- 09What problems emerged after the mobile ordering system launched in North England?
- 10Why did the oat milk ordering bug take two weeks to fix, and what was the customer impact?
- 11How did the timing of the mobile ordering rollout interact with other operational problems?
- 12What should be done differently before the Midlands cohort rollout in May?
- 13Which branches had staffing shortages in Q1 2024, and what caused them?
- 14How did understaffing at Leeds Central compound the impact of the equipment failure?
- 15What is the risk to Easter trading given current headcount levels?
- 16Which cities mentioned both staffing shortages and customer complaints in the same period?
- 17What is the connection between the espresso machine fault and customer complaints about drink quality?
- 18Which customer complaints can be traced back to a supplier issue rather than a branch-level failure?
- 19If you were the Head of Operations reviewing Q1 2024, what would you identify as the single most important systemic risk to address?
- 20Trace the full chain of events from the NorthBrew Supplies Wakefield depot failure through to the Google Reviews complaints at Leeds Central.
Retrieval Challenge Design
Six deliberate design properties ensure the dataset rewards multi-document retrieval and punishes naive keyword search or single-document lookup.
Indirect References
Causes and effects are named differently across documents. The system must bridge vocabulary gaps semantically.
Multi-Hop Causal Chains
The full cause-effect story spans 3–4 documents. Single-document retrieval gives an incomplete and potentially misleading answer.
Ambiguous Attribution
Some complaints could plausibly be blamed on the app, the supplier, or the equipment. Only multi-doc retrieval disambiguates.
Timeline References
February events appear in March documents as assumed context. The system must reconstruct timelines across report dates.
Partial Information
Each document holds a piece of the puzzle. A correct answer requires synthesising fragments from 2–5 files.
Control Case (Sheffield)
Sheffield's good performance is explained by a decision documented across two separate files — requiring synthesis to understand why.
Dataset Methodology
BrewPulse Coffee is synthetic — generated with Claude and manually curated. This is disclosed openly because the methodology itself demonstrates the engineering skill, not a limitation to hide.
AI-assisted synthetic corpus design
Documents were generated with Claude, co-designed with deliberate retrieval failure modes in mind. Entity recurrence, vocabulary gaps, and multi-hop dependencies were specified upfront — not left to chance.
Manual curation and ground truth
Each of the 20 queries has a manually verified ground truth mapping: expected files, expected retrieval method winner, and the reasoning. This enables precision/recall evaluation of any retrieval change.
Custom graph extractor (not off-shelf)
Graphify v0.8.13 was tested and rejected — 30–40% entity recall and a streaming bug. A custom extractor was written using Claude tool use with domain-specific entity typology and confidence scoring. Build cost: $0.11.
File-level chunking via Voyage AI
Documents are short (~640 words). File-level chunking is used — no sliding window needed. Voyage AI voyage-3 embeddings stored in pgvector on Supabase. Total embedding cost: $0.0004.
Ready to explore the retrieval?
Try the 20 queries live — see graph vs semantic side by side.