A structured evidence appendix presenting the complete double diamond design thinking process — from Point of View through Ideation, Consolidation, Feasibility Study, Concept Development, Success Criteria, and Methodology Critique.
The canonical short form was used as the only briefing material for the three anonymous ideation sessions. Each session received this statement and nothing else — no project context, no financial data, no knowledge base access.
The primary methodological risk in this project is the use of prompted, managed satisfaction metrics as the primary measure of customer health. NPS and post-interaction surveys are administered by the business and answered in the context of a direct relationship with the brand. They consistently overstate satisfaction because they measure willingness to respond positively in a managed context, not the actual state of the customer relationship.
The unmediated peer-to-peer community forum — where customers speak to each other, not to the brand — produces a materially different signal. In this context, customers express authentic sentiment without the social pressure of a direct brand relationship. The divergence between these two data sources is not noise. It is the finding.
Any concept developed in response to P1 must use unmediated community signal as its primary data source. Concepts that rely on prompted metrics — even as a secondary input — risk replicating the measurement failure they are designed to solve.
Human-in-the-loop oversight is required at every point where AI-generated signal is translated into a customer communication or operational decision. The system detects. The human decides.
The decision to run three isolated sessions with no shared context was the single most consequential structural choice in the methodology. Each session received only the anonymised POV statement. No session could see the others' output before generating its own. This manufactured genuine divergence rather than the pseudo-divergence of sequential prompting within a single conversation.
Why this matters: Convergence across isolated sessions is structurally different from convergence within a single session. When three isolated processes reach the same idea independently, that convergence is evidence — not an artefact of shared context. Community Vocabulary Shift appeared independently in all three sessions, making it the single most structurally validated idea in the corpus.
Financial context and project knowledge were deliberately excluded from all three ideation sessions. Ideas should not be filtered for commercial viability at the point of generation — premature commercial anchoring eliminates structurally important ideas before they can be assessed. The feasibility study made the commercial case for retained ideas retrospectively at Stage 4, not at Stage 2.
| Cluster | Ideas | Focus | Territory |
|---|---|---|---|
| Signal Intelligence | 01–04 | Making the unmanaged community signal legible | Signal |
| Pre-Departure Interception | 05–07 | Detecting the decision before it is announced | Signal |
| Relational Stewardship | 08–11 | Mutual recognition, reciprocity, and radical transparency | Relational |
| Co-Ownership & Influence | 12–13 | Highest-value subscriber as structural stakeholder | Relational |
| Physical Delivery as Trust Signal | 14–16 | The box as a trust instrument, not a fulfilment event | Operational |
| Legacy & Memory | 17–18 | Making the subscriber's investment visible and honoured | Relational |
| # | Idea Title | Cluster | Horizon |
|---|---|---|---|
| 01 | Forum Divergence Score | Signal Intelligence | Quick Win |
| 02 | The Silence Classifier | Signal Intelligence | Quick Win |
| 03 | Community Vocabulary Shift | Signal Intelligence | Quick Win |
| 04 | Referral Velocity Reversal | Signal Intelligence | Quick Win |
| 05 | The Anticipation Metric | Pre-Departure Interception | Quick Win |
| 06 | The Warm Goodbye Classifier | Pre-Departure Interception | Medium-Term |
| 07 | The Peer Influence Map | Pre-Departure Interception | Medium-Term |
| 08 | Exit Interview Before They Leave | Relational Stewardship | Quick Win |
| 09 | Commitment Reciprocity Test | Relational Stewardship | Quick Win |
| 10 | The Relationship Ledger | Relational Stewardship | Medium-Term |
| 11 | The Honest Annual Report | Relational Stewardship | Medium-Term |
| 12 | The Tenure Council | Co-Ownership & Influence | Ambitious |
| 13 | The Disagreement Forum | Co-Ownership & Influence | Medium-Term |
| 14 | Failure-First Insert | Physical Delivery as Trust Signal | Quick Win |
| 15 | The Grief Protocol Box | Physical Delivery as Trust Signal | Quick Win |
| 16 | Counterintuitive Downgrade Offer | Physical Delivery as Trust Signal | Medium-Term |
| 17 | Contribution Archaeology | Legacy & Memory | Ambitious |
| 18 | The Alumni Network | Legacy & Memory | Medium-Term |
| Metric | Figure | Source |
|---|---|---|
| Annual churn rate — highest LTV cohort | 24% | CRM analysis |
| Estimated UK high-LTV Angel cohort | ~24,000 | Segment estimate |
| Annualised revenue per Angel | ~£324 | HY26 derived |
| Replacement cost per churned Angel | ~£398 | CAC estimate |
| Acquisition payback window | 44 months | HY26 derived |
| Annual replacement burden (current) | ~£2.3m | Calculated |
| Revenue preserved per 5pp churn reduction | ~£478,000 | Calculated |
| Group NPS | 76 | HY26 reported |
| ICO maximum fine exposure (group) | ~£8m | 4% of £200m group revenue |
| Dimension | RAG | Assessment |
|---|---|---|
| Strategic Feasibility | GREEN | The prioritised ideas align tightly with the stated business direction of a smaller, materially more profitable business centred on core Angel retention. The seven Quick Win ideas are low-capital, high-signal interventions operating on existing data infrastructure. The strategic direction demands high-LTV retention as the primary value driver; the ideation output treats this as the organising principle throughout. |
| Commercial Feasibility | GREEN | The commercial case is strongly positive. At 24% annual churn in the highest-LTV cohort, with replacement cost of approximately £398 per churned Angel and a 44-month acquisition payback window, the cost of inaction at ~£2.3m annually substantially exceeds any credible implementation cost for the ideas prioritised. The Quick Win ideas individually carry negligible implementation cost relative to this figure. |
| Technical Feasibility | GREEN | All AI capabilities required — NLP sentiment analysis, temporal pattern recognition, change-point detection, generative personalisation — are mature, commercially available, and deployable without bespoke model development. The Quick Win ideas rely on NLP techniques operational in production environments in 2025–26. No capability gap exists that would prevent deployment within a 6–12 month timeframe. |
| Operational Feasibility | AMBER | The Quick Win ideas are implementable within the constraints of a business undergoing active cost reduction and restructuring, provided implementation is sequenced correctly. Ideas 01–05 are derivable from existing CRM and email platform data with modest engineering investment. The full 18-idea set implemented simultaneously would exceed operational capacity — but the Quick Win subset is operationally viable as a phased first tranche. |
| Human & Ethical Feasibility | AMBER | The ideation output is ethically well-constructed in intent. However, an Amber rating is warranted because the ideas do not specify the HITL architecture in sufficient detail. The Silence Classifier and Community Vocabulary Shift depend on NLP models that, if poorly calibrated, could reduce the nuanced signal of concluded loyalty to a binary churn flag — recreating the managed metric problem the project was designed to solve. Resolvable at concept development stage. |
| Reputational Feasibility | GREEN | The UK Angel community is values-aware and digitally engaged. The ideas most likely to generate positive reputational impact demonstrate radical transparency: the Honest Annual Report, the Commitment Reciprocity Test, and the Failure-First Insert each signal that the company trusts the Angel with the truth before she discovers it independently in the forum. Proactive transparency is the reputationally correct posture for the UK market specifically. |
| # | Condition | Detail |
|---|---|---|
| 1 | DPIA Completed | A Data Protection Impact Assessment under UK GDPR Article 35 must be completed and approved before any AI-driven behavioural profiling is deployed. This is a hard legal gate. ICO maximum fine exposure at group level is approximately £8m at current revenue. |
| 2 | HITL Design Specified | The Human-in-the-Loop architecture must be designed and documented before deployment. The interface through which community signal reaches human reviewers must preserve verbatim community language, not only model scores. The individuals authorised to act must have relationship authority, not only retention incentives. |
| 3 | Quick Win Sequencing Agreed | Recommended first tranche: Forum Divergence Score, Community Vocabulary Shift, and Failure-First Insert — three ideas requiring no new customer-facing infrastructure, validatable against existing data before broader rollout. |
| 4 | Target Cohort Defined | Before Exit Interview Before They Leave or Commitment Reciprocity Test are deployed, the specific tenure cohort at risk must be defined from CRM data — minimum 5-year tenure, pre-funded balance, historical referral activity. |
| 5 | Senior Leadership Sponsorship | The Tenure Council and Disagreement Forum require that senior leadership are genuinely willing to receive and respond to structured criticism from the Angel community. Without that commitment, both ideas become performative — and performative transparency with a values-aware community is more damaging than no transparency at all. |
P1 is the silent churn of long-tenure high-LTV subscribers caused by operational delivery failure — invisible to prompted satisfaction metrics but evidenced in unsolicited peer-to-peer community forum sentiment. The company's NPS of 76 says everything is fine. The unmanaged community forum says the highest-value customers have already decided.
The Quiet Signal System is designed to close the gap between what the managed metric measures and what the unmanaged community reveals — and to do so before the subscriber acts on their conclusion rather than after.
Before the Quiet Signal System is live, the Angel rates 9/10 when asked, posts less frequently in the community forum, stops referring friends, and opens delivery emails later. No complaint is lodged. The company's dashboard shows health. She has already decided.
After the system is live, the Angel receives a letter — not a survey, not a discount code. The letter names her specific five-year contribution, her referrals, a product she helped shape. It asks one question: what must we do to deserve year six? It arrives before she has acted on her conclusion. She did not ask for this. She did not expect it.
What changes is not the system's sophistication — it is the relational register. The company stops measuring her and starts listening to her. The gap between the managed metric and her lived reality is closed not by a better model, but by a human who read what she actually wrote and chose to respond as though it mattered.
The Quiet Signal System passes all five conditions set in the Feasibility Study. The Quick Win ideas (Forum Divergence Score, Community Vocabulary Shift, Silence Classifier, Anticipation Metric, Failure-First Insert, and Grief Protocol Box) are deployable within existing CRM and email platform infrastructure without requiring new customer-facing technology. The HITL architecture is explicitly designed to surface verbatim forum language to human reviewers rather than model scores — directly addressing the Amber rating on Human and Ethical Feasibility. The DPIA and legal review workstreams are designated as parallel prerequisites, not sequential gates. The Ambitious ideas (Tenure Council and Contribution Archaeology) are scoped as a separate governance conversation that does not block the Quick Win sprint. The commercial case — at £2.3m annual replacement burden with a 44-month payback window — is sufficient to justify the concept's development and prototype investment.
Incorporated in the concept: Ideas 01–05 (Signal Intelligence and Pre-Departure Interception Quick Wins), 08–09 (Relational Stewardship Quick Wins), 14–15 (Physical Delivery Quick Wins), 06 (Warm Goodbye Classifier), and 18 (Alumni Network). These form the complete operating loop of the Quiet Signal System from detection through intervention to post-outcome stewardship.
What is being tested: The single foundational assumption the entire system depends on — that AI can reliably distinguish concluded departure from active complaint in unmediated peer community language. This is the specific signal that algorithmic sanitisation consistently misses and that makes the metric-truth gap possible.
Why this is the correct place to start: Before any relational intervention is built, before any letter is sent, before any human steward is trained to act, the system must prove that Layer 1 is reliable. If the Silence Classifier cannot reliably distinguish 'I am not angry, I am finished' from 'I am frustrated but engaged,' all subsequent intervention layers are built on a false positive. Layer 1 must be validated first.
What the dashboard does: The relationship steward opens the dashboard and sees a short list of Angel subscribers whose Layer 1 signals have crossed the escalation threshold. For each subscriber, the dashboard surfaces: the verbatim community forum language that triggered the signal, not a score; a five-year history brief showing tenure, referral count, pre-funded balance, and last meaningful community contribution; the Anticipation Metric decay curve showing email open-velocity over tenure; and the specific Layer 1 signal type that triggered the alert — vocabulary shift, silence classification, or divergence score.
The steward reads first. Then decides. The system advises — the human acts. No communication is generated, no intervention is triggered, and no signal is acted upon without the steward's explicit decision. This is the HITL architecture made operational.
The storyboard and service blueprint below were produced as the visual outputs of the Concept Development stage. They are rendered here in their original form, as produced by the Quiet Signal System concept development prompt.
| What Is Being Measured | Successful Result | Measurement Tool / Approach |
|---|---|---|
| Whether the Angel who receives a relational intervention experiences it as genuine, not commercial | At least 70% of Angels who receive a letter report (when asked) that it did not feel like a retention script. The letter referenced specific, accurate details about their tenure. | Post-intervention qualitative interview (small sample, not a survey). Verbatim responses reviewed by human steward, not scored by NLP. |
| Whether the intervention arrives before the Angel has taken any cancellation action | 100% of Layer 1 alerts result in human steward review before the subscriber's next billing cycle. No intervention arrives after a cancellation decision has been made. | CRM timestamp comparison: alert generation date vs. cancellation action date. Alert must precede cancellation by a minimum of 14 days to qualify as a genuine interception. |
| Whether the Angel's community language recovers after a successful intervention | Community Vocabulary Shift model detects recovery of 'we' language (vs. 'they') within 90 days of intervention in at least 50% of renewed subscribers. | Layer 1 longitudinal NLP monitoring post-intervention. Pronoun ratio tracked at 30, 60, and 90 days. Compared against pre-intervention baseline. |
| What Is Being Measured | Successful Result | Measurement Tool / Approach |
|---|---|---|
| Whether the dashboard output is readable and actionable without further interpretation | Actionable is defined as: the steward can read the dashboard entry and make a binary decision — intervene or monitor — within 10 minutes, without consulting any other data source. The verbatim forum language at the top of the entry must be the primary input to that decision. | Steward time-to-decision logged per alert. Steward confidence score (1–5) collected per alert. Correlation between confidence score and intervention outcome tracked across prototype period. |
| Whether the AI draft letter requires substantial rewriting or only minor editing | At least 60% of AI draft letters are approved by the steward with edits of fewer than 50 words. Letters requiring substantial rewrite (150+ words changed) are flagged as Layer 3 failures and reviewed. | Word diff between AI draft and steward-approved version, logged per letter. Patterns in high-edit letters reviewed monthly to identify systematic Layer 3 weaknesses. |
| Whether the steward reports that the verbatim forum language accurately reflects the subscriber's departure trajectory | At least 80% of steward post-intervention assessments confirm that the forum language surfaced by the dashboard was consistent with the subscriber's subsequent behaviour (cancellation, renewal, or continued engagement). | Steward retrospective assessment at 90 days post-intervention. Binary: did the signal prove accurate? Logged per case. Used to calibrate Layer 1 threshold settings. |
| What Is Being Measured | Successful Result | Measurement Tool / Approach |
|---|---|---|
| Whether the prototype demonstrates a statistically meaningful churn reduction in the intervened cohort | Churn rate in the prototype cohort (Angels who received a Layer 1 alert and a subsequent intervention) is at least 8 percentage points lower than the control cohort churn rate over the same period. At current £398 replacement cost, this represents approximately £380,000 in saved replacement cost per 1,000 Angels in the prototype cohort. | Randomised control trial design: 50% of Angels flagged by Layer 1 in the prototype period receive intervention; 50% are monitored without intervention. 12-month churn comparison. CRM survival analysis. |
| Whether the commercial case for building the full system is validated by the prototype result | If the prototype demonstrates an 8pp churn reduction in the tested cohort, the projected annual benefit at full deployment (24,000 Angel cohort) exceeds the full system build cost within 18 months of deployment. Prototype result becomes the commercial gate for full system investment approval. | Financial model updated with actual prototype churn results. Presented to senior leadership at prototype close-out. Commercial gate decision documented. |
The following simulation comments are designed to rigorously test the full range of signal types the Layer 1 classifier must detect. Satisfactory examples confirm correct classification; unsatisfactory examples expose the limits and failure modes the classifier must be trained to handle.
Each indicator becomes measurable only once the prototype has validated Layer 1. None of these targets should be pursued as operational goals before the Horizon 1 prototype has confirmed the classifier's reliability.
| Stage | Phase | Where Run | Purpose | Structural Logic |
|---|---|---|---|---|
| 1 | POV Generation | Project chat with full knowledge base | Synthesise strongest strategic POV grounded in P1 | Single session, no isolation needed — the POV is a convergent output, not a divergent one |
| 2 | Ideation | Three anonymous browser windows — no shared context | Generate maximum idea diversity without cross-contamination | Isolation manufactures genuine divergence; convergence across sessions becomes evidence, not artefact |
| 3 | Consolidation | Separate standalone chat — no project knowledge base | De-duplicate, cluster, and select 15–20 ideas | No pre-imposed taxonomy; natural clusters must emerge from the ideas themselves |
| 4 | Feasibility Study | Project chat with full knowledge base including financials | Assess viability; quantify cost of inaction; identify critical gaps | Financial context enters here, not at ideation — protecting early-stage ideas from premature commercial filtering |
| 5 | Concept Development | Project chat — builds on feasibility findings | Synthesise ideas into a single coherent named concept with AI capability stack | Concept is bounded by feasibility conditions; HITL architecture made explicit in prototype specification |
| 5a | Success Criteria | Project chat — adds Defining Success framework | Generate measurable criteria across two horizons: prototype and full system | Horizon 1 tests the single foundational assumption before Horizon 2 is pursued |
The decision to run three isolated sessions using only the anonymised POV, with no shared context and no cross-contamination between windows, was the single most consequential structural choice in the entire methodology. The evidence that it worked is not the volume of ideas produced — 161 ideas across nine territory-sessions is a quantity any single session could approximate — but the convergence signal it generated. Community Vocabulary Shift appeared independently in all three sessions.
When three isolated sessions converge on the same idea, that convergence is the signal.
That convergence finding could not have been produced by a single session, however well-prompted. The methodology manufactured a form of triangulation that is structurally impossible in conventional chat-based use. This was deliberate, and it worked.
Running the consolidation in a separate chat with no project knowledge base, using only the three pasted outputs, meant the consolidator had to find natural patterns rather than confirm predetermined ones. The six clusters that emerged — Signal Intelligence, Pre-Departure Interception, Relational Stewardship, Co-Ownership and Influence, Physical Delivery as Trust Signal, Legacy and Memory — were not imposed in advance. The feasibility study then uses these clusters as its organising architecture, which means the emergent structure held up through subsequent scrutiny. Imposed taxonomies rarely do.
Embedding the HITL Critical Finding as a named, labelled knowledge base entry that every subsequent prompt was required to address directly prevented a predictable failure mode: that a design process about detecting unmediated signal would itself produce outputs optimised for managed metrics. The feasibility study's Amber rating on Human and Ethical Feasibility — specifically the observation that the Silence Classifier could recreate the managed metric problem it was designed to solve — demonstrates that the constraint was actively applied, not merely cited. A constraint that does not generate friction has not been embedded.
The Financial Context document entered the knowledge base at the Feasibility Study stage, not at Ideation. This sequencing meant ideation outputs were not anchored to commercial viability at the point of generation. Ideas like the Last Box Protocol, the Alumni Network, and the Tenure Council are commercially expensive to defend in an ideation room. Had the £398 replacement cost and 44-month payback window been present during ideation, these ideas would have been self-censored before reaching the consolidation. The Feasibility Study made the commercial case for them retrospectively — which is the correct sequence.
The Financial Context document entered the knowledge base at Stage 4 — the feasibility study — rather than Stage 2 during ideation. This sequencing was almost certainly not a deliberate decision to protect the ideation phase from commercial anchoring; it reads more like the financial data being assembled when needed. But the accidental effect was significant: the ideation outputs were not constrained by the £398 replacement cost figure or the 44-month payback window.
Ideas like the Last Box Protocol, the Alumni Network, and the Tenure Council — which are commercially expensive to defend at ideation stage — survived into the consolidation because no one was filtering against a cost model. The feasibility study then made the commercial case for them retrospectively. If the financial context had been present during ideation, some of the most structurally interesting ideas would likely have been self-censored before they reached the consolidation.
The verbatim "I am not angry. I am finished." is embedded in the POV and appears repeatedly across all three ideation outputs and the feasibility study. Its persistence across sessions suggests it did genuine generative work — anchoring the ideation in a specific human register rather than a generic churn problem.
The methodology benefited from having access to unusually precise qualitative evidence, and the prompt design was good enough not to dilute it. That combination was fortunate. A less specific verbatim — or no verbatim at all — would have produced a materially different ideation corpus. The difference between a well-evidenced discovery phase and a generic brief is visible in every subsequent output.
This is the most significant structural weakness. The feasibility study correctly identifies the mandatory Data Protection Impact Assessment as a hard legal deployment gate, and the AI Risk Register is thorough on UK GDPR Article 35 exposure, Equality Act obligations, and ICO fine risk. But both documents treat legal compliance as a condition to be satisfied after the concept is formed. The DPIA gap, the missing Standard Contractual Clauses, the absence of a Legitimate Interests Assessment — these are named in the feasibility study as remediation requirements, not as constraints that shaped the concept.
The consequence is that several of the highest-priority ideas — the Silence Classifier, the Community Vocabulary Shift monitor, the Peer Influence Map — involve behavioural profiling of identifiable subscribers at a scale that may require material redesign once legal review is completed. The feasibility study acknowledges this but defers it: "the concept development phase must include legal and data protection review as a parallel workstream." A parallel workstream to concept development is not the same as a design input to it.
If the DPIA had been scoped — even at a high level — before the consolidation and concept development stages, the AI capability stack might have been designed differently from the outset. Running the process again, a simplified privacy-by-design checklist should be present in the knowledge base from Stage 3 onwards, not introduced as a remediation condition in Stage 4.
| Critique Dimension | Verdict | Most Significant Evidence |
|---|---|---|
| Isolation at Ideation Stage | Worked by design | Community Vocabulary Shift convergence across all three sessions — impossible to produce without structural isolation. |
| Consolidation as Distinct Stage | Worked by design | Six emergent clusters held up as the organising architecture through the feasibility study and concept development — not replaced or revised. |
| HITL Critical Finding as Persistent Constraint | Worked by design | Feasibility Study Amber rating on Human & Ethical Feasibility — the constraint generated active friction, not just citation. |
| Financial Context Timing | Worked by accident | Last Box Protocol, Alumni Network, and Tenure Council survived into consolidation because no commercial filter was present at ideation. |
| POV Emotional Specificity | Worked by accident | "I am not angry. I am finished." appeared in all three sessions and in the feasibility study — generative anchor that was not designed to function this way. |
| DPIA as Post-Hoc Condition | Structural weakness | Several highest-priority ideas may require material redesign after legal review — the concept is commercially and strategically compelling but legally provisional. |