# Graphical representation of the Continuous Improvement Flywheel

flowchart TD
    %% Style definitions
    classDef development fill:#F8F4FF,stroke:#6B46C1,stroke-width:2px,color:#1F2937
    classDef evaluation fill:#F0FDF4,stroke:#16A34A,stroke-width:2px,color:#1F2937
    classDef ci fill:#FEF2F2,stroke:#DC2626,stroke-width:2px,color:#1F2937
    classDef deployment fill:#F0F9FF,stroke:#2563EB,stroke-width:2px,color:#1F2937
    classDef monitoring fill:#FFFBEB,stroke:#D97706,stroke-width:2px,color:#1F2937
    classDef analysis fill:#FAFAF9,stroke:#78716C,stroke-width:2px,color:#1F2937
    classDef improvement fill:#FFF7ED,stroke:#EA580C,stroke-width:2px,color:#1F2937

    %% Main cycle nodes
    A["🛠️ 1. Develop & Analyze (Initial)<br/><br/>📥 Inputs:<br/>• Initial prompts<br/>• RAG configuration<br/>• Connected tools<br/><br/>📤 Outputs:<br/>• Initial failure modes<br/>• Error analysis (1st pass)"]
    
    B["📊 2. Measure & Build Evaluations<br/><br/>🎯 Actions:<br/>• Translate failures to metrics<br/>• Create automated evaluators<br/>• Build initial golden dataset<br/>• Collaborative evaluation<br/><br/>📁 Artifacts:<br/>• Golden dataset<br/>• Evaluators<br/>• Metrics"]
    
    C["⚙️ 3. Integrate Evaluation in CI<br/><br/>🔧 Actions:<br/>• Integrate evaluators as tests<br/>• Add golden dataset as regressions<br/><br/>📋 Artifacts:<br/>• CI pipeline with tests"]
    
    D["🚀 4. Deploy (CD) with Observability<br/><br/>📡 Actions:<br/>• Instrumented deployment<br/>• Evaluators on real traffic<br/>• LLM-Judge pinning"]
    
    E["📈 5. Monitor Online Performance<br/><br/>👀 Actions:<br/>• Track quality metrics (θ^, CI)<br/>• Dashboards + alerts<br/>• Product metrics<br/>• User feedback"]
    
    F["🚨 6. Identify Deviations/New Failures<br/><br/>🔍 Actions:<br/>• Drift detection<br/>• Feedback analysis<br/>• Proactive discovery<br/>• Human sampling"]
    
    G["🧠 7. Re-Analyze (New Errors)<br/><br/>🔬 Actions:<br/>• Error analysis on problem traces<br/>• Recent issues identified"]
    
    H["🔄 8. Update Evaluation Artifacts<br/><br/>✨ Actions:<br/>• Add to golden dataset<br/>• Refine evaluators<br/>• Validate LLM-judge (TPR/TNR)<br/>• Update CI tests"]
    
    I["⚡ 9. Improve Pipeline<br/><br/>🎨 Strategies:<br/>• Refine prompts<br/>• Decompose tasks<br/>• Adjust RAG<br/>• Improve tools<br/>• Judicious fine-tuning"]
    
    J["🔁 10. Redeploy & Iterate<br/><br/>🚀 Actions:<br/>• Launch improved version<br/>• Resume monitoring"]

    %% Main cycle flow
    A --> B
    B --> C
    C --> D
    D --> E
    E --> F
    F --> G
    G --> H
    H --> C
    H --> I
    I --> J
    J --> D

    %% Feedback connections
    F -.-> E
    G -.-> B
    I -.-> A

    %% Apply styles
    class A development
    class B evaluation
    class C ci
    class D deployment
    class E monitoring
    class F analysis
    class G analysis
    class H evaluation
    class I improvement
    class J deployment

    %% Explanatory notes (left side)
    K["💡 CYCLE CHARACTERISTICS<br/><br/>🔄 Non-linear loop:<br/>   Phases can be executed<br/>   in different order by context<br/><br/>⚡ Step skipping:<br/>   Possible to omit phases<br/>   when not necessary<br/><br/>🔁 Internal cycles:<br/>   Iterations within<br/>   process subsections<br/><br/>📡 Continuous feedback:<br/>   Information flows in<br/>   multiple directions"]
    
    %% Position notes to the left
    K ~~~ A
    
    style K fill:#FAFAFA,stroke:#6B7280,stroke-width:1px,color:#374151,stroke-dasharray: 3 3
tomaslucas/continuous_improvement_flywheel.md