<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <id>https://jigsawflux.org/blog/</id>
    <title>JigsawFlux Blog</title>
    <updated>2026-06-28T00:00:00.000Z</updated>
    <generator>https://github.com/jpmonette/feed</generator>
    <link rel="alternate" href="https://jigsawflux.org/blog/"/>
    <subtitle>Updates from the JigsawFlux open-source community</subtitle>
    <icon>https://jigsawflux.org/blog/img/favicon.ico</icon>
    <rights>Copyright © 2026 JigsawFlux</rights>
    <entry>
        <title type="html"><![CDATA[Picking an Open-Source Agent Framework: LangGraph, CrewAI, and AutoGen]]></title>
        <id>https://jigsawflux.org/blog/comparing-agent-frameworks</id>
        <link href="https://jigsawflux.org/blog/comparing-agent-frameworks"/>
        <updated>2026-06-28T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[A hands-on comparison of three open-source agentic frameworks — LangGraph, CrewAI, and AutoGen — implementing the same two-agent pipeline with Anthropic Claude, with telemetry and a decision guide.]]></summary>
        <content type="html"><![CDATA[<p>The first decision in any agentic project isn't which model to use. It's which framework will orchestrate it. Get that wrong and you inherit a stack you can't run locally, can't afford to scale, and can't escape when the vendor changes the API.</p>
<p>This is a <a href="https://github.com/JigsawFlux" target="_blank" rel="noopener noreferrer">JigsawFlux</a> project. JigsawFlux builds open-source tools for health tech, humanitarian response, and crisis management — in places where "cloud-native" is not an option and IT budgets are measured in grants, not headcount. That context imposes hard constraints on every architecture decision: <strong>portability</strong>, <strong>cost</strong>, and <strong>freedom from vendor lock-in</strong>.</p>
<p>The frameworks here — <strong>LangGraph</strong>, <strong>CrewAI</strong>, and <strong>AutoGen</strong> — were chosen because they meet those constraints. They are open source, actively maintained, and run entirely on hardware you own. Alternatives like Microsoft Semantic Kernel or Amazon Bedrock Agents are capable, but they introduce hard dependencies on specific cloud ecosystems. That trade-off doesn't fit the JigsawFlux model.</p>
<p>Each framework is also the natural implementation home for a different family of agentic patterns — which is the other reason for this grouping, and why this is Part 1 of a two-part series. Part 2 implements those patterns directly.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="the-shared-task">The shared task<a href="https://jigsawflux.org/blog/comparing-agent-frameworks#the-shared-task" class="hash-link" aria-label="Direct link to The shared task" title="Direct link to The shared task">​</a></h2>
<p>Every framework runs the same pipeline: a <strong>Researcher agent</strong> uses a DuckDuckGo web search tool to gather facts on a given topic, then a <strong>Writer agent</strong> consumes those notes and produces a structured Markdown report. The topic in all runs was <em>solid-state batteries</em>.</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">python run.py --framework all --topic "solid-state batteries"</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>This symmetry is deliberate. Same task, same model (<code>claude-sonnet-4-6</code>), same tool — only the orchestration framework changes. That isolation means the telemetry at the end reflects framework behaviour and overhead, not model variation.</p>
<!-- -->
<p>The full source is at <a href="https://github.com/JigsawFlux/comparing-agent-frameworks" target="_blank" rel="noopener noreferrer">github.com/JigsawFlux/comparing-agent-frameworks</a>.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="langgraph-the-stateful-graph">LangGraph: the stateful graph<a href="https://jigsawflux.org/blog/comparing-agent-frameworks#langgraph-the-stateful-graph" class="hash-link" aria-label="Direct link to LangGraph: the stateful graph" title="Direct link to LangGraph: the stateful graph">​</a></h2>
<p>LangGraph models execution as a directed graph where nodes are functions and edges are routing decisions. State flows through every node as a typed dictionary — you define what it contains, and reducers control how it gets updated. <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-1">[1]</a></sup></p>
<p>The researcher loop is cyclic by design. It calls <code>web_search</code>, gets results back via the tool node, and loops until it decides it has enough information — at which point the conditional edge routes to the writer.</p>
<p>The state schema makes the data contract explicit:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token keyword" style="font-style:italic">class</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(255, 203, 107)">AgentState</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">TypedDict</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    messages</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> Annotated</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain">Sequence</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain">BaseMessage</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> add_messages</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    topic</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    research_notes</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    final_report</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The routing logic is a single function:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">should_continue</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> AgentState</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    last_message </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"messages"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token operator" style="color:rgb(137, 221, 255)">-</span><span class="token number" style="color:rgb(247, 140, 108)">1</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">if</span><span class="token plain"> last_message</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">tool_calls</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token keyword" style="font-style:italic">return</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"tools"</span><span class="token plain">   </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># loop back through tool node</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">return</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"writer"</span><span class="token plain">      </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># research complete</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>And the graph wiring reduces to five lines:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_edge</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">START</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"researcher"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_conditional_edges</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token string" style="color:rgb(195, 232, 141)">"researcher"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    should_continue</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"tools"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"tools"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"writer"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"writer"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_edge</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"tools"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"researcher"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain">   </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># the cyclic loop</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_edge</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"writer"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> END</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>What you get from this: fine-grained control over every routing decision, complete visibility into state at every step, and trivial output extraction (<code>state["final_report"]</code>). What it costs: you are building a graph. The mental model is powerful but requires internalising nodes, edges, reducers, and the distinction between cyclic and acyclic topologies before you can be productive.</p>
<p><strong>Agentic patterns this enables:</strong> ReAct <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-5">[5]</a></sup> (the researcher loop is a ReAct loop), Plan-and-Execute <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-6">[6]</a></sup>, ReWOO <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-7">[7]</a></sup>, Reflexion <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-8">[8]</a></sup>, DAG pipelines, human-in-the-loop via <code>interrupt()</code>.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="crewai-declarative-roles">CrewAI: declarative roles<a href="https://jigsawflux.org/blog/comparing-agent-frameworks#crewai-declarative-roles" class="hash-link" aria-label="Direct link to CrewAI: declarative roles" title="Direct link to CrewAI: declarative roles">​</a></h2>
<p>CrewAI inverts the mental model. Instead of defining a graph, you define <strong>agents</strong> with a role, goal, and backstory, and <strong>tasks</strong> with a description and expected output. Hand them to a <code>Crew</code> and it handles the orchestration. <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-2">[2]</a></sup></p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">researcher </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> Agent</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    role</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">"Senior Technology Researcher"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    goal</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">f"Conduct deep research on '</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">topic</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">' and compile key insights"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    backstory</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"You are a highly analytical research specialist. To stay within strict "</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"API rate limits, use the search_tool exactly ONCE with a broad query. "</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"Produce precise, structured notes."</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    tools</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain">search_tool</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    llm</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">llm</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">writer </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> Agent</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    role</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">"Expert Technical Writer"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    goal</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">f"Synthesize research notes into a professional technical report on '</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">topic</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">'"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    backstory</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"You are a veteran technical publisher who specialises in explaining complex "</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"advancements in clean, structured Markdown."</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    llm</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">llm</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Tasks are similarly declarative — each specifies what it needs and what it should produce:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">research_task </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> Task</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    description</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">f"Research '</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">topic</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">' using web_search. Identify: core concept, benefits, key players, barriers."</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    expected_output</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">"Detailed, structured notes listing research facts."</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    agent</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">researcher</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">write_task </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> Task</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    description</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">f"Write a professional report on '</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">topic</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">' from the research notes."</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    expected_output</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">"A structured technical report in Markdown format."</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    agent</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">writer</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">crew </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> Crew</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    agents</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain">researcher</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> writer</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    tasks</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain">research_task</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> write_task</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    process</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">Process</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">sequential</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">result </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> crew</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">kickoff</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Context passing between tasks is implicit — CrewAI injects the previous task's output as input to the next one. You never touch state directly.</p>
<p>This produced the richest research notes of the three runs: 9,351 characters versus LangGraph's 5,573. The declarative <code>backstory</code> and <code>role</code> fields give the model stronger framing to work from, and the verbose CrewAI reasoning traces generate more intermediate content. The trade-off is that sequential execution is the easy path; anything more complex — conditional routing, cyclic reasoning — requires switching to <code>Process.hierarchical</code> and adding a manager agent.</p>
<p><strong>Agentic patterns this enables:</strong> Hierarchical agent (add a manager with <code>Process.hierarchical</code>), role-based pipelines, multi-agent delegation.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="autogen-conversation-based">AutoGen: conversation-based<a href="https://jigsawflux.org/blog/comparing-agent-frameworks#autogen-conversation-based" class="hash-link" aria-label="Direct link to AutoGen: conversation-based" title="Direct link to AutoGen: conversation-based">​</a></h2>
<p>AutoGen treats agent coordination as a conversation. Each agent is a participant; they exchange messages, call tools, and signal completion via a termination string. There is no graph, no task object — just agents talking. <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-3">[3]</a><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-9">[9]</a></sup></p>
<p>The pipeline runs in two separate phases. Phase 1 is the research conversation:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">researcher </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> autogen</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">AssistantAgent</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    name</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">"Researcher"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    system_message</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"You are a Senior Researcher. Use the web_search tool ONCE to gather facts. "</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"Compile structured research notes. When complete, end with TERMINATE."</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    llm_config</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">llm_config</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">user_proxy </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> autogen</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">UserProxyAgent</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    name</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">"UserProxy"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    human_input_mode</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">"NEVER"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    max_consecutive_auto_reply</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token number" style="color:rgb(247, 140, 108)">5</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    is_termination_msg</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token keyword" style="font-style:italic">lambda</span><span class="token plain"> x</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"TERMINATE"</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">in</span><span class="token plain"> x</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">get</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"content"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">""</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    code_execution_config</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token boolean" style="color:rgb(255, 88, 116)">False</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">register_function</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">web_search</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> caller</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">researcher</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> executor</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">user_proxy</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">user_proxy</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">initiate_chat</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">researcher</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> message</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">f"Research '</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">topic</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">'..."</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The <code>UserProxyAgent</code> is not a human — it is the tool executor. When the researcher emits a tool call, the proxy executes it and sends the result back as a message. The loop ends when the researcher appends <code>TERMINATE</code>.</p>
<p>Phase 2 repeats the pattern with a Writer agent and a fresh proxy, passing the extracted research notes as the opening message:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">user_proxy2</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">initiate_chat</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    writer</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    message</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">f"Write a professional report on '</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">topic</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">' based on:\n\n</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">research_notes</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>AutoGen's conversation-native model is best suited to topologies where agents negotiate, debate, or vote — patterns where the back-and-forth is the mechanism, not just the means to an end.</p>
<p><strong>Agentic patterns this enables:</strong> Peer-to-peer network, Consensus/Joint, Human-in-the-loop (swap <code>UserProxyAgent</code> for an actual human).</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="telemetry-what-the-runs-produced">Telemetry: what the runs produced<a href="https://jigsawflux.org/blog/comparing-agent-frameworks#telemetry-what-the-runs-produced" class="hash-link" aria-label="Direct link to Telemetry: what the runs produced" title="Direct link to Telemetry: what the runs produced">​</a></h2>
<p>All three ran against the same topic on the same hardware with <code>claude-sonnet-4-6</code>. <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-12">[12]</a></sup></p>
<table><thead><tr><th style="text-align:left">Framework</th><th style="text-align:left">Status</th><th style="text-align:right">Time (s)</th><th style="text-align:right">Notes (chars)</th><th style="text-align:right">Report (chars)</th></tr></thead><tbody><tr><td style="text-align:left"><strong>LangGraph</strong></td><td style="text-align:left">Success</td><td style="text-align:right">129.64</td><td style="text-align:right">5,573</td><td style="text-align:right">18,178</td></tr><tr><td style="text-align:left"><strong>CrewAI</strong></td><td style="text-align:left">Success</td><td style="text-align:right">170.92</td><td style="text-align:right">9,351</td><td style="text-align:right">18,643</td></tr><tr><td style="text-align:left"><strong>AutoGen</strong></td><td style="text-align:left">Success</td><td style="text-align:right">76.19</td><td style="text-align:right">0</td><td style="text-align:right">322</td></tr></tbody></table>
<p>Three things stand out.</p>
<p><strong>LangGraph and CrewAI both produced full reports.</strong> The 41-second gap between them reflects CrewAI's more verbose internal reasoning — it generates more intermediate text per agent turn, which explains the longer notes. Both took the same 15-second rate-limit sleep between agent phases.</p>
<p><strong>AutoGen was fastest and produced almost nothing.</strong> 76 seconds, 0 notes, a 322-character "report" that turned out to be the writer's input prompt echoed back. This is not an AutoGen model failure — the model ran successfully. It is a message extraction problem in the runner. AutoGen stores conversation history per agent pair (<code>user_proxy.chat_messages[researcher]</code>), and the termination-message stripping removed content that overlapped with the notes before extraction. The model did its job; the output pipeline had a subtle bug.</p>
<p>This is worth pausing on because it reveals a real architectural difference. LangGraph's typed state makes output extraction trivial — the final report is simply <code>state["final_report"]</code>. CrewAI exposes it via <code>task.output.raw</code>. With AutoGen, you parse message history, and subtle bugs in that parsing can silently produce nothing. The conversational flexibility that makes AutoGen powerful for multi-agent debate also makes structured output extraction more fragile.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="a-note-on-n8n">A note on n8n<a href="https://jigsawflux.org/blog/comparing-agent-frameworks#a-note-on-n8n" class="hash-link" aria-label="Direct link to A note on n8n" title="Direct link to A note on n8n">​</a></h2>
<p>LangGraph is sometimes compared to <a href="https://n8n.io/" target="_blank" rel="noopener noreferrer">n8n</a> — a visual, platform-managed workflow tool that also supports AI agents. Both can orchestrate multi-step agentic workflows, but they operate at different layers. <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-4">[4]</a><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-11">[11]</a></sup></p>
<table><thead><tr><th style="text-align:left"></th><th style="text-align:left">LangGraph</th><th style="text-align:left">n8n</th></tr></thead><tbody><tr><td style="text-align:left"><strong>Interface</strong></td><td style="text-align:left">Code-first (Python / TypeScript)</td><td style="text-align:left">Visual canvas (drag-and-drop nodes)</td></tr><tr><td style="text-align:left"><strong>State management</strong></td><td style="text-align:left">Typed central state with reducers</td><td style="text-align:left">Step-by-step JSON payload passing</td></tr><tr><td style="text-align:left"><strong>Cyclic loops</strong></td><td style="text-align:left">Native graph edges</td><td style="text-align:left">Implicit via agent thought loops</td></tr><tr><td style="text-align:left"><strong>Triggers</strong></td><td style="text-align:left">Manual / custom API server</td><td style="text-align:left">Native webhooks, crons, event listeners</td></tr><tr><td style="text-align:left"><strong>Human-in-the-loop</strong></td><td style="text-align:left"><code>interrupt()</code> + checkpointer</td><td style="text-align:left">Built-in Wait and Form nodes</td></tr><tr><td style="text-align:left"><strong>400+ integrations</strong></td><td style="text-align:left">Write your own tool wrappers</td><td style="text-align:left">Slack, Jira, Notion, Postgres out of the box</td></tr><tr><td style="text-align:left"><strong>On-premises</strong></td><td style="text-align:left">✅ Any Python environment</td><td style="text-align:left">✅ Self-hosted Docker</td></tr></tbody></table>
<p>Use LangGraph when the logic is complex, cyclic, and needs to live inside your existing Python application. Use n8n when you need rapid API integration, non-developer maintainers, or built-in webhook triggers without writing ingestion routes.</p>
<p>For JigsawFlux use cases — clinics, crisis response, volunteer-run operations — either works on-premises. The deciding factor is usually who will maintain it: developers reach for LangGraph, operational staff reach for n8n.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="framework-decision-guide">Framework decision guide<a href="https://jigsawflux.org/blog/comparing-agent-frameworks#framework-decision-guide" class="hash-link" aria-label="Direct link to Framework decision guide" title="Direct link to Framework decision guide">​</a></h2>
<!-- -->
<table><thead><tr><th style="text-align:left">Framework</th><th style="text-align:left">Paradigm</th><th style="text-align:left">Best for</th></tr></thead><tbody><tr><td style="text-align:left"><strong>LangGraph</strong></td><td style="text-align:left">Stateful graph</td><td style="text-align:left">ReAct loops, DAG pipelines, human-in-the-loop, fine-grained state control</td></tr><tr><td style="text-align:left"><strong>CrewAI</strong></td><td style="text-align:left">Declarative roles</td><td style="text-align:left">Role-based teams, hierarchical topologies, rapid prototyping</td></tr><tr><td style="text-align:left"><strong>AutoGen</strong></td><td style="text-align:left">Conversation-based</td><td style="text-align:left">Peer-to-peer networks, consensus patterns, multi-agent debate</td></tr></tbody></table>
<p>All three have a free tier and run on hardware you already own. None require a cloud subscription to develop or deploy.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="whats-next">What's next<a href="https://jigsawflux.org/blog/comparing-agent-frameworks#whats-next" class="hash-link" aria-label="Direct link to What's next" title="Direct link to What's next">​</a></h2>
<p>This comparison is the foundation for Part 2, which implements and benchmarks <strong>agentic patterns</strong> directly — using whichever framework is the natural fit for each:</p>
<p><strong>Single-agent patterns</strong></p>
<ul>
<li><strong>ReAct</strong> <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-5">[5]</a></sup> — reason-act loops using LangGraph's cyclic edges</li>
<li><strong>Plan-and-Execute</strong> <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-6">[6]</a></sup> — separate planning phase from execution</li>
<li><strong>ReWOO</strong> <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-7">[7]</a></sup> — plan all tool calls upfront, then execute without intermediate observation</li>
<li><strong>Reflexion</strong> <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-8">[8]</a></sup> — self-critique and iterative self-improvement</li>
</ul>
<p><strong>Multi-agent topologies</strong></p>
<ul>
<li><strong>Hierarchical</strong> — orchestrator delegates to specialised sub-agents (CrewAI)</li>
<li><strong>DAG</strong> — directed pipeline with no feedback loops (LangGraph)</li>
<li><strong>Peer-to-peer network</strong> — lateral agent communication without a central manager (AutoGen)</li>
<li><strong>Consensus/Joint</strong> — multiple agents debate and converge on a shared answer (AutoGen)</li>
</ul>
<p>The project source is on GitHub: <a href="https://github.com/JigsawFlux/comparing-agent-frameworks" target="_blank" rel="noopener noreferrer">github.com/JigsawFlux/comparing-agent-frameworks</a>. <sup><a href="https://jigsawflux.org/blog/comparing-agent-frameworks#ref-10">[10]</a></sup></p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="references">References<a href="https://jigsawflux.org/blog/comparing-agent-frameworks#references" class="hash-link" aria-label="Direct link to References" title="Direct link to References">​</a></h2>
<p><strong>Frameworks</strong></p>
<p><span id="ref-1">[1]</span> LangGraph — <a href="https://github.com/langchain-ai/langgraph" target="_blank" rel="noopener noreferrer">langchain-ai/langgraph</a>, GitHub.</p>
<p><span id="ref-2">[2]</span> CrewAI — <a href="https://github.com/crewAIInc/crewAI" target="_blank" rel="noopener noreferrer">crewAIInc/crewAI</a>, GitHub.</p>
<p><span id="ref-3">[3]</span> AutoGen — <a href="https://github.com/microsoft/autogen" target="_blank" rel="noopener noreferrer">microsoft/autogen</a>, GitHub.</p>
<p><span id="ref-4">[4]</span> n8n — <a href="https://n8n.io/" target="_blank" rel="noopener noreferrer">n8n.io</a>, workflow automation platform.</p>
<p><strong>Research papers</strong></p>
<p><span id="ref-5">[5]</span> Yao, S. et al. (2022). <em>ReAct: Synergizing Reasoning and Acting in Language Models</em>. <a href="https://arxiv.org/abs/2210.03629" target="_blank" rel="noopener noreferrer">arXiv:2210.03629</a>.</p>
<p><span id="ref-6">[6]</span> Wang, L. et al. (2023). <em>Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models</em>. <a href="https://arxiv.org/abs/2305.04091" target="_blank" rel="noopener noreferrer">arXiv:2305.04091</a>.</p>
<p><span id="ref-7">[7]</span> Xu, B. et al. (2023). <em>ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models</em>. <a href="https://arxiv.org/abs/2305.18323" target="_blank" rel="noopener noreferrer">arXiv:2305.18323</a>.</p>
<p><span id="ref-8">[8]</span> Shinn, N. et al. (2023). <em>Reflexion: Language Agents with Verbal Reinforcement Learning</em>. <a href="https://arxiv.org/abs/2303.11366" target="_blank" rel="noopener noreferrer">arXiv:2303.11366</a>.</p>
<p><span id="ref-9">[9]</span> Wu, Q. et al. (2023). <em>AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework</em>. <a href="https://arxiv.org/abs/2308.08155" target="_blank" rel="noopener noreferrer">arXiv:2308.08155</a>.</p>
<p><strong>Project sources</strong></p>
<p><span id="ref-10">[10]</span> JigsawFlux. <em>comparing-agent-frameworks</em> — <a href="https://github.com/JigsawFlux/comparing-agent-frameworks/blob/main/README.md" target="_blank" rel="noopener noreferrer">README.md</a>, framework comparison matrix and architecture overview.</p>
<p><span id="ref-11">[11]</span> JigsawFlux. <em>comparing-agent-frameworks</em> — <a href="https://github.com/JigsawFlux/comparing-agent-frameworks/blob/main/langgraph_vs_n8n.md" target="_blank" rel="noopener noreferrer">langgraph_vs_n8n.md</a>, architectural comparison of LangGraph and n8n.</p>
<p><span id="ref-12">[12]</span> JigsawFlux. <em>comparing-agent-frameworks</em> — telemetry from <code>python run.py --framework all --topic "solid-state batteries"</code> using <code>claude-sonnet-4-6</code> (2026-06-28). <a href="https://github.com/JigsawFlux/comparing-agent-frameworks" target="_blank" rel="noopener noreferrer">github.com/JigsawFlux/comparing-agent-frameworks</a>.</p>
<hr>
<p>This is a JigsawFlux project. JigsawFlux builds open-source tools for health tech, humanitarian response, and crisis management — tools designed to work on constrained budgets, unreliable infrastructure, and donated hardware. If you are working on something in this space, or want to contribute, the <a href="https://github.com/JigsawFlux" target="_blank" rel="noopener noreferrer">JigsawFlux GitHub organisation</a> is where the work happens.</p>]]></content>
        <author>
            <name>Suresh Thomas</name>
            <uri>https://github.com/st185229</uri>
        </author>
        <category label="langgraph" term="langgraph"/>
        <category label="crewai" term="crewai"/>
        <category label="autogen" term="autogen"/>
        <category label="agents" term="agents"/>
        <category label="claude" term="claude"/>
        <category label="multi-agent" term="multi-agent"/>
        <category label="open-source" term="open-source"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Agent Clinic: Human-in-the-Loop Medical Consultations with LangGraph and AWS Bedrock]]></title>
        <id>https://jigsawflux.org/blog/agent-clinic</id>
        <link href="https://jigsawflux.org/blog/agent-clinic"/>
        <updated>2026-06-21T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[A walkthrough of a proof-of-concept medical consultation system where AI agents handle patient intake and prescriptions while a doctor stays in the loop — built with LangGraph, AWS Bedrock Claude Haiku 4.5, and Streamlit for under $0.01 per consultation.]]></summary>
        <content type="html"><![CDATA[<p>The name cuts two ways. It's a clinic — for patients. And the clinic runs on agents.</p>
<p>The problem this POC targets is specific: small and charity hospitals where doctor time is genuinely scarce and IT budgets are measured in hundreds of dollars, not thousands. A consultation isn't just a diagnosis — it's intake, medical history retrieval, triage sorting, prescription recording, pharmacy stock checking. The typical workflow hands all of that to a doctor anyway, because there's no other option. The result: a clinician spending 40% of their time on work that doesn't require clinical judgment.</p>
<p>The premise here is simple. AI handles everything that doesn't require a clinician. The doctor steps in exactly once — to read the AI-produced intake summary and give a diagnosis. That's it. The prescription agent takes over from there.</p>
<p>This is a <a href="https://github.com/JigsawFlux" target="_blank" rel="noopener noreferrer">JigsawFlux</a> project. JigsawFlux builds open-source tools for health tech, things that matters — tools that have to work in the real world, not the well-funded one. That means two hard constraints shaped every architecture decision here: <strong>cost</strong> and <strong>deployability</strong>. Viable on a shoestring budget. Runnable in places where "cloud-native" isn't an option — a clinic with a single server, unreliable internet, and an IT team of one.</p>
<p>Built on <strong>AWS Bedrock</strong> (Claude Haiku 4.5), <strong>LangGraph</strong> for orchestration, <strong>LangChain</strong> <code>@tool</code> wrappers for data access, and <strong>Streamlit</strong> for the UI. Total cost: <strong>&lt; $0.01 per consultation</strong>. Deployable on a £25/month VPS or a clinic's own hardware, with the option to go fully on-premises as models improve.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="why-not-lambda--bedrock">Why Not Lambda + Bedrock?<a href="https://jigsawflux.org/blog/agent-clinic#why-not-lambda--bedrock" class="hash-link" aria-label="Direct link to Why Not Lambda + Bedrock?" title="Direct link to Why Not Lambda + Bedrock?">​</a></h2>
<p>The obvious AWS pattern for an LLM-powered app is Lambda + Bedrock: a Lambda function receives a request, calls the model, returns a response. It works well for single-turn, stateless interactions.</p>
<p>A medical consultation is not a single-turn interaction. It looks like this:</p>
<div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">Patient submits symptoms        [seconds]</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">↓ intake agent runs             [~5 seconds]</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">↓ graph pauses — doctor queued  [minutes to hours]</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">↓ doctor submits diagnosis</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">↓ prescription agent runs       [~3 seconds]</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">↓ patient receives prescription</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>That middle gap — between intake completing and the doctor responding — can be minutes or hours. Lambda invocations are stateless. Every invocation discards all context. Modelling a pause-and-resume workflow with Lambda means rebuilding state management, custom polling, webhook handlers, and retry logic from scratch. The honest Lambda-native path for a stateful workflow like this is Lambda + Step Functions + API Gateway + DynamoDB — a real AWS stack with real AWS complexity and real AWS consulting costs.</p>
<p>But there's a second problem with Lambda for this use case: <strong>it is irrevocably cloud-only</strong>. A charity clinic in a low-connectivity region cannot run Lambda on their own hardware. If the internet goes down, consultations stop. Patient data has to leave the building to be processed. There is no on-premises option, no hybrid option, no path to data sovereignty.</p>
<p>LangGraph running on a single Python process is none of that. It runs on a clinic's existing server, a cheap VPS, a donated laptop. The only network call during a consultation is the Bedrock API invocation — a small JSON payload, only made when the model is actually thinking. Patient records stay local. Connectivity interruptions only affect the model call, not the workflow state. And in a future phase, Bedrock can be swapped for a locally-hosted model (Ollama running <code>llama3.2</code> or similar), taking the cloud dependency to zero.</p>
<table><thead><tr><th>Concern</th><th>Lambda + Bedrock</th><th>LangGraph + Bedrock</th></tr></thead><tbody><tr><td>State between steps</td><td>Application must manage</td><td>Built-in typed state (<code>TypedDict</code>)</td></tr><tr><td>Cyclic workflows</td><td>Manual loop logic</td><td>First-class graph edges</td></tr><tr><td>Human handoff</td><td>Custom webhook + polling</td><td><code>interrupt()</code> primitive</td></tr><tr><td>Pause &amp; resume</td><td>Rebuild from scratch</td><td>Checkpoint + resume built-in</td></tr><tr><td>Multi-agent routing</td><td>Custom routing code</td><td>Conditional edges</td></tr><tr><td>Deployment options</td><td>AWS cloud only</td><td>Any Python environment</td></tr><tr><td>On-premises / hybrid</td><td>❌ Not possible</td><td>✅ Native — runs on local hardware</td></tr><tr><td>Data leaves building</td><td>✅ Always (Lambda processes it)</td><td>Only model invocation payload</td></tr><tr><td>Infrastructure overhead</td><td>Lambda + Step Functions + APIGW + DDB</td><td>Single Python process</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="the-architecture">The Architecture<a href="https://jigsawflux.org/blog/agent-clinic#the-architecture" class="hash-link" aria-label="Direct link to The Architecture" title="Direct link to The Architecture">​</a></h2>
<p>Three layers, each with a clear responsibility:</p>
<div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">┌─────────────────────────────────────────────────┐</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  LangGraph                                      │  Orchestration</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  StateGraph · nodes · edges · interrupt()       │  (workflow logic, state, routing)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">├─────────────────────────────────────────────────┤</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  LangChain AWS  (langchain-aws)                 │  Integration</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  ChatBedrock · @tool wrappers                   │  (model wrappers, tool schemas)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">├─────────────────────────────────────────────────┤</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  AWS Bedrock  (Claude Haiku 4.5)                │  Intelligence</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  Foundation model · pay-per-token               │  (reasoning, generation)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">└─────────────────────────────────────────────────┘</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The LangGraph <code>StateGraph</code> encodes the consultation as nodes (agents) and edges (routing logic). Here's the full workflow:</p>
<!-- -->
<p>Every node reads from and writes to a single typed state object that flows through the entire graph:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># graph.py</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">class</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(255, 203, 107)">ConsultationState</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">TypedDict</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    session_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    patient_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    symptoms</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    patient_history</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">dict</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    intake_summary</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    is_emergency</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">bool</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    triage_score</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">int</span><span class="token plain">    </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># 1=Routine, 2=Minor, 3=Moderate, 4=Severe, 5=Urgent non-emergency</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    triage_reason</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain">   </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># one-sentence AI justification for the score</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    doctor_clarification_req</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    patient_clarification_ans</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    doctor_notes</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    prescription</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    status</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token plain">          </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># intake | emergency | awaiting_doctor | clarifying | prescribing | complete</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    messages</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">list</span><span class="token plain">       </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># rendered in patient chat UI</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>This typed contract is what makes the graph inspectable and debuggable. At any pause point you can call <code>graph.get_state(config)</code> and see exactly what every field holds.</p>
<p>Wiring the graph up is straightforward — <code>build_graph()</code> is the only assembly point:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># graph.py</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">MODEL_ID </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> os</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">getenv</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"BEDROCK_MODEL_ID"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"us.anthropic.claude-haiku-4-5-20251001-v1:0"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">_get_model</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">return</span><span class="token plain"> ChatBedrock</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        model_id</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">MODEL_ID</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        model_kwargs</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"temperature"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token number" style="color:rgb(247, 140, 108)">0.3</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        region_name</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">os</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">getenv</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"AWS_DEFAULT_REGION"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"us-east-1"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">build_graph</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> StateGraph</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">ConsultationState</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"intake_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">        intake_agent_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"emergency_protocol"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">  emergency_protocol_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_review"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">       doctor_review_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"clarification_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> clarification_agent_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"prescription_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">  prescription_agent_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">set_entry_point</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"intake_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_conditional_edges</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"intake_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        should_escalate</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"emergency_protocol"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"emergency_protocol"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_review"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_review"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_edge</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"emergency_protocol"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> END</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_conditional_edges</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_review"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        should_clarify</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"clarification_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"clarification_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"prescription_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"prescription_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_edge</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"clarification_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_review"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">add_edge</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"prescription_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> END</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    checkpointer </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> MemorySaver</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">return</span><span class="token plain"> workflow</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token builtin" style="color:rgb(130, 170, 255)">compile</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">checkpointer</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">checkpointer</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The conditional edge functions are one-liners that read directly from state:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">should_escalate</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">-</span><span class="token operator" style="color:rgb(137, 221, 255)">&gt;</span><span class="token plain"> Literal</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"emergency_protocol"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_review"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">return</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"emergency_protocol"</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">if</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">get</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"is_emergency"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">else</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_review"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">should_clarify</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">-</span><span class="token operator" style="color:rgb(137, 221, 255)">&gt;</span><span class="token plain"> Literal</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"clarification_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"prescription_agent"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">return</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"clarification_agent"</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">if</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">get</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_clarification_req"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">else</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"prescription_agent"</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="human-in-the-loop-how-interrupt-actually-works">Human-in-the-Loop: How <code>interrupt()</code> Actually Works<a href="https://jigsawflux.org/blog/agent-clinic#human-in-the-loop-how-interrupt-actually-works" class="hash-link" aria-label="Direct link to human-in-the-loop-how-interrupt-actually-works" title="Direct link to human-in-the-loop-how-interrupt-actually-works">​</a></h2>
<p>The <code>doctor_review</code> node is not a model call. It's a pause point — here's the real code:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># graph.py</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">doctor_review_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> ConsultationState</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">-</span><span class="token operator" style="color:rgb(137, 221, 255)">&gt;</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">dict</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    context </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"session_id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"session_id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"patient_id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"patient_id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"intake_summary"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"intake_summary"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"patient_history"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"patient_history"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">if</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">get</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"patient_clarification_ans"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        context</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"clarification"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token string" style="color:rgb(195, 232, 141)">"question"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_clarification_req"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token string" style="color:rgb(195, 232, 141)">"answer"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"patient_clarification_ans"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># Graph pauses here until Streamlit calls graph.invoke(Command(resume=doctor_input), config)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    doctor_input </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> interrupt</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">context</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">if</span><span class="token plain"> doctor_input</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">get</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"action"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">==</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"clarify"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        clarification_req </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> doctor_input</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"question"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        _update_consultation</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"session_id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            doctor_clarification_req</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">clarification_req</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            status</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">"clarifying"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token keyword" style="font-style:italic">return</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_clarification_req"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> clarification_req</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token string" style="color:rgb(195, 232, 141)">"patient_clarification_ans"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">""</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token string" style="color:rgb(195, 232, 141)">"status"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"clarifying"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    doctor_notes </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> doctor_input</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">get</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"notes"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">""</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    _update_consultation</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"session_id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        doctor_notes</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">doctor_notes</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        status</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">"prescribing"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">return</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"doctor_notes"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> doctor_notes</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"status"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"prescribing"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>When <code>interrupt(context)</code> fires, LangGraph serialises the entire <code>ConsultationState</code> to the checkpoint store (<code>MemorySaver</code> in this POC; DynamoDB in production) and stops. The doctor's Streamlit tab polls the database for pending consultations and shows the intake summary.</p>
<p>When the doctor acts, Streamlit resumes the graph with a single call:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># app.py — doctor submits diagnosis</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">graph</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">invoke</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    Command</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">resume</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"action"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"diagnose"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"notes"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> final_notes</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    config</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># app.py — doctor asks patient a clarifying question</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">graph</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">invoke</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    Command</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">resume</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"action"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"clarify"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"question"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> question</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">strip</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    config</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># app.py — patient answers the doctor's question</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">graph</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">invoke</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">Command</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">resume</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">answer</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">strip</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> config</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Each <code>Command(resume=...)</code> call picks up execution from exactly where <code>interrupt()</code> left it — no rebuilding state, no re-running the intake agent, no webhooks.</p>
<p>The Streamlit UI shares the in-process graph across all browser tabs via <code>@st.cache_resource</code>. This is what makes the patient → doctor state handoff work without any network calls:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># app.py</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">@st</span><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">cache_resource</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">get_graph</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">return</span><span class="token plain"> build_graph</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain">  </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># single graph instance, shared across all sessions</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p><code>MemorySaver</code> lives inside that cached object. Patient tab writes state; doctor tab reads it — same process, same memory, zero coordination overhead.</p>
<p>The clarification loop uses the same <code>interrupt()</code> mechanism a second time. <code>clarification_agent_node</code> pauses the graph waiting for the patient's answer; when they reply, the graph resumes and routes back to <code>doctor_review</code> via a fixed edge. A cyclic human workflow that would need custom state machines, polling infrastructure, and webhook handlers in a Lambda-based system becomes three graph edges and two <code>interrupt()</code> calls.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="demo-walkthrough">Demo Walkthrough<a href="https://jigsawflux.org/blog/agent-clinic#demo-walkthrough" class="hash-link" aria-label="Direct link to Demo Walkthrough" title="Direct link to Demo Walkthrough">​</a></h2>
<p>Here's a complete consultation using Patient P001 (Alice Thompson, headache and fever).</p>
<p><strong>1. Login — role and PIN selection</strong></p>
<p><img decoding="async" loading="lazy" alt="Login screen" src="https://jigsawflux.org/blog/assets/images/login-9330e527640c39be6d15557f48759e9b.jpg" width="676" height="877" class="img_ev3q"></p>
<p>The login screen is a placeholder for Amazon Cognito. The hardcoded PIN (<code>1234</code> for patients, <code>doctor</code> for doctors) illustrates exactly where real authentication would plug in — the rest of the graph doesn't change.</p>
<p><strong>2. Patient submits symptoms — AI intake runs</strong></p>
<p>The patient types their symptoms and clicks <strong>Start Consultation</strong>. The intake agent fetches the patient's record and medical history from SQLite, searches the knowledge base, assigns a triage score (1–5), and hands off to the doctor queue. The entire intake takes a few seconds.</p>
<p><img decoding="async" loading="lazy" alt="AI intake complete — case passed to doctor" src="https://jigsawflux.org/blog/assets/images/patient_handover-672ceb37b9f2a1c07bd898bdb25169bf.jpg" width="1197" height="532" class="img_ev3q"></p>
<p>The graph is now paused at <code>doctor_review</code>. Nothing in the system runs until the doctor acts.</p>
<p><strong>3. Doctor desktop — triage queue and response panel</strong></p>
<p>In a separate browser tab the doctor logs in. The queue shows all pending consultations sorted by triage severity. For Alice's case:</p>
<p><img decoding="async" loading="lazy" alt="Doctor desktop — intake summary and response panel" src="https://jigsawflux.org/blog/assets/images/doc-response-5441d93ed356ea8ec2f98bbb8e4d339a.jpg" width="1708" height="1257" class="img_ev3q"></p>
<p>The doctor sees the triage badge (🟡 Level 3 — Moderate), the AI-generated intake summary with clinical context drawn from Alice's history and allergies, and a response panel with two actions: <strong>Submit Diagnosis</strong> or <strong>Ask Patient</strong>.</p>
<p><strong>4. Patient receives the prescription</strong></p>
<p>After the doctor submits, the prescription agent checks pharmacy inventory (Amoxicillin is seeded as out of stock — it flags this and suggests an alternative) then records the prescription. The patient's view updates:</p>
<p><img decoding="async" loading="lazy" alt="Patient consultation complete with prescription" src="https://jigsawflux.org/blog/assets/images/patient_complete_consultation-6264c0e9d043f2b14d0a7f1122359bbf.jpg" width="1708" height="737" class="img_ev3q"></p>
<p><strong>Try these scenarios yourself:</strong></p>
<table><thead><tr><th>Scenario</th><th>How to trigger</th></tr></thead><tbody><tr><td>Emergency path</td><td>Report chest pain radiating to the left arm — red banner, no queue entry</td></tr><tr><td>Clarification loop</td><td>Doctor uses "Ask Patient" before diagnosing — patient answers, routes back</td></tr><tr><td>Out-of-stock pharmacy</td><td>Prescription agent flags Amoxicillin and suggests an alternative</td></tr><tr><td>Multilingual intake</td><td>Submit symptoms in French or Spanish — intake responds in kind; doctor always gets English</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="the-tool-layer-tool-now-mcp-later">The Tool Layer: <code>@tool</code> Now, MCP Later<a href="https://jigsawflux.org/blog/agent-clinic#the-tool-layer-tool-now-mcp-later" class="hash-link" aria-label="Direct link to the-tool-layer-tool-now-mcp-later" title="Direct link to the-tool-layer-tool-now-mcp-later">​</a></h2>
<p>The intake and prescription agents access data through LangChain <code>@tool</code>-decorated functions:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">@tool</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">get_patient_record</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">patient_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">-</span><span class="token operator" style="color:rgb(137, 221, 255)">&gt;</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">dict</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">@tool</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">get_medical_history</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">patient_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">-</span><span class="token operator" style="color:rgb(137, 221, 255)">&gt;</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">list</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token builtin" style="color:rgb(130, 170, 255)">dict</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">@tool</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">search_knowledge_base</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">query</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">-</span><span class="token operator" style="color:rgb(137, 221, 255)">&gt;</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">list</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token builtin" style="color:rgb(130, 170, 255)">dict</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">@tool</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">record_prescription</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">session_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> prescription</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">-</span><span class="token operator" style="color:rgb(137, 221, 255)">&gt;</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">str</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>These are bound to the Haiku model in <code>intake_agent_node</code> and the agent runs a standard tool-calling loop — invoke, check for tool calls, execute them, feed results back, repeat until the model produces a final JSON response:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># graph.py — intake_agent_node (core loop)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">intake_agent_node</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">state</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> ConsultationState</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">-</span><span class="token operator" style="color:rgb(137, 221, 255)">&gt;</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">dict</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    llm </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> _get_model</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    intake_tools </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain">get_patient_record</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> get_medical_history</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> search_knowledge_base</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    llm_with_tools </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> llm</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">bind_tools</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">intake_tools</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    tool_map </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token plain">t</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">name</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> t </span><span class="token keyword" style="font-style:italic">for</span><span class="token plain"> t </span><span class="token keyword" style="font-style:italic">in</span><span class="token plain"> intake_tools</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    messages </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        SystemMessage</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">content</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">INTAKE_SYSTEM</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        HumanMessage</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">content</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">f"Patient ID: </span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">state</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string-interpolation interpolation string" style="color:rgb(195, 232, 141)">'patient_id'</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">\nSymptoms: </span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">state</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string-interpolation interpolation string" style="color:rgb(195, 232, 141)">'symptoms'</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">for</span><span class="token plain"> _ </span><span class="token keyword" style="font-style:italic">in</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">range</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token number" style="color:rgb(247, 140, 108)">6</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain">  </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># cap tool-calling iterations</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        response </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> llm_with_tools</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">invoke</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">messages</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        messages</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">append</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">response</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token keyword" style="font-style:italic">if</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">not</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(130, 170, 255)">getattr</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">response</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"tool_calls"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token boolean" style="color:rgb(255, 88, 116)">None</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token keyword" style="font-style:italic">break</span><span class="token plain">  </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># model produced final answer — no more tool calls</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token keyword" style="font-style:italic">for</span><span class="token plain"> tc </span><span class="token keyword" style="font-style:italic">in</span><span class="token plain"> response</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">tool_calls</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            fn </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> tool_map</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">get</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"name"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token keyword" style="font-style:italic">if</span><span class="token plain"> fn</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">                result </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> fn</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">invoke</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">tc</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"args"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">                messages</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">append</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">ToolMessage</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">content</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">json</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">dumps</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">result</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> tool_call_id</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">tc</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># parse the structured JSON response and return updated state fields ...</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The model is instructed via <code>INTAKE_SYSTEM</code> to respond only with a JSON object containing <code>intake_summary</code>, <code>is_emergency</code>, <code>triage_score</code>, <code>triage_reason</code>, and a <code>patient_message</code> in the patient's detected language. The prescription agent follows the same loop pattern, using <code>check_pharmacy_inventory</code> and <code>record_prescription</code> as its tools.</p>
<p>For a POC this is the right choice: zero infrastructure, SQLite queries are synchronous, and the entire stack is Python.</p>
<p>In production, three problems make this untenable:</p>
<ol>
<li><strong>Security</strong> — the agent has unrestricted database access. There is no way to enforce that the intake agent can only read records for the <em>current</em> patient.</li>
<li><strong>Reuse</strong> — a second agent type (specialist referral, pharmacy checker) would need to duplicate or tightly share these functions.</li>
<li><strong>Auditability</strong> — no independent record of which agent accessed which data, a compliance requirement in clinical settings.</li>
</ol>
<p><strong>Model Context Protocol (MCP)</strong> solves all three. Each tool becomes a standalone server process with its own IAM role, access policy, and CloudWatch audit log:</p>
<table><thead><tr><th></th><th><code>@tool</code> wrappers (POC)</th><th>MCP Servers (Production)</th></tr></thead><tbody><tr><td>Setup time</td><td>Minutes</td><td>Days</td></tr><tr><td>Security boundary</td><td>None (same process)</td><td>IAM role per server</td></tr><tr><td>Audit trail</td><td>LangGraph traces only</td><td>CloudWatch per call</td></tr><tr><td>Reusability</td><td>Duplicated per agent</td><td>Shared across all agents</td></tr><tr><td>Cost</td><td>$0</td><td>~$2–5/month (Lambda)</td></tr><tr><td>Graph changes required</td><td>None</td><td>None</td></tr></tbody></table>
<p>The last row is the key. The LangGraph node functions in <code>graph.py</code> don't change — they still call <code>get_patient_record()</code> and <code>search_knowledge_base()</code>. Only the <em>implementations</em> in <code>tools.py</code> swap from direct SQLite calls to MCP client calls. The tool signatures are the interface contract; the interface is already final.</p>
<p>AWS Verified Permissions (Cedar policies) in Phase 2 adds the access enforcement: <em>"intake_agent may call get_patient_record only for the patient_id present in the current session."</em></p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="cost">Cost<a href="https://jigsawflux.org/blog/agent-clinic#cost" class="hash-link" aria-label="Direct link to Cost" title="Direct link to Cost">​</a></h2>
<p>For a small or charity hospital, cost isn't an optimisation — it's a constraint. A £2,000/month telemedicine SaaS subscription is simply not available. £20/month might be.</p>
<table><thead><tr><th>Item</th><th>POC (local)</th><th>Production (50 consults/day)</th><th>Lambda-equivalent stack</th></tr></thead><tbody><tr><td>Claude Haiku 4.5</td><td>~$0.001/consult</td><td>~$1.50/month</td><td>~$1.50/month (same)</td></tr><tr><td>Compute</td><td>$0 (developer laptop)</td><td>~$10–15/month (t3.micro or VPS)</td><td>~$5–10/month (Lambda)</td></tr><tr><td>State / DB</td><td>$0 (SQLite)</td><td>~$1–2/month (DynamoDB on-demand)</td><td>~$5–10/month (DDB + Step Functions)</td></tr><tr><td>API / orchestration</td><td>$0</td><td>$0 (LangGraph in-process)</td><td>~$10–15/month (API Gateway + Step Functions)</td></tr><tr><td><strong>Total</strong></td><td><strong>$0</strong></td><td><strong>&lt; $20/month</strong></td><td><strong>~$25–40/month</strong></td></tr></tbody></table>
<p>LangGraph on a cheap VM is modestly cheaper than the Lambda-native equivalent — but the bigger difference is operational complexity and the option to run on hardware you already own. A clinic that already has a server pays only for Bedrock tokens.</p>
<p><strong>Hybrid deployment options:</strong></p>
<table><thead><tr><th>Deployment</th><th>Compute cost</th><th>Cloud dependency</th><th>Data stays local?</th></tr></thead><tbody><tr><td>Developer laptop (POC)</td><td>$0</td><td>Bedrock API only</td><td>✅ Yes</td></tr><tr><td>£25/month VPS</td><td>~£25/month</td><td>Bedrock API only</td><td>✅ Yes</td></tr><tr><td>Clinic's own server</td><td>$0 (sunk cost)</td><td>Bedrock API only</td><td>✅ Yes</td></tr><tr><td>Fully on-premises (future)</td><td>$0</td><td>None (local model)</td><td>✅ Yes</td></tr><tr><td>Lambda + Step Functions</td><td>Per-invocation</td><td>AWS cloud required</td><td>❌ No</td></tr></tbody></table>
<p>The "Bedrock API only" row is worth emphasising. During a consultation, the only data that leaves the clinic's network is the prompt sent to the model — the patient's symptoms and the anonymised intake context. The patient database, medical history, and prescription records never leave the machine. That matters for GDPR compliance and for clinics operating in jurisdictions with strict patient data rules.</p>
<p>Two model decisions keep the token cost low:</p>
<p><strong>Haiku 4.5, not Sonnet 4.</strong> Haiku is ~8× cheaper per token. In this system, the doctor is always the final clinical authority — the triage score is a queue-sorting mechanism, not a clinical decision. The AI's job is to produce a useful intake summary, not to diagnose. Haiku 4.5 does that reliably, including structured JSON output (<code>triage_score</code>, <code>intake_summary</code>) and native multilingual support for non-English-speaking patients at no extra cost.</p>
<p><strong>Cross-region inference profile.</strong> One gotcha worth flagging: Claude Haiku 4.5 requires a cross-region inference profile, not a direct model ID. Invoking the bare model ID returns a <code>ValidationException</code>. The default in <code>graph.py</code> is already the correct form:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic"># graph.py</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">MODEL_ID </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> os</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">getenv</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token string" style="color:rgb(195, 232, 141)">"BEDROCK_MODEL_ID"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"us.anthropic.claude-haiku-4-5-20251001-v1:0"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic">#                                                         ^^^</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic">#                                          cross-region prefix — required for on-demand throughput</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>If you override <code>BEDROCK_MODEL_ID</code> with a bare model ID (e.g. <code>anthropic.claude-haiku-4-5-20251001-v1:0</code>), switch it back to the <code>us.</code> prefixed version.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="poc--production-roadmap">POC → Production Roadmap<a href="https://jigsawflux.org/blog/agent-clinic#poc--production-roadmap" class="hash-link" aria-label="Direct link to POC → Production Roadmap" title="Direct link to POC → Production Roadmap">​</a></h2>
<p>The architectural bet at the centre of this design: <code>graph.py</code> — the nodes, edges, conditional routing, and <code>interrupt()</code> calls — <strong>never changes across any production phase</strong>. Infrastructure evolves; the workflow doesn't.</p>
<table><thead><tr><th>Component</th><th>POC</th><th>Phase 1</th><th>Phase 2</th><th>Phase 3</th><th>Phase 4</th><th>Phase 5</th></tr></thead><tbody><tr><td><strong>LLM</strong></td><td>Haiku (Bedrock)</td><td>← same</td><td>← same</td><td>← same</td><td>← same</td><td>← same</td></tr><tr><td><strong>Graph topology</strong></td><td>5 nodes, 2 interrupts</td><td>← same</td><td>← same</td><td>← same</td><td>← same</td><td>← same</td></tr><tr><td><strong>Checkpointing</strong></td><td>MemorySaver</td><td>DynamoDB</td><td>← same</td><td>← same</td><td>← same</td><td>+ PITR</td></tr><tr><td><strong>Patient DB</strong></td><td>SQLite</td><td>RDS PostgreSQL</td><td>← same</td><td>+ encryption</td><td>← same</td><td>+ Multi-AZ</td></tr><tr><td><strong>Tool access</strong></td><td><code>@tool</code> → SQLite</td><td><code>@tool</code> → RDS</td><td><strong>MCP servers</strong></td><td>+ Cedar policies</td><td>← same</td><td>+ audit logs</td></tr><tr><td><strong>Auth</strong></td><td>Hardcoded PIN</td><td>← same</td><td>← same</td><td>Amazon Cognito</td><td>← same</td><td>← same</td></tr><tr><td><strong>Frontend</strong></td><td>Streamlit (local)</td><td>Streamlit (App Runner)</td><td>← same</td><td>React / Next.js</td><td>+ WebSocket</td><td>← same</td></tr><tr><td><strong>Infra cost/month</strong></td><td>$0</td><td>~$25</td><td>~$30</td><td>~$50</td><td>~$55</td><td>~$80–120</td></tr></tbody></table>
<p>Phase 2 is the architectural pivot. Everything before it is scaffolding. Everything after it scales. The move from <code>@tool</code> wrappers to MCP servers is the only change that touches security, reusability, and auditability simultaneously — and it does so without touching the graph.</p>
<p>Beyond the core phases, the design also supports additive extensions for low-resource environments: a <strong>WhatsApp/SMS patient interface</strong> via Twilio (the graph's <code>interrupt()</code> is channel-agnostic — state simply sits in DynamoDB until the next SMS arrives), <strong>voice note intake</strong> via AWS Transcribe or self-hosted Whisper, and <strong>native multilingual support</strong> (already live in the POC — Haiku 4.5 detects the patient's language and responds in kind while always producing the doctor summary in English).</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="whats-next">What's Next<a href="https://jigsawflux.org/blog/agent-clinic#whats-next" class="hash-link" aria-label="Direct link to What's Next" title="Direct link to What's Next">​</a></h2>
<p>The full source — <code>graph.py</code>, <code>tools.py</code>, <code>app.py</code>, <code>seed_db.py</code>, and all architecture documentation — is on GitHub: <a href="https://github.com/JigsawFlux/agentic-clinic" target="_blank" rel="noopener noreferrer">github.com/JigsawFlux/agentic-clinic</a>.</p>
<p>To run it locally:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">git clone https://github.com/JigsawFlux/agentic-clinic</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">cd agentic-clinic</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">python -m venv .venv &amp;&amp; source .venv/bin/activate</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">pip install -r requirements.txt</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">python seed_db.py</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">streamlit run app.py</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>You'll need AWS credentials and Bedrock model access for <code>us.anthropic.claude-haiku-4-5-20251001-v1:0</code> in <code>us-east-1</code>. The <a href="https://github.com/JigsawFlux/agentic-clinic/blob/main/README.md" target="_blank" rel="noopener noreferrer">README</a> has the full setup steps including the Bedrock console access request and a troubleshooting section for the two most common <code>ValidationException</code> and <code>ResourceNotFoundException</code> errors.</p>
<p>The four scenarios in the demo section are good starting points — emergency path, clarification loop, out-of-stock pharmacy substitution, and multilingual intake. Each exercises a different branch of the graph.</p>
<hr>
<p>This is a JigsawFlux project. JigsawFlux builds open-source tools for health tech, humanitarian response, and crisis management — tools designed to work on constrained budgets, unreliable infrastructure, and donated hardware. If you're working on something in this space, or you want to contribute to this project, the <a href="https://github.com/JigsawFlux" target="_blank" rel="noopener noreferrer">JigsawFlux GitHub organisation</a> is where the work happens.</p>]]></content>
        <author>
            <name>Suresh Thomas</name>
            <uri>https://github.com/st185229</uri>
        </author>
        <category label="langgraph" term="langgraph"/>
        <category label="aws-bedrock" term="aws-bedrock"/>
        <category label="langchain" term="langchain"/>
        <category label="agents" term="agents"/>
        <category label="human-in-the-loop" term="human-in-the-loop"/>
        <category label="architecture" term="architecture"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Stop Burning AI Credits: A Framework for Right-Sizing Model Usage]]></title>
        <id>https://jigsawflux.org/blog/ai-credits-optimization</id>
        <link href="https://jigsawflux.org/blog/ai-credits-optimization"/>
        <updated>2026-06-11T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[How enterprises can cut AI token waste with complexity-based model routing and a dual-gateway architecture.]]></summary>
        <content type="html"><![CDATA[<p>Four months into 2026, Uber's AI budget for the year was already gone — thousands of engineers with un-gated access to Claude Code, bills reportedly running $500–$2,000 per person per month, and leadership asking very loud questions about what exactly all those tokens were buying. Around the same time, an unnamed "mystery company" reportedly burned $500 million on Claude credits in a single month — not from a runaway model or a billing bug, but because nobody had thought to put a usage cap on employee licences.</p>
<p>Neither story is about bad engineering. Both are about one broken default: when developers get unrestricted access to frontier models, they use frontier models for everything.</p>
<p>I've been working through a framework to fix this — not by restricting access, but by routing the right task to the right model. The goal is frictionless development that doesn't quietly drain your budget. Here's how it works.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="the-shift-that-changed-everything">The shift that changed everything<a href="https://jigsawflux.org/blog/ai-credits-optimization#the-shift-that-changed-everything" class="hash-link" aria-label="Direct link to The shift that changed everything" title="Direct link to The shift that changed everything">​</a></h2>
<p>Until early 2026, enterprise AI tooling largely ran on flat-rate subscriptions. You paid a seat fee, your team used whatever they needed, and costs were predictable. That model is gone. GitHub Copilot retired flat-rate allowances in favour of token-metered "AI Credits" on June 1, 2026. Every major provider has followed the same trajectory.</p>
<p>The problem isn't the pricing model — it's that usage habits haven't caught up. Complex agentic tasks and heavy reasoning models consume tokens exponentially faster than simple completions. A developer running Claude Opus to autocomplete a for-loop is swinging a sledgehammer at a drawing pin. The pin goes in either way; the difference is what the swing cost.</p>
<p>That's the sledgehammer problem, and it's the single largest source of AI budget waste: identical output, ten times the price.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="tier-your-workload">Tier your workload<a href="https://jigsawflux.org/blog/ai-credits-optimization#tier-your-workload" class="hash-link" aria-label="Direct link to Tier your workload" title="Direct link to Tier your workload">​</a></h2>
<p>The fix is simpler than it sounds: match the model to the complexity of the task. Not every request needs deep reasoning. Not every request needs a lightweight model either. Three tiers cover almost everything.</p>
<table><thead><tr><th style="text-align:left">Task Complexity</th><th style="text-align:left">Typical Use Cases</th><th style="text-align:left">Recommended Tier</th><th style="text-align:left">Cost Profile</th></tr></thead><tbody><tr><td style="text-align:left"><strong>Tier 1 — Simple &amp; Deterministic</strong></td><td style="text-align:left">Inline completion, boilerplate, unit tests, regex, Dockerfiles</td><td style="text-align:left">Efficient models (Haiku, GPT-4o-mini, Gemini Flash)</td><td style="text-align:left">📉 Lowest</td></tr><tr><td style="text-align:left"><strong>Tier 2 — Moderate &amp; Generative</strong></td><td style="text-align:left">Component logic, API endpoints, Mermaid diagram generation</td><td style="text-align:left">Balanced models (Sonnet, Gemini Pro)</td><td style="text-align:left">⚖️ Medium</td></tr><tr><td style="text-align:left"><strong>Tier 3 — Complex Reasoning</strong></td><td style="text-align:left">Architecture design, C++ memory debugging, large repo refactoring</td><td style="text-align:left">Frontier models (Opus, GPT-5, Gemini Ultra)</td><td style="text-align:left">📈 Highest</td></tr></tbody></table>
<p>The goal isn't to ban frontier models — it's to use them where they actually earn their keep. A well-structured RAG pipeline or a cross-cutting refactor across a million-line codebase? That's a Tier 3 problem. A standard API endpoint in Go? It isn't.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="two-traffic-lanes-ide-vs-programmatic">Two traffic lanes: IDE vs programmatic<a href="https://jigsawflux.org/blog/ai-credits-optimization#two-traffic-lanes-ide-vs-programmatic" class="hash-link" aria-label="Direct link to Two traffic lanes: IDE vs programmatic" title="Direct link to Two traffic lanes: IDE vs programmatic">​</a></h2>
<p>Tiering the workload is half the answer. The other half is routing enforcement — making sure models are actually selected based on task complexity rather than developer habit. The architecture I recommend splits AI traffic into two distinct lanes, each governed differently.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="lane-a-ide-traffic-via-github-copilot">Lane A: IDE traffic via GitHub Copilot<a href="https://jigsawflux.org/blog/ai-credits-optimization#lane-a-ide-traffic-via-github-copilot" class="hash-link" aria-label="Direct link to Lane A: IDE traffic via GitHub Copilot" title="Direct link to Lane A: IDE traffic via GitHub Copilot">​</a></h3>
<p>Everyday inline coding and IDE chat stays within the GitHub Enterprise account. Governance happens through native Copilot policies: seat-based licensing, Targeted Model Rules, and usage caps. The key move is enforcing "Auto" mode as the default for standard users — it selects an appropriate model rather than defaulting to the most expensive one. Senior architects and principal engineers can be granted Tier 3 access for the work that justifies it.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="lane-b-programmatic-traffic-via-azure-apim">Lane B: Programmatic traffic via Azure APIM<a href="https://jigsawflux.org/blog/ai-credits-optimization#lane-b-programmatic-traffic-via-azure-apim" class="hash-link" aria-label="Direct link to Lane B: Programmatic traffic via Azure APIM" title="Direct link to Lane B: Programmatic traffic via Azure APIM">​</a></h3>
<p>Custom scripts, CI/CD pipelines, and internal tools route through a single Azure API Management gateway. Rather than each team managing its own provider keys and burning Tier 3 credits by default, everything flows through a central control point that provides:</p>
<ul>
<li><strong>Identity-based access control</strong> via Entra ID</li>
<li><strong>Semantic caching</strong> — identical prompts return cached responses at zero token cost</li>
<li><strong>Intelligent routing</strong> across Azure AI Foundry and AWS Bedrock</li>
<li><strong>Dollar-based budgets</strong> enforced per team or pipeline</li>
</ul>
<p>The caching point deserves emphasis. An automated pipeline making 50 near-identical classification requests doesn't need to pay for 50 model calls. It pays for one, caches the response, and the remaining 49 return instantly — the most straightforward cost reduction in the entire framework.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="architecture">Architecture<a href="https://jigsawflux.org/blog/ai-credits-optimization#architecture" class="hash-link" aria-label="Direct link to Architecture" title="Direct link to Architecture">​</a></h2>
<!-- -->
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="what-this-looks-like-day-to-day">What this looks like day-to-day<a href="https://jigsawflux.org/blog/ai-credits-optimization#what-this-looks-like-day-to-day" class="hash-link" aria-label="Direct link to What this looks like day-to-day" title="Direct link to What this looks like day-to-day">​</a></h2>
<p>Three scenarios that show the framework working in practice:</p>
<p><strong>The developer writing .NET boilerplate in VS Code</strong> opens a file and starts typing. Copilot's Auto mode kicks in with a Tier 1 model — fast, cheap, accurate for the task. The developer never thinks about model selection, and the team's AI Credits aren't quietly being drained by Opus completions for standard controller logic.</p>
<p><strong>The architect testing a RAG pipeline</strong> writes a Python script to benchmark embedding strategies. Instead of managing three different provider API keys, they authenticate once via Entra ID and send all calls through the APIM gateway. The gateway checks their team budget, routes to the appropriate tier, and logs everything. If their budget is close to the limit, a threshold alert fires — not a surprise invoice.</p>
<p><strong>The data pipeline categorising code risk across 50 repositories</strong> runs nightly. The first repository's analysis is processed and cached. Repositories two through fifty hit the semantic cache and return at zero token cost. A task that would otherwise consume 50× the tokens costs the same as one.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="rolling-this-out">Rolling this out<a href="https://jigsawflux.org/blog/ai-credits-optimization#rolling-this-out" class="hash-link" aria-label="Direct link to Rolling this out" title="Direct link to Rolling this out">​</a></h2>
<p>Three phases, and the order is deliberate:</p>
<ol>
<li><strong>Gateway first.</strong> Stand up Azure APIM and migrate all programmatic API calls to the unified endpoint. Establish identity-based routing and per-team budgets before anything else. You can't govern what you can't see.</li>
<li><strong>IDE governance second.</strong> Once programmatic traffic is visible and controlled, implement GitHub Enterprise Targeted Model Rules. Restrict Tier 3 access to roles that genuinely need it; set Auto as the default for everyone else.</li>
<li><strong>Team education last.</strong> Infrastructure changes enforce behaviour. Education reinforces it. Once the system is in place, rolling out guidelines on prompt scoping, context compression, and model selection across VS Code, Cursor, and Antigravity lands on a foundation that already supports the habits you're trying to build.</li>
</ol>
<p>The temptation is to start with education — it feels fast and low-risk. But guidelines without enforcement fade. Get the infrastructure right first.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="the-non-profit-dimension">The non-profit dimension<a href="https://jigsawflux.org/blog/ai-credits-optimization#the-non-profit-dimension" class="hash-link" aria-label="Direct link to The non-profit dimension" title="Direct link to The non-profit dimension">​</a></h2>
<p>Everything above assumes an enterprise with a budget to optimise. But the same billing shock hits harder when you're running on grants, donations, or a volunteer-funded open-source project. Non-profits face a specific version of this problem: the same productivity pressure, the same tooling expectations from technical staff, and far less capacity to absorb a surprise $2,000-per-engineer monthly bill.</p>
<p>The tiering framework still applies — but for cost-constrained organisations, there are additional levers worth considering beyond just routing between model tiers.</p>
<p><strong>Local inference</strong> removes marginal token cost entirely. Tools like <a href="https://ollama.com/" target="_blank" rel="noopener noreferrer">Ollama</a> let you run open models (Llama, Mistral, Phi, Gemma) on local hardware or a small cloud VM — I've written about exactly this setup before, <a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab">running Ollama on a Kubernetes home lab</a>. The per-query cost drops to near zero; you trade API fees for infrastructure and a ceiling on model capability. For Tier 1 tasks — boilerplate, unit tests, simple completions — a locally hosted 7B or 13B model is often indistinguishable in output quality from a cloud API call, at a fraction of the ongoing cost.</p>
<p>Local hosting earns its keep in two situations beyond pure cost. The first is <strong>edge deployment</strong>: when inference needs to run on devices in the field — clinics with unreliable connectivity, crisis-response hardware, remote sensors — a cloud API isn't slow, it's unavailable. The second is subtler: for high-frequency, low-complexity workloads, the <strong>network round trip itself becomes a cost</strong>. Every cloud call pays latency and egress on top of the token price. A local model answering in tens of milliseconds, with no per-call fee, beats a cloud frontier model answering the same mundane question in two seconds — both on responsiveness and on the invoice.</p>
<p><strong>Corporate cloud hosting</strong> sits between local inference and direct API access. Running models through AWS Bedrock or Azure AI Foundry — rather than calling Anthropic or Google directly — typically costs 20–40% less per token under enterprise agreements. More importantly for non-profits, it keeps data within a known compliance boundary, avoids egress fees from mixed-cloud setups, and makes budget governance easier through existing cloud billing tools. If your organisation is already on Azure or AWS, routing AI workloads through Foundry or Bedrock is often the fastest path to meaningful cost reduction without changing tooling.</p>
<p>The decision tree for cost-constrained teams:</p>
<ol>
<li>Can a local model (Ollama + Llama/Mistral) handle this task acceptably? Use it.</li>
<li>Is your org already on Azure or AWS? Route through Foundry or Bedrock first.</li>
<li>Does the task genuinely need a frontier capability? Then pay for the direct API — but only then.</li>
</ol>
<p>The same tiering logic, extended one level further down.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="references">References<a href="https://jigsawflux.org/blog/ai-credits-optimization#references" class="hash-link" aria-label="Direct link to References" title="Direct link to References">​</a></h2>
<ol>
<li><em>Visual Studio Magazine</em>, "Copilot Billing Shock Hits Developers" (June 4, 2026)</li>
<li><em>InfoWorld</em>, "GitHub shifts Copilot to usage-based billing" (April 28, 2026)</li>
<li><em>Forbes</em>, <a href="https://www.forbes.com/sites/janakirammsv/2026/05/17/uber-burns-its-2026-ai-budget-in-four-months-on-claude-code/" target="_blank" rel="noopener noreferrer">"Uber Burns Its 2026 AI Budget in Four Months on Claude Code"</a> (May 17, 2026)</li>
<li><em>AI Magazine</em>, <a href="https://aimagazine.com/news/why-uber-has-already-burned-through-its-ai-budget" target="_blank" rel="noopener noreferrer">"Why Uber Has Already Burned Through Its AI Budget"</a></li>
<li><em>Inc. Magazine</em>, "1 Company Spent Half a Billion Dollars on Claude in a Single Month" (June 5, 2026)</li>
<li><em>Tom's Hardware</em>, "Mystery company accidentally blew $500 million on Claude AI in a single month" (May 29, 2026)</li>
</ol>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="appendix-how-this-post-was-written">Appendix: How this post was written<a href="https://jigsawflux.org/blog/ai-credits-optimization#appendix-how-this-post-was-written" class="hash-link" aria-label="Direct link to Appendix: How this post was written" title="Direct link to Appendix: How this post was written">​</a></h2>
<p>This post was drafted and iterated entirely using Claude Code — which makes it a small live example of the tiering approach it describes. Three sessions, three different kinds of work, with measurably different token costs for each.</p>
<p><strong>Session 1 — structural work (Sonnet + Haiku subagent)</strong></p>
<p>The first session covered the heavy lifting: converting a raw strategy document into a structured blog post, rewriting from third-person enterprise-doc tone to first-person narrative, and setting up the publishing infrastructure (frontmatter, truncate markers, draft log).</p>
<table><thead><tr><th style="text-align:left">Step</th><th style="text-align:left">Model</th><th style="text-align:left">Use case</th><th style="text-align:left">Cost</th></tr></thead><tbody><tr><td style="text-align:left">Iter 0</td><td style="text-align:left">claude-sonnet-4-6</td><td style="text-align:left">Frontmatter, CLAUDE.md setup, draft log scaffolding</td><td style="text-align:left">—</td></tr><tr><td style="text-align:left">Iter 1</td><td style="text-align:left">claude-sonnet-4-6</td><td style="text-align:left">Full structural rewrite — enterprise doc → first-person blog</td><td style="text-align:left">—</td></tr><tr><td style="text-align:left">Subagent</td><td style="text-align:left">claude-haiku-4-5</td><td style="text-align:left">Read-only codebase lookup (existing blog tone analysis)</td><td style="text-align:left">$0.02</td></tr><tr><td style="text-align:left"><strong>Session 1 total</strong></td><td style="text-align:left"></td><td style="text-align:left">1.5k input · 18k output · 1.6m cache read</td><td style="text-align:left"><strong>$1.30</strong></td></tr></tbody></table>
<p><strong>Session 2 — prose and new sections (Sonnet)</strong></p>
<p>The second session expanded the post: narrative restructuring, the non-profit section, and the first version of this appendix. Same model as Session 1, but cheaper — targeted edits on a stable structure produce far fewer output tokens than a full rewrite.</p>
<table><thead><tr><th style="text-align:left">Step</th><th style="text-align:left">Model</th><th style="text-align:left">Use case</th><th style="text-align:left">Cost</th></tr></thead><tbody><tr><td style="text-align:left">Iter 2–3</td><td style="text-align:left">claude-sonnet-4-6</td><td style="text-align:left">Prose polish, non-profit section, appendix with real cost data</td><td style="text-align:left">—</td></tr><tr><td style="text-align:left"><strong>Session 2 total</strong></td><td style="text-align:left"></td><td style="text-align:left">0.7k input · 11.1k output · 1.0m cache read</td><td style="text-align:left"><strong>~$0.56</strong></td></tr></tbody></table>
<p><strong>Session 3 — narrative pass (Fable 5)</strong></p>
<p>The final session switched to <code>claude-fable-5</code>, Anthropic's narrative-optimised model, for a light editorial pass: metaphor consistency, sentence rhythm, and voice. The smallest change set of the three sessions — and, unexpectedly, the most expensive.</p>
<table><thead><tr><th style="text-align:left">Step</th><th style="text-align:left">Model</th><th style="text-align:left">Use case</th><th style="text-align:left">Cost</th></tr></thead><tbody><tr><td style="text-align:left">Iter 4</td><td style="text-align:left">claude-fable-5</td><td style="text-align:left">Editorial polish — metaphor consistency, rhythm, voice</td><td style="text-align:left">$2.25</td></tr><tr><td style="text-align:left"><strong>Session 3 total</strong></td><td style="text-align:left"></td><td style="text-align:left">645 input · 5.1k output</td><td style="text-align:left"><strong>$2.25</strong></td></tr></tbody></table>
<p>Look at the numbers side by side. Sonnet generated 31.4k output tokens across two sessions of structural and prose work for $2.01. Fable generated 5.1k tokens — a sixth of the volume — for $2.25. The lightest session cost the most, because premium per-token pricing dominated token count entirely.</p>
<p>That accidental result is a cleaner demonstration of this post's thesis than anything I planned: <strong>model choice drives cost more than workload size does.</strong> Whether the premium model was worth it for an editorial pass is exactly the kind of question the tiering framework exists to force. Total across all three sessions: <strong>$4.28</strong>.</p>
<p><em>Full token breakdown: <code>_DRAFT_LOG.md</code> in this post's directory.</em></p>]]></content>
        <author>
            <name>Suresh Thomas</name>
            <uri>https://github.com/st185229</uri>
        </author>
        <category label="ai-credits" term="ai-credits"/>
        <category label="github-copilot" term="github-copilot"/>
        <category label="azure" term="azure"/>
        <category label="cost-optimization" term="cost-optimization"/>
        <category label="enterprise" term="enterprise"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Running a Local LLM on Kubernetes — A Home Lab Setup]]></title>
        <id>https://jigsawflux.org/blog/local-llm-kubernetes-home-lab</id>
        <link href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab"/>
        <updated>2026-06-01T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Part 2 of the local LLM series moves Ollama from bare-metal Linux into a MicroK8s cluster on an Intel NUC home lab server — covering StatefulSet deployment, Open-WebUI, MetalLB ingress, and the honest truth about running CPU-only inference on an AMD GPU machine.]]></summary>
        <content type="html"><![CDATA[<p>In <a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike">Part 1</a> I ran Ollama directly on a Linux machine and wired it up through an MCP layer to a small web app. It worked. But bare-metal has friction — if the process crashes, it stays down. Adding Open-WebUI means managing another process. Resource limits are manual. There's no clean internal networking between services.</p>
<p>This post moves the whole thing into Kubernetes. The goal isn't enterprise-grade infrastructure — it's a home lab setup that's reliable, easy to extend, and honest about its limitations.</p>
<p><em>Manifests are in the <a href="https://github.com/JigsawFlux/ollama-mcp-starter" target="_blank" rel="noopener noreferrer"><code>ollama-mcp-starter</code></a> repo under <code>backend/k8s-deployment/</code>.</em></p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="the-hardware">The Hardware<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#the-hardware" class="hash-link" aria-label="Direct link to The Hardware" title="Direct link to The Hardware">​</a></h2>
<p>The server is an Intel NUC Hades Canyon (NUC8i7HVK) — a small-form-factor machine with a skull logo on the lid and a surprisingly capable spec for its size:</p>
<table><thead><tr><th>Component</th><th>Detail</th></tr></thead><tbody><tr><td>CPU</td><td>Intel Core i7-8809G, 4C/8T, 3.1 GHz base / 4.2 GHz turbo</td></tr><tr><td>RAM</td><td>32 GB DDR4</td></tr><tr><td>Storage</td><td>NVMe SSD (M.2 PCIe)</td></tr><tr><td>GPU</td><td>AMD Radeon RX Vega M GH — 4 GB HBM2</td></tr><tr><td>Network</td><td>2× Intel Gigabit LAN</td></tr><tr><td>Power</td><td>230W external adapter</td></tr></tbody></table>
<p>It draws modest power for a home server, runs quietly, and fits on a shelf. These are the things that matter when it's on 24/7.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="the-gpu-caveat">The GPU caveat<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#the-gpu-caveat" class="hash-link" aria-label="Direct link to The GPU caveat" title="Direct link to The GPU caveat">​</a></h3>
<p>The Vega M GH is a capable GPU for graphics workloads, but <strong>Ollama's GPU acceleration uses CUDA — an NVIDIA-only technology</strong>. AMD GPU support via ROCm exists in Ollama but requires manual configuration and is not supported through the standard Kubernetes GPU operator.</p>
<p>In practice: Ollama on this box runs <strong>CPU-only</strong>. The i7-8809G handles inference well enough for personal and home lab use — expect 10–20 tokens/second with <code>llama3.1:8b</code>. More on this in <a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#lessons-learned">Lessons Learned</a>.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="why-move-to-kubernetes">Why Move to Kubernetes?<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#why-move-to-kubernetes" class="hash-link" aria-label="Direct link to Why Move to Kubernetes?" title="Direct link to Why Move to Kubernetes?">​</a></h2>
<p>Running Ollama on bare metal works, but once you add a second service (Open-WebUI), you're managing two processes manually. A third service and it becomes unwieldy. Kubernetes solves this cleanly:</p>
<ul>
<li><strong>Auto-restart</strong> — pods restart automatically on crash; no babysitting processes</li>
<li><strong>Resource limits</strong> — cap CPU and memory per service so one greedy process can't starve the others</li>
<li><strong>Persistent storage</strong> — PersistentVolumeClaims keep model data across pod restarts</li>
<li><strong>Internal DNS</strong> — services talk to each other by name (<code>ollama-service.ollama.svc.cluster.local</code>) rather than fragile IP addresses</li>
<li><strong>Ingress</strong> — one load balancer IP, hostname-based routing to multiple services</li>
</ul>
<p>For a single-node home lab, <strong>MicroK8s</strong> is the right choice. It installs as a snap package, ships with the add-ons you need, and doesn't require a multi-node cluster to be useful.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="architecture-overview">Architecture Overview<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#architecture-overview" class="hash-link" aria-label="Direct link to Architecture Overview" title="Direct link to Architecture Overview">​</a></h2>
<p>Before diving into setup, here's the logical view of the full system:</p>
<!-- -->
<p>Two access paths to Ollama:</p>
<ul>
<li><strong>MCP path</strong> (from Part 1) — Browser → Static Frontend → Node.js Backend → MCP Server → <code>ollama.local</code></li>
<li><strong>Direct path</strong> — Browser → <code>ai.local</code> → Open-WebUI → Ollama (internal ClusterIP)</li>
</ul>
<p>Both paths go through MetalLB and Nginx on the NUC. The MCP app and frontend still run on the dev machine; only the inference layer lives in Kubernetes.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="microk8s-setup">MicroK8s Setup<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#microk8s-setup" class="hash-link" aria-label="Direct link to MicroK8s Setup" title="Direct link to MicroK8s Setup">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="install">Install<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#install" class="hash-link" aria-label="Direct link to Install" title="Direct link to Install">​</a></h3>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">sudo snap install microk8s --classic</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">sudo usermod -aG microk8s $USER</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">newgrp microk8s</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Verify the node is ready:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">microk8s kubectl get nodes</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="enable-add-ons">Enable Add-ons<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#enable-add-ons" class="hash-link" aria-label="Direct link to Enable Add-ons" title="Direct link to Enable Add-ons">​</a></h3>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">microk8s enable dns</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">microk8s enable storage</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">microk8s enable ingress</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">microk8s enable metallb</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>When enabling MetalLB, you'll be prompted for an IP range. Use a small slice of your local subnet that won't conflict with DHCP — for example:</p>
<div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">192.168.1.50-192.168.1.60</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>MetalLB will assign IPs from this pool to LoadBalancer services. The ingress controller picks up <code>192.168.1.54</code> in this setup.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="kubectl-alias">kubectl alias<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#kubectl-alias" class="hash-link" aria-label="Direct link to kubectl alias" title="Direct link to kubectl alias">​</a></h3>
<p>MicroK8s ships its own <code>kubectl</code>. To use the standard <code>kubectl</code> command (useful for remote access from another machine):</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">microk8s config &gt; ~/.kube/config</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then install <code>kubectl</code> on your client machine and point it at the exported config.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="gpu-operator">GPU Operator<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#gpu-operator" class="hash-link" aria-label="Direct link to GPU Operator" title="Direct link to GPU Operator">​</a></h3>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">microk8s enable gpu</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>This installs the <strong>NVIDIA GPU Operator</strong>. It works well if your node has an NVIDIA card — it handles driver installation, container runtime configuration, and makes GPUs schedulable as resources.</p>
<p>On this NUC, the GPU is AMD (Vega M GH), so the operator deploys but has nothing to drive. Ollama falls back to CPU. The <code>gpu-operator-resources</code> namespace will exist but be idle. See <a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#lessons-learned">Lessons Learned</a>.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="deploying-ollama">Deploying Ollama<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#deploying-ollama" class="hash-link" aria-label="Direct link to Deploying Ollama" title="Direct link to Deploying Ollama">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="why-a-statefulset">Why a StatefulSet?<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#why-a-statefulset" class="hash-link" aria-label="Direct link to Why a StatefulSet?" title="Direct link to Why a StatefulSet?">​</a></h3>
<p>Ollama stores pulled models in <code>/root/.ollama</code>. A <code>Deployment</code> gives pods random names and doesn't guarantee stable storage attachment. A <code>StatefulSet</code> gives a stable pod name (<code>ollama-0</code>), ordered startup and shutdown, and a consistent binding between the pod and its PersistentVolumeClaim.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="the-stack-manifest">The Stack Manifest<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#the-stack-manifest" class="hash-link" aria-label="Direct link to The Stack Manifest" title="Direct link to The Stack Manifest">​</a></h3>
<p><code>ollama-stack.yaml</code> creates four resources in sequence:</p>
<ol>
<li>The <code>ollama</code> namespace</li>
<li>A 50 Gi PVC (models are large — <code>llama3.1:8b</code> is ~5 GB; headroom matters)</li>
<li>The Ollama StatefulSet</li>
<li>A ClusterIP service on port 11434</li>
</ol>
<div class="language-yaml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-yaml codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token key atrule">apiVersion</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> apps/v1</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token key atrule">kind</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> StatefulSet</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token key atrule">metadata</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token key atrule">name</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> ollama</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token key atrule">namespace</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> ollama</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token key atrule">spec</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token key atrule">replicas</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token number" style="color:rgb(247, 140, 108)">1</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token key atrule">template</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token key atrule">spec</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">      </span><span class="token key atrule">containers</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain"> </span><span class="token key atrule">name</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> ollama</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">          </span><span class="token key atrule">image</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> ollama/ollama</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain">latest</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">          </span><span class="token key atrule">resources</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token key atrule">limits</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">              </span><span class="token key atrule">cpu</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"4"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">              </span><span class="token key atrule">memory</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"16Gi"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token key atrule">requests</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">              </span><span class="token key atrule">cpu</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"2"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">              </span><span class="token key atrule">memory</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"4Gi"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">          </span><span class="token key atrule">volumeMounts</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">            </span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain"> </span><span class="token key atrule">name</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> ollama</span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain">volume</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">              </span><span class="token key atrule">mountPath</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> /root/.ollama</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The memory limit of 16 Gi gives <code>llama3.1:8b</code> enough headroom to load the full model into RAM without competing with Open-WebUI for the remaining 16 Gi.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="model-preload-with-initcontainer">Model Preload with initContainer<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#model-preload-with-initcontainer" class="hash-link" aria-label="Direct link to Model Preload with initContainer" title="Direct link to Model Preload with initContainer">​</a></h3>
<p>The first time Ollama starts, no models are pulled. If your app tries to chat before a model exists, it fails with a confusing error. <code>ollama-automated.yaml</code> solves this with an <code>initContainer</code> that pulls the model before the main container starts:</p>
<div class="language-yaml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-yaml codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token key atrule">initContainers</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain"> </span><span class="token key atrule">name</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> pull</span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain">model</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token key atrule">image</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> ollama/ollama</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain">latest</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token key atrule">command</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"/bin/sh"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"-c"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token key atrule">args</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">      </span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">|</span><span class="token scalar string" style="color:rgb(195, 232, 141)"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token scalar string" style="color:rgb(195, 232, 141)">        ollama serve &amp;</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token scalar string" style="color:rgb(195, 232, 141)">        sleep 5</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token scalar string" style="color:rgb(195, 232, 141)">        ollama pull llama3.1:8b</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token scalar string" style="color:rgb(195, 232, 141)">        kill %1</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token key atrule">volumeMounts</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">      </span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain"> </span><span class="token key atrule">name</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> ollama</span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain">volume</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token key atrule">mountPath</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> /root/.ollama</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The init container starts <code>ollama serve</code> in the background, waits for it to be ready, pulls the model, then stops. The main container finds the model already on disk and starts serving immediately.</p>
<p>Apply it:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/ollama-stack.yaml</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># or, to pre-pull the model on first boot:</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/ollama-automated.yaml</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="deploying-open-webui">Deploying Open-WebUI<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#deploying-open-webui" class="hash-link" aria-label="Direct link to Deploying Open-WebUI" title="Direct link to Deploying Open-WebUI">​</a></h2>
<p>Open-WebUI is a polished chat interface that connects directly to Ollama. <code>open-webui-stack.yaml</code> deploys it in the same <code>ollama</code> namespace:</p>
<div class="language-yaml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-yaml codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token key atrule">env</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain"> </span><span class="token key atrule">name</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> OLLAMA_BASE_URL</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token key atrule">value</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> http</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain">//ollama</span><span class="token punctuation" style="color:rgb(199, 146, 234)">-</span><span class="token plain">service.ollama.svc.cluster.local</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token number" style="color:rgb(247, 140, 108)">11434</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The key line is <code>OLLAMA_BASE_URL</code>. It uses Kubernetes internal DNS (<code>&lt;service&gt;.&lt;namespace&gt;.svc.cluster.local</code>) to reach Ollama — no hard-coded IPs, no reliance on external networking. If the Ollama pod restarts and gets a new IP, the DNS name still resolves correctly.</p>
<p>A 10 Gi PVC stores chat history and Open-WebUI configuration.</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/open-webui-stack.yaml</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Once running, Open-WebUI is available at <code>http://ai.local</code> (after ingress is configured below).</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="metallb--nginx-ingress">MetalLB + Nginx Ingress<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#metallb--nginx-ingress" class="hash-link" aria-label="Direct link to MetalLB + Nginx Ingress" title="Direct link to MetalLB + Nginx Ingress">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="why-metallb">Why MetalLB?<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#why-metallb" class="hash-link" aria-label="Direct link to Why MetalLB?" title="Direct link to Why MetalLB?">​</a></h3>
<p>In a cloud cluster, <code>type: LoadBalancer</code> services get a public IP automatically from the cloud provider. On bare metal, nothing assigns that IP — services stay in <code>&lt;pending&gt;</code>. MetalLB fills that gap by assigning IPs from a configured pool to LoadBalancer services on your local network.</p>
<p><code>ingress-lb.yaml</code> creates a LoadBalancer service for the Nginx ingress controller:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/ingress-lb.yaml</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>MetalLB assigns <code>192.168.1.54</code> from the configured pool. Verify:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl get svc -n ingress</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># ingress-loadbalancer   LoadBalancer   ...   192.168.1.54   80:...,443:...</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="ingress-rules">Ingress Rules<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#ingress-rules" class="hash-link" aria-label="Direct link to Ingress Rules" title="Direct link to Ingress Rules">​</a></h3>
<p>Three ingress objects route hostnames to services:</p>
<table><thead><tr><th>File</th><th>Hostname</th><th>Target Service</th></tr></thead><tbody><tr><td><code>ollama-ingress.yaml</code></td><td><code>ollama.local</code></td><td><code>ollama-service:11434</code></td></tr><tr><td><code>open-webui-ingress.yaml</code></td><td><code>ai.local</code></td><td><code>open-webui-service:80</code></td></tr><tr><td><code>dashboard-ingress.yaml</code></td><td><code>dashboard.local</code></td><td>Kubernetes dashboard</td></tr></tbody></table>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/ollama-ingress.yaml</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/open-webui-ingress.yaml</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/dashboard-ingress.yaml</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="client-hosts-file">Client Hosts File<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#client-hosts-file" class="hash-link" aria-label="Direct link to Client Hosts File" title="Direct link to Client Hosts File">​</a></h3>
<p>Any machine that wants to reach these hostnames needs a single <code>/etc/hosts</code> entry pointing the names at the MetalLB IP:</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">192.168.1.54  dashboard.local ollama.local ai.local</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>On macOS: <code>/etc/hosts</code>. On Windows: <code>C:\Windows\System32\drivers\etc\hosts</code>.</p>
<p>After this, <code>http://ai.local</code> opens Open-WebUI in the browser, and <code>http://ollama.local</code> is the Ollama API endpoint.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="connecting-the-mcp-app-from-part-1">Connecting the MCP App from Part 1<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#connecting-the-mcp-app-from-part-1" class="hash-link" aria-label="Direct link to Connecting the MCP App from Part 1" title="Direct link to Connecting the MCP App from Part 1">​</a></h2>
<p>The MCP app from Part 1 pointed at the bare-metal Ollama IP. Switching to the k8s deployment is a one-line <code>.env</code> change in both <code>backend/.env</code> and <code>mcp-server/.env</code>:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain"># Before (bare metal)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">OLLAMA_HOST=http://192.168.1.80:11434</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># After (k8s ingress)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">OLLAMA_HOST=http://ollama.local</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">OLLAMA_MODEL=llama3.1:8b</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">OLLAMA_TIMEOUT=180000</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The app now survives an Ollama pod restart without any manual intervention — the pod comes back up, the DNS name resolves, and the next request succeeds. The 3-minute timeout (<code>180000</code> ms) accounts for the slower CPU-only inference on <code>llama3.1:8b</code>.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="verifying-the-deployment">Verifying the Deployment<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#verifying-the-deployment" class="hash-link" aria-label="Direct link to Verifying the Deployment" title="Direct link to Verifying the Deployment">​</a></h2>
<p>Check everything is running:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl get pods,svc,ingress -n ollama</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Expected output:</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">NAME                             READY   STATUS    RESTARTS</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">pod/ollama-0                     1/1     Running   1</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">pod/open-webui-648d966b5-jsvfl   1/1     Running   2</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">NAME                         TYPE        CLUSTER-IP       PORT(S)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">service/ollama-service       ClusterIP   10.152.183.132   11434/TCP</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">service/open-webui-service   ClusterIP   10.152.183.48    80/TCP</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">NAME                                    HOSTS        ADDRESS</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">ingress/ollama-ingress                  ollama.local 127.0.0.1</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">ingress/open-webui-ingress             ai.local     127.0.0.1</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Test Ollama directly:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">curl http://ollama.local/api/tags</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="demo">Demo<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#demo" class="hash-link" aria-label="Direct link to Demo" title="Direct link to Demo">​</a></h2>
<p>With everything running, Open-WebUI is available at <code>http://ai.local</code> — a full chat interface served entirely from the home lab, with no cloud API involved.</p>
<p><img decoding="async" loading="lazy" alt="Open-WebUI running at ai.local with llama3.1:8b responding to a question about kids&amp;#39; art" src="https://jigsawflux.org/blog/assets/images/working_screenshot-3636e7b8e7d1c109d9a92b8c8f1cf515.jpg" width="2696" height="1808" class="img_ev3q"></p>
<p><code>llama3.1:8b</code> handling a practical question with a detailed, well-structured response. The model name and URL in the browser confirm it's running locally through the k8s ingress.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="lessons-learned">Lessons Learned<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#lessons-learned" class="hash-link" aria-label="Direct link to Lessons Learned" title="Direct link to Lessons Learned">​</a></h2>
<p><strong>The GPU assumption cost time.</strong> I installed the NVIDIA GPU operator early, assumed Ollama would detect the AMD Vega M and use it. It didn't — CUDA and ROCm are separate stacks. The Ollama container never saw the GPU. Lesson: check GPU vendor before planning acceleration. For AMD, ROCm support in Ollama requires a custom image and a different operator setup — a future post if I get there.</p>
<p><strong>CPU inference is slower than expected on larger models.</strong> <code>llama3.2:3b</code> from Part 1 was snappy on CPU. Upgrading to <code>llama3.1:8b</code> dropped throughput noticeably — around 10–15 tokens/second. The 3B model was fine for quick tests; the 8B model is better quality but requires patience. The 3-minute timeout in the MCP app exists because of this.</p>
<p><strong>The <code>initContainer</code> preload is worth the complexity.</strong> On first deploy without it, the Ollama pod starts, the Open-WebUI pod starts, a user immediately tries to chat, and gets a confusing "model not found" error. The initContainer adds maybe 5 minutes to first boot (model download) and eliminates that failure mode entirely.</p>
<p><strong>StatefulSet vs Deployment matters less than the PVC.</strong> I spent time deliberating over this. The real thing that matters is the PVC binding — models must persist across restarts. StatefulSet makes the PVC binding explicit; a Deployment with a manual PVC binding would also work. StatefulSet is cleaner.</p>
<p><strong>Ingress address shows <code>127.0.0.1</code> — that's normal.</strong> MicroK8s reports the ingress ADDRESS as <code>127.0.0.1</code> rather than the MetalLB IP. This confused me initially. The actual external IP lives on the LoadBalancer service in the <code>ingress</code> namespace, not on the ingress objects themselves.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="whats-next">What's Next<a href="https://jigsawflux.org/blog/local-llm-kubernetes-home-lab#whats-next" class="hash-link" aria-label="Direct link to What's Next" title="Direct link to What's Next">​</a></h2>
<p>The infrastructure is stable. The next steps are on the application side:</p>
<ul>
<li><strong>Streaming responses</strong> — the MCP app waits for the full model completion before showing anything; streaming would feel much more interactive</li>
<li><strong>Containerise the app</strong> — the Node.js backend and MCP server still run locally on the dev machine; packaging them as containers and deploying to the same cluster would make the whole system self-contained</li>
<li><strong>AMD GPU support</strong> — ROCm-based Ollama on the Vega M is worth exploring; a 4 GB HBM2 GPU could significantly improve throughput even at reduced model precision</li>
</ul>
<p><em>The full source, including all manifests, is on <a href="https://github.com/JigsawFlux/ollama-mcp-starter" target="_blank" rel="noopener noreferrer">GitHub</a>. Feedback and contributions welcome.</em></p>]]></content>
        <author>
            <name>Suresh Thomas</name>
            <uri>https://github.com/st185229</uri>
        </author>
        <category label="local-llm" term="local-llm"/>
        <category label="ollama" term="ollama"/>
        <category label="kubernetes" term="kubernetes"/>
        <category label="microk8s" term="microk8s"/>
        <category label="open-source" term="open-source"/>
        <category label="home-lab" term="home-lab"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Claude™, Copilot™ & Gemini™ for Architects — A 2026 Field Guide]]></title>
        <id>https://jigsawflux.org/blog/ai-for-architects-beyond-code</id>
        <link href="https://jigsawflux.org/blog/ai-for-architects-beyond-code"/>
        <updated>2026-05-20T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[AI Tools for Architects: Beyond Code Generation]]></summary>
        <content type="html"><![CDATA[<h2 class="anchor anchorWithStickyNavbar_LWe7" id="ai-tools-for-architects-beyond-code-generation">AI Tools for Architects: Beyond Code Generation<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#ai-tools-for-architects-beyond-code-generation" class="hash-link" aria-label="Direct link to AI Tools for Architects: Beyond Code Generation" title="Direct link to AI Tools for Architects: Beyond Code Generation">​</a></h2>
<blockquote>
<p><strong>Practical use cases for Solution, Infrastructure &amp; Enterprise Architects</strong>
Featuring <strong>Claude™</strong>, <strong>GitHub Copilot™</strong>, and <strong>Gemini™ + Antigravity™</strong></p>
</blockquote>
<p>The rapid explosion of LLMs and AI tools to assist software engineers and architects is mind-boggling. Selecting the right tools for each use case can be a daunting task. I have been experimenting with three superpowers: Claude Code™, GitHub Copilot™ + VS Code, and Antigravity™ + Gemini™. Variations such as using Cursor with Claude™ can easily be derived from this.</p>
<p>This article also shares a <a href="https://github.com/JigsawFlux/kafka-optimize-public-transport" target="_blank" rel="noopener noreferrer">GitHub</a> repo that demonstrates some of the architecture use cases.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="table-of-contents">Table of Contents<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#table-of-contents" class="hash-link" aria-label="Direct link to Table of Contents" title="Direct link to Table of Contents">​</a></h2>
<table><thead><tr><th>#</th><th>Section</th><th>Topic</th></tr></thead><tbody><tr><td>1</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#1-three-superpowers">Three Superpowers</a></td><td>AI Tools for Architects — Beyond Code Generation</td></tr><tr><td>2</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#2-what-changed-the-20252026-leap">What Changed</a></td><td>The 2025–2026 Leap</td></tr><tr><td>3</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#3-three-tools-three-architectural-superpowers">Three Superpowers</a></td><td>Claude · Copilot · Gemini + Antigravity</td></tr><tr><td>4</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#4-architectural-use-cases-at-a-glance-2026">Use Case Matrix</a></td><td>Architect use cases rated across all three tools</td></tr><tr><td>5</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#5-case-study-claude--architecture-reasoner">Case Study: Claude</a></td><td>Reverse-engineer architecture → ADR</td></tr><tr><td>6</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#6-case-study-2--github-copilot--agentic-devops-partner">Case Study: Copilot</a></td><td>IaC scaffold + security scan + coding agent</td></tr><tr><td>7</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#7-case-study-3--gemini--antigravity--agent-first-cloud-architect">Case Study: Gemini</a></td><td>Upload diagram → component spec + GCP mapping</td></tr><tr><td>8</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#8-elevated-use-cases-docs-security--togaf">Elevated Use Cases</a></td><td>ADRs, Security, TOGAF, POC/MVP</td></tr><tr><td>9</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#9-the-ai-augmented-architecture-workflow-2026">Workflow</a></td><td>Discover → Design → Build → Validate</td></tr><tr><td>10</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#10-case-study-output">Case Study Output</a></td><td>ADRs, Architecture &amp; Trade-off docs on GitHub</td></tr><tr><td>11</td><td><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#11-start-today">Start Today</a></td><td>This Week / This Month / This Quarter actions</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="1-three-superpowers">1. Three Superpowers<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#1-three-superpowers" class="hash-link" aria-label="Direct link to 1. Three Superpowers" title="Direct link to 1. Three Superpowers">​</a></h2>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>From Generators to Agents</summary><div><div class="collapsibleContent_i85q"><p>All three tools — Claude, GitHub Copilot, and Google Gemini (including Google's new Antigravity IDE) — have evolved massively in the past year. They're no longer just code generators. They now have agentic capabilities, long context windows, multimodal understanding, and integrated security scanning. This talk shows how architects can leverage each tool's unique strengths for elevated use cases: reverse-engineering architecture, generating TOGAF artefacts, security analysis, and building POC/SPIKEs. Note: Antigravity is Google's agent-first IDE (VS Code fork) released in Nov 2025 alongside Gemini 3 — it's the IDE where Gemini truly comes alive.</p></div></div></details>
<p><img decoding="async" loading="lazy" alt="Slide 1 — Title" src="https://jigsawflux.org/blog/assets/images/slide3-01-3ffc3573f5392f60a3e720762f0c71cc.jpg" width="1300" height="732" class="img_ev3q"></p>
<p><strong>AI Tools for Architects — Beyond Code Generation</strong></p>
<p>ARCHITECTURE × ARTIFICIAL INTELLIGENCE</p>
<p>Three tools, differentiated by strength: <strong>Claude™</strong> · <strong>GitHub Copilot™</strong> · <strong>Gemini™ + Antigravity™</strong></p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="2-what-changed-the-20252026-leap">2. What Changed: The 2025–2026 Leap<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#2-what-changed-the-20252026-leap" class="hash-link" aria-label="Direct link to 2. What Changed: The 2025–2026 Leap" title="Direct link to 2. What Changed: The 2025–2026 Leap">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Slide 2 — What Changed" src="https://jigsawflux.org/blog/assets/images/slide3-02-9530293ca0821fd4281f555fdbdb6b54.jpg" width="1300" height="732" class="img_ev3q"></p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="agentic-capabilities">Agentic Capabilities<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#agentic-capabilities" class="hash-link" aria-label="Direct link to Agentic Capabilities" title="Direct link to Agentic Capabilities">​</a></h3>
<p>All three tools now operate as agents — not just suggesting code, but autonomously planning, executing multi-step tasks, and iterating on their own output. Copilot has a coding agent that creates PRs from issues. Claude Code runs shell commands and edits files. Gemini Code Assist has full agent mode.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="massive-context-windows">Massive Context Windows<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#massive-context-windows" class="hash-link" aria-label="Direct link to Massive Context Windows" title="Direct link to Massive Context Windows">​</a></h3>
<p>Claude supports 200K+ tokens. Gemini offers 1M+ tokens. This means entire codebases, not just files, can be analysed in a single session — transforming architecture understanding from guesswork to comprehensive analysis.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="built-in-security--review">Built-in Security &amp; Review<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#built-in-security--review" class="hash-link" aria-label="Direct link to Built-in Security &amp; Review" title="Direct link to Built-in Security &amp; Review">​</a></h3>
<p>Copilot now runs code scanning, secret scanning, and dependency checks inside its agent workflow. Gemini does automated PR reviews on GitHub. Claude's multi-agent system can reason about threat models holistically.</p>
<blockquote>
<p><em>These capabilities move AI tools from 'coding assistants' to 'architecture collaborators'.</em></p>
</blockquote>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>Transformative changes in agentic mode</summary><div><div class="collapsibleContent_i85q"><p>Key message: since late 2025, these are fundamentally different tools than what most architects tried a year ago. The context window expansion is particularly transformative — Gemini's 1M tokens means you can feed an entire microservices codebase. Claude's 200K tokens handles most real-world repos. And agentic mode means the tool doesn't just suggest — it plans, executes, and self-corrects. For architects, this means tasks like 'understand this legacy system' or 'generate a threat model' are now genuinely viable.</p></div></div></details>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="3-three-tools-three-architectural-superpowers">3. Three Tools, Three Architectural Superpowers<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#3-three-tools-three-architectural-superpowers" class="hash-link" aria-label="Direct link to 3. Three Tools, Three Architectural Superpowers" title="Direct link to 3. Three Tools, Three Architectural Superpowers">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Slide 3 — Three Superpowers" src="https://jigsawflux.org/blog/assets/images/slide3-03-3ee930fce3f346349ba1a07944e6353c.jpg" width="1300" height="732" class="img_ev3q"></p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="claude--the-architecture-reasoner">Claude™ — <em>The Architecture Reasoner</em><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#claude--the-architecture-reasoner" class="hash-link" aria-label="Direct link to claude--the-architecture-reasoner" title="Direct link to claude--the-architecture-reasoner">​</a></h3>
<ul>
<li>Deep reasoning + 200K context</li>
<li>Claude Code: agentic CLI (shell + files)</li>
<li>Multi-agent: sub-agents for parallel tasks</li>
<li>TOGAF artefacts, ADRs, trade-off analysis</li>
<li>Projects for persistent architecture context</li>
</ul>
<p><strong>Sample use case:</strong> Reverse-engineer codebase → Mermaid diagram + ADR</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="github-copilot--the-agentic-devops-partner">GitHub Copilot™ — <em>The Agentic DevOps Partner</em><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#github-copilot--the-agentic-devops-partner" class="hash-link" aria-label="Direct link to github-copilot--the-agentic-devops-partner" title="Direct link to github-copilot--the-agentic-devops-partner">​</a></h3>
<ul>
<li>Coding agent: assign issues → auto-PRs</li>
<li>Agent mode: multi-file edits + self-healing</li>
<li>Built-in security scanning (code, secrets, deps)</li>
<li>Multi-model: GPT, Claude, Gemini inside Copilot</li>
<li>MCP integration for external tools + context</li>
</ul>
<p><strong>Sample use case:</strong> IaC scaffold + security scan + code explanation</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="gemini--antigravity--the-agent-first-cloud-architect">Gemini™ + Antigravity™ — <em>The Agent-First Cloud Architect</em><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#gemini--antigravity--the-agent-first-cloud-architect" class="hash-link" aria-label="Direct link to gemini--antigravity--the-agent-first-cloud-architect" title="Direct link to gemini--antigravity--the-agent-first-cloud-architect">​</a></h3>
<ul>
<li>1M+ token context (entire codebase analysis)</li>
<li>Antigravity IDE: up to 5 parallel agents</li>
<li>Native multimodal: diagrams + code + docs</li>
<li>Built-in Chrome browser for visual verification</li>
<li>Gemini 3.1 Pro: frontier reasoning + coding</li>
</ul>
<p><strong>Sample Use case:</strong> Upload diagram → component spec + GCP mapping</p>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>Recent Updates — 2026</summary><div><div class="collapsibleContent_i85q"><p>Updated differentiation for 2026:</p><p><strong>CLAUDE:</strong> No longer just a chat window. Claude Code is a full agentic CLI that runs in your terminal, executes commands, edits files, and uses a multi-agent architecture. It's strongest for deep architectural reasoning — understanding large codebases, generating structured documentation, and trade-off analysis. Projects allow persistent context across sessions.</p><p><strong>COPILOT:</strong> Now has a full coding agent that can be assigned GitHub issues and autonomously creates PRs. Agent mode does multi-file edits with self-healing. Built-in security scanning (code scanning, secret scanning, dependency checks) runs automatically. Multi-model support means you can use Claude or Gemini models inside Copilot.</p><p><strong>GEMINI + ANTIGRAVITY:</strong> 1M+ token context is the headline differentiator for architecture work. Gemini Code Assist now has agent mode in the IDE plus automated PR reviews on GitHub. Native multimodality means you can upload whiteboard photos, Visio diagrams, and architecture screenshots. Gemini 3.1 Pro offers frontier-class reasoning. Antigravity is Google's agent-first IDE — a VS Code fork with a Manager View for up to 5 parallel agents and a built-in Chrome browser for visual verification.</p></div></div></details>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="4-architectural-use-cases-at-a-glance-2026">4. Architectural Use Cases at a Glance (2026)<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#4-architectural-use-cases-at-a-glance-2026" class="hash-link" aria-label="Direct link to 4. Architectural Use Cases at a Glance (2026)" title="Direct link to 4. Architectural Use Cases at a Glance (2026)">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Slide 4 — Use Case Matrix" src="https://jigsawflux.org/blog/assets/images/slide3-04-8845538af181ff5f5aed7207cffbd849.jpg" width="1300" height="732" class="img_ev3q"></p>
<table><thead><tr><th>Architect Use Case</th><th>Claude</th><th>Copilot</th><th>Gemini + Antigravity</th></tr></thead><tbody><tr><td><strong>Reverse-engineer architecture</strong></td><td>★★★ Deep reasoning + diagrams</td><td>★★ Agent explores repo</td><td>★★★ 1M context + multimodal</td></tr><tr><td><strong>Generate TOGAF artefacts</strong></td><td>★★★ ADRs, BDAT layers, ArchiMate</td><td>★★ Via agent + custom prompts</td><td>★★ Doc generation + search</td></tr><tr><td><strong>IaC scaffolding &amp; review</strong></td><td>★★ Claude Code in terminal</td><td>★★★ Native IDE + agent mode</td><td>★★★ GCP-native + agent mode</td></tr><tr><td><strong>Security &amp; threat modelling</strong></td><td>★★★ STRIDE threat models</td><td>★★★ Auto scan: code+secrets+deps</td><td>★★ PR review + policy checks</td></tr><tr><td><strong>POC/MVP rapid prototyping</strong></td><td>★★★ Multi-agent builds + tests</td><td>★★★ Assign issue → auto PR</td><td>★★★ Agent mode + GCP deploy</td></tr><tr><td><strong>Trade-off documentation</strong></td><td>★★★ Structured analysis + ADRs</td><td>★★ Chat-based comparison</td><td>★★ Research-grounded with search</td></tr></tbody></table>
<blockquote>
<p><em>Key insight: the tools have converged in many areas. The differentiator is now where each tool lives — your terminal, your IDE, or your browser.</em></p>
</blockquote>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>Tool Integration and IDE Evolution</summary><div><div class="collapsibleContent_i85q"><p>Updated matrix reflects 2026 reality: the tools have converged significantly. All three now have agent modes. All can do code explanation and security analysis. The real differentiators are: WHERE the tool lives (terminal/CLI for Claude Code, IDE for Copilot, browser+IDE+cloud for Gemini), CONTEXT SIZE (Gemini wins with 1M tokens), REASONING DEPTH (Claude wins for structured architectural analysis), and WORKFLOW INTEGRATION (Copilot wins — it's embedded in GitHub issues, PRs, and CI/CD).</p></div></div></details>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="5-case-study-claude--architecture-reasoner">5. Case Study Claude · Architecture Reasoner<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#5-case-study-claude--architecture-reasoner" class="hash-link" aria-label="Direct link to 5. Case Study Claude · Architecture Reasoner" title="Direct link to 5. Case Study Claude · Architecture Reasoner">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Slide 5 — Claude Demo" src="https://jigsawflux.org/blog/assets/images/slide3-05-e5c082941d490165d16d23731c97fc1d.jpg" width="1300" height="732" class="img_ev3q"></p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="what-is-done">What is done?<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#what-is-done" class="hash-link" aria-label="Direct link to What is done?" title="Direct link to What is done?">​</a></h3>
<ol>
<li>Upload a microservices codebase to a Claude Project (persistent context across sessions)</li>
<li>Ask: <em>"Reverse-engineer the architecture — identify services, dependencies, data flows, and concerns"</em></li>
<li>Claude produces a structured architecture narrative + Mermaid C4 diagram</li>
<li>Follow-up: <em>"Generate an ADR for replacing sync REST with async event-driven messaging"</em></li>
<li>Result: Full MADR-format ADR with trade-offs, risks, and migration path</li>
</ol>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="prompts">Prompts<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#prompts" class="hash-link" aria-label="Direct link to Prompts" title="Direct link to Prompts">​</a></h3>
<div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">You are a senior solution architect.</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">Analyse this codebase and:</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">1. Identify all services and responsibilities</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">2. Map the dependency graph</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">3. Highlight architectural concerns</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">   (coupling, resilience, scalability)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">4. Produce a Mermaid C4 context diagram</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">5. Suggest 3 improvement priorities</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">[PASTE CODE or use Project context]</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">Then generate an ADR (MADR format) for</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">migrating to async event-driven messaging.</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p><strong>Output:</strong> Architecture narrative · C4 Mermaid diagram · MADR ADR · Improvement priorities</p>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>Claude Case Study</summary><div><div class="collapsibleContent_i85q"><p><strong>1 — Claude</strong> (claude.ai or Claude Code CLI).</p><p><strong>Study A (Web):</strong> Create a Claude Project, upload the codebase as context, then prompt.
<strong>Study B (CLI):</strong> Use Claude Code in your terminal — it can read files, run commands, and iterate.</p><p>Key observations:</p><ul>
<li>Claude's 200K token context handles most real-world repos in one session</li>
<li>Projects provide persistent context — come back tomorrow and it still knows your codebase</li>
<li>Multi-agent architecture means Claude spawns sub-agents for parallel analysis</li>
<li>Claude excels at structured reasoning: ADRs, trade-off docs, TOGAF artefacts</li>
<li>Mention Claude Code's agentic capabilities: it can read your repo, run tests, and generate diagrams</li>
</ul><p><strong>IMPORTANT:</strong> This has been a 'thinking partner' study — Claude's strength is deep architectural reasoning, not just code generation.</p></div></div></details>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="6-case-study-2--github-copilot--agentic-devops-partner">6. Case Study 2 · GitHub Copilot · Agentic DevOps Partner<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#6-case-study-2--github-copilot--agentic-devops-partner" class="hash-link" aria-label="Direct link to 6. Case Study 2 · GitHub Copilot · Agentic DevOps Partner" title="Direct link to 6. Case Study 2 · GitHub Copilot · Agentic DevOps Partner">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Slide 6 — Copilot Demo" src="https://jigsawflux.org/blog/assets/images/slide3-06-d853d022a0838c0db8948f674e8f4b54.jpg" width="1300" height="732" class="img_ev3q"></p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="scenario-a-agent-mode--iac--security">Scenario A: Agent Mode — IaC + Security<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#scenario-a-agent-mode--iac--security" class="hash-link" aria-label="Direct link to Scenario A: Agent Mode — IaC + Security" title="Direct link to Scenario A: Agent Mode — IaC + Security">​</a></h3>
<p><em>For Infrastructure / Solution Architects</em></p>
<ol>
<li>Open VS Code → Agent mode in Copilot Edits</li>
<li>Prompt: <em>"Create AKS cluster with private networking, AAD, and RBAC in Terraform"</em></li>
<li>Agent scaffolds multi-file Terraform config with self-healing</li>
<li>Ask Chat: <em>"Scan for security vulnerabilities"</em> — built-in code + secret + dependency scan runs automatically</li>
</ol>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="scenario-b-coding-agent--issue-to-pr">Scenario B: Coding Agent — Issue to PR<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#scenario-b-coding-agent--issue-to-pr" class="hash-link" aria-label="Direct link to Scenario B: Coding Agent — Issue to PR" title="Direct link to Scenario B: Coding Agent — Issue to PR">​</a></h3>
<p><em>For Enterprise / Domain Architects</em></p>
<ol>
<li>In GitHub, assign an issue to Copilot: <em>"Add OpenTelemetry tracing to order service"</em></li>
<li>Coding agent researches the repo, plans implementation, creates branch</li>
<li>Agent self-reviews with Copilot Code Review before tagging you</li>
<li>Review the PR — security scanning already passed, iterate via PR comments</li>
</ol>
<blockquote>
<p><em>2026 update: Copilot now supports multi-model (GPT, Claude, Gemini), MCP integration, and runs in VS Code, JetBrains, Eclipse, and Xcode.</em></p>
</blockquote>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>GitHub Copilot Scenarios</summary><div><div class="collapsibleContent_i85q"><p><strong>GitHub Copilot</strong> in VS Code + GitHub.</p><p><strong>Scenario A</strong> shows the updated Agent Mode (not the old chat). In agent mode, Copilot autonomously plans, edits multiple files, runs terminal commands, and self-heals errors. The security scanning is the big update — code scanning, secret scanning, and dependency vulnerability checks now run automatically inside the agent workflow.</p><p><strong>Scenario B</strong> shows the Coding Agent — this is the async, background agent announced May 2025. You assign a GitHub issue to Copilot, and it autonomously researches the repo, creates a plan, implements across multiple files, runs its own code review, runs security scans, and creates a PR. You review when it's done.</p><p>Key 2026 updates to mention:</p><ul>
<li>Multi-model support: you can now use Claude Opus 4.7 or Gemini inside Copilot</li>
<li>MCP integration: connect external tools (Jira, Slack, databases) to the agent</li>
<li>Agentic code review: reviews now gather full project context before suggesting changes</li>
<li>Available in VS Code, JetBrains, Eclipse, Xcode</li>
<li>Coding agent runs on GitHub Actions infrastructure</li>
</ul></div></div></details>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="7-case-study-3--gemini--antigravity--agent-first-cloud-architect">7. Case Study 3 · Gemini + Antigravity · Agent-First Cloud Architect<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#7-case-study-3--gemini--antigravity--agent-first-cloud-architect" class="hash-link" aria-label="Direct link to 7. Case Study 3 · Gemini + Antigravity · Agent-First Cloud Architect" title="Direct link to 7. Case Study 3 · Gemini + Antigravity · Agent-First Cloud Architect">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Slide 7 — Gemini + Antigravity Demo" src="https://jigsawflux.org/blog/assets/images/slide3-07-349a57bc95bb3ae60e7117f1c6011467.jpg" width="1300" height="732" class="img_ev3q"></p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="what-is-done-1">What is done<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#what-is-done-1" class="hash-link" aria-label="Direct link to What is done" title="Direct link to What is done">​</a></h3>
<ol>
<li>Upload a whiteboard/Visio architecture diagram to Gemini (browser or Antigravity IDE)</li>
<li>Ask: <em>"Identify components, integration patterns, and data flows from this diagram"</em></li>
<li>Follow-up: <em>"What concerns do you see? Generate a component spec."</em></li>
<li>In Antigravity: spawn parallel agents — one maps to GCP services, another writes IaC</li>
<li>Bonus: Paste codebase alongside diagram — 1M+ tokens cross-references both</li>
</ol>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="antigravity--gemini-why-this-changes-architecture">Antigravity + Gemini: Why This Changes Architecture<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#antigravity--gemini-why-this-changes-architecture" class="hash-link" aria-label="Direct link to Antigravity + Gemini: Why This Changes Architecture" title="Direct link to Antigravity + Gemini: Why This Changes Architecture">​</a></h3>
<ul>
<li>🚀 <strong>Antigravity:</strong> agent-first IDE (VS Code fork) with Manager View for up to 5 parallel agents</li>
<li>🖼️ <strong>Built-in Chrome browser:</strong> agents verify UI changes visually — no context-switch</li>
<li>📋 <strong>1M+ tokens:</strong> analyse entire repos + diagrams + docs in a single session</li>
<li>☁️ <strong>Native GCP integration</strong> — deploy, diagnose K8s, review Cloud Console from the IDE</li>
</ul>
<blockquote>
<p><em>Antigravity: free public preview. VS Code fork — your extensions carry over. Powered by Gemini 3.1 Pro + supports Claude Sonnet/Opus.</em></p>
</blockquote>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>Gemini &amp; Antigravity</summary><div><div class="collapsibleContent_i85q"><p><strong>DEMO 3 — Gemini + Antigravity IDE.</strong></p><p>Antigravity is Google's agent-first IDE, released Nov 2025 alongside Gemini 3. It's a VS Code fork, so all your extensions, keybindings, and themes carry over. But the value is the layer on top:</p><ul>
<li><strong>Manager View:</strong> spawn up to 5 parallel agents working on different tasks simultaneously</li>
<li><strong>Built-in Chrome browser:</strong> agents can navigate to localhost, interact with UI, take screenshots to verify their own work</li>
<li><strong>Multi-model:</strong> Gemini 3.1 Pro (default), Gemini 3 Flash, Claude Sonnet 4.6, Claude Opus 4.6, GPT-OSS-120B</li>
<li><strong>Artifacts:</strong> every agent produces structured outputs — plans, diffs, screenshots, test results</li>
<li>Free in public preview as of May 2026</li>
</ul><p>For the demo: open Antigravity, upload an architecture diagram to Gemini, show the multimodal analysis. Then show the Manager View — spawn one agent to write the component spec, another to generate Terraform for GCP. Show the built-in browser verifying a deployed service.</p><p>Key observations for architects: Antigravity shifts your role from 'writer of code' to 'mission controller'. You direct agents, review their artifacts, and focus on architectural decisions. This is the closest any tool comes to the architect-as-orchestrator paradigm.</p><p>Note on Gemini ecosystem: Antigravity (IDE), Gemini Code Assist (IDE extension for VS Code/JetBrains), Gemini CLI (terminal tool), Gemini web app (browser). Each serves a different workflow.</p></div></div></details>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="8-elevated-use-cases-docs-security--togaf">8. Elevated Use Cases: Docs, Security &amp; TOGAF<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#8-elevated-use-cases-docs-security--togaf" class="hash-link" aria-label="Direct link to 8. Elevated Use Cases: Docs, Security &amp; TOGAF" title="Direct link to 8. Elevated Use Cases: Docs, Security &amp; TOGAF">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Slide 8 — Elevated Use Cases" src="https://jigsawflux.org/blog/assets/images/slide3-08-41465a11b4b0070e983922847f9f417c.jpg" width="1300" height="732" class="img_ev3q"></p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="architecture-decision-records--claude-">Architecture Decision Records — <code>Claude (★★★)</code><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#architecture-decision-records--claude-" class="hash-link" aria-label="Direct link to architecture-decision-records--claude-" title="Direct link to architecture-decision-records--claude-">​</a></h3>
<p>Describe the problem, constraints, and options. Claude outputs a MADR/RFC-format ADR with context, options, pros/cons, and rationale. Use Projects to maintain architectural context across multiple ADRs.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="security-scanning--threat-modelling--copilot--claude">Security: Scanning + Threat Modelling — <code>Copilot + Claude</code><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#security-scanning--threat-modelling--copilot--claude" class="hash-link" aria-label="Direct link to security-scanning--threat-modelling--copilot--claude" title="Direct link to security-scanning--threat-modelling--copilot--claude">​</a></h3>
<p>Copilot's agent now auto-runs code scanning, secret detection, and dependency checks. Claude generates STRIDE threat models from architecture descriptions. Gemini reviews PRs for policy violations.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="togaf-businessdataapptech-layers--claude--gemini">TOGAF Business/Data/App/Tech Layers — <code>Claude + Gemini</code><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#togaf-businessdataapptech-layers--claude--gemini" class="hash-link" aria-label="Direct link to togaf-businessdataapptech-layers--claude--gemini" title="Direct link to togaf-businessdataapptech-layers--claude--gemini">​</a></h3>
<p>Claude reasons about ArchiMate-aligned artefacts across all four TOGAF layers with mapping tables. Gemini adds value by processing existing architecture diagrams as visual input alongside requirements.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="pocspikemvp-rapid-build--all-three-">POC/SPIKE/MVP Rapid Build — <code>All three (★★★)</code><a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#pocspikemvp-rapid-build--all-three-" class="hash-link" aria-label="Direct link to pocspikemvp-rapid-build--all-three-" title="Direct link to pocspikemvp-rapid-build--all-three-">​</a></h3>
<p>Copilot's coding agent builds features from issues. Claude Code scaffolds and tests full projects in your terminal. Gemini's agent mode builds inside the IDE. Assign the boring parts to AI, focus on design.</p>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>ADR, Security, TOGAF &amp; MVP</summary><div><div class="collapsibleContent_i85q"><p>These four tiles represent the highest-value use cases beyond coding.</p><p><strong>ADRs:</strong> Claude remains the strongest for structured reasoning. With Projects, you can maintain persistent architecture context — feed in your tech radar, principles, and existing ADRs, then generate new ones in that context.</p><p><strong>Security:</strong> The big 2026 change is Copilot's built-in scanning. The coding agent auto-runs code scanning, secret scanning, and dependency vulnerability checks before even opening a PR. Combine with Claude for holistic STRIDE threat modelling.</p><p><strong>TOGAF:</strong> Claude for reasoning, Gemini for multimodal input (existing diagrams + requirements docs).</p><p><strong>POC/MVP:</strong> All three tools now have genuine agentic build capabilities. Copilot's coding agent is particularly strong here — assign an issue, come back to a PR.</p></div></div></details>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="9-the-ai-augmented-architecture-workflow-2026">9. The AI-Augmented Architecture Workflow (2026)<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#9-the-ai-augmented-architecture-workflow-2026" class="hash-link" aria-label="Direct link to 9. The AI-Augmented Architecture Workflow (2026)" title="Direct link to 9. The AI-Augmented Architecture Workflow (2026)">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Slide 9 — Workflow" src="https://jigsawflux.org/blog/assets/images/slide3-09-bd65909a3da576871c63b061a51d2e2c.jpg" width="1300" height="732" class="img_ev3q"></p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="1️⃣-discover">1️⃣ Discover<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#1%EF%B8%8F%E2%83%A3-discover" class="hash-link" aria-label="Direct link to 1️⃣ Discover" title="Direct link to 1️⃣ Discover">​</a></h3>
<p>Upload code + diagrams to Claude (Projects) or Gemini/Antigravity (1M context, parallel agents). Get architecture narrative and concern flags.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="2️⃣-design">2️⃣ Design<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#2%EF%B8%8F%E2%83%A3-design" class="hash-link" aria-label="Direct link to 2️⃣ Design" title="Direct link to 2️⃣ Design">​</a></h3>
<p>Generate options + ADRs with Claude. Scaffold IaC with Copilot Agent. Cross-reference diagrams in Gemini.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="3️⃣-build">3️⃣ Build<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#3%EF%B8%8F%E2%83%A3-build" class="hash-link" aria-label="Direct link to 3️⃣ Build" title="Direct link to 3️⃣ Build">​</a></h3>
<p>Assign issues to Copilot Coding Agent. Use Claude Code for terminal-based builds. Antigravity: parallel agents for multi-task builds.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="4️⃣-validate">4️⃣ Validate<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#4%EF%B8%8F%E2%83%A3-validate" class="hash-link" aria-label="Direct link to 4️⃣ Validate" title="Direct link to 4️⃣ Validate">​</a></h3>
<p>Copilot auto-scans for security. Claude generates threat models. Gemini reviews PRs. All before human review.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="guiding-principles">Guiding Principles<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#guiding-principles" class="hash-link" aria-label="Direct link to Guiding Principles" title="Direct link to Guiding Principles">​</a></h3>
<ul>
<li>AI augments judgment, it doesn't replace it — always validate outputs against your architectural principles</li>
<li>Combine tools: Gemini for intake, Claude for reasoning, Copilot for implementation and security</li>
<li>Treat AI output as a first draft. Review like you'd review a junior architect's work — verify, refine, approve.</li>
</ul>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>DDBV cycle — Discover, Design, Build, and Validate</summary><div><div class="collapsibleContent_i85q"><p>The workflow has evolved from 'copy-paste into chat' to a genuine multi-tool architecture workflow.</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="discover">Discover<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#discover" class="hash-link" aria-label="Direct link to Discover" title="Direct link to Discover">​</a></h4><p>Claude Projects maintain context across sessions — upload your codebase once, query it for weeks. Gemini's 1M tokens let you feed an entire repo + existing docs.</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="design">Design<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#design" class="hash-link" aria-label="Direct link to Design" title="Direct link to Design">​</a></h4><p>Claude generates structured ADRs and trade-off docs. Copilot agent mode scaffolds IaC with self-healing.</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="build">Build<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#build" class="hash-link" aria-label="Direct link to Build" title="Direct link to Build">​</a></h4><p>The biggest 2026 change. Copilot's coding agent takes GitHub issues and autonomously creates PRs. Claude Code runs in your terminal as a full agentic tool. Gemini Code Assist has agent mode in the IDE.</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="validate">Validate<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#validate" class="hash-link" aria-label="Direct link to Validate" title="Direct link to Validate">​</a></h4><p>Copilot now auto-runs security scanning (code, secrets, deps) in the agent workflow. This is genuinely new — security is shifting left into the AI agent itself.</p></div></div></details>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="10-case-study-output">10. Case study output<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#10-case-study-output" class="hash-link" aria-label="Direct link to 10. Case study output" title="Direct link to 10. Case study output">​</a></h2>
<p><em>The full source is on <a href="https://github.com/JigsawFlux/kafka-optimize-public-transport" target="_blank" rel="noopener noreferrer">GitHub</a>. See the docs folder. Contributions and feedback welcome.</em></p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="adrs">ADRs<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#adrs" class="hash-link" aria-label="Direct link to ADRs" title="Direct link to ADRs">​</a></h3>
<ul>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus">ADR - Event Bus</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-002-avro-schema-registry">ADR - Schema Registry</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-003-kafka-connect-jdbc-postgres">ADR - Kafka connect</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql">ADR - KSQL</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-005-rest-proxy-for-weather-producer">ADR - Producer</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard">ADR - Dashboard</a></li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="architecture-and-trade-off-analysis">Architecture and trade-off analysis<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#architecture-and-trade-off-analysis" class="hash-link" aria-label="Direct link to Architecture and trade-off analysis" title="Direct link to Architecture and trade-off analysis">​</a></h3>
<ul>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture">Architecture</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis">Trade-off Analysis</a></li>
</ul>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="11-start-today">11. Start Today<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#11-start-today" class="hash-link" aria-label="Direct link to 11. Start Today" title="Direct link to 11. Start Today">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Slide 10 — Call to Action" src="https://jigsawflux.org/blog/assets/images/slide3-10-6e03f7ae869a9c4128520d4d31fa7506.jpg" width="1300" height="732" class="img_ev3q"></p>
<blockquote>
<p><em>You don't need a new tool — you need a new reflex.</em></p>
</blockquote>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="this-week">This Week<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#this-week" class="hash-link" aria-label="Direct link to This Week" title="Direct link to This Week">​</a></h3>
<ul>
<li>→ Try Claude on one legacy codebase — ask it to explain the architecture and generate a Mermaid diagram</li>
<li>→ Use Copilot Agent Mode to scaffold a Terraform module with security review</li>
<li>→ Upload a whiteboard diagram to Gemini (or Antigravity) and ask it to generate a component spec</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="this-month">This Month<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#this-month" class="hash-link" aria-label="Direct link to This Month" title="Direct link to This Month">​</a></h3>
<ul>
<li>→ Generate your first AI-assisted ADR for a real architectural decision using Claude Projects</li>
<li>→ Assign a low-complexity GitHub issue to Copilot's Coding Agent and review the PR it creates</li>
<li>→ Run a security review: Copilot scans code + Claude generates a STRIDE threat model</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="this-quarter">This Quarter<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#this-quarter" class="hash-link" aria-label="Direct link to This Quarter" title="Direct link to This Quarter">​</a></h3>
<ul>
<li>→ Establish a team AI-augmented architecture workflow (Discover → Design → Build → Validate)</li>
<li>→ Document 3 trade-off analyses using AI as a first draft, then refine with team review</li>
<li>→ Share learnings — what worked, what hallucinated, what's not ready yet</li>
</ul>
<p><strong>Let's build something together.</strong></p>
<details class="details_lb9f alert alert--info details_b_Ee" data-collapsed="true"><summary>Concerns</summary><div><div class="collapsibleContent_i85q"><ul>
<li><strong>'Is our code safe?'</strong> — All three offer enterprise tiers with data protection. Don't paste proprietary code into free consumer tiers. Claude Enterprise, Copilot Enterprise, and Gemini Enterprise all have no-training guarantees.</li>
<li><strong>'What about hallucinations?'</strong> — Treat AI output like a junior colleague's work. Always validate. The agentic tools are getting better at self-correction (Copilot's coding agent runs its own code review + security scan before PRs).</li>
<li><strong>'Which model is inside Copilot?'</strong> — Multi-model now: GPT-4o (default), Claude, Gemini. You choose.</li>
<li><strong>'What is Antigravity?'</strong> — Google's agent-first IDE (VS Code fork), released Nov 2025. Free public preview. Runs Gemini 3.1 Pro by default but also supports Claude Sonnet/Opus 4.6. Up to 5 parallel agents. Built-in Chrome browser. Your VS Code extensions carry over.</li>
<li><strong>'What's the difference between Antigravity and Gemini Code Assist?'</strong> — Antigravity is a full standalone IDE (agent-first). Gemini Code Assist is an extension for existing IDEs (VS Code, JetBrains). Use Antigravity for greenfield/agent-heavy work, Code Assist for day-to-day coding in your existing setup.</li>
<li><strong>Costs:</strong> Claude Pro ~£20/mo, Copilot from £8/mo, Antigravity free in preview, Gemini Code Assist has a free tier.</li>
</ul></div></div></details>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="resources--links">Resources &amp; Links<a href="https://jigsawflux.org/blog/ai-for-architects-beyond-code#resources--links" class="hash-link" aria-label="Direct link to Resources &amp; Links" title="Direct link to Resources &amp; Links">​</a></h2>
<table><thead><tr><th>Tool</th><th>Link</th><th>Notes</th></tr></thead><tbody><tr><td>Claude™</td><td><a href="https://claude.ai/" target="_blank" rel="noopener noreferrer">claude.ai</a></td><td>Web app, Projects, 200K context</td></tr><tr><td>Claude Code™</td><td><a href="https://code.claude.com/" target="_blank" rel="noopener noreferrer">code.claude.com</a></td><td>Agentic CLI for terminal</td></tr><tr><td>GitHub Copilot™</td><td><a href="https://github.com/features/copilot" target="_blank" rel="noopener noreferrer">github.com/features/copilot</a></td><td>IDE + Coding Agent + Security</td></tr><tr><td>Google Antigravity™</td><td><a href="https://antigravity.google/" target="_blank" rel="noopener noreferrer">antigravity.google</a></td><td>Agent-first IDE (VS Code fork)</td></tr><tr><td>Gemini Code Assist™</td><td><a href="https://cloud.google.com/gemini/docs/codeassist/overview" target="_blank" rel="noopener noreferrer">cloud.google.com/gemini/docs/codeassist</a></td><td>IDE extension for VS Code/JetBrains</td></tr><tr><td>Gemini™</td><td><a href="https://gemini.google.com/" target="_blank" rel="noopener noreferrer">gemini.google.com</a></td><td>Web app, 1M+ context, multimodal</td></tr></tbody></table>
<hr>
<p><em>Created in May 2026. Content reflects tool capabilities as of this date — these tools evolve rapidly.</em></p>
<p><em>Trademark notice: Claude, Claude Code, GitHub Copilot, GitHub, Gemini, Gemini Code Assist, Antigravity, and related product names are trademarks of their respective owners.</em></p>]]></content>
        <author>
            <name>Suresh Thomas</name>
            <uri>https://github.com/st185229</uri>
        </author>
        <category label="gemini" term="gemini"/>
        <category label="architecture" term="architecture"/>
        <category label="claude" term="claude"/>
        <category label="copilot" term="copilot"/>
        <category label="ai-beyond-coding" term="ai-beyond-coding"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Architecture Document — CTA Public Transport Optimisation System]]></title>
        <id>https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture</id>
        <link href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture"/>
        <updated>2026-05-19T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Version: 1.0]]></summary>
        <content type="html"><![CDATA[<p><strong>Version:</strong> 1.0
<strong>Date:</strong> 2026-03-12
<strong>Status:</strong> Baselined
<strong>Standard:</strong> 4+1 Architectural View Model (Kruchten, 1995)
<strong>Notation:</strong> ArchiMate 3.1 concepts rendered as Mermaid diagrams</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="table-of-contents">Table of Contents<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#table-of-contents" class="hash-link" aria-label="Direct link to Table of Contents" title="Direct link to Table of Contents">​</a></h2>
<ol>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#1-document-purpose-and-scope">Document Purpose and Scope</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#2-architectural-drivers">Architectural Drivers</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#3-use-case-view-1">Use Case View (+1)</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#4-logical-view">Logical View</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#5-process-view">Process View</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#6-development-view">Development View</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#7-physical-view">Physical View</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#8-architectural-decisions-summary">Architectural Decisions Summary</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#9-risks-and-technical-debt">Risks and Technical Debt</a></li>
</ol>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="1-document-purpose-and-scope">1. Document Purpose and Scope<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#1-document-purpose-and-scope" class="hash-link" aria-label="Direct link to 1. Document Purpose and Scope" title="Direct link to 1. Document Purpose and Scope">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="11-purpose">1.1 Purpose<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#11-purpose" class="hash-link" aria-label="Direct link to 1.1 Purpose" title="Direct link to 1.1 Purpose">​</a></h3>
<p>This document describes the software architecture of the <strong>Chicago Transit Authority (CTA)
Public Transport Optimisation System</strong>.  It is structured according to the <strong>4+1 Architectural
View Model</strong> (Kruchten, IEEE Software 1995), which organises the architecture into five
complementary views, each addressing the concerns of a different stakeholder group:</p>
<table><thead><tr><th>View</th><th>Primary Audience</th><th>Central Concern</th></tr></thead><tbody><tr><td>Use Case (+1)</td><td>All stakeholders</td><td>Scenarios that drive architectural decisions</td></tr><tr><td>Logical</td><td>Architects, developers</td><td>Functional decomposition and key abstractions</td></tr><tr><td>Process</td><td>Architects, integrators</td><td>Concurrency, data flows, runtime behaviour</td></tr><tr><td>Development</td><td>Developers, build engineers</td><td>Module structure, package organisation</td></tr><tr><td>Physical</td><td>Operations, DevOps</td><td>Deployment topology, infrastructure mapping</td></tr></tbody></table>
<p>Diagrams use <strong>Mermaid</strong> syntax and follow <strong>ArchiMate 3.1</strong> layering conventions:</p>
<ul>
<li><strong>Technology Layer</strong> — infrastructure elements (brokers, databases, containers)</li>
<li><strong>Application Layer</strong> — software components and their interfaces</li>
<li><strong>Business Layer</strong> — business processes and actors that the system serves</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="12-system-overview">1.2 System Overview<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#12-system-overview" class="hash-link" aria-label="Direct link to 1.2 System Overview" title="Direct link to 1.2 System Overview">​</a></h3>
<p>The system is a real-time streaming pipeline that ingests simulated operational data from the
CTA elevated rail network ("L"), processes it through multiple transformation stages, and presents
a live transit status dashboard.  It demonstrates a full <strong>Event-Driven Architecture (EDA)</strong> on
the Confluent Kafka platform.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="13-scope">1.3 Scope<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#13-scope" class="hash-link" aria-label="Direct link to 1.3 Scope" title="Direct link to 1.3 Scope">​</a></h3>
<ul>
<li>Three train lines: <strong>Blue</strong>, <strong>Red</strong>, <strong>Green</strong> (each with 10 trains, bidirectional)</li>
<li>Station arrival events, turnstile ridership counts, and weather telemetry</li>
<li>Static station reference data from PostgreSQL</li>
<li>A browser-accessible real-time status dashboard</li>
</ul>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="2-architectural-drivers">2. Architectural Drivers<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#2-architectural-drivers" class="hash-link" aria-label="Direct link to 2. Architectural Drivers" title="Direct link to 2. Architectural Drivers">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="21-quality-attribute-requirements">2.1 Quality Attribute Requirements<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#21-quality-attribute-requirements" class="hash-link" aria-label="Direct link to 2.1 Quality Attribute Requirements" title="Direct link to 2.1 Quality Attribute Requirements">​</a></h3>
<table><thead><tr><th>ID</th><th>Quality Attribute</th><th>Scenario</th><th>Architectural Response</th></tr></thead><tbody><tr><td>QA-01</td><td><strong>Throughput</strong></td><td>3 lines × stations × 10 trains produce arrival events every 5 s</td><td>10-partition Kafka topic; AvroProducer batching</td></tr><tr><td>QA-02</td><td><strong>Decoupling</strong></td><td>New consumers must not require producer changes</td><td>All communication via Kafka topics (no direct calls)</td></tr><tr><td>QA-03</td><td><strong>Schema Evolution</strong></td><td>Fields may be added to events over time</td><td>Avro + Schema Registry with compatibility enforcement</td></tr><tr><td>QA-04</td><td><strong>Replayability</strong></td><td>Dashboard must recover state on restart</td><td>Consumers start from <code>offset_earliest</code>; Faust rebuilds table from log</td></tr><tr><td>QA-05</td><td><strong>Responsiveness</strong></td><td>Dashboard must serve HTTP requests without stalling Kafka polling</td><td>Tornado async IO loop; consumers as coroutines</td></tr><tr><td>QA-06</td><td><strong>Extensibility</strong></td><td>Station reference data changes without code deployment</td><td>Kafka Connect JDBC connector; consumers subscribe to topic</td></tr></tbody></table>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="22-constraints">2.2 Constraints<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#22-constraints" class="hash-link" aria-label="Direct link to 2.2 Constraints" title="Direct link to 2.2 Constraints">​</a></h3>
<ul>
<li>Python-only application code (no JVM services authored in-house)</li>
<li>Single-host Docker Compose deployment (development / demonstration environment)</li>
<li>Confluent Platform 5.2.2 (fixed version)</li>
</ul>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="3-use-case-view-1">3. Use Case View (+1)<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#3-use-case-view-1" class="hash-link" aria-label="Direct link to 3. Use Case View (+1)" title="Direct link to 3. Use Case View (+1)">​</a></h2>
<p>The Use Case View captures the key scenarios that motivated and validate the architectural
decisions.  In the 4+1 model this view acts as the glue — each scenario exercises a slice
through every other view.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="31-actor-diagram">3.1 Actor Diagram<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#31-actor-diagram" class="hash-link" aria-label="Direct link to 3.1 Actor Diagram" title="Direct link to 3.1 Actor Diagram">​</a></h3>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="32-key-scenarios">3.2 Key Scenarios<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#32-key-scenarios" class="hash-link" aria-label="Direct link to 3.2 Key Scenarios" title="Direct link to 3.2 Key Scenarios">​</a></h3>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="uc-01--view-live-transit-status">UC-01 — View Live Transit Status<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#uc-01--view-live-transit-status" class="hash-link" aria-label="Direct link to UC-01 — View Live Transit Status" title="Direct link to UC-01 — View Live Transit Status">​</a></h4>
<p><strong>Trigger:</strong> Transit operator opens <code>http://localhost:8888</code>
<strong>Flow:</strong> Tornado serves <code>status.html</code> populated from in-memory <code>Lines</code> and <code>Weather</code> state
that is continuously updated by four Kafka consumers running as async coroutines.
<strong>Architectural relevance:</strong> Drives the Tornado async server choice (ADR-006) and the
requirement for in-process Kafka consumer coroutines.</p>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="uc-02--publish-train-arrival-event">UC-02 — Publish Train Arrival Event<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#uc-02--publish-train-arrival-event" class="hash-link" aria-label="Direct link to UC-02 — Publish Train Arrival Event" title="Direct link to UC-02 — Publish Train Arrival Event">​</a></h4>
<p><strong>Trigger:</strong> Simulation time step advances; a train moves to the next station.
<strong>Flow:</strong> <code>Station.run()</code> → <code>AvroProducer.produce()</code> → Schema Registry validates Avro →
Kafka topic <code>org.chicago.cta.station.arrivals.t001</code> → <code>KafkaConsumer</code> in server →
<code>Lines.process_message()</code> → UI state updated.
<strong>Architectural relevance:</strong> Establishes the end-to-end Kafka + Avro pipeline (ADR-001, ADR-002).</p>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="uc-04--publish-weather-reading">UC-04 — Publish Weather Reading<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#uc-04--publish-weather-reading" class="hash-link" aria-label="Direct link to UC-04 — Publish Weather Reading" title="Direct link to UC-04 — Publish Weather Reading">​</a></h4>
<p><strong>Trigger:</strong> Simulation hour boundary.
<strong>Flow:</strong> <code>Weather.run()</code> → HTTP POST to Kafka REST Proxy → Kafka topic
<code>org.chicago.cta.weather.v1</code> → <code>KafkaConsumer</code> in server → <code>Weather.process_message()</code>.
<strong>Architectural relevance:</strong> Demonstrates the REST Proxy integration path (ADR-005).</p>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="uc-06--aggregate-rider-counts">UC-06 — Aggregate Rider Counts<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#uc-06--aggregate-rider-counts" class="hash-link" aria-label="Direct link to UC-06 — Aggregate Rider Counts" title="Direct link to UC-06 — Aggregate Rider Counts">​</a></h4>
<p><strong>Trigger:</strong> Continuous turnstile events on <code>com.cta.stations.turnstile.entry</code>.
<strong>Flow:</strong> KSQL <code>turnstile</code> table materialises from topic → KSQL <code>TURNSTILE_SUMMARY</code> GROUP BY
aggregation → new Kafka topic → <code>KafkaConsumer (is_avro=False)</code> in server → UI ridership count.
<strong>Architectural relevance:</strong> Drives the KSQL aggregation decision (ADR-004).</p>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="uc-07--transform-station-schema">UC-07 — Transform Station Schema<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#uc-07--transform-station-schema" class="hash-link" aria-label="Direct link to UC-07 — Transform Station Schema" title="Direct link to UC-07 — Transform Station Schema">​</a></h4>
<p><strong>Trigger:</strong> Kafka Connect pushes a raw station row to <code>com.cta.stations.data.rawt001.stations</code>.
<strong>Flow:</strong> Faust <code>transform_stations</code> agent reads record → resolves <code>red/blue/green</code> booleans
to <code>line</code> string → writes <code>TransformedStation</code> to <code>org.chicago.cta.stations.table.v1t001</code>
and updates Faust in-memory table.
<strong>Architectural relevance:</strong> Drives the Faust stream processor choice (ADR-004).</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="4-logical-view">4. Logical View<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#4-logical-view" class="hash-link" aria-label="Direct link to 4. Logical View" title="Direct link to 4. Logical View">​</a></h2>
<p>The Logical View describes the system's functional decomposition into key abstractions,
their responsibilities, and their relationships.  This view follows ArchiMate's
<strong>Application Layer</strong> notation.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="41-component-overview">4.1 Component Overview<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#41-component-overview" class="hash-link" aria-label="Direct link to 4.1 Component Overview" title="Direct link to 4.1 Component Overview">​</a></h3>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="42-key-abstractions">4.2 Key Abstractions<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#42-key-abstractions" class="hash-link" aria-label="Direct link to 4.2 Key Abstractions" title="Direct link to 4.2 Key Abstractions">​</a></h3>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="producer-hierarchy">Producer Hierarchy<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#producer-hierarchy" class="hash-link" aria-label="Direct link to Producer Hierarchy" title="Direct link to Producer Hierarchy">​</a></h4>
<!-- -->
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="consumer--model-hierarchy">Consumer / Model Hierarchy<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#consumer--model-hierarchy" class="hash-link" aria-label="Direct link to Consumer / Model Hierarchy" title="Direct link to Consumer / Model Hierarchy">​</a></h4>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="43-kafka-topic-catalogue">4.3 Kafka Topic Catalogue<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#43-kafka-topic-catalogue" class="hash-link" aria-label="Direct link to 4.3 Kafka Topic Catalogue" title="Direct link to 4.3 Kafka Topic Catalogue">​</a></h3>
<table><thead><tr><th>Topic</th><th>Producer</th><th>Consumer(s)</th><th>Format</th><th>Partitions</th></tr></thead><tbody><tr><td><code>org.chicago.cta.station.arrivals.t001</code></td><td>Station (AvroProducer)</td><td>Tornado server</td><td>Avro</td><td>10</td></tr><tr><td><code>com.cta.stations.turnstile.entry</code></td><td>Turnstile (AvroProducer)</td><td>KSQL</td><td>Avro</td><td>10</td></tr><tr><td><code>org.chicago.cta.weather.v1</code></td><td>Weather (REST Proxy)</td><td>Tornado server</td><td>Avro</td><td>10</td></tr><tr><td><code>com.cta.stations.data.rawt001.stations</code></td><td>Kafka Connect JDBC</td><td>Faust</td><td>JSON (Connect)</td><td>1</td></tr><tr><td><code>org.chicago.cta.stations.table.v1t001</code></td><td>Faust</td><td>Tornado server</td><td>JSON</td><td>1</td></tr><tr><td><code>TURNSTILE_SUMMARY</code></td><td>KSQL</td><td>Tornado server</td><td>JSON</td><td>—</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="5-process-view">5. Process View<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#5-process-view" class="hash-link" aria-label="Direct link to 5. Process View" title="Direct link to 5. Process View">​</a></h2>
<p>The Process View describes the system's dynamic behaviour — how processes start, how data
flows between them at runtime, and how concurrency is managed.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="51-system-startup-sequence">5.1 System Startup Sequence<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#51-system-startup-sequence" class="hash-link" aria-label="Direct link to 5.1 System Startup Sequence" title="Direct link to 5.1 System Startup Sequence">​</a></h3>
<p>The diagram below shows the mandatory startup order.  Components further right depend on
components to their left being fully initialised.</p>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="52-end-to-end-data-flow--train-arrival">5.2 End-to-End Data Flow — Train Arrival<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#52-end-to-end-data-flow--train-arrival" class="hash-link" aria-label="Direct link to 5.2 End-to-End Data Flow — Train Arrival" title="Direct link to 5.2 End-to-End Data Flow — Train Arrival">​</a></h3>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="53-end-to-end-data-flow--turnstile-aggregation">5.3 End-to-End Data Flow — Turnstile Aggregation<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#53-end-to-end-data-flow--turnstile-aggregation" class="hash-link" aria-label="Direct link to 5.3 End-to-End Data Flow — Turnstile Aggregation" title="Direct link to 5.3 End-to-End Data Flow — Turnstile Aggregation">​</a></h3>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="54-concurrency-model">5.4 Concurrency Model<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#54-concurrency-model" class="hash-link" aria-label="Direct link to 5.4 Concurrency Model" title="Direct link to 5.4 Concurrency Model">​</a></h3>
<!-- -->
<p>The entire consumer application runs in a <strong>single OS thread</strong> using cooperative multitasking.
Kafka polling is non-blocking (0.1 s timeout).  The HTTP handler is synchronous but executes
between coroutine yield points, keeping UI latency low.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="6-development-view">6. Development View<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#6-development-view" class="hash-link" aria-label="Direct link to 6. Development View" title="Direct link to 6. Development View">​</a></h2>
<p>The Development View describes the organisation of the software in the development environment —
module structure, package dependencies, and build artefacts.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="61-module-structure">6.1 Module Structure<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#61-module-structure" class="hash-link" aria-label="Direct link to 6.1 Module Structure" title="Direct link to 6.1 Module Structure">​</a></h3>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="62-package-dependencies">6.2 Package Dependencies<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#62-package-dependencies" class="hash-link" aria-label="Direct link to 6.2 Package Dependencies" title="Direct link to 6.2 Package Dependencies">​</a></h3>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="63-entry-points-and-startup-commands">6.3 Entry Points and Startup Commands<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#63-entry-points-and-startup-commands" class="hash-link" aria-label="Direct link to 6.3 Entry Points and Startup Commands" title="Direct link to 6.3 Entry Points and Startup Commands">​</a></h3>
<table><thead><tr><th>Process</th><th>Entry Point</th><th>Command</th></tr></thead><tbody><tr><td>Data producer + simulation</td><td><code>producers/simulation.py</code></td><td><code>python simulation.py</code></td></tr><tr><td>Station stream transformer</td><td><code>consumers/faust_stream.py</code></td><td><code>faust -A faust_stream worker -l info</code></td></tr><tr><td>Turnstile KSQL setup</td><td><code>consumers/ksql.py</code></td><td><code>python ksql.py</code></td></tr><tr><td>Dashboard web server</td><td><code>consumers/server.py</code></td><td><code>python server.py</code></td></tr></tbody></table>
<blockquote>
<p><strong>Note:</strong> Processes 2, 3, and 4 have an implicit startup ordering dependency.
The Kafka Connect JDBC connector (configured by the simulation) must produce station data
before the Faust app can transform it; the KSQL tables must exist before the dashboard starts.
There is no orchestration script enforcing this order.</p>
</blockquote>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="7-physical-view">7. Physical View<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#7-physical-view" class="hash-link" aria-label="Direct link to 7. Physical View" title="Direct link to 7. Physical View">​</a></h2>
<p>The Physical View maps software components onto physical (or virtualised) infrastructure.
This view follows ArchiMate's <strong>Technology Layer</strong>.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="71-container-deployment-diagram">7.1 Container Deployment Diagram<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#71-container-deployment-diagram" class="hash-link" aria-label="Direct link to 7.1 Container Deployment Diagram" title="Direct link to 7.1 Container Deployment Diagram">​</a></h3>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="72-network-port-map">7.2 Network Port Map<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#72-network-port-map" class="hash-link" aria-label="Direct link to 7.2 Network Port Map" title="Direct link to 7.2 Network Port Map">​</a></h3>
<table><thead><tr><th>Port</th><th>Service</th><th>Protocol</th><th>Consumer(s)</th></tr></thead><tbody><tr><td>2181</td><td>Zookeeper</td><td>TCP</td><td>Kafka broker (internal)</td></tr><tr><td>9092</td><td>Kafka broker</td><td>PLAINTEXT</td><td>Python producers, Python consumers, Faust</td></tr><tr><td>8081</td><td>Schema Registry</td><td>HTTP</td><td>AvroProducer, AvroConsumer, Kafka Connect</td></tr><tr><td>8082</td><td>Kafka REST Proxy</td><td>HTTP</td><td>Weather producer</td></tr><tr><td>8083</td><td>Kafka Connect REST API</td><td>HTTP</td><td><code>connector.py</code> setup</td></tr><tr><td>8084</td><td>Connect UI</td><td>HTTP</td><td>Operator browser</td></tr><tr><td>8085</td><td>Topics UI</td><td>HTTP</td><td>Operator browser</td></tr><tr><td>8086</td><td>Schema Registry UI</td><td>HTTP</td><td>Operator browser</td></tr><tr><td>8088</td><td>KSQL Server</td><td>HTTP</td><td><code>ksql.py</code> setup</td></tr><tr><td>5432</td><td>PostgreSQL</td><td>TCP</td><td>Kafka Connect JDBC</td></tr><tr><td>8888</td><td>Tornado Dashboard</td><td>HTTP</td><td>Transit Operator browser</td></tr></tbody></table>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="73-data-persistence-boundary">7.3 Data Persistence Boundary<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#73-data-persistence-boundary" class="hash-link" aria-label="Direct link to 7.3 Data Persistence Boundary" title="Direct link to 7.3 Data Persistence Boundary">​</a></h3>
<!-- -->
<p>All in-process state is rebuilt from Kafka on restart.  Durable state exists only in PostgreSQL
(station reference data) and the Kafka topic logs.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="8-architectural-decisions-summary">8. Architectural Decisions Summary<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#8-architectural-decisions-summary" class="hash-link" aria-label="Direct link to 8. Architectural Decisions Summary" title="Direct link to 8. Architectural Decisions Summary">​</a></h2>
<p>Cross-reference to the detailed ADR documents in <code>docs/adr/</code>.</p>
<table><thead><tr><th>ID</th><th>Decision</th><th>Rationale</th><th>ADR</th></tr></thead><tbody><tr><td>AD-01</td><td>Apache Kafka as the central event bus</td><td>Decoupling, replayability, fan-out</td><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus">ADR-001</a></td></tr><tr><td>AD-02</td><td>Avro + Schema Registry for all first-party topics</td><td>Schema evolution, contract enforcement</td><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-002-avro-schema-registry">ADR-002</a></td></tr><tr><td>AD-03</td><td>Kafka Connect JDBC Source for PostgreSQL</td><td>Zero custom ingestion code; handles offset/retry</td><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-003-kafka-connect-jdbc-postgres">ADR-003</a></td></tr><tr><td>AD-04</td><td>Faust for station transformation</td><td>Python-native; record-level transform</td><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql">ADR-004</a></td></tr><tr><td>AD-05</td><td>KSQL for turnstile aggregation</td><td>Declarative SQL GROUP BY; no Python state management</td><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql">ADR-004</a></td></tr><tr><td>AD-06</td><td>Kafka REST Proxy for weather</td><td>Demonstrates HTTP-based produce path</td><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-005-rest-proxy-for-weather-producer">ADR-005</a></td></tr><tr><td>AD-07</td><td>Tornado async web server</td><td>Single-thread concurrency for Kafka + HTTP</td><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard">ADR-006</a></td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="9-risks-and-technical-debt">9. Risks and Technical Debt<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#9-risks-and-technical-debt" class="hash-link" aria-label="Direct link to 9. Risks and Technical Debt" title="Direct link to 9. Risks and Technical Debt">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="91-risks">9.1 Risks<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#91-risks" class="hash-link" aria-label="Direct link to 9.1 Risks" title="Direct link to 9.1 Risks">​</a></h3>
<table><thead><tr><th>ID</th><th>Risk</th><th>Severity</th><th>Affected View</th><th>Mitigation</th></tr></thead><tbody><tr><td>R-01</td><td>Single Kafka broker — SPOF</td><td>High</td><td>Physical</td><td>Add 2 additional brokers; set <code>replication_factor=3</code></td></tr><tr><td>R-02</td><td>Replication factor 1 on all topics</td><td>High</td><td>Physical</td><td>Increase to 3 in production</td></tr><tr><td>R-03</td><td>Hard-coded <code>localhost</code> addresses in both <code>constants.py</code> files</td><td>Medium</td><td>Development</td><td>Externalise via environment variables or a config file</td></tr><tr><td>R-04</td><td>Hard-coded DB credentials in <code>connector.py</code></td><td>High</td><td>Physical</td><td>Use Kafka Connect secrets management or environment injection</td></tr><tr><td>R-05</td><td>Manual startup ordering with no orchestration</td><td>Medium</td><td>Process</td><td>Add a readiness-check script or use <code>depends_on</code> with health checks</td></tr><tr><td>R-06</td><td><code>AvroProducer</code> is a deprecated Confluent API</td><td>Medium</td><td>Development</td><td>Migrate to <code>SerializingProducer</code> + <code>AvroSerializer</code></td></tr></tbody></table>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="92-technical-debt">9.2 Technical Debt<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture#92-technical-debt" class="hash-link" aria-label="Direct link to 9.2 Technical Debt" title="Direct link to 9.2 Technical Debt">​</a></h3>
<table><thead><tr><th>ID</th><th>Description</th><th>Location</th><th>Effort</th></tr></thead><tbody><tr><td>TD-01</td><td><code>TURNSTILE_SUMMARY</code> uses JSON while all other topics use Avro — inconsistency in serialisation convention</td><td><code>consumers/ksql.py</code>, <code>consumers/server.py:87</code></td><td>Low</td></tr><tr><td>TD-02</td><td>Faust Table uses <code>store="memory://"</code> — state lost on restart, rebuild time increases with topic size</td><td><code>consumers/faust_stream.py:38</code></td><td>Medium</td></tr><tr><td>TD-03</td><td>Both <code>producers/constants.py</code> and <code>consumers/constants.py</code> duplicate identical constant values</td><td>Both files</td><td>Low</td></tr><tr><td>TD-04</td><td>No unit or integration tests present in the repository</td><td>Entire codebase</td><td>High</td></tr><tr><td>TD-05</td><td>Weather schema JSON loaded on every <code>Weather.__init__</code> call via file I/O (class variables mitigate partially)</td><td><code>producers/models/weather.py:49-55</code></td><td>Low</td></tr><tr><td>TD-06</td><td><code>connector.py</code> exits the process on connector creation failure, preventing graceful recovery</td><td><code>producers/connector.py:51-53</code></td><td>Low</td></tr></tbody></table>
<hr>
<p><em>Document generated by reverse-engineering the source code on 2026-03-12.
All diagrams use <a href="https://mermaid.js.org/" target="_blank" rel="noopener noreferrer">Mermaid</a> and render natively on GitHub.</em></p>]]></content>
    </entry>
    <entry>
        <title type="html"><![CDATA[Architectural Trade-off Analysis — CTA Public Transport Optimisation System]]></title>
        <id>https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis</id>
        <link href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis"/>
        <updated>2026-05-19T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Version: 1.0]]></summary>
        <content type="html"><![CDATA[<p><strong>Version:</strong> 1.0
<strong>Date:</strong> 2026-03-12
<strong>Authors:</strong> Architecture Review (reverse-engineered from codebase)
<strong>References:</strong> <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/architecture">architecture.md</a> · <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus">ADR-001</a> through <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard">ADR-006</a></p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="table-of-contents">Table of Contents<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#table-of-contents" class="hash-link" aria-label="Direct link to Table of Contents" title="Direct link to Table of Contents">​</a></h2>
<ol>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#1-evaluation-framework">Evaluation Framework</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#2-decision-trade-off-analysis">Decision Trade-off Analysis</a>
<ul>
<li>2.1 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#21-event-bus--kafka-vs-alternatives">Event Bus — Kafka vs Alternatives</a></li>
<li>2.2 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#22-serialisation--avro-vs-alternatives">Serialisation — Avro vs Alternatives</a></li>
<li>2.3 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#23-db-ingestion--kafka-connect-vs-custom-producer">DB Ingestion — Kafka Connect vs Custom Producer</a></li>
<li>2.4 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#24-stream-processing--faust--ksql-vs-alternatives">Stream Processing — Faust + KSQL vs Alternatives</a></li>
<li>2.5 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#25-weather-producer--rest-proxy-vs-native-client">Weather Producer — REST Proxy vs Native Client</a></li>
<li>2.6 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#26-dashboard-server--tornado-vs-alternatives">Dashboard Server — Tornado vs Alternatives</a></li>
</ul>
</li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#3-cross-cutting-trade-off-analysis">Cross-Cutting Trade-off Analysis</a>
<ul>
<li>3.1 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#31-serialisation-consistency">Serialisation Consistency</a></li>
<li>3.2 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#32-state-management-strategy">State Management Strategy</a></li>
<li>3.3 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#33-concurrency-model">Concurrency Model</a></li>
<li>3.4 <a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#34-operational-complexity">Operational Complexity</a></li>
</ul>
</li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#4-architecture-fitness-function">Architecture Fitness Function</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#5-strategic-recommendations">Strategic Recommendations</a></li>
<li><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#6-trade-off-summary-heatmap">Trade-off Summary Heatmap</a></li>
</ol>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="1-evaluation-framework">1. Evaluation Framework<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#1-evaluation-framework" class="hash-link" aria-label="Direct link to 1. Evaluation Framework" title="Direct link to 1. Evaluation Framework">​</a></h2>
<p>Every trade-off is scored against the six quality attributes (QAs) derived from the system's
architectural drivers (see <code>architecture.md §2</code>).</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="11-quality-attribute-weights">1.1 Quality Attribute Weights<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#11-quality-attribute-weights" class="hash-link" aria-label="Direct link to 1.1 Quality Attribute Weights" title="Direct link to 1.1 Quality Attribute Weights">​</a></h3>
<table><thead><tr><th>ID</th><th>Quality Attribute</th><th>Weight</th><th>Justification</th></tr></thead><tbody><tr><td>QA-01</td><td><strong>Throughput</strong></td><td>20 %</td><td>System processes events from 3 lines × stations × 10 trains @ 5 s intervals</td></tr><tr><td>QA-02</td><td><strong>Decoupling</strong></td><td>20 %</td><td>Producers and consumers must evolve independently</td></tr><tr><td>QA-03</td><td><strong>Schema Evolution</strong></td><td>15 %</td><td>Fields may be added; consumers must not break</td></tr><tr><td>QA-04</td><td><strong>Replayability</strong></td><td>15 %</td><td>Dashboard must rebuild state on restart</td></tr><tr><td>QA-05</td><td><strong>Responsiveness</strong></td><td>15 %</td><td>Dashboard HTTP latency must not stall Kafka polling</td></tr><tr><td>QA-06</td><td><strong>Extensibility</strong></td><td>15 %</td><td>New data sources/consumers without code changes</td></tr></tbody></table>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="12-scoring-scale">1.2 Scoring Scale<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#12-scoring-scale" class="hash-link" aria-label="Direct link to 1.2 Scoring Scale" title="Direct link to 1.2 Scoring Scale">​</a></h3>
<table><thead><tr><th>Score</th><th>Meaning</th></tr></thead><tbody><tr><td>5</td><td>Fully meets the quality attribute</td></tr><tr><td>4</td><td>Meets with minor gaps</td></tr><tr><td>3</td><td>Partial / neutral</td></tr><tr><td>2</td><td>Partially undermines the quality attribute</td></tr><tr><td>1</td><td>Significantly undermines the quality attribute</td></tr></tbody></table>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="13-risk-scale">1.3 Risk Scale<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#13-risk-scale" class="hash-link" aria-label="Direct link to 1.3 Risk Scale" title="Direct link to 1.3 Risk Scale">​</a></h3>
<table><thead><tr><th>Level</th><th>Symbol</th><th>Meaning</th></tr></thead><tbody><tr><td>Critical</td><td>🔴</td><td>Likely to cause production incidents</td></tr><tr><td>High</td><td>🟠</td><td>Significant impact under normal load</td></tr><tr><td>Medium</td><td>🟡</td><td>Impact under edge cases or growth</td></tr><tr><td>Low</td><td>🟢</td><td>Manageable with standard practices</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="2-decision-trade-off-analysis">2. Decision Trade-off Analysis<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#2-decision-trade-off-analysis" class="hash-link" aria-label="Direct link to 2. Decision Trade-off Analysis" title="Direct link to 2. Decision Trade-off Analysis">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="21-event-bus--kafka-vs-alternatives">2.1 Event Bus — Kafka vs Alternatives<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#21-event-bus--kafka-vs-alternatives" class="hash-link" aria-label="Direct link to 2.1 Event Bus — Kafka vs Alternatives" title="Direct link to 2.1 Event Bus — Kafka vs Alternatives">​</a></h3>
<blockquote>
<p><strong>Decision:</strong> ADR-001 — Apache Kafka as the single event bus</p>
</blockquote>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="weighted-scoring-matrix">Weighted Scoring Matrix<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#weighted-scoring-matrix" class="hash-link" aria-label="Direct link to Weighted Scoring Matrix" title="Direct link to Weighted Scoring Matrix">​</a></h4>
<table><thead><tr><th>Quality Attribute</th><th>Weight</th><th><strong>Kafka</strong></th><th>RabbitMQ</th><th>Redis Streams</th><th>REST Polling</th></tr></thead><tbody><tr><td>Throughput (QA-01)</td><td>20 %</td><td><strong>5</strong></td><td>4</td><td>4</td><td>2</td></tr><tr><td>Decoupling (QA-02)</td><td>20 %</td><td><strong>5</strong></td><td>4</td><td>3</td><td>1</td></tr><tr><td>Schema Evolution (QA-03)</td><td>15 %</td><td><strong>5</strong></td><td>3</td><td>2</td><td>2</td></tr><tr><td>Replayability (QA-04)</td><td>15 %</td><td><strong>5</strong></td><td>2</td><td>3</td><td>1</td></tr><tr><td>Responsiveness (QA-05)</td><td>15 %</td><td>4</td><td>4</td><td><strong>5</strong></td><td>2</td></tr><tr><td>Extensibility (QA-06)</td><td>15 %</td><td><strong>5</strong></td><td>4</td><td>3</td><td>1</td></tr><tr><td><strong>Weighted Total</strong></td><td></td><td><strong>4.85</strong></td><td>3.55</td><td>3.30</td><td>1.55</td></tr></tbody></table>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="positioning">Positioning<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#positioning" class="hash-link" aria-label="Direct link to Positioning" title="Direct link to Positioning">​</a></h4>
<!-- -->
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="trade-off-narrative">Trade-off Narrative<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#trade-off-narrative" class="hash-link" aria-label="Direct link to Trade-off Narrative" title="Direct link to Trade-off Narrative">​</a></h4>
<p><strong>Why Kafka wins here:</strong>
Kafka's append-only, partitioned log is the single feature that unlocks <strong>replayability</strong> and
<strong>fan-out</strong> simultaneously — properties that no queue-based broker (RabbitMQ) provides
out of the box. The ability to start a new consumer at <code>offset_earliest</code> and rebuild the full
station/weather state is architecturally critical for the dashboard's cold-start scenario.</p>
<p><strong>What is sacrificed:</strong></p>
<ul>
<li><strong>Operational simplicity.</strong> Kafka requires Zookeeper (in CP 5.x), Schema Registry, and REST
Proxy as satellites. A RabbitMQ cluster is simpler to operate.</li>
<li><strong>Latency at p99.</strong> Kafka batches records before acknowledgment; for the weather producer that
posts once per simulated hour, this is irrelevant, but it rules Kafka out for
sub-millisecond latency use cases.</li>
</ul>
<p><strong>Key risk introduced:</strong> 🔴 Single-broker deployment with <code>replication_factor=1</code>.
In production, broker failure loses all un-replicated messages.</p>
<hr>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="22-serialisation--avro-vs-alternatives">2.2 Serialisation — Avro vs Alternatives<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#22-serialisation--avro-vs-alternatives" class="hash-link" aria-label="Direct link to 2.2 Serialisation — Avro vs Alternatives" title="Direct link to 2.2 Serialisation — Avro vs Alternatives">​</a></h3>
<blockquote>
<p><strong>Decision:</strong> ADR-002 — Apache Avro + Confluent Schema Registry</p>
</blockquote>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="weighted-scoring-matrix-1">Weighted Scoring Matrix<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#weighted-scoring-matrix-1" class="hash-link" aria-label="Direct link to Weighted Scoring Matrix" title="Direct link to Weighted Scoring Matrix">​</a></h4>
<table><thead><tr><th>Quality Attribute</th><th>Weight</th><th><strong>Avro + Registry</strong></th><th>Plain JSON</th><th>Protobuf</th><th>MessagePack</th></tr></thead><tbody><tr><td>Throughput (QA-01)</td><td>20 %</td><td><strong>5</strong></td><td>3</td><td><strong>5</strong></td><td>4</td></tr><tr><td>Decoupling (QA-02)</td><td>20 %</td><td><strong>5</strong></td><td>2</td><td><strong>5</strong></td><td>2</td></tr><tr><td>Schema Evolution (QA-03)</td><td>15 %</td><td><strong>5</strong></td><td>1</td><td><strong>5</strong></td><td>2</td></tr><tr><td>Replayability (QA-04)</td><td>15 %</td><td><strong>5</strong></td><td>3</td><td>4</td><td>3</td></tr><tr><td>Responsiveness (QA-05)</td><td>15 %</td><td>4</td><td><strong>5</strong></td><td>4</td><td>4</td></tr><tr><td>Extensibility (QA-06)</td><td>15 %</td><td><strong>5</strong></td><td>2</td><td>4</td><td>2</td></tr><tr><td><strong>Weighted Total</strong></td><td></td><td><strong>4.85</strong></td><td>2.55</td><td>4.55</td><td>2.80</td></tr></tbody></table>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="trade-off-narrative-1">Trade-off Narrative<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#trade-off-narrative-1" class="hash-link" aria-label="Direct link to Trade-off Narrative" title="Direct link to Trade-off Narrative">​</a></h4>
<p><strong>Why Avro wins:</strong>
The Schema Registry's compatibility check acts as a <strong>compile-time equivalent at publish-time</strong> —
a field removal or rename is rejected before a single consumer can be broken.  Avro's wire format
embeds only the schema ID (4 bytes), making messages far more compact than equivalent JSON.</p>
<p><strong>Protobuf is the credible alternative:</strong> Protobuf achieves nearly identical scores. The
differentiator is ecosystem fit: <code>confluent-kafka-python</code>'s <code>AvroProducer</code>/<code>AvroConsumer</code>
were the idiomatic Python Confluent API at CP 5.2.2, whereas Protobuf support required more
boilerplate. Today (Confluent Platform 7+), Protobuf is first-class; migrating would be viable.</p>
<p><strong>Key inconsistency introduced:</strong> 🟠 <code>TURNSTILE_SUMMARY</code> uses JSON while all other topics use
Avro. This forces consumers to branch on <code>is_avro</code> and removes schema-enforcement for rider
counts — the metric that most directly feeds the UI.</p>
<!-- -->
<hr>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="23-db-ingestion--kafka-connect-vs-custom-producer">2.3 DB Ingestion — Kafka Connect vs Custom Producer<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#23-db-ingestion--kafka-connect-vs-custom-producer" class="hash-link" aria-label="Direct link to 2.3 DB Ingestion — Kafka Connect vs Custom Producer" title="Direct link to 2.3 DB Ingestion — Kafka Connect vs Custom Producer">​</a></h3>
<blockquote>
<p><strong>Decision:</strong> ADR-003 — Kafka Connect JDBC Source Connector</p>
</blockquote>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="weighted-scoring-matrix-2">Weighted Scoring Matrix<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#weighted-scoring-matrix-2" class="hash-link" aria-label="Direct link to Weighted Scoring Matrix" title="Direct link to Weighted Scoring Matrix">​</a></h4>
<table><thead><tr><th>Quality Attribute</th><th>Weight</th><th><strong>Kafka Connect JDBC</strong></th><th>Custom Python Producer</th><th>Direct DB Read in Consumer</th><th>Debezium CDC</th></tr></thead><tbody><tr><td>Throughput (QA-01)</td><td>20 %</td><td>4</td><td>4</td><td>3</td><td><strong>5</strong></td></tr><tr><td>Decoupling (QA-02)</td><td>20 %</td><td><strong>5</strong></td><td>3</td><td>1</td><td><strong>5</strong></td></tr><tr><td>Schema Evolution (QA-03)</td><td>15 %</td><td>3</td><td>2</td><td>1</td><td><strong>5</strong></td></tr><tr><td>Replayability (QA-04)</td><td>15 %</td><td><strong>5</strong></td><td>4</td><td>1</td><td><strong>5</strong></td></tr><tr><td>Responsiveness (QA-05)</td><td>15 %</td><td>4</td><td>4</td><td>2</td><td>4</td></tr><tr><td>Extensibility (QA-06)</td><td>15 %</td><td><strong>5</strong></td><td>3</td><td>1</td><td><strong>5</strong></td></tr><tr><td><strong>Weighted Total</strong></td><td></td><td><strong>4.35</strong></td><td>3.30</td><td>1.55</td><td><strong>4.85</strong></td></tr></tbody></table>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="trade-off-narrative-2">Trade-off Narrative<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#trade-off-narrative-2" class="hash-link" aria-label="Direct link to Trade-off Narrative" title="Direct link to Trade-off Narrative">​</a></h4>
<p><strong>Why Kafka Connect wins over a custom producer:</strong>
Zero-code ingestion eliminates an entire class of bugs: offset tracking, error handling, and
retry logic are handled by a battle-tested framework.  The connector is <strong>idempotent</strong> — safe
to re-run on simulation restart.</p>
<p><strong>Why Debezium CDC scores higher but was rejected:</strong>
Debezium captures every INSERT/UPDATE/DELETE via PostgreSQL Write-Ahead Log, which is more
correct (would capture station updates, not just inserts). However, enabling WAL replication
requires DBA-level PostgreSQL configuration (<code>wal_level=logical</code>), which is overkill when
the <code>stations</code> table is quasi-static reference data loaded once from CSV.</p>
<p><strong>Hidden cost of the chosen approach:</strong> 🟡 <code>mode=incrementing</code> only detects new rows by
monotonically increasing <code>stop_id</code>.  A station name correction or line reassignment will
silently remain stale in the Kafka topic and in the dashboard until the connector is manually
reset and replayed.</p>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="when-the-decision-should-be-revisited">When the decision should be revisited<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#when-the-decision-should-be-revisited" class="hash-link" aria-label="Direct link to When the decision should be revisited" title="Direct link to When the decision should be revisited">​</a></h4>
<p>If station data becomes writable (e.g. an admin UI for updating station names), migrate to
Debezium CDC to capture UPDATE and DELETE events.</p>
<hr>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="24-stream-processing--faust--ksql-vs-alternatives">2.4 Stream Processing — Faust + KSQL vs Alternatives<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#24-stream-processing--faust--ksql-vs-alternatives" class="hash-link" aria-label="Direct link to 2.4 Stream Processing — Faust + KSQL vs Alternatives" title="Direct link to 2.4 Stream Processing — Faust + KSQL vs Alternatives">​</a></h3>
<blockquote>
<p><strong>Decision:</strong> ADR-004 — Faust for record transformation + KSQL for aggregation</p>
</blockquote>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="option-space">Option Space<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#option-space" class="hash-link" aria-label="Direct link to Option Space" title="Direct link to Option Space">​</a></h4>
<!-- -->
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="weighted-scoring-matrix-3">Weighted Scoring Matrix<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#weighted-scoring-matrix-3" class="hash-link" aria-label="Direct link to Weighted Scoring Matrix" title="Direct link to Weighted Scoring Matrix">​</a></h4>
<table><thead><tr><th>Quality Attribute</th><th>Weight</th><th><strong>Faust + KSQL</strong></th><th>Faust Only</th><th>KSQL Only</th><th>Kafka Streams</th><th>Spark SS</th></tr></thead><tbody><tr><td>Throughput (QA-01)</td><td>20 %</td><td>4</td><td>4</td><td><strong>5</strong></td><td><strong>5</strong></td><td><strong>5</strong></td></tr><tr><td>Decoupling (QA-02)</td><td>20 %</td><td><strong>5</strong></td><td><strong>5</strong></td><td><strong>5</strong></td><td><strong>5</strong></td><td><strong>5</strong></td></tr><tr><td>Schema Evolution (QA-03)</td><td>15 %</td><td>4</td><td>4</td><td>4</td><td><strong>5</strong></td><td>4</td></tr><tr><td>Replayability (QA-04)</td><td>15 %</td><td>3</td><td>3</td><td>4</td><td><strong>5</strong></td><td><strong>5</strong></td></tr><tr><td>Responsiveness (QA-05)</td><td>15 %</td><td>4</td><td>4</td><td>4</td><td>4</td><td>3</td></tr><tr><td>Extensibility (QA-06)</td><td>15 %</td><td><strong>5</strong></td><td>4</td><td>4</td><td><strong>5</strong></td><td>4</td></tr><tr><td><strong>Weighted Total</strong></td><td></td><td><strong>4.20</strong></td><td>4.00</td><td>4.35</td><td><strong>4.80</strong></td><td>4.35</td></tr></tbody></table>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="trade-off-narrative-3">Trade-off Narrative<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#trade-off-narrative-3" class="hash-link" aria-label="Direct link to Trade-off Narrative" title="Direct link to Trade-off Narrative">​</a></h4>
<p><strong>The dual-engine pattern is a deliberate pedagogical trade-off:</strong>
The use of two tools increases operational complexity (two processes, two different programming
models) but each tool is used where it excels:</p>
<table><thead><tr><th>Concern</th><th>Faust</th><th>KSQL</th></tr></thead><tbody><tr><td>Programming model</td><td>Async Python coroutines</td><td>Declarative SQL</td></tr><tr><td>Best for</td><td>Arbitrary code logic, Python type safety</td><td>GROUP BY, windowed aggregations</td></tr><tr><td>State store</td><td>In-memory (dev) / RocksDB (prod)</td><td>Kafka-backed materialised table</td></tr><tr><td>Restart behaviour</td><td>Replays topic from earliest</td><td>Persistent table survives restart</td></tr></tbody></table>
<p><strong>Key risk:</strong> 🟡 Faust's <code>store="memory://"</code> means station state is rebuilt from the full topic
on every restart.  As the station topic grows this adds startup latency.  Replace with
<code>store="rocksdb://"</code> for a persistent local state store.</p>
<p><strong>Operational debt:</strong> 🟠 No orchestration enforces the startup order:</p>
<ol>
<li>Kafka Connect must publish station data</li>
<li>Faust must transform it</li>
<li>KSQL must create <code>TURNSTILE_SUMMARY</code></li>
<li>Only then can the dashboard start</li>
</ol>
<p>A failure anywhere in this chain requires manual intervention.</p>
<hr>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="25-weather-producer--rest-proxy-vs-native-client">2.5 Weather Producer — REST Proxy vs Native Client<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#25-weather-producer--rest-proxy-vs-native-client" class="hash-link" aria-label="Direct link to 2.5 Weather Producer — REST Proxy vs Native Client" title="Direct link to 2.5 Weather Producer — REST Proxy vs Native Client">​</a></h3>
<blockquote>
<p><strong>Decision:</strong> ADR-005 — Kafka REST Proxy for the Weather producer</p>
</blockquote>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="weighted-scoring-matrix-4">Weighted Scoring Matrix<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#weighted-scoring-matrix-4" class="hash-link" aria-label="Direct link to Weighted Scoring Matrix" title="Direct link to Weighted Scoring Matrix">​</a></h4>
<table><thead><tr><th>Quality Attribute</th><th>Weight</th><th><strong>REST Proxy</strong></th><th>Native AvroProducer</th></tr></thead><tbody><tr><td>Throughput (QA-01)</td><td>20 %</td><td>3</td><td><strong>5</strong></td></tr><tr><td>Decoupling (QA-02)</td><td>20 %</td><td><strong>5</strong></td><td><strong>5</strong></td></tr><tr><td>Schema Evolution (QA-03)</td><td>15 %</td><td>4</td><td><strong>5</strong></td></tr><tr><td>Replayability (QA-04)</td><td>15 %</td><td><strong>5</strong></td><td><strong>5</strong></td></tr><tr><td>Responsiveness (QA-05)</td><td>15 %</td><td>3</td><td><strong>5</strong></td></tr><tr><td>Extensibility (QA-06)</td><td>15 %</td><td><strong>5</strong></td><td><strong>5</strong></td></tr><tr><td><strong>Weighted Total</strong></td><td></td><td><strong>4.10</strong></td><td><strong>5.00</strong></td></tr></tbody></table>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="trade-off-narrative-4">Trade-off Narrative<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#trade-off-narrative-4" class="hash-link" aria-label="Direct link to Trade-off Narrative" title="Direct link to Trade-off Narrative">​</a></h4>
<p><strong>This decision scores lowest of all six</strong> because there is no functional reason to diverge from
the native client — only demonstration value.</p>
<table><thead><tr><th>Dimension</th><th>REST Proxy</th><th>Native AvroProducer</th></tr></thead><tbody><tr><td>Extra network hop</td><td>Yes (+ ~1–5 ms per request)</td><td>No</td></tr><tr><td>Schema sent in every request</td><td>Yes (wasteful, ~2 KB)</td><td>No (schema ID only after first register)</td></tr><tr><td>Error handling</td><td>Silent drop on HTTP failure</td><td>Delivery callback with retry</td></tr><tr><td>Maintenance burden</td><td>Two integration patterns to understand</td><td>One</td></tr><tr><td>Polyglot value</td><td>Useful if producer is non-Python</td><td>Not applicable here</td></tr></tbody></table>
<p><strong>Verdict:</strong> 🟡 The REST Proxy choice adds cognitive overhead for no functional gain in a
Python-only system.  If the goal is demonstration, the inconsistency should be documented
clearly (it now is in ADR-005).  For a production system, weather should use <code>AvroProducer</code>
like every other producer, and the REST Proxy demo should be a separate isolated example.</p>
<hr>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="26-dashboard-server--tornado-vs-alternatives">2.6 Dashboard Server — Tornado vs Alternatives<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#26-dashboard-server--tornado-vs-alternatives" class="hash-link" aria-label="Direct link to 2.6 Dashboard Server — Tornado vs Alternatives" title="Direct link to 2.6 Dashboard Server — Tornado vs Alternatives">​</a></h3>
<blockquote>
<p><strong>Decision:</strong> ADR-006 — Tornado async web server</p>
</blockquote>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="weighted-scoring-matrix-5">Weighted Scoring Matrix<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#weighted-scoring-matrix-5" class="hash-link" aria-label="Direct link to Weighted Scoring Matrix" title="Direct link to Weighted Scoring Matrix">​</a></h4>
<table><thead><tr><th>Quality Attribute</th><th>Weight</th><th><strong>Tornado</strong></th><th>Flask (sync)</th><th>aiohttp</th><th>FastAPI</th><th>Separate Consumer + Redis</th></tr></thead><tbody><tr><td>Throughput (QA-01)</td><td>20 %</td><td>4</td><td>2</td><td><strong>5</strong></td><td><strong>5</strong></td><td><strong>5</strong></td></tr><tr><td>Decoupling (QA-02)</td><td>20 %</td><td>4</td><td>4</td><td>4</td><td>4</td><td><strong>5</strong></td></tr><tr><td>Schema Evolution (QA-03)</td><td>15 %</td><td>4</td><td>4</td><td>4</td><td>4</td><td>4</td></tr><tr><td>Replayability (QA-04)</td><td>15 %</td><td><strong>5</strong></td><td>3</td><td><strong>5</strong></td><td><strong>5</strong></td><td>4</td></tr><tr><td>Responsiveness (QA-05)</td><td>15 %</td><td><strong>5</strong></td><td>2</td><td><strong>5</strong></td><td><strong>5</strong></td><td><strong>5</strong></td></tr><tr><td>Extensibility (QA-06)</td><td>15 %</td><td>4</td><td>3</td><td>4</td><td><strong>5</strong></td><td>4</td></tr><tr><td><strong>Weighted Total</strong></td><td></td><td><strong>4.30</strong></td><td>2.90</td><td><strong>4.65</strong></td><td><strong>4.65</strong></td><td>4.45</td></tr></tbody></table>
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="positioning-1">Positioning<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#positioning-1" class="hash-link" aria-label="Direct link to Positioning" title="Direct link to Positioning">​</a></h4>
<!-- -->
<h4 class="anchor anchorWithStickyNavbar_LWe7" id="trade-off-narrative-5">Trade-off Narrative<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#trade-off-narrative-5" class="hash-link" aria-label="Direct link to Trade-off Narrative" title="Direct link to Trade-off Narrative">​</a></h4>
<p><strong>Why Tornado is a reasonable but not optimal choice:</strong>
Tornado's <code>IOLoop</code> integrates naturally with <code>confluent_kafka</code>'s callback-based API and was the
idiomatic async web server in the Python ecosystem before <code>asyncio</code> matured.  The chosen design
— <code>spawn_callback</code> for consumers, synchronous GET handler — achieves the goal with minimal code.</p>
<p><strong>aiohttp / FastAPI score higher today</strong> because:</p>
<ul>
<li>Both are built natively on <code>asyncio</code> (no legacy compatibility shim)</li>
<li>FastAPI provides automatic OpenAPI documentation</li>
<li>The <code>aiokafka</code> library provides a fully async consumer compatible with both</li>
</ul>
<p><strong>Flask's fatal flaw in this context:</strong> A synchronous web server cannot co-locate Kafka
consumer polling in the same process without threads.  Using threads reintroduces shared-state
locking complexity that the async model eliminates.</p>
<p><strong>Key risk:</strong> 🟡 All four consumers share a single Kafka <code>group.id</code>.  Starting a second
dashboard instance would split partition ownership, causing each instance to see only a
subset of events — producing an incoherent UI state.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="3-cross-cutting-trade-off-analysis">3. Cross-Cutting Trade-off Analysis<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#3-cross-cutting-trade-off-analysis" class="hash-link" aria-label="Direct link to 3. Cross-Cutting Trade-off Analysis" title="Direct link to 3. Cross-Cutting Trade-off Analysis">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="31-serialisation-consistency">3.1 Serialisation Consistency<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#31-serialisation-consistency" class="hash-link" aria-label="Direct link to 3.1 Serialisation Consistency" title="Direct link to 3.1 Serialisation Consistency">​</a></h3>
<p>The system uses <strong>two serialisation formats</strong> across its six topics:</p>
<!-- -->
<table><thead><tr><th>Dimension</th><th>Avro path (5 topics)</th><th>JSON path (TURNSTILE_SUMMARY)</th></tr></thead><tbody><tr><td>Schema enforcement</td><td>Registry rejects breaking changes</td><td>None</td></tr><tr><td>Consumer code</td><td><code>AvroConsumer</code> (auto-deserialise)</td><td>Manual JSON decode</td></tr><tr><td>Wire size</td><td>Compact (schema ID only)</td><td>Verbose</td></tr><tr><td>Debuggability</td><td>Schema Registry UI</td><td>Raw JSON readable in Topics UI</td></tr><tr><td>Risk of silent breakage</td><td>Low</td><td><strong>High</strong></td></tr></tbody></table>
<p><strong>Recommendation:</strong> Register an Avro schema for <code>TURNSTILE_SUMMARY</code> and change
<code>VALUE_FORMAT='AVRO'</code> in the KSQL CREATE TABLE statement. This removes the <code>is_avro=False</code>
branch from the consumer and makes the serialisation model uniform.</p>
<hr>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="32-state-management-strategy">3.2 State Management Strategy<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#32-state-management-strategy" class="hash-link" aria-label="Direct link to 3.2 State Management Strategy" title="Direct link to 3.2 State Management Strategy">​</a></h3>
<p>The system employs three distinct state-management patterns with different durability guarantees:</p>
<!-- -->
<table><thead><tr><th>State Store</th><th>Pattern</th><th>Cold-Start Cost</th><th>Data Loss Risk</th><th>Recovery</th></tr></thead><tbody><tr><td>PostgreSQL</td><td>Source of truth</td><td>None</td><td>Low (volume)</td><td>Re-seed from CSV</td></tr><tr><td>Kafka logs</td><td>Event log</td><td>None</td><td>🔴 <code>replication_factor=1</code></td><td>None if broker lost</td></tr><tr><td>Faust table (<code>memory://</code>)</td><td>Materialised view</td><td>Replay full topic</td><td>None (replays)</td><td>Automatic</td></tr><tr><td>Tornado in-process</td><td>Derived state</td><td>Replay all 4 topics</td><td>None (replays)</td><td>Automatic</td></tr></tbody></table>
<p><strong>Structural tension:</strong> The design makes all in-process state <strong>reconstruct-able from Kafka</strong>,
which is elegant and correct.  However, it assumes the Kafka logs are themselves durable —
an assumption violated by <code>replication_factor=1</code>.</p>
<hr>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="33-concurrency-model">3.3 Concurrency Model<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#33-concurrency-model" class="hash-link" aria-label="Direct link to 3.3 Concurrency Model" title="Direct link to 3.3 Concurrency Model">​</a></h3>
<p>Three different concurrency approaches coexist across the system:</p>
<table><thead><tr><th>Component</th><th>Concurrency Model</th><th>Thread-safe?</th><th>Scale-out strategy</th></tr></thead><tbody><tr><td><code>simulation.py</code></td><td>Sequential (single Python process, no async)</td><td>N/A</td><td>N/A</td></tr><tr><td><code>faust_stream.py</code></td><td>Asyncio event loop (Faust worker)</td><td>Yes</td><td>Multiple Faust worker instances</td></tr><tr><td><code>ksql.py</code></td><td>Single HTTP request, then exits</td><td>N/A</td><td>N/A</td></tr><tr><td><code>server.py</code></td><td>Tornado IOLoop + <code>spawn_callback</code> coroutines</td><td>Single-thread cooperative</td><td>🔴 Blocked by shared <code>group.id</code></td></tr></tbody></table>
<p><strong>Scale-out constraint for the dashboard:</strong></p>
<!-- -->
<p><strong>Fix:</strong> Each dashboard instance should use a <strong>unique <code>group.id</code></strong> (e.g. append a UUID suffix)
so every instance receives the full partition set and sees all events.</p>
<hr>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="34-operational-complexity">3.4 Operational Complexity<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#34-operational-complexity" class="hash-link" aria-label="Direct link to 3.4 Operational Complexity" title="Direct link to 3.4 Operational Complexity">​</a></h3>
<p>The system requires <strong>7 infrastructure containers + 4 Python processes</strong> to be started in a
specific order:</p>
<!-- -->
<p><strong>Total components: 11</strong></p>
<table><thead><tr><th>Layer</th><th>Components</th><th>Startup dependencies</th><th>Failure impact</th></tr></thead><tbody><tr><td>Infrastructure</td><td>7 Docker containers</td><td>Ordered by <code>depends_on</code></td><td>Total system down</td></tr><tr><td>Producers</td><td>1 Python process</td><td>Kafka + Schema Registry + Connect up</td><td>No events produced</td></tr><tr><td>Stream processors</td><td>2 Python processes</td><td>Kafka + producer running</td><td>Dashboard sees no data</td></tr><tr><td>Dashboard</td><td>1 Python process</td><td>Stream processors running</td><td>No UI</td></tr></tbody></table>
<p><strong>Operational risk:</strong> 🟠 There is no automated readiness check or restart policy for the Python
processes.  A crash at any layer requires manual diagnosis and ordered restart.</p>
<p><strong>Mitigation options:</strong></p>
<table><thead><tr><th>Option</th><th>Effort</th><th>Benefit</th></tr></thead><tbody><tr><td>Add <code>healthcheck</code> to <code>docker-compose.yaml</code> for each service</td><td>Low</td><td>Detect infrastructure failures automatically</td></tr><tr><td>Wrap Python processes in a <code>Makefile</code> with retry logic</td><td>Low</td><td>Reduce manual restart toil</td></tr><tr><td>Add a startup probe script (<code>wait-for-it.sh</code> pattern)</td><td>Medium</td><td>Enforce ordering without manual timing</td></tr><tr><td>Convert Python processes to Docker services</td><td>Medium</td><td>Unified <code>docker-compose up</code> startup</td></tr><tr><td>Migrate to Kubernetes with init containers + readiness probes</td><td>High</td><td>Production-grade orchestration</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="4-architecture-fitness-function">4. Architecture Fitness Function<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#4-architecture-fitness-function" class="hash-link" aria-label="Direct link to 4. Architecture Fitness Function" title="Direct link to 4. Architecture Fitness Function">​</a></h2>
<p>A fitness function scores how well the <strong>as-built</strong> architecture meets each quality attribute.</p>
<div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">Scale: ████████░░ = partially met   ██████████ = fully met   ████░░░░░░ = significantly unmet</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<table><thead><tr><th>QA</th><th>Attribute</th><th style="text-align:center">Current Score</th><th>Evidence</th><th>Gap</th></tr></thead><tbody><tr><td>QA-01</td><td>Throughput</td><td style="text-align:center">🟢 4.0/5</td><td>Kafka partitioning, AvroProducer batching handle demo load</td><td>Single broker caps real scale</td></tr><tr><td>QA-02</td><td>Decoupling</td><td style="text-align:center">🟢 4.5/5</td><td>All flows via Kafka; zero direct service calls</td><td>REST Proxy + native producer inconsistency</td></tr><tr><td>QA-03</td><td>Schema Evolution</td><td style="text-align:center">🟡 3.5/5</td><td>Avro + Schema Registry on 5/6 topics</td><td>TURNSTILE_SUMMARY bypasses registry</td></tr><tr><td>QA-04</td><td>Replayability</td><td style="text-align:center">🟡 3.5/5</td><td><code>offset_earliest</code> on all consumers; Faust rebuilds</td><td><code>replication_factor=1</code> — log loss is unrecoverable</td></tr><tr><td>QA-05</td><td>Responsiveness</td><td style="text-align:center">🟢 4.0/5</td><td>Tornado async model; non-blocking poll</td><td>Hard <code>exit(1)</code> if topics missing blocks startup</td></tr><tr><td>QA-06</td><td>Extensibility</td><td style="text-align:center">🟢 4.5/5</td><td>New consumers subscribe without producer changes</td><td>Startup ordering is implicit, not automated</td></tr></tbody></table>
<p><strong>Overall fitness: 4.0 / 5.0 (80 %) — suitable for demonstration, not production</strong></p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="5-strategic-recommendations">5. Strategic Recommendations<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#5-strategic-recommendations" class="hash-link" aria-label="Direct link to 5. Strategic Recommendations" title="Direct link to 5. Strategic Recommendations">​</a></h2>
<p>Prioritised by risk reduction value vs implementation effort:</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="priority-1--critical-address-before-any-production-use">Priority 1 — Critical (address before any production use)<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#priority-1--critical-address-before-any-production-use" class="hash-link" aria-label="Direct link to Priority 1 — Critical (address before any production use)" title="Direct link to Priority 1 — Critical (address before any production use)">​</a></h3>
<table><thead><tr><th>#</th><th>Recommendation</th><th>ADR</th><th>Risk Addressed</th><th>Effort</th></tr></thead><tbody><tr><td>P1-1</td><td>Set <code>replication_factor=3</code> on all topics; add 2 Kafka brokers</td><td>ADR-001</td><td>🔴 Data loss on broker failure</td><td>Medium</td></tr><tr><td>P1-2</td><td>Externalise all credentials (<code>DB_USER</code>, <code>DB_PASS</code>, <code>BOOTSTRAP_SERVERS</code>) to environment variables</td><td>ADR-003</td><td>🔴 Credential exposure</td><td>Low</td></tr><tr><td>P1-3</td><td>Assign unique <code>group.id</code> per dashboard instance</td><td>ADR-006</td><td>🔴 Incoherent state on scale-out</td><td>Low</td></tr></tbody></table>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="priority-2--high-address-in-first-production-sprint">Priority 2 — High (address in first production sprint)<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#priority-2--high-address-in-first-production-sprint" class="hash-link" aria-label="Direct link to Priority 2 — High (address in first production sprint)" title="Direct link to Priority 2 — High (address in first production sprint)">​</a></h3>
<table><thead><tr><th>#</th><th>Recommendation</th><th>ADR</th><th>Risk Addressed</th><th>Effort</th></tr></thead><tbody><tr><td>P2-1</td><td>Register Avro schema for <code>TURNSTILE_SUMMARY</code>; change <code>VALUE_FORMAT='AVRO'</code></td><td>ADR-002</td><td>🟠 Silent schema breaks on rider count</td><td>Low</td></tr><tr><td>P2-2</td><td>Replace <code>AvroProducer</code> with <code>SerializingProducer</code> + <code>AvroSerializer</code></td><td>ADR-002</td><td>🟠 Deprecated API removal</td><td>Medium</td></tr><tr><td>P2-3</td><td>Add automated startup ordering (health checks + wait scripts)</td><td>ADR-004</td><td>🟠 Manual restart toil on failure</td><td>Medium</td></tr><tr><td>P2-4</td><td>Replace <code>store="memory://"</code> with <code>store="rocksdb://"</code> in Faust</td><td>ADR-004</td><td>🟠 Startup latency grows with topic size</td><td>Low</td></tr></tbody></table>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="priority-3--medium-address-in-backlog">Priority 3 — Medium (address in backlog)<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#priority-3--medium-address-in-backlog" class="hash-link" aria-label="Direct link to Priority 3 — Medium (address in backlog)" title="Direct link to Priority 3 — Medium (address in backlog)">​</a></h3>
<table><thead><tr><th>#</th><th>Recommendation</th><th>ADR</th><th>Risk Addressed</th><th>Effort</th></tr></thead><tbody><tr><td>P3-1</td><td>Unify all producers to use <code>AvroProducer</code>; remove REST Proxy dependency</td><td>ADR-005</td><td>🟡 Cognitive overhead for maintainers</td><td>Low</td></tr><tr><td>P3-2</td><td>Migrate <code>faust_stream.py</code> + <code>server.py</code> to FastAPI + aiokafka</td><td>ADR-006</td><td>🟡 Tornado is legacy; FastAPI is modern async standard</td><td>High</td></tr><tr><td>P3-3</td><td>Merge the two <code>constants.py</code> files into a shared config module</td><td>—</td><td>🟡 Duplication / drift risk</td><td>Low</td></tr><tr><td>P3-4</td><td>Add unit tests for all producer models and consumer models</td><td>—</td><td>🟡 No regression safety net</td><td>High</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="6-trade-off-summary-heatmap">6. Trade-off Summary Heatmap<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#6-trade-off-summary-heatmap" class="hash-link" aria-label="Direct link to 6. Trade-off Summary Heatmap" title="Direct link to 6. Trade-off Summary Heatmap">​</a></h2>
<p>The heatmap below shows the contribution of each architectural decision (rows) to each quality
attribute (columns).  Green = positive contribution; Red = negative contribution.</p>
<table><thead><tr><th>Decision</th><th style="text-align:center">Throughput</th><th style="text-align:center">Decoupling</th><th style="text-align:center">Schema Evol.</th><th style="text-align:center">Replayability</th><th style="text-align:center">Responsiveness</th><th style="text-align:center">Extensibility</th><th style="text-align:center"><strong>Net</strong></th></tr></thead><tbody><tr><td><strong>ADR-001</strong> Kafka</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟡</td><td style="text-align:center">🟢🟢</td><td style="text-align:center"><strong>+11</strong></td></tr><tr><td><strong>ADR-002</strong> Avro</td><td style="text-align:center">🟢</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟡</td><td style="text-align:center">🟢🟢</td><td style="text-align:center"><strong>+9</strong></td></tr><tr><td><strong>ADR-003</strong> Connect</td><td style="text-align:center">🟡</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟡</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟡</td><td style="text-align:center">🟢🟢</td><td style="text-align:center"><strong>+7</strong></td></tr><tr><td><strong>ADR-004</strong> Faust+KSQL</td><td style="text-align:center">🟢</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟡</td><td style="text-align:center">🔴</td><td style="text-align:center">🟡</td><td style="text-align:center">🟢🟢</td><td style="text-align:center"><strong>+5</strong></td></tr><tr><td><strong>ADR-005</strong> REST Proxy</td><td style="text-align:center">🔴</td><td style="text-align:center">🟢</td><td style="text-align:center">🟡</td><td style="text-align:center">🟢</td><td style="text-align:center">🔴</td><td style="text-align:center">🟢</td><td style="text-align:center"><strong>+1</strong></td></tr><tr><td><strong>ADR-006</strong> Tornado</td><td style="text-align:center">🟡</td><td style="text-align:center">🟡</td><td style="text-align:center">🟡</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟢🟢</td><td style="text-align:center">🟡</td><td style="text-align:center"><strong>+5</strong></td></tr><tr><td><strong>ADR-002 gap</strong> JSON KSQL</td><td style="text-align:center">🟡</td><td style="text-align:center">🟡</td><td style="text-align:center">🔴🔴</td><td style="text-align:center">🟡</td><td style="text-align:center">🟡</td><td style="text-align:center">🟡</td><td style="text-align:center"><strong>-3</strong></td></tr><tr><td><strong>ADR-001 gap</strong> RF=1</td><td style="text-align:center">🟡</td><td style="text-align:center">🟡</td><td style="text-align:center">🟡</td><td style="text-align:center">🔴🔴</td><td style="text-align:center">🟡</td><td style="text-align:center">🟡</td><td style="text-align:center"><strong>-3</strong></td></tr></tbody></table>
<p><strong>Key:</strong></p>
<ul>
<li>🟢🟢 Strong positive (+2)</li>
<li>🟢 Positive (+1)</li>
<li>🟡 Neutral (0)</li>
<li>🔴 Negative (−1)</li>
<li>🔴🔴 Strong negative (−2)</li>
</ul>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="overall-assessment">Overall Assessment<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/trade-off-analysis#overall-assessment" class="hash-link" aria-label="Direct link to Overall Assessment" title="Direct link to Overall Assessment">​</a></h3>
<p>The core architectural spine — <strong>Kafka + Avro + Schema Registry + Kafka Connect</strong> — scores
highly and is well-suited to the problem.  The gaps are concentrated in two areas:</p>
<ol>
<li><strong>Operational resilience</strong> (single broker, no startup orchestration)</li>
<li><strong>Serialisation consistency</strong> (KSQL JSON bypass undermines the otherwise strong Avro contract)</li>
</ol>
<p>Addressing the P1 recommendations above would raise the overall fitness score from
<strong>4.0 / 5.0 → ~4.6 / 5.0</strong>, making the architecture production-worthy.</p>
<hr>
<p><em>Trade-off analysis generated by reverse-engineering source code as of 2026-03-12.
Weighted scores are analytical judgements based on code evidence, not empirical benchmarks.</em></p>]]></content>
    </entry>
    <entry>
        <title type="html"><![CDATA[ADR-001: Apache Kafka as the Central Event Bus]]></title>
        <id>https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus</id>
        <link href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus"/>
        <updated>2026-05-19T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Date: 2026-03-12]]></summary>
        <content type="html"><![CDATA[<p><strong>Date:</strong> 2026-03-12
<strong>Status:</strong> Accepted
<strong>Deciders:</strong> Engineering Team</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="context">Context<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus#context" class="hash-link" aria-label="Direct link to Context" title="Direct link to Context">​</a></h2>
<p>The CTA (Chicago Transit Authority) public transport optimisation system must ingest and distribute
high-frequency, heterogeneous events from multiple sources:</p>
<ul>
<li>Train arrivals at every station on three colour lines (Blue, Red, Green), each carrying 10 trains</li>
<li>Turnstile entry counts produced per time-step at every station</li>
<li>Hourly weather readings</li>
<li>Static station reference data held in a relational database</li>
</ul>
<p>A naive polling or REST-request-per-event approach would not scale to the volume, would couple
producers tightly to consumers, and would make it difficult to replay or replay events for new
consumers.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="decision">Decision<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus#decision" class="hash-link" aria-label="Direct link to Decision" title="Direct link to Decision">​</a></h2>
<p>Apache Kafka (Confluent Platform 5.2.2) is used as the single, central event streaming backbone.
All data flows in and out of Kafka topics; no service communicates directly with another.</p>
<p>Evidence from code:</p>
<table><thead><tr><th>Source</th><th>Topic</th><th>Producer mechanism</th></tr></thead><tbody><tr><td>Train simulation</td><td><code>org.chicago.cta.station.arrivals.t001</code></td><td><code>confluent_kafka</code> <code>AvroProducer</code></td></tr><tr><td>Turnstile simulation</td><td><code>com.cta.stations.turnstile.entry</code></td><td><code>confluent_kafka</code> <code>AvroProducer</code></td></tr><tr><td>Weather simulation</td><td><code>org.chicago.cta.weather.v1</code></td><td>Kafka REST Proxy (HTTP POST)</td></tr><tr><td>PostgreSQL stations table</td><td><code>com.cta.stations.data.rawt001.stations</code></td><td>Kafka Connect JDBC Source</td></tr><tr><td>Faust stream processor</td><td><code>org.chicago.cta.stations.table.v1t001</code></td><td>Faust internal producer</td></tr><tr><td>KSQL aggregation</td><td><code>TURNSTILE_SUMMARY</code></td><td>KSQL internal producer</td></tr></tbody></table>
<p>Topics are created with LZ4 compression, a short delete-retention window (2 s) suitable for
real-time transit dashboards, and 10 partitions by default for arrival topics
(<code>producers/models/producer.py:18-32</code>).</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="alternatives-considered">Alternatives Considered<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus#alternatives-considered" class="hash-link" aria-label="Direct link to Alternatives Considered" title="Direct link to Alternatives Considered">​</a></h2>
<table><thead><tr><th>Alternative</th><th>Reason Rejected</th></tr></thead><tbody><tr><td>RabbitMQ / AMQP</td><td>No log-replay; difficult to add consumers without re-engineering</td></tr><tr><td>REST polling from dashboard</td><td>Tight coupling, synchronous latency, no fan-out</td></tr><tr><td>Redis Streams</td><td>Weaker ecosystem for schema enforcement and SQL-style aggregations</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="consequences">Consequences<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus#consequences" class="hash-link" aria-label="Direct link to Consequences" title="Direct link to Consequences">​</a></h2>
<p><strong>Positive</strong></p>
<ul>
<li>Producers and consumers are fully decoupled; new consumers (e.g. analytics) can subscribe
independently without touching producers.</li>
<li>Log retention enables late-joining consumers to replay from the earliest offset
(<code>offset_earliest=True</code> in <code>consumers/server.py:73-91</code>).</li>
<li>Kafka's partition model provides horizontal scale-out for high-throughput arrival events.</li>
</ul>
<p><strong>Negative / Risks</strong></p>
<ul>
<li>Single-broker setup (<code>kafka0</code> in <code>docker-compose.yaml</code>) is a single point of failure for local
development; production would require at minimum 3 brokers.</li>
<li>Replication factor is set to <code>1</code> throughout — data loss on broker failure.</li>
<li>Hard-coded <code>localhost:9092</code> in producer bootstrap config (<code>producers/models/producer.py:66</code>)
couples the code to the local Docker environment.</li>
</ul>]]></content>
    </entry>
    <entry>
        <title type="html"><![CDATA[ADR-002: Avro Schemas + Confluent Schema Registry for Message Contracts]]></title>
        <id>https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-002-avro-schema-registry</id>
        <link href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-002-avro-schema-registry"/>
        <updated>2026-05-19T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Date: 2026-03-12]]></summary>
        <content type="html"><![CDATA[<p><strong>Date:</strong> 2026-03-12
<strong>Status:</strong> Accepted
<strong>Deciders:</strong> Engineering Team</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="context">Context<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-002-avro-schema-registry#context" class="hash-link" aria-label="Direct link to Context" title="Direct link to Context">​</a></h2>
<p>Multiple independent processes — producers written in Python and stream processors written with
Faust and KSQL — exchange messages over Kafka.  Without a shared, versioned contract, a schema
change in a producer silently breaks downstream consumers.  The system needs:</p>
<ol>
<li>A machine-readable schema for every message type.</li>
<li>A registry that enforces backward/forward compatibility on publish.</li>
<li>Consumers that can deserialise messages without embedding the schema in every message.</li>
</ol>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="decision">Decision<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-002-avro-schema-registry#decision" class="hash-link" aria-label="Direct link to Decision" title="Direct link to Decision">​</a></h2>
<p>Apache Avro is the default serialisation format for first-party Kafka topics, with Confluent
Schema Registry (port 8081) acting as the central schema store.  Python producers that use the
shared producer base class publish Avro-encoded messages via <code>AvroProducer</code>, and consumers use
Avro deserialisation for those Avro-backed topics via <code>AvroConsumer</code> from
<code>confluent-kafka-python</code>.</p>
<p>There are explicit exceptions to that default path.  Weather data is produced via the REST Proxy
(<code>producers/models/weather.py</code>) rather than through <code>AvroProducer</code>.  The dashboard also consumes
some JSON topics with <code>is_avro=False</code> in <code>consumers/server.py</code>, including the stations table and
the <code>TURNSTILE_SUMMARY</code> topic.
Schema files are stored as JSON alongside the producer models:</p>
<div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">producers/models/schemas/</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  arrival_key.json</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  arrival_value.json</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  turnstile_key.json</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  turnstile_value.json</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  weather_key.json</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  weather_value.json</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Representative schema (<code>arrival_value.json</code>):</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token property">"namespace"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"com.udacity"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token property">"type"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"record"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token property">"name"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"arrival.value"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token property">"fields"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token property">"name"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"station_id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">       </span><span class="token property">"type"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"int"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token property">"name"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"train_id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">         </span><span class="token property">"type"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"string"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token property">"name"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"direction"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">        </span><span class="token property">"type"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"string"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token property">"name"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"line"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">             </span><span class="token property">"type"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"null"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token string" style="color:rgb(195, 232, 141)">"string"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token property">"name"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"train_status"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">     </span><span class="token property">"type"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"null"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token string" style="color:rgb(195, 232, 141)">"string"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token property">"name"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"prev_station_id"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">  </span><span class="token property">"type"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"null"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token string" style="color:rgb(195, 232, 141)">"int"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token property">"name"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"prev_direction"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain">   </span><span class="token property">"type"</span><span class="token operator" style="color:rgb(137, 221, 255)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token string" style="color:rgb(195, 232, 141)">"null"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token string" style="color:rgb(195, 232, 141)">"string"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  </span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The producer base class wires schemas at construction time
(<code>producers/models/producer.py:75-77</code>):</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">self</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">avroProducer </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> AvroProducer</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"bootstrap.servers"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"..."</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"schema.registry.url"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"http://localhost:8081"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    default_key_schema</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">self</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">key_schema</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> default_value_schema</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">self</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">value_schema</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The exception is the <code>TURNSTILE_SUMMARY</code> topic produced by KSQL, which uses JSON encoding
(VALUE_FORMAT='JSON') and is consumed without Avro deserialisation
(<code>consumers/server.py:87-91</code>).</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="alternatives-considered">Alternatives Considered<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-002-avro-schema-registry#alternatives-considered" class="hash-link" aria-label="Direct link to Alternatives Considered" title="Direct link to Alternatives Considered">​</a></h2>
<table><thead><tr><th>Alternative</th><th>Reason Rejected</th></tr></thead><tbody><tr><td>JSON (plain)</td><td>No schema enforcement; brittle under field renames</td></tr><tr><td>Protobuf</td><td>Supported by Confluent but less native to the Python confluent-kafka library at the time</td></tr><tr><td>MessagePack</td><td>No registry ecosystem; debugging harder</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="consequences">Consequences<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-002-avro-schema-registry#consequences" class="hash-link" aria-label="Direct link to Consequences" title="Direct link to Consequences">​</a></h2>
<p><strong>Positive</strong></p>
<ul>
<li>Schema Registry enforces compatibility before messages are published.</li>
<li>Schema IDs are embedded in the Avro wire format — consumers can always retrieve the exact schema
used to write a message.</li>
<li>Faust's <code>faust.Record</code> dataclasses mirror the Avro schema structure, making the contract
explicit in both the registry and the Python type system
(<code>consumers/faust_stream.py:14-33</code>).</li>
</ul>
<p><strong>Negative / Risks</strong></p>
<ul>
<li><code>AvroProducer</code> is marked as a legacy API in newer Confluent SDK versions; migration to
<code>SerializingProducer</code> with <code>AvroSerializer</code> will be needed.</li>
<li>The KSQL <code>TURNSTILE_SUMMARY</code> topic diverges from the Avro convention (uses JSON), creating an
inconsistency that consumers must handle explicitly (<code>is_avro=False</code>).</li>
<li>Schema files live inside <code>producers/</code> only; the consumer side has no local copy, creating a
coupling between producer deployment and consumer startup.</li>
</ul>]]></content>
    </entry>
    <entry>
        <title type="html"><![CDATA[ADR-003: Kafka Connect JDBC Source for PostgreSQL Station Data]]></title>
        <id>https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-003-kafka-connect-jdbc-postgres</id>
        <link href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-003-kafka-connect-jdbc-postgres"/>
        <updated>2026-05-19T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Date: 2026-03-12]]></summary>
        <content type="html"><![CDATA[<p><strong>Date:</strong> 2026-03-12
<strong>Status:</strong> Accepted
<strong>Deciders:</strong> Engineering Team</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="context">Context<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-003-kafka-connect-jdbc-postgres#context" class="hash-link" aria-label="Direct link to Context" title="Direct link to Context">​</a></h2>
<p>Station reference data (stop IDs, names, line membership, ordering) is stored in a PostgreSQL
table (<code>stations</code>) seeded from a CSV file at container start-up (<code>load_stations.sql</code>).  The
consumer side needs this data in Kafka so that stream processors (Faust) can enrich and transform
it alongside real-time event streams.</p>
<p>Two ingestion options were on the table:</p>
<ol>
<li>Write a bespoke Python producer that reads from the database and publishes to Kafka.</li>
<li>Use a managed connector that understands JDBC semantics.</li>
</ol>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="decision">Decision<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-003-kafka-connect-jdbc-postgres#decision" class="hash-link" aria-label="Direct link to Decision" title="Direct link to Decision">​</a></h2>
<p>Confluent Kafka Connect with the <code>JdbcSourceConnector</code> is used to stream the <code>stations</code> table
from PostgreSQL into Kafka automatically.</p>
<p>The connector is configured programmatically at simulation start-up via the Kafka Connect REST API
(<code>producers/connector.py:16-57</code>):</p>
<table><thead><tr><th>Config key</th><th>Value</th><th>Rationale</th></tr></thead><tbody><tr><td><code>connector.class</code></td><td><code>io.confluent.connect.jdbc.JdbcSourceConnector</code></td><td>Standard JDBC source</td></tr><tr><td><code>mode</code></td><td><code>incrementing</code></td><td>Detects new rows via monotonically increasing <code>stop_id</code></td></tr><tr><td><code>incrementing.column.name</code></td><td><code>stop_id</code></td><td>Primary key / surrogate key for new-row detection</td></tr><tr><td><code>table.whitelist</code></td><td><code>stations</code></td><td>Scope connector to a single table</td></tr><tr><td><code>topic.prefix</code></td><td><code>com.cta.stations.data.rawt001.</code></td><td>Output topic = prefix + table name</td></tr><tr><td><code>poll.interval.ms</code></td><td><code>3600000</code> (1 h)</td><td>Station data is quasi-static; hourly polling is sufficient</td></tr><tr><td><code>batch.max.rows</code></td><td><code>500</code></td><td>Limits per-poll memory footprint</td></tr></tbody></table>
<p>The connector is idempotent — if it already exists the setup function returns early
(<code>producers/connector.py:19-22</code>).</p>
<p>Faust then reads from the output topic <code>com.cta.stations.data.rawt001.stations</code>
(<code>consumers/faust_stream.py:40</code>).</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="alternatives-considered">Alternatives Considered<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-003-kafka-connect-jdbc-postgres#alternatives-considered" class="hash-link" aria-label="Direct link to Alternatives Considered" title="Direct link to Alternatives Considered">​</a></h2>
<table><thead><tr><th>Alternative</th><th>Reason Rejected</th></tr></thead><tbody><tr><td>Custom Python producer reading from PostgreSQL</td><td>More code to maintain; no built-in retry or offset tracking</td></tr><tr><td>Debezium CDC connector</td><td>Overkill for quasi-static reference data; requires PostgreSQL WAL configuration</td></tr><tr><td>Reading CSV directly in producer</td><td>Bypasses the Kafka pipeline; consumers cannot subscribe independently</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="consequences">Consequences<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-003-kafka-connect-jdbc-postgres#consequences" class="hash-link" aria-label="Direct link to Consequences" title="Direct link to Consequences">​</a></h2>
<p><strong>Positive</strong></p>
<ul>
<li>Zero custom ingestion code; the connector handles polling, batching, and offset management.</li>
<li>Decouples the database schema from producer code — schema changes propagate via the connector.</li>
<li>New consumers of station data subscribe to the Kafka topic without touching the database.</li>
</ul>
<p><strong>Negative / Risks</strong></p>
<ul>
<li><code>incrementing</code> mode only detects inserts, not updates or deletes; stale station data will not
be corrected unless the connector is reset.</li>
<li>Hard-coded credentials (<code>cta_admin</code> / <code>chicago</code>) in <code>connector.py:43-44</code> must be externalised
for any non-local environment.</li>
<li>The connector is registered once at simulation start; a crash before registration completes
leaves no station data in Kafka.</li>
</ul>]]></content>
    </entry>
    <entry>
        <title type="html"><![CDATA[ADR-004: Dual Stream-Processing Engines — Faust (Python) + KSQL]]></title>
        <id>https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql</id>
        <link href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql"/>
        <updated>2026-05-19T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Date: 2026-03-12]]></summary>
        <content type="html"><![CDATA[<p><strong>Date:</strong> 2026-03-12
<strong>Status:</strong> Accepted
<strong>Deciders:</strong> Engineering Team</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="context">Context<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql#context" class="hash-link" aria-label="Direct link to Context" title="Direct link to Context">​</a></h2>
<p>Two distinct stream-processing requirements exist:</p>
<ol>
<li>
<p><strong>Station enrichment</strong> — raw station rows arriving from the JDBC connector carry boolean
<code>red</code>/<code>blue</code>/<code>green</code> columns.  A downstream topic is needed that replaces these booleans with
a single <code>line</code> string field and retains only the fields required by the UI model.</p>
</li>
<li>
<p><strong>Turnstile aggregation</strong> — individual turnstile-entry events must be aggregated into a count
per station so the dashboard can display a single rider-count per station rather than a
stream of raw entry records.</p>
</li>
</ol>
<p>These two problems have different shapes: the first is a stateless record-by-record
transformation; the second is a stateful GROUP BY aggregation.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="decision">Decision<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql#decision" class="hash-link" aria-label="Direct link to Decision" title="Direct link to Decision">​</a></h2>
<p>Two separate stream-processing tools are used, each chosen for its natural fit with one problem:</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="faust--station-transformation-consumersfaust_streampy">Faust — station transformation (<code>consumers/faust_stream.py</code>)<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql#faust--station-transformation-consumersfaust_streampy" class="hash-link" aria-label="Direct link to faust--station-transformation-consumersfaust_streampy" title="Direct link to faust--station-transformation-consumersfaust_streampy">​</a></h3>
<p>Faust is a Python-native stream-processing library.  It is used to:</p>
<ul>
<li>Subscribe to <code>com.cta.stations.data.rawt001.stations</code></li>
<li>Produce <code>TransformedStation</code> records to <code>org.chicago.cta.stations.table.v1t001</code></li>
<li>Maintain an in-memory Faust Table as a materialised view keyed by <code>station_id</code></li>
</ul>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">@app</span><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token decorator annotation punctuation" style="color:rgb(199, 146, 234)">agent</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">in_topic</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">async</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">def</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">transform_stations</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">in_stations</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">async</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">for</span><span class="token plain"> sn </span><span class="token keyword" style="font-style:italic">in</span><span class="token plain"> in_stations</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        t </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> TransformedStation</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">sn</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">station_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> sn</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">station_name</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> sn</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">order</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"na"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token keyword" style="font-style:italic">if</span><span class="token plain"> sn</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">red</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain">   t</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">line </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"red"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token keyword" style="font-style:italic">elif</span><span class="token plain"> sn</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">blue</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> t</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">line </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"blue"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token keyword" style="font-style:italic">elif</span><span class="token plain"> sn</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">green</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> t</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">line </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"green"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token keyword" style="font-style:italic">else</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">continue</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        table</span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain">sn</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">station_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"> </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> t</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="ksql--turnstile-aggregation-consumersksqlpy">KSQL — turnstile aggregation (<code>consumers/ksql.py</code>)<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql#ksql--turnstile-aggregation-consumersksqlpy" class="hash-link" aria-label="Direct link to ksql--turnstile-aggregation-consumersksqlpy" title="Direct link to ksql--turnstile-aggregation-consumersksqlpy">​</a></h3>
<p>KSQL (now ksqlDB) is used to express a SQL aggregation over the turnstile topic:</p>
<div class="language-sql codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-sql codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token keyword" style="font-style:italic">CREATE</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">TABLE</span><span class="token plain"> turnstile </span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">WITH</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">KAFKA_TOPIC</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">'com.cta.stations.turnstile.entry'</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> VALUE_FORMAT</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">'AVRO'</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">KEY</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">'station_id'</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token keyword" style="font-style:italic">CREATE</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">TABLE</span><span class="token plain"> TURNSTILE_SUMMARY </span><span class="token keyword" style="font-style:italic">WITH</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">VALUE_FORMAT</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token string" style="color:rgb(195, 232, 141)">'JSON'</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">AS</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">SELECT</span><span class="token plain"> station_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token function" style="color:rgb(130, 170, 255)">count</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">station_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">as</span><span class="token plain"> COUNT</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token keyword" style="font-style:italic">FROM</span><span class="token plain"> turnstile </span><span class="token keyword" style="font-style:italic">GROUP</span><span class="token plain"> </span><span class="token keyword" style="font-style:italic">BY</span><span class="token plain"> station_id</span><span class="token punctuation" style="color:rgb(199, 146, 234)">;</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The KSQL statement is submitted via the KSQL REST API at consumer start-up and is idempotent —
it is skipped if <code>TURNSTILE_SUMMARY</code> already exists.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="alternatives-considered">Alternatives Considered<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql#alternatives-considered" class="hash-link" aria-label="Direct link to Alternatives Considered" title="Direct link to Alternatives Considered">​</a></h2>
<table><thead><tr><th>Alternative</th><th>Reason Rejected</th></tr></thead><tbody><tr><td>Kafka Streams (Java)</td><td>Project is Python-only; JVM dependency is undesirable</td></tr><tr><td>Single Faust app for both transformations</td><td>Aggregation with Faust Tables is more complex than KSQL GROUP BY; KSQL is more expressive for SQL aggregations</td></tr><tr><td>Single KSQL for both transformations</td><td>KSQL cannot natively run arbitrary Python logic cleanly; Faust keeps the transformation in the same language as the rest of the application</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="consequences">Consequences<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql#consequences" class="hash-link" aria-label="Direct link to Consequences" title="Direct link to Consequences">​</a></h2>
<p><strong>Positive</strong></p>
<ul>
<li>Each tool is used for its core strength: Faust for Python-idiomatic record transformation,
KSQL for declarative aggregation.</li>
<li>The Faust app and KSQL statements are independently deployable and restartable.</li>
</ul>
<p><strong>Negative / Risks</strong></p>
<ul>
<li>Two different stream-processing runtimes increase operational surface area (two separate
processes to start, monitor, and upgrade).</li>
<li>The Faust Table uses <code>store="memory://"</code> — state is lost on restart; the table is rebuilt from
Kafka on each startup, which adds startup latency.</li>
<li>KSQL's output (<code>TURNSTILE_SUMMARY</code>) uses JSON while all other topics use Avro, creating
a serialisation inconsistency (see ADR-002).</li>
<li>The <code>consumers/server.py</code> startup guard (<code>topic_check</code>) blocks the dashboard if either Faust
or KSQL has not yet produced its output topic, creating an implicit startup ordering dependency.</li>
</ul>]]></content>
    </entry>
    <entry>
        <title type="html"><![CDATA[ADR-005: Kafka REST Proxy for the Weather Producer]]></title>
        <id>https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-005-rest-proxy-for-weather-producer</id>
        <link href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-005-rest-proxy-for-weather-producer"/>
        <updated>2026-05-19T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Date: 2026-03-12]]></summary>
        <content type="html"><![CDATA[<p><strong>Date:</strong> 2026-03-12
<strong>Status:</strong> Accepted
<strong>Deciders:</strong> Engineering Team</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="context">Context<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-005-rest-proxy-for-weather-producer#context" class="hash-link" aria-label="Direct link to Context" title="Direct link to Context">​</a></h2>
<p>The weather simulation model (<code>producers/models/weather.py</code>) needs to publish Avro-encoded
records to Kafka.  All other producers in the system use the <code>confluent-kafka</code> Python library's
<code>AvroProducer</code> directly against the broker.</p>
<p>During development, a second integration path was explored: the Confluent Kafka REST Proxy
(port 8082), which accepts HTTP POST requests with embedded schemas and records.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="decision">Decision<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-005-rest-proxy-for-weather-producer#decision" class="hash-link" aria-label="Direct link to Decision" title="Direct link to Decision">​</a></h2>
<p>The <code>Weather</code> producer uses the Kafka REST Proxy instead of a native Kafka producer client.</p>
<p>Implementation (<code>producers/models/weather.py:71-86</code>):</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">resp </span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain"> requests</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">post</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">f"</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">constants</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token string-interpolation interpolation">Constants</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token string-interpolation interpolation">rest_proxy_url</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">topics/</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string-interpolation interpolation">constants</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token string-interpolation interpolation">Constants</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token string-interpolation interpolation">weather_topic_name</span><span class="token string-interpolation interpolation punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token string-interpolation string" style="color:rgb(195, 232, 141)">"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    headers</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"Content-Type"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"application/vnd.kafka.avro.v2+json"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    data</span><span class="token operator" style="color:rgb(137, 221, 255)">=</span><span class="token plain">json</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">dumps</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"key_schema"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain">   json</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">dumps</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">Weather</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">key_schema</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"value_schema"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> json</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">dumps</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">Weather</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">value_schema</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">        </span><span class="token string" style="color:rgb(195, 232, 141)">"records"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"value"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(195, 232, 141)">"key"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token string" style="color:rgb(195, 232, 141)">"timestamp"</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"> self</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">time_millis</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Schema JSON is inlined in every POST payload rather than being pre-registered with the Schema
Registry.  The REST Proxy handles registration transparently.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="alternatives-considered">Alternatives Considered<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-005-rest-proxy-for-weather-producer#alternatives-considered" class="hash-link" aria-label="Direct link to Alternatives Considered" title="Direct link to Alternatives Considered">​</a></h2>
<table><thead><tr><th>Alternative</th><th>Reason Rejected</th></tr></thead><tbody><tr><td>Native <code>AvroProducer</code> (used by other producers)</td><td>Both approaches produce identical results; REST Proxy was chosen to demonstrate the capability</td></tr><tr><td><code>requests</code> posting to the broker directly</td><td>Kafka wire protocol is binary and not HTTP-accessible without a proxy</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="consequences">Consequences<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-005-rest-proxy-for-weather-producer#consequences" class="hash-link" aria-label="Direct link to Consequences" title="Direct link to Consequences">​</a></h2>
<p><strong>Positive</strong></p>
<ul>
<li>Demonstrates an HTTP-based integration path useful for polyglot producers (languages without a
native Kafka client library).</li>
<li>No Kafka client dependency required in the producing service.</li>
</ul>
<p><strong>Negative / Risks</strong></p>
<ul>
<li>An additional network hop (producer → REST Proxy → broker) adds latency compared to the native
client path.</li>
<li>Inlining the full schema JSON in every request is wasteful; the Schema Registry already holds
the schema after the first publish.</li>
<li>Error handling on HTTP failures is minimal — a failed <code>raise_for_status()</code> logs the error but
drops the weather event silently.</li>
<li>Inconsistency: weather uses REST Proxy while all other producers use the native client,
increasing cognitive overhead for maintainers.</li>
</ul>]]></content>
    </entry>
    <entry>
        <title type="html"><![CDATA[ADR-006: Tornado Async Web Server for the Real-Time Transit Dashboard]]></title>
        <id>https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard</id>
        <link href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard"/>
        <updated>2026-05-19T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Date: 2026-03-12]]></summary>
        <content type="html"><![CDATA[<p><strong>Date:</strong> 2026-03-12
<strong>Status:</strong> Accepted
<strong>Deciders:</strong> Engineering Team</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="context">Context<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard#context" class="hash-link" aria-label="Direct link to Context" title="Direct link to Context">​</a></h2>
<p>The consumer layer must simultaneously:</p>
<ol>
<li>Poll four Kafka topics continuously and update in-memory state (weather, line status, arrivals,
turnstile counts).</li>
<li>Serve HTTP GET requests that render the current state as an HTML page.</li>
</ol>
<p>A blocking web server would stall Kafka consumption while handling HTTP requests.  A blocking
Kafka consumer would stall HTTP responses while waiting for new messages.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="decision">Decision<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard#decision" class="hash-link" aria-label="Direct link to Decision" title="Direct link to Decision">​</a></h2>
<p>The Tornado asynchronous web framework is used as the server runtime
(<code>consumers/server.py</code>).  Kafka consumers are scheduled as Tornado IO loop callbacks:</p>
<div class="language-python codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-python codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token keyword" style="font-style:italic">for</span><span class="token plain"> consumer </span><span class="token keyword" style="font-style:italic">in</span><span class="token plain"> consumers</span><span class="token punctuation" style="color:rgb(199, 146, 234)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    tornado</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">ioloop</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">IOLoop</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">current</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">spawn_callback</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token plain">consumer</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">consume</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">tornado</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">ioloop</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">IOLoop</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">current</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><span class="token punctuation" style="color:rgb(199, 146, 234)">.</span><span class="token plain">start</span><span class="token punctuation" style="color:rgb(199, 146, 234)">(</span><span class="token punctuation" style="color:rgb(199, 146, 234)">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Each <code>KafkaConsumer.consume()</code> is an <code>async</code> coroutine that yields control between polls
via <code>await gen.sleep(self.sleep_secs)</code> (<code>consumers/consumer.py:70-76</code>).  The HTTP handler
renders state synchronously on GET without blocking Kafka consumption.</p>
<p>Four consumers are registered on startup:</p>
<table><thead><tr><th>Consumer</th><th>Topic</th><th>Avro?</th></tr></thead><tbody><tr><td>Weather</td><td><code>org.chicago.cta.weather.v1</code></td><td>Yes</td></tr><tr><td>Stations table</td><td><code>^org.chicago.cta.stations.table.*</code></td><td>No (regex, JSON)</td></tr><tr><td>Train arrivals</td><td><code>^org.chicago.cta.station.arrivals.*</code></td><td>Yes (regex)</td></tr><tr><td>Turnstile summary</td><td><code>TURNSTILE_SUMMARY</code></td><td>No (JSON)</td></tr></tbody></table>
<p>All consumers share a single <code>group.id</code> (<code>com.chicago.transport.consumer.group.1</code>).</p>
<p>The server listens on port 8888 and serves a single route (<code>/</code>) rendered from
<code>consumers/templates/status.html</code>.</p>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="alternatives-considered">Alternatives Considered<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard#alternatives-considered" class="hash-link" aria-label="Direct link to Alternatives Considered" title="Direct link to Alternatives Considered">​</a></h2>
<table><thead><tr><th>Alternative</th><th>Reason Rejected</th></tr></thead><tbody><tr><td>Flask / Django (synchronous)</td><td>Cannot multiplex Kafka polling with HTTP serving without threads</td></tr><tr><td>asyncio + aiohttp</td><td>Viable alternative; Tornado chosen for built-in IOLoop integration matching <code>confluent_kafka</code> callback style</td></tr><tr><td>Separate Kafka consumer process + shared state store (Redis)</td><td>Over-engineered for a dashboard with a single user</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="consequences">Consequences<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard#consequences" class="hash-link" aria-label="Direct link to Consequences" title="Direct link to Consequences">​</a></h2>
<p><strong>Positive</strong></p>
<ul>
<li>Single process handles both Kafka consumption and HTTP serving without threading.</li>
<li>Tornado's <code>spawn_callback</code> allows an arbitrary number of consumers to coexist on one event loop.</li>
<li>Simple HTML template rendering — no JavaScript framework needed for the status page.</li>
</ul>
<p><strong>Negative / Risks</strong></p>
<ul>
<li>State is stored in Python objects (<code>Weather</code>, <code>Lines</code>) in process memory; any restart loses
accumulated state until topics are re-consumed from the earliest offset.</li>
<li>All four consumers share one <code>group.id</code>, meaning if a second dashboard instance were started
it would steal partitions from the first.</li>
<li>The dashboard blocks entirely during startup if KSQL or Faust topics are not yet ready
(hard <code>exit(1)</code> at <code>consumers/server.py:49-57</code>), requiring a manual restart order.</li>
<li>No authentication or HTTPS on port 8888.</li>
</ul>]]></content>
    </entry>
    <entry>
        <title type="html"><![CDATA[Architecture Decision Records]]></title>
        <id>https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/README</id>
        <link href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/README"/>
        <updated>2026-05-19T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[This directory contains Architecture Decision Records (ADRs) for the]]></summary>
        <content type="html"><![CDATA[<p>This directory contains Architecture Decision Records (ADRs) for the
<strong>CTA Public Transport Optimisation</strong> system.  ADRs are generated by
reverse-engineering the codebase as of 2026-03-12.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="index">Index<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/README#index" class="hash-link" aria-label="Direct link to Index" title="Direct link to Index">​</a></h2>
<table><thead><tr><th>ADR</th><th>Title</th><th>Status</th></tr></thead><tbody><tr><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-001-kafka-as-central-event-bus">ADR-001</a></td><td>Apache Kafka as the Central Event Bus</td><td>Accepted</td></tr><tr><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-002-avro-schema-registry">ADR-002</a></td><td>Avro Schemas + Confluent Schema Registry</td><td>Accepted</td></tr><tr><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-003-kafka-connect-jdbc-postgres">ADR-003</a></td><td>Kafka Connect JDBC Source for PostgreSQL</td><td>Accepted</td></tr><tr><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-004-dual-stream-processing-faust-ksql">ADR-004</a></td><td>Dual Stream Processing — Faust + KSQL</td><td>Accepted</td></tr><tr><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-005-rest-proxy-for-weather-producer">ADR-005</a></td><td>Kafka REST Proxy for Weather Producer</td><td>Accepted</td></tr><tr><td><a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/ADR-006-tornado-async-dashboard">ADR-006</a></td><td>Tornado Async Web Server for Dashboard</td><td>Accepted</td></tr></tbody></table>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="system-architecture-overview">System Architecture Overview<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/README#system-architecture-overview" class="hash-link" aria-label="Direct link to System Architecture Overview" title="Direct link to System Architecture Overview">​</a></h2>
<div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">┌─────────────────────────────────────────────────────────────────────┐</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│                        PRODUCER LAYER                               │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│                                                                     │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  simulation.py                                                      │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    ├─ Line (Blue/Red/Green)                                         │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    │    ├─ Station ──────────────────► org.chicago.cta.station.     │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    │    │   (AvroProducer)              arrivals.t001               │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    │    └─ Turnstile ─────────────────► com.cta.stations.           │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    │        (AvroProducer)               turnstile.entry            │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    └─ Weather ──────────────────────►  org.chicago.cta.weather.v1  │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│         (REST Proxy HTTP POST)                                      │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│                                                                     │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  connector.py ──[JDBC Source]──► com.cta.stations.data.rawt001.    │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  (Kafka Connect)                  stations  (from PostgreSQL)       │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">└────────────────────────────┬────────────────────────────────────────┘</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">                             │  Apache Kafka  (+ Schema Registry)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">┌────────────────────────────▼────────────────────────────────────────┐</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│                    STREAM PROCESSING LAYER                          │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│                                                                     │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  faust_stream.py                                                    │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    com.cta.stations.data.rawt001.stations                           │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│      ──[transform]──► org.chicago.cta.stations.table.v1t001        │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│                                                                     │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  ksql.py                                                            │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    com.cta.stations.turnstile.entry                                 │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│      ──[GROUP BY station_id]──► TURNSTILE_SUMMARY (JSON)            │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">└────────────────────────────┬────────────────────────────────────────┘</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">                             │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">┌────────────────────────────▼────────────────────────────────────────┐</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│                      CONSUMER / DASHBOARD LAYER                     │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│                                                                     │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│  server.py  (Tornado, port 8888)                                    │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    KafkaConsumer × 4  ──► in-memory Weather + Lines state           │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│    GET /  ──► status.html                                           │</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">└─────────────────────────────────────────────────────────────────────┘</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="key-technology-choices">Key Technology Choices<a href="https://jigsawflux.org/blog/2026/05/19/ai-tools-for-architects/docs/adr/README#key-technology-choices" class="hash-link" aria-label="Direct link to Key Technology Choices" title="Direct link to Key Technology Choices">​</a></h2>
<table><thead><tr><th>Concern</th><th>Technology</th><th>ADR</th></tr></thead><tbody><tr><td>Event streaming</td><td>Apache Kafka (Confluent 5.2.2)</td><td>ADR-001</td></tr><tr><td>Schema enforcement</td><td>Apache Avro + Confluent Schema Registry</td><td>ADR-002</td></tr><tr><td>DB-to-Kafka ingestion</td><td>Kafka Connect JDBC Source</td><td>ADR-003</td></tr><tr><td>Station transformation</td><td>Faust (Python stream processor)</td><td>ADR-004</td></tr><tr><td>Turnstile aggregation</td><td>KSQL</td><td>ADR-004</td></tr><tr><td>HTTP-based produce path</td><td>Kafka REST Proxy</td><td>ADR-005</td></tr><tr><td>Real-time web dashboard</td><td>Tornado async web server</td><td>ADR-006</td></tr><tr><td>Infrastructure</td><td>Docker Compose (single-broker dev cluster)</td><td>ADR-001</td></tr></tbody></table>]]></content>
    </entry>
    <entry>
        <title type="html"><![CDATA[Welcome to the JigsawFlux Blog]]></title>
        <id>https://jigsawflux.org/blog/welcome</id>
        <link href="https://jigsawflux.org/blog/welcome"/>
        <updated>2026-05-15T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[Welcome to the JigsawFlux Blog — a space for updates, ideas, and stories from the JigsawFlux open-source community.]]></summary>
        <content type="html"><![CDATA[<p>Welcome to the JigsawFlux Blog — a space for updates, ideas, and stories from the JigsawFlux open-source community.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="what-is-jigsawflux">What is JigsawFlux?<a href="https://jigsawflux.org/blog/welcome#what-is-jigsawflux" class="hash-link" aria-label="Direct link to What is JigsawFlux?" title="Direct link to What is JigsawFlux?">​</a></h2>
<p>JigsawFlux is an open-source organisation focused on building tools that make a real difference in health tech, crisis management, and humanitarian response. Every project we build is open, non-profit in spirit, and built to solve real problems that affect real people.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="what-this-blog-is-for">What this blog is for<a href="https://jigsawflux.org/blog/welcome#what-this-blog-is-for" class="hash-link" aria-label="Direct link to What this blog is for" title="Direct link to What this blog is for">​</a></h2>
<p>This blog will be updated weekly or monthly with:</p>
<ul>
<li><strong>Project updates</strong> — what we're building, what's shipping, what's next</li>
<li><strong>Technical deep-dives</strong> — architecture decisions, lessons learned, interesting engineering problems</li>
<li><strong>Community stories</strong> — contributors, use cases, and the humans behind the code</li>
<li><strong>Ideas and proposals</strong> — things we're thinking about, inviting input before we build</li>
</ul>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="get-involved">Get involved<a href="https://jigsawflux.org/blog/welcome#get-involved" class="hash-link" aria-label="Direct link to Get involved" title="Direct link to Get involved">​</a></h2>
<p>If you want to contribute — code, documentation, design, testing, or ideas — the best place to start is <a href="https://jigsawflux.org/contribute" target="_blank" rel="noopener noreferrer">jigsawflux.org/contribute</a>. Every bit helps.</p>
<p>You can also follow along via the <a href="https://jigsawflux.org/blog/rss.xml">RSS feed</a> or the <a href="https://github.com/JigsawFlux" target="_blank" rel="noopener noreferrer">GitHub organisation</a>.</p>
<p>Here's to building tools that matter.</p>]]></content>
        <author>
            <name>Suresh Thomas</name>
            <uri>https://github.com/st185229</uri>
        </author>
        <category label="open-source" term="open-source"/>
        <category label="health-tech" term="health-tech"/>
        <category label="humanitarian" term="humanitarian"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[Running a Local LLM with Ollama and MCP — An Architecture Spike]]></title>
        <id>https://jigsawflux.org/blog/local-llm-ollama-mcp-spike</id>
        <link href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike"/>
        <updated>2026-05-15T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[AI inference doesn't have to mean a cloud API call. This post walks through a spike I built to run a locally hosted language model through a clean, layered architecture using Ollama and the Model Context Protocol (MCP).]]></summary>
        <content type="html"><![CDATA[<p>AI inference doesn't have to mean a cloud API call. This post walks through a spike I built to run a locally hosted language model through a clean, layered architecture using Ollama and the Model Context Protocol (MCP).</p>
<p><em>The full source is on <a href="https://github.com/JigsawFlux/ollama-mcp-starter" target="_blank" rel="noopener noreferrer">GitHub</a>. Contributions and feedback welcome.</em></p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="why-local-llms">Why Local LLMs?<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#why-local-llms" class="hash-link" aria-label="Direct link to Why Local LLMs?" title="Direct link to Why Local LLMs?">​</a></h2>
<p>Most AI integrations today rely on external APIs — you send a prompt to a cloud provider, they run the model, you get a response. That works fine for many use cases, but it comes with real tradeoffs:</p>
<ul>
<li><strong>Privacy</strong>: your prompts leave your network</li>
<li><strong>Connectivity</strong>: requires reliable internet</li>
<li><strong>Cost</strong>: token-based pricing adds up quickly at scale</li>
<li><strong>Latency</strong>: round-trip to a remote data centre</li>
</ul>
<p>For health tech, crisis response, and humanitarian tools — areas where JigsawFlux focuses — these tradeoffs can matter a great deal. Patient data is sensitive. Field teams in disaster zones may have limited connectivity. Running AI locally sidesteps all of these concerns.</p>
<p>Local LLMs have become genuinely useful. Models like <code>llama3.2:3b</code> run comfortably on consumer hardware and handle a wide range of practical tasks: summarisation, triage, Q&amp;A, and structured extraction.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="what-is-ollama">What is Ollama?<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#what-is-ollama" class="hash-link" aria-label="Direct link to What is Ollama?" title="Direct link to What is Ollama?">​</a></h2>
<p><a href="https://ollama.com/" target="_blank" rel="noopener noreferrer">Ollama</a> is an open-source tool that makes running LLMs locally straightforward. Think of it as a package manager for AI models — pull a model, run it, and interact with it over a local HTTP API.</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">ollama pull llama3.2:3b</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">ollama run llama3.2:3b</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Once running, Ollama exposes a REST API on <code>localhost:11434</code> (or a networked host). It handles model loading, memory management, and inference. You interact with it exactly as you would a cloud API, just without the round-trip.</p>
<p>For this spike, Ollama runs on a separate machine on my local network at <code>192.168.1.80</code> — a common setup where a more capable machine hosts the model and lighter clients query it.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="setting-up-ollama-on-linux">Setting Up Ollama on Linux<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#setting-up-ollama-on-linux" class="hash-link" aria-label="Direct link to Setting Up Ollama on Linux" title="Direct link to Setting Up Ollama on Linux">​</a></h2>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="installation">Installation<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#installation" class="hash-link" aria-label="Direct link to Installation" title="Direct link to Installation">​</a></h3>
<p>The quickest way is the official install script — one line and it handles everything:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">curl -fsSL https://ollama.com/install.sh | sh</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>If you prefer to install manually (air-gapped environments, or you want control over the binary location):</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain"># Download the Linux binary</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">curl -L https://ollama.com/download/ollama-linux-amd64 -o ollama</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># Make it executable</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">chmod +x ollama</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># Move to a directory in your PATH</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">sudo mv ollama /usr/local/bin/</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="starting-the-server">Starting the Server<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#starting-the-server" class="hash-link" aria-label="Direct link to Starting the Server" title="Direct link to Starting the Server">​</a></h3>
<p>By default, <code>ollama serve</code> binds to <code>127.0.0.1</code> — only accessible from the same machine. For this spike, Ollama runs on a dedicated machine on the local network and the application connects to it from a different host. To allow that, bind to all interfaces:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">OLLAMA_HOST=0.0.0.0 ollama serve</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>To make this permanent (e.g. as a systemd service override):</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">sudo systemctl edit ollama.service</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Add the following and save:</p>
<div class="language-ini codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-ini codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">[Service]</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">Environment="OLLAMA_HOST=0.0.0.0"</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then reload and restart:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">sudo systemctl daemon-reload</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">sudo systemctl restart ollama</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The server now listens on port <code>11434</code> on all network interfaces. Other machines on the same network can reach it at <code>http://&lt;host-ip&gt;:11434</code>.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="pulling-a-model">Pulling a Model<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#pulling-a-model" class="hash-link" aria-label="Direct link to Pulling a Model" title="Direct link to Pulling a Model">​</a></h3>
<p>Before the server can handle requests, pull the model you want to use:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">ollama pull llama3.2:3b</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="testing-the-api">Testing the API<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#testing-the-api" class="hash-link" aria-label="Direct link to Testing the API" title="Direct link to Testing the API">​</a></h3>
<p>Ollama exposes a simple HTTP API. Once the server is running, you can test it directly with <code>curl</code> — no SDK needed.</p>
<p><strong>Generate completion</strong> (single prompt, no conversation context):</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">curl $OLLAMA_HOST/api/generate -d '{</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  "model": "llama3.2",</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  "prompt": "Explain kubernetes in one paragraph",</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  "stream": false</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">}'</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p><strong>Chat completion</strong> (conversation format with message roles):</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">curl $OLLAMA_HOST/api/chat -d '{</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  "model": "llama3.2",</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  "messages": [</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">    {"role": "user", "content": "What is Docker?"}</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  ],</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">  "stream": false</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">}'</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>The <code>/api/generate</code> endpoint is stateless — one prompt in, one response out. The <code>/api/chat</code> endpoint accepts a <code>messages</code> array so you can pass conversation history and get contextually aware replies. This spike uses <code>/api/chat</code> for both the chat and summarise tools.</p>
<p>Set <code>OLLAMA_HOST</code> in your environment or <code>.env</code> file to point at the machine running Ollama:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">OLLAMA_HOST=http://192.168.1.80:11434</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="what-is-mcp">What is MCP?<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#what-is-mcp" class="hash-link" aria-label="Direct link to What is MCP?" title="Direct link to What is MCP?">​</a></h2>
<p>The <a href="https://modelcontextprotocol.io/" target="_blank" rel="noopener noreferrer">Model Context Protocol</a> (MCP) is an open standard from Anthropic for structuring communication between AI hosts and the tools or services they call. It defines a consistent way to:</p>
<ul>
<li><strong>Expose tools</strong> — name them, describe their inputs/outputs</li>
<li><strong>Call tools</strong> — invoke them with structured arguments</li>
<li><strong>Handle responses</strong> — receive results in a predictable format</li>
</ul>
<p>The key idea is separation of concerns. Your application logic doesn't need to know the details of how a model is invoked — it just calls a tool. The MCP server handles the translation to whatever inference backend is running.</p>
<p>This makes the architecture composable. Swap Ollama for another model runner, add a new tool, or connect a different MCP client — each layer stays isolated.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="architecture-of-the-spike">Architecture of the Spike<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#architecture-of-the-spike" class="hash-link" aria-label="Direct link to Architecture of the Spike" title="Direct link to Architecture of the Spike">​</a></h2>
<p>The system has four main layers:</p>
<!-- -->
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="layer-1-browser-frontend">Layer 1: Browser (Frontend)<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#layer-1-browser-frontend" class="hash-link" aria-label="Direct link to Layer 1: Browser (Frontend)" title="Direct link to Layer 1: Browser (Frontend)">​</a></h3>
<p>A vanilla HTML/CSS/JS frontend with two tabs — <strong>Chat</strong> and <strong>Summarise</strong>. No framework, no build step. The UI handles user input, loading states, and error display. It talks only to the Express backend and never directly to Ollama.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="layer-2-express-backend-api-layer">Layer 2: Express Backend (API Layer)<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#layer-2-express-backend-api-layer" class="hash-link" aria-label="Direct link to Layer 2: Express Backend (API Layer)" title="Direct link to Layer 2: Express Backend (API Layer)">​</a></h3>
<p>A Node.js/Express server that exposes three endpoints:</p>
<table><thead><tr><th>Endpoint</th><th>Description</th></tr></thead><tbody><tr><td><code>GET /health</code></td><td>Returns status of backend, MCP, and Ollama</td></tr><tr><td><code>POST /chat</code></td><td>Accepts a prompt, returns a model response</td></tr><tr><td><code>POST /summarise</code></td><td>Accepts text and optional style, returns a summary</td></tr></tbody></table>
<p>The backend holds a <strong>persistent MCP client session</strong> — one long-lived connection to the MCP server rather than spinning up a new process per request. It also assigns a correlation ID to each request for tracing.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="layer-3-mcp-server-tool-layer">Layer 3: MCP Server (Tool Layer)<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#layer-3-mcp-server-tool-layer" class="hash-link" aria-label="Direct link to Layer 3: MCP Server (Tool Layer)" title="Direct link to Layer 3: MCP Server (Tool Layer)">​</a></h3>
<p>A stdio-based MCP server written in TypeScript. It registers three tools:</p>
<ul>
<li><code>health_check</code> — pings Ollama and returns status</li>
<li><code>chat</code> — sends a prompt to the model</li>
<li><code>summarise</code> — sends a text block for summarisation</li>
</ul>
<p>The MCP server contains an <strong>Ollama adapter</strong> — a single module that centralises all Ollama API communication. Connection errors, timeouts, and model-not-found responses are normalised here into consistent, user-actionable messages.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="layer-4-ollama-inference-layer">Layer 4: Ollama (Inference Layer)<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#layer-4-ollama-inference-layer" class="hash-link" aria-label="Direct link to Layer 4: Ollama (Inference Layer)" title="Direct link to Layer 4: Ollama (Inference Layer)">​</a></h3>
<p>The model runner. This spike uses <code>llama3.2:3b</code> — a capable 3-billion parameter model that runs fast on modest hardware. Ollama receives requests at <code>/api/chat</code> and returns model completions.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="request-flow">Request Flow<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#request-flow" class="hash-link" aria-label="Direct link to Request Flow" title="Direct link to Request Flow">​</a></h2>
<p>Here is what happens when you submit a prompt in the chat UI:</p>
<!-- -->
<p>The response includes the model name and duration — useful for understanding latency and knowing which model answered.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="key-design-decisions">Key Design Decisions<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#key-design-decisions" class="hash-link" aria-label="Direct link to Key Design Decisions" title="Direct link to Key Design Decisions">​</a></h2>
<p><strong>Isolated Ollama adapter.</strong> All HTTP calls to Ollama live in one file (<code>ollama-adapter.ts</code>). Changing the inference backend, adding retry logic, or switching models only requires touching this one module.</p>
<p><strong>Stdio MCP transport.</strong> The MCP server runs as a child process communicating over stdio rather than a network socket. This keeps deployment simple — no extra port to manage — and makes it easy to integrate with other MCP-compatible clients like VS Code Copilot.</p>
<p><strong>Single persistent MCP session.</strong> The backend creates one MCP client session at startup and reuses it across requests. This avoids the overhead of spawning a new process per request and keeps session state coherent.</p>
<p><strong>Vanilla frontend.</strong> No React, no Vite, no build pipeline. The frontend is plain HTML/CSS/JS served directly. This keeps the system portable — it can run anywhere a static file server exists.</p>
<p><strong>Security baseline.</strong> The browser never talks to Ollama directly. All inference is proxied through the backend. Request body size limits are applied. Sensitive text is redacted from error logs.</p>
<h3 class="anchor anchorWithStickyNavbar_LWe7" id="deployment-view">Deployment View<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#deployment-view" class="hash-link" aria-label="Direct link to Deployment View" title="Direct link to Deployment View">​</a></h3>
<!-- -->
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="running-it-locally">Running It Locally<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#running-it-locally" class="hash-link" aria-label="Direct link to Running It Locally" title="Direct link to Running It Locally">​</a></h2>
<p>The project structure looks like this:</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">ollama-claude/</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">├── mcp-server/           # MCP server + Ollama adapter (TypeScript)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">├── backend/</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│   ├── ...               # Express API + MCP client session (TypeScript)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">│   └── k8s-deployment/   # Kubernetes manifests for MicroK8s</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">├── frontend/             # Static UI (HTML/CSS/JS)</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">├── start.sh              # Unix startup script</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">└── start.bat             # Windows startup script</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>To get started:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain"># Copy environment files and configure your Ollama host</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">cp .env.example .env</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># Edit OLLAMA_HOST to point at your Ollama instance</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># Start all three services</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./start.sh          # macOS/Linux</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">start.bat           # Windows</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Then open <code>http://localhost:3000</code> in your browser.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="deploying-on-kubernetes-microk8s">Deploying on Kubernetes (MicroK8s)<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#deploying-on-kubernetes-microk8s" class="hash-link" aria-label="Direct link to Deploying on Kubernetes (MicroK8s)" title="Direct link to Deploying on Kubernetes (MicroK8s)">​</a></h2>
<p>The <code>backend/k8s-deployment/</code> folder contains production-ready manifests for running the Ollama stack on MicroK8s with MetalLB and Nginx ingress. The key files are:</p>
<table><thead><tr><th>Manifest</th><th>Purpose</th></tr></thead><tbody><tr><td><code>ollama-stack.yaml</code></td><td><code>ollama</code> namespace, 50 Gi PVC, Ollama StatefulSet, ClusterIP service</td></tr><tr><td><code>ollama-automated.yaml</code></td><td>Variant that pre-pulls <code>llama3.1:8b</code> via an <code>initContainer</code> on first boot</td></tr><tr><td><code>open-webui-stack.yaml</code></td><td>Open WebUI deployment with 10 Gi PVC, pointed at the internal Ollama service</td></tr><tr><td><code>ollama-ingress.yaml</code></td><td>Exposes Ollama at <code>ollama.local</code></td></tr><tr><td><code>open-webui-ingress.yaml</code></td><td>Exposes Open WebUI at <code>ai.local</code></td></tr><tr><td><code>dashboard-ingress.yaml</code></td><td>Exposes the Kubernetes dashboard at <code>dashboard.local</code> (SSL passthrough)</td></tr><tr><td><code>ingress-lb.yaml</code></td><td>LoadBalancer service for the Nginx ingress controller (ports 80/443)</td></tr></tbody></table>
<p>Apply them in order:</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#bfc7d5;background-color:#292d3e"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/ollama-stack.yaml</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/open-webui-stack.yaml</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/ollama-ingress.yaml</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/open-webui-ingress.yaml</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/dashboard-ingress.yaml</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">kubectl apply -f backend/k8s-deployment/ingress-lb.yaml</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Prerequisites: MicroK8s with <code>dns</code>, <code>storage</code>, <code>ingress</code>, and <code>metallb</code> add-ons enabled. Add <code>ollama.local</code>, <code>ai.local</code>, and <code>dashboard.local</code> to your <code>/etc/hosts</code> pointing at the MetalLB-assigned IP.</p>
<p>GPU support is optional — uncomment the <code>nvidia.com/gpu: 1</code> resource limit in <code>ollama-stack.yaml</code> after running <code>microk8s enable gpu</code>.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="demo">Demo<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#demo" class="hash-link" aria-label="Direct link to Demo" title="Direct link to Demo">​</a></h2>
<p>In this demo, I asked this crazy question to both locally hosted ollama and Microsoft CoPilot</p>
<p>The below is from my locally hosted ollama</p>
<p><img decoding="async" loading="lazy" alt="Demo screenshot" src="https://jigsawflux.org/blog/assets/images/demo-screenshot-635cbeab60411a800cc496f69275cd46.png" width="795" height="1185" class="img_ev3q"></p>
<p>The chat tab lets you send prompts directly to the model. The health indicator in the header shows live status of the MCP and Ollama connections. If either goes down, the indicator updates and requests fail with a clear error message rather than a silent timeout.</p>
<p>The below was the response from Microsoft Co-pilot</p>
<p><img decoding="async" loading="lazy" alt="Demo copilot" src="https://jigsawflux.org/blog/assets/images/copilot-538063f765f067c2f4433355ccc48299.png" width="823" height="1067" class="img_ev3q"></p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id="whats-next">What's Next<a href="https://jigsawflux.org/blog/local-llm-ollama-mcp-spike#whats-next" class="hash-link" aria-label="Direct link to What's Next" title="Direct link to What's Next">​</a></h2>
<p>This is a spike — a working proof of concept, not production software. The obvious next steps are:</p>
<ul>
<li><strong>Streaming responses</strong> — currently the UI waits for the full completion; streaming would feel much more interactive</li>
<li><strong>Conversation history</strong> — right now each prompt is stateless; persisting context would allow follow-up questions</li>
<li><strong>Structured logging</strong> — correlation IDs are assigned but not yet threaded through all log lines</li>
<li><strong>Docker packaging</strong> — containerising all three services would make deployment reproducible anywhere</li>
</ul>
<p>For JigsawFlux, the interesting application of this architecture is in tools for field teams and health workers — where data privacy matters, connectivity is unreliable, and a locally-running model on a shared device could provide genuine decision support.</p>
<p>The full source is on <a href="https://github.com/JigsawFlux/ollama-mcp-starter" target="_blank" rel="noopener noreferrer">GitHub</a>. Contributions and feedback welcome.</p>]]></content>
        <author>
            <name>Suresh Thomas</name>
            <uri>https://github.com/st185229</uri>
        </author>
        <category label="local-llm" term="local-llm"/>
        <category label="ollama" term="ollama"/>
        <category label="mcp" term="mcp"/>
        <category label="architecture" term="architecture"/>
        <category label="open-source" term="open-source"/>
    </entry>
</feed>