[{"data":1,"prerenderedAt":1542},["ShallowReactive",2],{"article-\u002Fblog\u002Feleven-steps-you-dont-type":3,"related-\u002Fblog\u002Feleven-steps-you-dont-type":644},{"id":4,"title":5,"author":6,"body":14,"date":629,"description":630,"extension":631,"image":632,"meta":633,"navigation":634,"path":635,"seo":636,"stem":637,"tags":638,"__hash__":643},"blog\u002Fblog\u002Feleven-steps-you-dont-type.md","Eleven Steps You Don't Type",{"name":7,"headshot":8,"role":9,"contact":10},"Levente Simon","\u002Fheadshots\u002FLS.jpeg","creator of dethernety",{"linkedin":11,"email":12,"twitter":13},"https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Flevente-simon\u002F","levente.simon@dether.net","https:\u002F\u002Fx.com\u002FLevente_Simon",{"type":15,"value":16,"toc":616},"minimark",[17,21,28,33,36,39,42,45,48,51,59,66,73,76,79,84,99,106,109,130,133,140,147,150,153,156,160,167,176,179,182,185,188,191,195,198,205,208,242,247,286,289,300,303,306,309,312,316,319,325,332,338,342,345,348,351,357,371,377,383,389,392,395,398,402,414,417,420,424,427,430,433,436,439,442,445,448,477,480,487,490,493,496,499,506,510,513,523,530,536,543,547,550,553,556,559,562],[18,19,5],"h1",{"id":20},"eleven-steps-you-dont-type",[22,23,24],"p",{},[25,26,27],"em",{},"Staged delegation, and the shape of a guided workflow that actually gets used",[22,29,30],{},[25,31,32],{},"First in a series on Dethernety and Dethereal.",[22,34,35],{},"Threat modeling has a UX problem. Shift-left made it worse.",[22,37,38],{},"For most of those twenty years, in most organizations, it was a specialist activity: a small number of security architects, offline, a schedule set by the security team, an artifact delivered once and filed. The tool was whatever the security team happened to use. Visio, a spreadsheet, a commercial suite none of the developers had ever opened.",[22,40,41],{},"DevSecOps changed who is expected to do it. The current orthodoxy says threat modeling belongs with the engineers building and running the system, not with a security team that does it once at the design-review gate and never again. The artifact is supposed to be living. The audience includes the author. The work is supposed to happen early and continuously.",[22,43,44],{},"It has not. The reasons are plural: engineers often lack the threat-intel background, incentives reward feature velocity over security artifacts, security teams still gate-keep review, and most organizations have no loop between the model and the runtime controls that would give skipping it a consequence. A better tool does not fix any of those. It only removes the excuse that the tool is in the way.",[22,46,47],{},"This piece is about the reason closest to the tool itself: the tools did not follow. Engineers were told to threat-model but handed tools that either assumed they already thought in trust boundaries or assumed they did not write code. Neither assumption fit. The result, predictably, is that most teams either don't do it, do it once during design review and never touch it again, or outsource it back to the security team and pretend otherwise.",[22,49,50],{},"The tools we have come in three shapes, and all three fail in the same way: they put the wrong cognitive load on the wrong person at the wrong time.",[22,52,53,54,58],{},"The first shape is the ",[55,56,57],"strong",{},"diagram",". A canvas tool, sometimes a dedicated threat-modeling application, increasingly a diagrams-as-code format checked into the repo. The diagrams-as-code version solves the staleness problem that killed the canvas version: the diagram lives in git and gets updated in pull requests. The serious tools in this family go further: they run rule engines against the declarative model and emit threats, countermeasures, and compliance mappings for the author to adjudicate, and the best of them seed the initial diagram from IaC or reference architectures rather than a blank canvas. That shift, from recall toward adjudication, is the right direction. What the tools still require is that the author declare the structure the seeding guessed at — trust boundaries, data flows, decomposition depth — with no scaffolded adjudication flow around those declarations. The diagram is the question masquerading as the input.",[22,60,61,62,65],{},"The second shape is the ",[55,63,64],{},"form",". Modern form-based threat modeling tools ask the modeler to describe the system through structured inputs and check results against built-in threat libraries and compliance mappings. Some ship with pattern libraries that provide a starter shape for common architectures — microservice behind API gateway, lambda behind queue — which helps on the recognisable cases and does nothing off the template. They validate more than the STRIDE spreadsheets they replaced: they cross-reference countermeasures, flag gaps, link findings to frameworks. What they cannot validate is whether the modeler answered their questions well. The form asks: is this boundary a trust boundary? Is this data PII? A modeler who does not know the answer selects one anyway, and the form records the guess as fact. The form is complete long before the model is.",[22,67,68,69,72],{},"The third shape is the ",[55,70,71],{},"chat box",". The naive version is \"describe your system and I'll generate a threat model.\" The serious versions, scoped by a host product's data model so they cannot invent components outside its vocabulary, ask follow-up questions and validate against built-in libraries. Both still ask the modeler to know, up front, what the tool needs: sensitivity, boundaries, adversary classes, compliance drivers, crown jewels. The serious versions do shift work toward adjudication: the tool proposes, the user confirms. What they do poorly is scope and sequence the proposals. The output arrives as a long prose draft that the author validates line by line, hoping to catch hallucinated components and missed ones in flow. Staged delegation produces the same class of proposals in small structured batches instead of one continuous draft, and the batch is the unit of adjudication. Same instinct, different shape.",[22,74,75],{},"Three failure modes, one common root: the tool asks the human to structure the problem, and the human is not good at it. Diagrams demand graphical structure the modeler does not have. Forms demand taxonomic structure the modeler has not internalized. Chat demands a well-formed prompt the modeler cannot produce because they don't yet know what the tool needs.",[22,77,78],{},"There is a fourth shape, and it has been hiding in plain sight.",[80,81,83],"h2",{"id":82},"the-third-paradigm","The third paradigm",[22,85,86,87],{},"Jakob Nielsen has argued that AI marks a paradigm shift in user interfaces: away from command-based systems, where users \"strike every blow\" by executing step-by-step instructions, and toward intent-based systems, where users specify desired outcomes and let the system figure out procedures. The user stops being an operator and becomes a supervisor. The computer stops being a tool and becomes an agent.",[88,89,90],"sup",{},[91,92,98],"a",{"href":93,"ariaDescribedBy":94,"dataFootnoteRef":96,"id":97},"#user-content-fn-1",[95],"footnote-label","","user-content-fnref-1","1",[22,100,101,102,105],{},"The shift is real, but it has a failure mode Nielsen himself names: the ",[55,103,104],{},"articulation barrier",". Intent-based UIs assume the user can express what they want in a single well-formed statement. In expert-knowledge domains, the user often cannot. An engineer asked to threat-model their system does not walk up to a chat box already knowing the crown jewels, adversary classes, compliance drivers, trust boundaries, and decomposition depth they need to articulate. They know those things after thinking about the system, which is the work they are trying to do.",[22,107,108],{},"So both ends of Nielsen's spectrum fail for this kind of work. Command-based fails because the novice does not know which command to run next. Intent-based fails because the novice cannot state the intent; they do not yet know what the tool needs from them.",[22,110,111,112,116,117,116,120,116,123,126,127,129],{},"Consider what a command-driven interface looks like when stripped to its bones: ",[113,114,115],"code",{},"discover",", ",[113,118,119],{},"classify",[113,121,122],{},"enrich",[113,124,125],{},"sync",". Clean. Also unusable. Nobody running ",[113,128,119],{}," on a blank directory knows what they are classifying, or why, or in what order, or against which taxonomy.",[22,131,132],{},"Consider the opposite: a single prompt, a long system message, and an agent free to ask whatever it wants. This fails differently. The agent goes three turns into a conversation, decides it has enough context, and generates a model. Or it burrows into one component for thirty turns and forgets the rest of the system exists. The output looks like a threat model written by someone who has read a lot of threat models but has never debugged one.",[22,134,135,136,139],{},"Nielsen proposes one resolution to the articulation barrier: ",[55,137,138],{},"intent by discovery",", helping users recognize what they want through exploration. The user starts without a clear intent and surfaces it through interaction with the system. This is the right resolution when the problem is that the user does not yet know what they want.",[22,141,142,143,146],{},"There is a second resolution, appropriate when the user roughly knows what they want (\"a threat model of this system\") but cannot articulate the structure that definition requires. The resolution is not to let them explore until they discover it. It is to break the final artifact into its component parts, in a fixed order, and to hand each part to a specialist agent that can do most of the articulation work on the user's behalf. The user is left with the part they are actually good at: recognising whether a proposal is right, and saying where it is not. Call this ",[55,144,145],{},"staged delegation",": break one large intent into an ordered sequence of smaller ones, delegate each to the actor best placed to articulate it (the user where only the user knows the answer, a specialist agent where the answer is discoverable from the codebase or the model), and require the user to supervise every proposal before it becomes part of the model.",[22,148,149],{},"The cognitive shift is from recall (what do I want?) to recognition (is this right?), which is the easier problem for a non-expert author by a wide margin.",[22,151,152],{},"Against Nielsen's frame, the diagnosis is the same and the resolution is different: staged delegation rather than intent by discovery. This is not a new UI era. It is a pattern for a specific kind of problem: expert-knowledge work where the user is both author and novice, and where the artifact has enough internal structure to decompose.",[22,154,155],{},"What follows is what one implementation looks like, concretely.",[80,157,159],{"id":158},"meet-the-user-where-they-work","Meet the user where they work",[22,161,162,163,166],{},"Before the eleven steps, a question a fair reader is probably already asking: ",[25,164,165],{},"if the whole workflow is a conversation in a terminal, what about the security analyst who wants the graph, or the reviewer who wants to see the whole model laid out?"," The answer has two parts.",[22,168,170],{"align":169},"center",[171,172],"img",{"src":173,"alt":174,"width":175},"\u002Fimages\u002Fblog\u002Fdiagram-architecture.svg","Two user populations, two interfaces, one platform: the plugin and the Web UI both talk to the Dethernety backend, which loads modules that call a graph DB, OPA, and an analysis engine",1000,[22,177,178],{},"The workflow runs inside Claude Code. This is not an implementation detail; it is the point. An engineer already in Claude Code does not want to stop working, open a browser, create an account on a threat modeling SaaS, upload a diagram, and answer a form. The moment you break their flow, the threat model stops happening. The context-switch tax is one of the reasons shift-left threat modeling stalls in practice. Telling an engineer to model a change, and then making them leave their editor to do it, kills the work for an activity with no immediate reward.",[22,180,181],{},"So the conversation lives in the editor. The model is a directory tree on disk. The output is committable. The workflow is resumable across sessions. An engineer reviewing a pull request can threat-model the change in the same window where they are reading the diff.",[22,183,184],{},"This does not mean the web UI is obsolete. A security analyst reviewing an attack surface wants the graph: a visual editor where boundaries, flows, and exposures are laid out spatially. Forcing them into a CLI is the mirror-image mistake of forcing an engineer into a web form. The two interfaces serve different populations and different tasks: the plugin for authors in the loop, the web UI for analysts and reviewers who need to see the whole model at once. Both read and write the same underlying graph.",[22,186,187],{},"Pair modeling — a developer and a security engineer sitting together for the exercise, still the most productive way to do this work — happens in the same session, at the same terminal. The tool does not replace the collaboration. It gives both people a shared surface to argue over.",[22,189,190],{},"Threat modeling has to meet users where they work.",[80,192,194],{"id":193},"eleven-steps","Eleven steps",[22,196,197],{},"The guided workflow has eleven steps. They are not commands. The user does not pick them. The agent walks through them in order, and each one corresponds to a specific transition in the model's state machine.",[22,199,200],{"align":169},[171,201],{"src":202,"alt":203,"width":204},"\u002Fimages\u002Fblog\u002Fdiagram-state-machine.svg","Eleven steps across six states, with the session break between step 5 and step 6",900,[22,206,207],{},"The steps themselves:",[209,210,211,218,224,230,236],"ol",{},[212,213,214,217],"li",{},[55,215,216],{},"Scope Definition"," — what is this system, what are the crown jewels, what compliance drivers apply",[212,219,220,223],{},[55,221,222],{},"Discovery"," — scan the codebase for infrastructure, containers, IaC, API definitions",[212,225,226,229],{},[55,227,228],{},"Model Review"," — confirm the discovered elements, match them against the platform's class library",[212,231,232,235],{},[55,233,234],{},"Boundary Refinement"," — adjust trust boundaries, set enforcement attributes",[212,237,238,241],{},[55,239,240],{},"Data Flow Mapping"," — connect components, add operational flows the scanner missed",[22,243,244],{},[25,245,246],{},"— Session Break —",[209,248,250,256,262,268,274,280],{"start":249},6,[212,251,252,255],{},[55,253,254],{},"Classification"," — LLM-assisted class assignment for ambiguous elements",[212,257,258,261],{},[55,259,260],{},"Data Item Classification"," — tag sensitive data on cross-boundary flows",[212,263,264,267],{},[55,265,266],{},"Enrichment"," — security attributes and credentials against each class's schema",[212,269,270,273],{},[55,271,272],{},"Validation"," — quality score, gate checks, readiness assessment",[212,275,276,279],{},[55,277,278],{},"Sync"," — push to the platform for analysis",[212,281,282,285],{},[55,283,284],{},"Post-Sync Linking"," — link countermeasures to exposures",[22,287,288],{},"The eleven-step shape is not arbitrary. Each split is where it is for one of three reasons: a distinct reasoning mode, a distinct agent invocation, or a distinct moment when the user has to decide something. The rest of this section walks through the three places that make the shape's logic visible.",[22,290,291,292],{},"Scope comes first because every later decision depends on it. You cannot classify a data item as Restricted under PCI-DSS if you have not established that PCI-DSS is in scope. You cannot tag a component as a crown jewel if you have not named the crown jewels. The scope file is short, the questions are conversational, and the answers are referenced all the way through to validation.",[88,293,294],{},[91,295,299],{"href":296,"ariaDescribedBy":297,"dataFootnoteRef":96,"id":298},"#user-content-fn-2",[95],"user-content-fnref-2","2",[22,301,302],{},"Discovery is separate from classification because the two require different reasoning. Discovery is \"does this thing exist in the codebase, and what is it called.\" Classification is \"what kind of thing is it, and how does it fit in our taxonomy.\" Collapsing them produces a model full of plausibly-named components that turn out to be config files, or job schedulers that got classified as web servers because they happened to bind a port.",[22,304,305],{},"The session break between step five and step six is deliberate. Steps one through five build the structure of the model: what components exist, how they connect, which boundaries they sit in. Steps six through ten populate that structure with security context. The two phases are different cognitive modes. Discovery reasons from evidence to structure; enrichment reasons from structure to security properties. Keeping them in one session mixes two kinds of thinking in the same working memory, for the LLM and the human both. The practical consequence is that starting enrichment in a fresh session produces better output at lower cost, and the session break makes that explicit. It also gives the user a natural place to commit the structural model to git before the richer, more revisable enrichment passes happen.",[22,307,308],{},"Those three are enough to see the pattern. Each sits where it sits because moving it or dropping it breaks something concrete, usually not at the step you touched but downstream. The others rhyme.",[22,310,311],{},"Eleven is not a magic number. A reasonable decomposition of this artifact could have landed on nine or thirteen; the commitments are that every split sit on one of the three reasons above, and that every step be a precondition for the next. Eleven is what that produced here.",[80,313,315],{"id":314},"two-layers","Two layers",[22,317,318],{},"Seen from above, the workflow has two layers.",[22,320,321],{"align":169},[171,322],{"src":323,"alt":324,"width":204},"\u002Fimages\u002Fblog\u002Fdiagram-two-layers.svg","The outer layer is the fixed step sequence; the inner layer is a specialist agent proposing a batch that the user adjudicates before anything is persisted",[22,326,327,328,331],{},"The ",[55,329,330],{},"outer layer"," is the fixed sequence above. It is not command-based, because the user does not pick which step runs when. It is not intent-based, because the workflow itself infers nothing; its shape is fixed by the shape of the artifact the tool has to produce. Each step corresponds to a specific, named part of the model that must exist before the next step can proceed. Not a suggested order — a required one.",[22,333,327,334,337],{},[55,335,336],{},"inner layer"," is what happens inside each step. A specialist agent runs a well-defined procedure (scan the codebase, match elements against the class library, fill attribute schemas, score quality) and presents its output as a batch the user confirms, modifies, or rejects before anything is written to disk. Scope definition at the start is the exception: the user articulates the crown jewels, the compliance drivers, what is in scope and what is out, and the agent's job is to capture, not propose. Everywhere else the direction is inverted — the agent proposes, the user adjudicates — and the user supervises a pipeline of specialist proposals rather than a single black-box agent.",[80,339,341],{"id":340},"the-multi-brain","The multi-brain",[22,343,344],{},"The other half of the system is that the agent is not one agent. It is four. The first version was one, and it drifted.",[22,346,347],{},"A single orchestrator with a long system prompt covering discovery, classification, enrichment, and validation produces the failure mode described earlier: plausible-looking models with fabricated details, because the agent has no structural reason to separate \"I am scanning for infrastructure\" from \"I am filling security attributes against a class schema.\" In one context, it is all one job, and the job drifts in whichever direction the last few turns pushed it.",[22,349,350],{},"The four-agent split mirrors the four cognitive jobs.",[22,352,327,353,356],{},[55,354,355],{},"threat-modeler"," is the orchestrator. It reads the discovery report, presents it to the user, writes the confirmed elements to disk, and drives the workflow through its states. It owns the state machine. It handles user confirmations, batch-table presentations, and state transitions. It delegates enrichment and review rather than doing them inline, because each task has different context-budget needs.",[22,358,327,359,362,363,366,367,370],{},[55,360,361],{},"infrastructure-scout"," scans the codebase. It is read-only. It does not write any model files. It produces a discovery report: a structured list of components, each one carrying the source that produced it (file, line, resource) and two confidence buckets — one for ",[25,364,365],{},"existence",", one for ",[25,368,369],{},"classification"," — picked against a fixed rubric (high for an explicit declaration like a Kubernetes Service or a Terraform resource, medium for a strong inference like a Docker image or an import statement, low for a weak inference like a string literal or a comment). The scores are rubric assignments against observable source properties, not a free-form self-assessment. Its exploration budget is bounded — discovery on a real codebase needs to look at a lot of files, but not unbounded files. It has no concept of security attributes, classes, or MITRE. It does one thing.",[22,372,327,373,376],{},[55,374,375],{},"security-enricher"," writes attributes. It is the only sub-agent with write access to attribute files. It runs a two-pass classification (embedding-based matching against the class library first, LLM-assisted for the residue), pulls each matched class's attribute schema from the backend, and fills those attributes from what the scout discovered. It produces the credential topology — which identities hold which credentials to reach which resources. It does not assign ATT&CK techniques; that mapping happens on the platform, deterministically, from the attributes once the model is synced. Its budget is larger than the scout's because enrichment on a medium-sized model touches a lot of files. It has no opinion on whether a component should exist in the first place; that is the modeler's job.",[22,378,327,379,382],{},[55,380,381],{},"model-reviewer"," is a read-only auditor. It cannot modify any files. It computes a seven-factor quality score and evaluates three quality gates. The three gate-relevant factors — classification coverage, attribute completion, flow coverage — are grounded by construction: the numbers come from counting conditions over the graph, not from LLM judgment about whether a classification is sensible. The other four factors in the score carry some heuristic weighting and inform the dashboard, but they do not gate the workflow. The LLM's role at the review step is narrating the result, not producing it.",[22,384,385],{"align":169},[171,386],{"src":387,"alt":388,"width":204},"\u002Fimages\u002Fblog\u002Fdiagram-four-agent-permissions.svg","Permission matrix: each agent gets a tool allowlist that scopes it to one narrow job",[22,390,391],{},"Four agents, four roles, four permission sets, four exploration budgets scaled to the task. Each one has a narrow, well-defined job. None of them can accidentally do each other's work, because the tooling does not let them. The tools live on an MCP server the plugin ships with — twenty-two in total. Each agent starts with a role-scoped allowlist; the scout's omits write primitives entirely, and it cannot write a file even if its prompt told it to. The reviewer cannot mutate state. The enricher cannot create new components. The boundaries are enforced by the allowlist, not by the prompt.",[22,393,394],{},"That is what staged delegation needs on the agent side: specialization enforced at the tool layer, not in the system message. The scout's tools end where the modeler's begin. The reviewer can read everything and write nothing. The enricher owns its attribute files. A single generalist agent would have access to everything and would use it, all the time, for every task, exactly as the first version of the system did.",[22,396,397],{},"Tool permissions prevent the scout from writing an attribute file, but they do not prevent the scout from hallucinating a component with a plausible filename and confidence bucket. Role separation solves the cross-contamination problem: the scout cannot quietly rewrite attributes. It does not solve the hallucination problem, and the three agents that touch the model are not checked equally. The enricher classifies against a class library defined outside the agent and fills attributes whose schema lives on the backend; the reviewer counts conditions over the platform's graph. Both check themselves against something external. The scout does not — its evidence is its own narration of its own work, and a hallucinated component can come with plausible-looking evidence. Which is why step three — Model Review — is the adjudication step where the author's attention matters most: the scout's cited sources (file, line, resource) and rubric-based buckets exist so the author walks the evidence rather than rubber-stamping the conclusion. Permissions and grounding are different problems, and the less grounded agent is the one whose output the human has to touch first.",[80,399,401],{"id":400},"what-this-shape-is-not-for","What this shape is not for",[22,403,404,405,413],{},"Staged delegation is not a substitute for the live conversation. The continuous-threat-modeling school",[88,406,407],{},[91,408,412],{"href":409,"ariaDescribedBy":410,"dataFootnoteRef":96,"id":411},"#user-content-fn-3",[95],"user-content-fnref-3","3"," argues two things worth taking seriously: that the right unit of threat modeling is the change — a pull request, a design decision, a sprint ticket — not a quarterly all-day session, and that any artifact-centric workflow risks producing a clean document that gives the team permission to stop thinking. The document becomes the point, the practice withers, and a threat model in the repo becomes a worse outcome than no threat model at all, because it looks like coverage.",[22,415,416],{},"On the cadence claim, staged delegation is not in competition. The workflow lives on the pull request and is resumable at the diff level. The same sequence that produces the initial model runs on a later change, with only the stale elements going through classification and enrichment again. That is the cadence CTM demands — embedded in the engineer's loop, at the unit of the change — with the difference that the capture step is cheap enough that the continuous practice produces a durable trail instead of evaporating.",[22,418,419],{},"On the artifact-crowding-out-practice worry, the concession is real. A usable artifact can become a substitute for the thinking that produced it. The defence is not in the tool; it is in how the team uses it. Staged delegation does not enforce conversation, and a team that runs the workflow solo every time has given up the thing that made the CTM school right to begin with. What the artifact does offer is a surface where self-deception is more expensive: attributes are either filled or they are not, and the reviewer counts what is missing. The workflow raises the cost of lying to yourself; it does not eliminate it. Calling the artifact the output is not the same as calling it the point.",[80,421,423],{"id":422},"the-constraint-is-the-feature","The constraint is the feature",[22,425,426],{},"Staged delegation trades flexibility for structure, and the trade is deliberate.",[22,428,429],{},"What it gives up is flexibility. A power user who knows exactly what they want cannot skip straight to enrichment without at least a degenerate pass through scope and discovery. There is a command-based interface for those users, and individual commands work fine on their own. But even the commands enforce the state-machine preconditions: you cannot enrich a component that does not exist, you cannot classify a data item without a flow. When the preconditions are not met the commands fail loudly rather than auto-running the upstream steps and producing a silent partial model. The ordering is not in the UX; it is in the model. The guided workflow simply makes the ordering explicit and comfortable for a user who would otherwise have to discover it by running into errors.",[22,431,432],{},"It also assumes the shape of the artifact is roughly known. For the common case (a service with a codebase, IaC, a CI pipeline, a recognisable architecture) the shape fits. For systems where it does not — brownfield models inherited from an acquisition with no source to scan, SaaS integrations where most of the system is someone else's, regulated environments where scope is dictated by an auditor rather than a conversation, or architectures whose components fall outside the installed class library (a mainframe tier, a bespoke message bus, a medical-device subsystem) — the eleven-step shape is the wrong shape. The individual commands remain available and the guided workflow is not forced. What is given up is the scaffolding, and with it, the population the scaffolding was built for.",[22,434,435],{},"Two engineers running the same eleven steps on the same repo will not produce byte-identical models. The LLM-assisted steps (pass-two classification, the enricher's attribute inference) are non-deterministic, and two authors will make different calls when they adjudicate. The workflow narrows the variance by fixing scope, by matching against a class library, by forcing the same sequence of questions, but it does not eliminate it. The same engineer running the same workflow against the same repo six months later, after the model provider has updated the underlying LLM, will also see variance.",[22,437,438],{},"For regulated environments where reproducibility matters — an auditor reviewing the model against a specific point-in-time version of the system — this is not just a variance concern but a reproducibility one. The model lives on disk and commits to the same repo as the codebase, so git carries the versioning for both together: a tag pins the commit, which pins the model, the codebase, and the timestamp. Pinning the LLM is not enough on its own. The class library and the ATT&CK\u002FD3FEND graphs version too, and reproducibility needs those recorded alongside the commit — a note in the commit message, a footer in the scope file, whatever the team's practice allows. There is no separate manifest in the tool; there is git and there is the author's discipline. If you want byte-identical models without that plumbing, you write them by hand, which brings the articulation barrier back. The trade is reducing variance while keeping articulation affordable, not zeroing it.",[22,440,441],{},"And the quality floor is still set by the adjudicator. An engineer who cannot recognise a bad proposal will accept one, and a user who clicks through proposals without reading them reproduces the form failure one layer up — the workflow completes, the model is wrong, the failure has just moved from the input side to the adjudication side. Staged delegation cannot prevent this. What it offers is proposals that are small enough and specific enough that reading them costs less than ignoring them.",[22,443,444],{},"There is a related limit worth naming, and it has two parts.",[22,446,447],{},"The first is what the scout cannot see no matter how much access you grant it: systems behind credentials it does not hold, third-party SaaS whose APIs it cannot reach, human-process steps that form part of the real threat model (a Slack approval gate in a deployment pipeline), and anything that lives only in someone's head. Even with full cluster access, runtime-only behaviour — cron jobs buried in container entrypoints, sidecar injections from admission controllers, service meshes that rewrite traffic paths — is only partially visible: what the scout sees and what the system does at runtime are not the same set. These gaps do not close with more tooling; they close only with an author who notices and fills them in by hand.",[22,449,450,451,116,454,457,458,461,462,465,466,469,470,473,474,476],{},"The second is what it can see but might not be allowed to. The scout reads the codebase by default, and if the workstation has ",[113,452,453],{},"kubectl",[113,455,456],{},"aws",", or ",[113,459,460],{},"terraform"," configured, it can introspect live infrastructure through read-only commands and pick up runtime-only components that never appear in source. Whether you grant it that access is a trust decision, not a default. An LLM running ",[113,463,464],{},"aws describe"," against a production account is not a choice to make casually, and the answers address two different risks. Read-only roles and test-environment restrictions mitigate the ",[25,467,468],{},"blast radius",": even if the agent misbehaves, it cannot mutate state it was not given permission to mutate. Pre-extracting the data and handing the agent a file is a different mitigation — it addresses the ",[25,471,472],{},"data egress"," risk. Everything the scout reads, including ",[113,475,453],{}," stdout, IAM role names, security group rules, S3 ARNs, along with the Terraform code and the source itself, is sent to whoever hosts the model as context. Read-only does not mean read-nothing-sensitive.",[22,478,479],{},"Which model provider sees that context is part of the workflow decision, not just the procurement decision. A regulated shop picks along three axes: self-hosted inference (no third-party provider sees the context), redaction at the scout boundary (identifiers, secrets, and customer data stripped before they leave the workstation), or scope restriction (point the agent only at what you are already willing to send outside the perimeter). This is true of every AI-assisted workflow, not just this tool.",[22,481,482,483,486],{},"A related risk neither half of the split addresses: the agent can be manipulated by its inputs. An IaC comment, a Dockerfile, or ",[113,484,485],{},"kubectl describe"," output in a compromised repo is attacker-controlled in adversarial settings, and a prompt-injection payload riding that input is something neither read-only roles nor pre-extraction prevents. It lives in the same class of problem as malicious pull-request reviews and needs the same kinds of defence — input sanitisation, scope limits on what tools the agent can call on what it reads — with the caveat that sanitisation here is best-effort: LLM context has no parameterised-query equivalent.",[22,488,489],{},"The workflow is only as good as what the scout can see, given the access you are willing to grant it, plus what the author adds by hand.",[22,491,492],{},"The trade buys three things: resumability, inspectability, composability. Each falls out of the state machine and the directory tree on disk. None of them survive in a free-form chat.",[22,494,495],{},"Because the workflow is a state machine, the user can stop at any step, close the session, come back a week later, and pick up exactly where they left off. The progress table shows what is done, what is auto-skipped, what is current, and what is pending. When the session dies in a chat the context dies with it; here the context is the directory.",[22,497,498],{},"Every step produces an output that lives on disk in a human-readable format. Scope is a JSON file. Discovery produces a structure file. Classification updates class fields. Enrichment writes per-element attribute files. The user can read any of it, edit any of it in a text editor, commit any of it to git, and point an auditor at any of it. The model is not a hidden state inside an agent; it is a directory tree on the user's disk, under their filesystem permissions, in their git history.",[22,500,501,502,505],{},"And because the steps are semantically well-defined, the agent can auto-skip steps whose conditions are already met. If every discovered component matches a class unambiguously on the first pass, step six shows a green check and the LLM-assisted pass has nothing to resolve. If the user adds a component during enrichment, the state reverts to ",[113,503,504],{},"STRUCTURE_COMPLETE",", the new element gets flagged as stale, and enrichment re-runs only on the stale element. Staleness propagates: a changed component invalidates flows crossing it, which invalidates data items on those flows, and the agent computes the closure before re-running. The same logic holds across sessions — a developer opening a pull request two months later runs the workflow on the diff, and only the stale elements go through classification and enrichment again. None of this would work if the steps were just narrative waypoints in a long prompt.",[80,507,509],{"id":508},"three-things-underneath","Three things underneath",[22,511,512],{},"This shape rests on three things the later pieces in the series will take on directly.",[22,514,515,516,519,520,522],{},"First, a ",[55,517,518],{},"graph-native backend"," whose topology actually enforces state transitions. A graph with typed nodes, typed edges, and enforced classes, where ",[113,521,504],{}," is not a flag on a document but a condition on the graph: every component belongs to exactly one boundary, no orphan flows, every flow has a classified source and target. The state machine above is the user-visible expression of invariants the graph already enforces.",[22,524,525,526,529],{},"Second, a ",[55,527,528],{},"modular analysis layer"," on the platform. Analyses like attack path generation, compliance mapping, and control coverage read the same graph through the same interface and produce exposures — reachable attack paths, policy violations, control gaps — rather than proprietary output. Because MITRE ATT&CK and D3FEND are loaded into the same graph, each exposure is linked to the ATT&CK techniques an adversary would use against it and, where a countermeasure applies, to the D3FEND techniques that defend against it. \"Deterministic\" here means the mapping is a pure function of the graph state, given a fixed ruleset and taxonomy version. Not authoritative. Reproducible. Attacker and defender views meet on the same model, not in a spreadsheet next to it.",[22,531,532],{"align":169},[171,533],{"src":534,"alt":535,"width":204},"\u002Fimages\u002Fblog\u002Fdiagram-graph-fragment.svg","Exposures are written by the plugin; their links to ATT&CK and D3FEND techniques are added deterministically on the backend, closing the loop between attacker and defender views",[22,537,538,539,542],{},"Third, a ",[55,540,541],{},"role-separated agent architecture"," where cross-agent permission boundaries are enforced at the tool layer rather than in the prompt. That last guarantee is narrower than it sounds — permissions stop one agent from doing another's work, they do not prevent any of them from being wrong — but it is real, and almost nothing else in the multi-agent space bothers to enforce it.",[80,544,546],{"id":545},"the-paradigm-is-transferable","The paradigm is transferable",[22,548,549],{},"The shape is not really about threat modeling.",[22,551,552],{},"Any expert-knowledge domain where a non-expert has to produce a structured artifact, on a pace that allows per-step supervision, runs into the same articulation barrier. Architecture review is the clearest example. An engineer proposing a design is asked to produce a component diagram, a trust-boundary analysis, a failure-mode table, and a written rationale in a specific format. The established tooling there — ADRs, C4 diagrams, RFC templates — gives them the outline but not the content. They cannot articulate what goes inside in one prompt, not because they lack skill but because the work of thinking about it is exactly what they are being asked to do. A guided workflow that delegates the legwork to specialists — one agent reads the codebase and proposes the component diagram, another walks the trust boundaries, a third scans for failure modes, a reviewer checks against architectural principles — does not turn them into a senior architect. It lets them produce a usable review by recognising good answers rather than recalling them, inside the editor they are already writing the design in. Compliance gap analysis, regulated-document drafting, and clinical decision support have the same shape. Real-time adversarial work does not: an analyst paged at two in the morning has no time for eleven supervised steps, and incident response is about moving faster than the attacker, not producing an inspectable artifact. The pattern is for slow expert-knowledge work.",[22,554,555],{},"Staged delegation is one operational answer to the articulation barrier: a fixed outer workflow, specialist agents that articulate where they can, supervised proposals where the user has to adjudicate, delivered in the tool the user already has open. It generalizes to any domain where the artifact has enough internal structure to decompose. The hard part is not the technology. It is the discipline of saying no to the free-form chat box, to the general-purpose agent, and to the power-user shortcut.",[22,557,558],{},"A better interface removes the excuse that the tool was in the way. It does not remove the rest. The incentive and loop problems named at the start are still there, untouched, and a team that does not threat-model will not start just because the tool got better. Those are different conversations, and this piece was about only one of them.",[22,560,561],{},"The constraint is not a limitation to apologize for. It is what makes a language model useful for this kind of work.",[563,564,567,572],"section",{"className":565,"dataFootnotes":96},[566],"footnotes",[80,568,571],{"className":569,"id":95},[570],"sr-only","Footnotes",[209,573,574,594,603],{},[212,575,577,578,586,587],{"id":576},"user-content-fn-1","Jakob Nielsen, ",[91,579,583],{"href":580,"rel":581},"https:\u002F\u002Fwww.uxtigers.com\u002Fpost\u002Fintent-ux",[582],"nofollow",[25,584,585],{},"Intent by Discovery: Designing the AI User Experience",", March 26, 2026. \"Articulation barrier\" and \"intent by discovery\" are his terms, picked up here because they name the problem cleanly. The broader HCI tradition (mixed-initiative interaction, scaffolded workflows, progressive disclosure, wizard-style UIs) has been working this territory for decades and the debt is acknowledged. \"Staged delegation\" is used here to name a distinct resolution to the same problem. ",[91,588,593],{"href":589,"ariaLabel":590,"className":591,"dataFootnoteBackref":96},"#user-content-fnref-1","Back to reference 1",[592],"data-footnote-backref","↩",[212,595,597,598],{"id":596},"user-content-fn-2","How the workflow produces artifacts an auditor will accept — the connection between attributes, the compliance taxonomies loaded in the graph, and what lands in an evidence bundle — is deferred to a later piece on the analysis layer. ",[91,599,593],{"href":600,"ariaLabel":601,"className":602,"dataFootnoteBackref":96},"#user-content-fnref-2","Back to reference 2",[592],[212,604,606,607,610,611],{"id":605},"user-content-fn-3","The book-length articulation is Izar Tarandach and Matthew J. Coles, ",[25,608,609],{},"Threat Modeling: A Practical Guide for Development Teams"," (O'Reilly, 2020), and the continuous-threat-modeling community that grew around it. ",[91,612,593],{"href":613,"ariaLabel":614,"className":615,"dataFootnoteBackref":96},"#user-content-fnref-3","Back to reference 3",[592],{"title":96,"searchDepth":617,"depth":617,"links":618},2,[619,620,621,622,623,624,625,626,627,628],{"id":82,"depth":617,"text":83},{"id":158,"depth":617,"text":159},{"id":193,"depth":617,"text":194},{"id":314,"depth":617,"text":315},{"id":340,"depth":617,"text":341},{"id":400,"depth":617,"text":401},{"id":422,"depth":617,"text":423},{"id":508,"depth":617,"text":509},{"id":545,"depth":617,"text":546},{"id":95,"depth":617,"text":571},"2026-04-20","Threat modeling stalls in shift-left workflows because intent-based interfaces run into an articulation barrier. Staged delegation inside the engineer's editor, backed by specialist agents and a graph-native model, is one resolution.","md","\u002Fimages\u002Fblog\u002Feleven-steps-cover.jpg",{},true,"\u002Fblog\u002Feleven-steps-you-dont-type",{"title":5,"description":630},"blog\u002Feleven-steps-you-dont-type",[639,640,641,642],"threat modeling","agents","interface design","graph","wNDFIuDykVbCqCf3ArjGrkpDRXPEhD6oMiFWCsLEgv8",[645,1259],{"id":646,"title":647,"author":648,"body":650,"date":1241,"description":1242,"extension":631,"image":1243,"meta":1244,"navigation":634,"path":1249,"seo":1250,"stem":1251,"tags":1252,"__hash__":1258},"blog\u002Fblog\u002Fthe_lost_science.md","The Lost Science: How We Forgot Security Was a Graph Problem",{"name":7,"headshot":8,"role":9,"contact":649},{"linkedin":11,"email":12,"twitter":13},{"type":15,"value":651,"toc":1216},[652,655,660,663,669,672,675,679,682,687,690,695,701,708,722,729,733,736,739,742,756,759,762,766,772,775,790,793,819,825,832,836,843,846,853,856,860,863,866,870,873,876,880,883,886,889,892,896,899,913,916,919,922,926,929,1010,1013,1016,1020,1023,1027,1030,1053,1060,1063,1067,1070,1077,1081,1084,1087,1094,1097,1108,1112,1115,1122,1129,1132,1138,1148,1152,1155,1158,1161,1164,1168,1171,1174,1177,1181,1184,1201,1206],[18,653,647],{"id":654},"the-lost-science-how-we-forgot-security-was-a-graph-problem",[22,656,657],{},[25,658,659],{},"In 1976, computer scientists proved that security is a graph traversal problem. Then we forgot. Here's why, and why it matters now.",[22,661,662],{},"If you ask a modern security architect how access control works, they'll describe Access Control Lists: users, groups, permissions, roles. Flat tables. Lookup operations.",[22,664,665,666],{},"But if you asked the same question to a computer scientist in 1976, they would have drawn you a graph. Nodes for subjects and objects. Directed edges for rights. And they would have told you: ",[25,667,668],{},"\"Security is the question of whether a path exists.\"",[22,670,671],{},"We knew this. We proved it. Then we forgot it. It was not wrong, but we couldn't afford to compute it.",[22,673,674],{},"This is the story of how security became a graph problem, why we abandoned that insight, and why graph databases now let us pick it back up.",[80,676,678],{"id":677},"the-golden-age-when-security-was-mathematics-1973-1983","The golden age: when security was mathematics (1973-1983)",[22,680,681],{},"The early 1970s were an unusual period for computer security. The field wasn't dominated by vendors selling products. It was dominated by mathematicians asking fundamental questions:",[22,683,684],{},[25,685,686],{},"\"Can we prove that a system is secure?\"",[22,688,689],{},"The answers they developed weren't heuristics or best practices. They were formal models: mathematical frameworks that could provide actual guarantees. And almost all of them were graph problems.",[691,692,694],"h3",{"id":693},"bell-lapadula-confidentiality-as-information-flow-1973","Bell-LaPadula: confidentiality as information flow (1973)",[22,696,697,698],{},"David Elliott Bell and Leonard LaPadula, working for MITRE under a US Air Force contract, asked: ",[25,699,700],{},"\"How do we prevent classified information from leaking to unauthorized users?\"",[22,702,703,704,707],{},"They modeled security clearances as a lattice—a mathematical structure where elements have a defined ordering (",[113,705,706],{},"Top Secret > Secret > Confidential > Unclassified","). Then they defined two simple rules:",[209,709,710,716],{},[212,711,712,715],{},[55,713,714],{},"No Read Up (Simple Security):"," A subject cannot read an object at a higher classification level.",[212,717,718,721],{},[55,719,720],{},"No Write Down (Star Property):"," A subject cannot write to an object at a lower classification level.",[22,723,724,725,728],{},"Information flow is directional. If you model clearances as nodes and permitted flows as edges, then a security violation is simply ",[25,726,727],{},"a path that shouldn't exist",". Confidentiality becomes a graph reachability problem.",[691,730,732],{"id":731},"biba-the-integrity-inverse-1977","Biba: the integrity inverse (1977)",[22,734,735],{},"Kenneth Biba, working at MITRE on a different problem, realized that integrity is the mirror image of confidentiality.",[22,737,738],{},"Where Bell-LaPadula asks \"Can secrets leak down?\", Biba asks \"Can corruption flow up?\"",[22,740,741],{},"His model inverted the rules:",[209,743,744,750],{},[212,745,746,749],{},[55,747,748],{},"No Read Down:"," A subject cannot read from a lower integrity level (don't trust untrusted data).",[212,751,752,755],{},[55,753,754],{},"No Write Up:"," A subject cannot write to a higher integrity level (don't corrupt trusted data).",[22,757,758],{},"Same lattice structure. Same graph problem. Different direction of concern.",[22,760,761],{},"Together, Bell-LaPadula and Biba showed that both confidentiality and integrity could be modeled as constrained information flow on a graph. Security was about proving that certain paths could never exist. Not checking permissions.",[691,763,765],{"id":764},"take-grant-security-as-graph-rewriting-1976","Take-Grant: security as graph rewriting (1976)",[22,767,768,769],{},"While Bell-LaPadula and Biba focused on information flow, Jones, Lipton, and Snyder asked a different question: ",[25,770,771],{},"\"How do permissions propagate?\"",[22,773,774],{},"Their Take-Grant model was explicitly a directed graph:",[776,777,778,784],"ul",{},[212,779,780,783],{},[55,781,782],{},"Nodes:"," Subjects (users, processes) and Objects (files, resources)",[212,785,786,789],{},[55,787,788],{},"Edges:"," Rights (read, write, take, grant)",[22,791,792],{},"The model defined four operations that could modify the graph:",[776,794,795,801,807,813],{},[212,796,797,800],{},[55,798,799],{},"Take:"," If A has \"take\" rights to B, and B has rights to C, then A can acquire B's rights to C.",[212,802,803,806],{},[55,804,805],{},"Grant:"," If A has \"grant\" rights to B, A can give B any rights that A possesses.",[212,808,809,812],{},[55,810,811],{},"Create:"," A subject can create new nodes.",[212,814,815,818],{},[55,816,817],{},"Remove:"," A subject can remove edges it controls.",[22,820,821,822],{},"The security question became: ",[25,823,824],{},"\"Given an initial graph and these rewriting rules, can subject X ever acquire right R to object Y?\"",[22,826,827,828,831],{},"This is pure graph theory, and the result: ",[55,829,830],{},"the safety problem in Take-Grant is decidable in linear time",". You can actually prove whether a right can ever leak.",[691,833,835],{"id":834},"harrison-ruzzo-ullman-the-limits-of-decidability-1976","Harrison-Ruzzo-Ullman: the limits of decidability (1976)",[22,837,838,839,842],{},"Not all news was good. Harrison, Ruzzo, and Ullman studied a more general access control model and proved the result: ",[55,840,841],{},"in the general case, the safety problem is undecidable",".",[22,844,845],{},"You cannot write an algorithm that will always correctly determine whether a given access control system can ever reach an unsafe state.",[22,847,848,849,852],{},"But they also showed that restricted models ",[25,850,851],{},"are"," decidable. If you constrain the graph structure (limit the operations, bound the complexity), you can recover formal guarantees.",[22,854,855],{},"The theorists had mapped the terrain: security is a graph problem, and the key question is whether your graph is constrained enough to be analyzable.",[80,857,859],{"id":858},"the-pragmatic-retreat-why-we-over-corrected-1980s-2000s","The pragmatic retreat: why we over-corrected (1980s-2000s)",[22,861,862],{},"If security was solved in theory by 1983, why are we still struggling with access control in 2026?",[22,864,865],{},"Two forces pushed at once, and we over-corrected for both.",[691,867,869],{"id":868},"the-hardware-reality","The hardware reality",[22,871,872],{},"In 1976, a PDP-11 had 64KB of RAM. Graph traversal algorithms that are trivial today were prohibitively expensive. Running a Take-Grant safety analysis on a real system with thousands of users and objects? Impossible.",[22,874,875],{},"The formal models were correct but practically useless.",[691,877,879],{"id":878},"the-models-own-limitations","The models' own limitations",[22,881,882],{},"Computational cost wasn't the only problem. The formal models themselves had rough edges that made practitioners skeptical.",[22,884,885],{},"Bell-LaPadula's \"tranquility\" property required that security labels never change during operation. That was too rigid for real systems where users legitimately need to reclassify data. Biba's integrity lattice couldn't capture what commercial systems actually needed: separation of duties, well-formed transactions, audit trails.",[22,887,888],{},"Clark and Wilson recognized this in 1987. Their integrity model abandoned the lattice approach entirely. Real-world integrity, they argued, isn't about information flow direction. It's about ensuring that data is only modified through authorized, validated procedures. They were right about that.",[22,890,891],{},"But the lesson the industry drew was broader than Clark and Wilson intended. The takeaway wasn't \"lattice models need refinement.\" It was \"formal models don't work in practice.\" That was the over-correction. The models needed better hardware and more nuance. Instead, we threw out the graph entirely.",[691,893,895],{"id":894},"the-compromise-flatten-the-graph","The compromise: flatten the graph",[22,897,898],{},"System designers needed something that would actually run. Their solution was to flatten the graph into tables:",[776,900,901,907],{},[212,902,903,906],{},[55,904,905],{},"Access Control Lists (ACLs):"," For each object, list who can access it.",[212,908,909,912],{},[55,910,911],{},"Capability Lists:"," For each subject, list what they can access.",[22,914,915],{},"Both are projections of the underlying graph onto flat structures. They're fast to query (O(1) lookup), easy to store, and simple to understand.",[22,917,918],{},"But they can't answer path questions.",[22,920,921],{},"An ACL can tell you \"Alice has read access to File X.\" It cannot tell you \"If Alice compromises Service A, can she eventually reach Database Z?\" That's a multi-hop traversal, exactly what ACLs were designed not to compute.",[691,923,925],{"id":924},"what-we-lost","What we lost",[22,927,928],{},"When we flattened the graph, we lost the ability to answer the questions that actually matter:",[930,931,932,948],"table",{},[933,934,935],"thead",{},[936,937,938,942,945],"tr",{},[939,940,941],"th",{},"Question",[939,943,944],{},"Graph Model",[939,946,947],{},"ACL Model",[949,950,951,962,977,988,999],"tbody",{},[936,952,953,957,960],{},[954,955,956],"td",{},"\"Can Alice access File X?\"",[954,958,959],{},"Trivial",[954,961,959],{},[936,963,964,971,974],{},[954,965,966,967,970],{},"\"Can Alice ",[25,968,969],{},"ever"," reach Database Z?\"",[954,972,973],{},"Path query",[954,975,976],{},"Manual analysis",[936,978,979,982,985],{},[954,980,981],{},"\"If we add this permission, what new paths open?\"",[954,983,984],{},"Graph diff",[954,986,987],{},"Unknown",[936,989,990,993,996],{},[954,991,992],{},"\"Where are the transitive trust relationships?\"",[954,994,995],{},"Traversal",[954,997,998],{},"Invisible",[936,1000,1001,1004,1007],{},[954,1002,1003],{},"\"Is our permission structure safe?\"",[954,1005,1006],{},"Decidable (constrained)",[954,1008,1009],{},"Undecidable (in practice, unknown)",[22,1011,1012],{},"We traded formal guarantees for performance and simplicity. For 30 years, that was a reasonable trade.",[22,1014,1015],{},"But the constraints that forced it have been gone for over a decade. At some point, a reasonable compromise becomes an unexamined habit.",[80,1017,1019],{"id":1018},"the-irony-we-rebuilt-the-graph-badly","The irony: we rebuilt the graph (badly)",[22,1021,1022],{},"The irony is that modern enterprise security has recreated the graph problem at massive scale while still pretending we have flat ACLs.",[691,1024,1026],{"id":1025},"cloud-iam-the-implicit-graph","Cloud IAM: the implicit graph",[22,1028,1029],{},"Look at AWS IAM:",[776,1031,1032,1035,1038,1041,1047],{},[212,1033,1034],{},"IAM Users and Roles are subjects",[212,1036,1037],{},"Resources (S3 buckets, EC2 instances, Lambda functions) are objects",[212,1039,1040],{},"Policies define edges",[212,1042,1043,1046],{},[55,1044,1045],{},"AssumeRole"," is literally the \"take\" operation from Take-Grant",[212,1048,1049,1052],{},[55,1050,1051],{},"Resource-based policies"," create cross-account edges",[22,1054,1055,1056,1059],{},"AWS IAM ",[25,1057,1058],{},"is"," a graph. But AWS gives you no native tools to query it as one. You get the IAM Policy Simulator: a point query tool in a world that needs path analysis.",[22,1061,1062],{},"So security teams discover that a misconfigured role in Account A can assume into Account B, which has a policy that allows access to Account C's production database. Three hops. Invisible to any single ACL review.",[691,1064,1066],{"id":1065},"the-kubernetes-permission-graph","The Kubernetes permission graph",[22,1068,1069],{},"Kubernetes might be the worst offender. ServiceAccounts, Roles, RoleBindings, ClusterRoles, ClusterRoleBindings. All edges in a graph. Namespace boundaries create subgraphs. Pod security contexts add more nodes.",[22,1071,1072,1073,1076],{},"And the graph has hidden edges that RBAC doesn't model. A ServiceAccount with permission to list Secrets in a namespace can read every token stored there, including tokens for more privileged ServiceAccounts. That's a path through two different edge types (RBAC grants access to Secrets, Secrets contain credentials) that no single ",[113,1074,1075],{},"kubectl auth can-i"," check will ever surface. It's Biba's integrity problem in miniature: low-trust workloads reading their way up to high-trust credentials.",[691,1078,1080],{"id":1079},"active-directory-the-original-sin","Active Directory: the original sin",[22,1082,1083],{},"Active Directory has been a graph since 1999. Users, Groups, OUs, GPOs, Trust Relationships. All edges in a directed graph. Nested group memberships create transitive paths. Trust relationships create cross-domain paths.",[22,1085,1086],{},"Every AD privilege escalation attack (Kerberoasting, DCSync, Golden Ticket paths) is a graph traversal exploit. The attackers know this. In 2016, the BloodHound project made it explicit: ingest AD relationships, build a directed graph, find the shortest path to Domain Admin. It works devastatingly well precisely because it models AD as what it actually is.",[22,1088,1089,1090,1093],{},"Defenders, meanwhile, run ",[113,1091,1092],{},"Get-ADUser"," queries and review group memberships in spreadsheets.",[22,1095,1096],{},"We've spent 25 years defending against graph attacks with table tools.",[22,1098,1099,1100,1103,1104,1107],{},"BloodHound has been open-source since 2016. Defenders can use it too. But BloodHound answers an attacker's question: ",[25,1101,1102],{},"\"What's the shortest path to Domain Admin?\""," The defensive inverse — ",[25,1105,1106],{},"\"show me everything that can reach our crown jewels, continuously, across every environment\""," — needs different tooling and a different architectural commitment. One most security teams haven't made, because nobody is selling it to them.",[80,1109,1111],{"id":1110},"the-return-graph-databases-make-this-practical","The return: graph databases make this practical",[22,1113,1114],{},"In 2007, Neo4j released the first production graph database. By 2015, graph databases were mainstream. The computational barrier that forced us to abandon formal models in the 1980s no longer exists.",[22,1116,1117,1118,1121],{},"Graph traversal that was impossible on a PDP-11 now runs in milliseconds on commodity hardware. A path query that answers ",[25,1119,1120],{},"\"Can Principal X ever reach Resource Y through any chain of permissions?\""," is a single Cypher statement. Tools like AWS Access Analyzer have started nibbling at this problem, but they're still point queries against specific policy combinations, not full path traversals across trust boundaries.",[22,1123,1124,1125,1128],{},"The difference matters. A point query tells you whether one specific permission is granted. A graph query tells you whether a ",[25,1126,1127],{},"path"," exists that you never intended to create. The three-hop AWS role chain, the Kubernetes Secret that bridges two privilege levels, the nested AD group that grants Domain Admin through six degrees of membership. These are all paths. They're invisible to point queries and obvious to graph traversal.",[22,1130,1131],{},"The 1970s papers showed that if you model your system as a graph with appropriate constraints, you can prove security properties. For 40 years, nobody had the hardware to act on that. Now we do. A graph database holding your IAM policies, network topology, trust relationships, and data flows can answer questions that no combination of ACLs, spreadsheets, and manual reviews can touch.",[22,1133,1134,1135,1137],{},"The safety guarantees that Bell, LaPadula, Biba, and the Take-Grant authors described are implementable now, provided you constrain the model appropriately. The HRU undecidability result still holds for the general case. But most real systems ",[25,1136,851],{}," constrained, and that's exactly where the formal results apply.",[22,1139,1140,1141,1144,1145,1147],{},"Some vendors are catching on. Wiz builds attack graphs across cloud environments. XM Cyber models attacker paths to critical assets. These are real steps forward — they ask path questions, not point questions. But they solve half the original problem: they find paths that exist ",[25,1142,1143],{},"right now",". The formal question the 1970s models posed was stronger: can this system ",[25,1146,969],{}," reach an unsafe state? That's the difference between a snapshot and a proof. Graph databases give us the machinery for both. The industry has mostly picked up the first half. The mainstream vendor ecosystem is still selling better ACL management.",[80,1149,1151],{"id":1150},"the-question-we-should-be-asking","The question we should be asking",[22,1153,1154],{},"The security industry has spent two decades building increasingly sophisticated ACL management tools. Better UIs for permission tables. More granular RBAC. More complex policy languages. All of it optimizing the lookup.",[22,1156,1157],{},"None of it asks whether the path exists.",[22,1159,1160],{},"The 1970s theorists were decades ahead of the hardware. They understood that security is about paths, flows, and reachability. They built formal models to prove it, and then had to shelve those models because nothing could run them fast enough.",[22,1162,1163],{},"The hardware caught up 15 years ago. The question is why we're still pretending that flattening a graph into ACLs is anything other than a legacy compromise.",[691,1165,1167],{"id":1166},"whats-left","What's left",[22,1169,1170],{},"That said, recognizing the problem and fixing it are different things. Graph databases are mature. A few vendors are asking path questions. The attack side has been thinking in graphs for a decade. But this still isn't how most defenders work.",[22,1172,1173],{},"The theory has been there since the 1970s. The compute is there now. The attackers figured it out a decade ago. We're still waiting for the defenders to close the loop.",[1175,1176],"hr",{},[80,1178,1180],{"id":1179},"references-further-reading","References & further reading",[22,1182,1183],{},"The original papers, if you're curious:",[776,1185,1186,1189,1192,1195,1198],{},[212,1187,1188],{},"Bell, D.E. & LaPadula, L.J. (1973). \"Secure Computer Systems: Mathematical Foundations\" - MITRE Technical Report",[212,1190,1191],{},"Biba, K.J. (1977). \"Integrity Considerations for Secure Computer Systems\" - MITRE Technical Report",[212,1193,1194],{},"Harrison, M.A., Ruzzo, W.L., & Ullman, J.D. (1976). \"Protection in Operating Systems\" - Communications of the ACM",[212,1196,1197],{},"Clark, D.D. & Wilson, D.R. (1987). \"A Comparison of Commercial and Military Computer Security Policies\" - IEEE Symposium on Security and Privacy",[212,1199,1200],{},"Lipton, R.J. & Snyder, L. (1977). \"A Linear Time Algorithm for Deciding Subject Security\" - Journal of the ACM",[22,1202,1203],{},[25,1204,1205],{},"These papers are freely available and shorter than you'd expect. The notation looks dated, but the proofs hold.",[22,1207,1208],{},[25,1209,1210,1211,842],{},"This article was originally published on  ",[91,1212,1215],{"href":1213,"rel":1214},"https:\u002F\u002Fleventesimon.com\u002Finsights\u002Fthe_lost_science",[582],"leventesimon.com",{"title":96,"searchDepth":617,"depth":617,"links":1217},[1218,1225,1231,1236,1237,1240],{"id":677,"depth":617,"text":678,"children":1219},[1220,1222,1223,1224],{"id":693,"depth":1221,"text":694},3,{"id":731,"depth":1221,"text":732},{"id":764,"depth":1221,"text":765},{"id":834,"depth":1221,"text":835},{"id":858,"depth":617,"text":859,"children":1226},[1227,1228,1229,1230],{"id":868,"depth":1221,"text":869},{"id":878,"depth":1221,"text":879},{"id":894,"depth":1221,"text":895},{"id":924,"depth":1221,"text":925},{"id":1018,"depth":617,"text":1019,"children":1232},[1233,1234,1235],{"id":1025,"depth":1221,"text":1026},{"id":1065,"depth":1221,"text":1066},{"id":1079,"depth":1221,"text":1080},{"id":1110,"depth":617,"text":1111},{"id":1150,"depth":617,"text":1151,"children":1238},[1239],{"id":1166,"depth":1221,"text":1167},{"id":1179,"depth":617,"text":1180},"2026-03-09","In the 1970s, we proved security is a graph problem. Then we abandoned the math for flat ACLs. Now graph databases let us pick it back up.","\u002Fimages\u002Fblog\u002Fthe_lost_science.jpg",{"audio":1245,"audioLabel":1246,"category":1247},"\u002Faudio\u002FGraph_security_versus_access_control_lists.mp3","AI-generated debate",[1248],"Thinking in Graphs","\u002Fblog\u002Fthe_lost_science",{"title":647,"description":1242},"blog\u002Fthe_lost_science",[1253,1254,1255,1256,1257],"graph theory","access control","security history","formal methods","security architecture","QrTxjmWfVSyFETf4dCC04tPBpWzk-yjX25RrImBUvO4",{"id":1260,"title":1261,"author":1262,"body":1264,"date":1528,"description":1529,"extension":631,"image":1530,"meta":1531,"navigation":634,"path":1534,"seo":1535,"stem":1536,"tags":1537,"__hash__":1541},"blog\u002Fblog\u002Ftmi2_alarm_flood.md","TMI2 Alarm Flood",{"name":7,"headshot":8,"role":9,"contact":1263},{"linkedin":11,"email":12,"twitter":13},{"type":15,"value":1265,"toc":1519},[1266,1270,1275,1278,1281,1284,1290,1293,1300,1304,1307,1310,1313,1316,1319,1322,1326,1333,1336,1343,1346,1349,1352,1355,1359,1362,1365,1368,1375,1378,1384,1387,1390,1394,1397,1404,1407,1410,1413,1417,1420,1423,1426,1429,1432,1446,1449,1452,1459,1463,1470,1473,1476,1479,1482,1486,1489,1492,1495,1498,1500,1509,1511],[18,1267,1269],{"id":1268},"_847-alarms-at-4-am","847 Alarms at 4 AM",[22,1271,1272],{},[25,1273,1274],{},"At 4:00 AM on March 28, 1979, a pressure relief valve stuck open at Three Mile Island Unit 2. What followed was the most studied nuclear accident in American history, and it had almost nothing to do with the valve.",[22,1276,1277],{},"The reactor's safety systems did exactly what they were designed to do. The SCRAM triggered. Emergency coolant activated. Alarms sounded.",[22,1279,1280],{},"All of them. At once.",[22,1282,1283],{},"Over 100 alarms fired in the first few minutes. The control room had no alarm prioritization. No way to suppress low-relevance alerts or separate the critical from the routine. Every indicator screamed with equal urgency: the critical and the routine, the cause and the symptom, the thing that mattered and the hundred things that didn't.",[22,1285,1286,1287],{},"The operators stood in front of a wall of flashing lights and had to answer one question: ",[25,1288,1289],{},"what do we fix first?",[22,1291,1292],{},"They got it wrong. Instruments showed high water levels in the pressurizer, and the operators, unable to distinguish cause from symptom in the flood of alarms, concluded the reactor had too much coolant. They turned off the emergency cooling system. The reactor was actually losing coolant through the stuck valve. They had shut off the one thing keeping the core alive. Within hours, it partially melted.",[22,1294,1295,1296,1299],{},"The defense system worked. The defense system's ",[25,1297,1298],{},"output"," caused the meltdown.",[80,1301,1303],{"id":1302},"_847-alarms-at-4-am-1","847 alarms at 4 AM",[22,1305,1306],{},"Replace the control room with a Slack channel. Replace the flashing lights with a vulnerability report. Replace the 100 simultaneous alarms with 847 CVEs.",[22,1308,1309],{},"A security scanner runs against 12 production clusters. It finds 847 vulnerabilities. It scores each one with CVSS. It produces a report. It sends the report to the platform team.",[22,1311,1312],{},"The platform team has three engineers.",[22,1314,1315],{},"The report tells them everything and nothing. It lists every CVE but not which ones are reachable from the internet, not what breaks if a given service is compromised. It does not tell them what to fix first.",[22,1317,1318],{},"So the engineers do what the TMI-2 operators did. They stand in front of the wall of flashing lights and start guessing. Manual correlation. Spreadsheets. Tribal knowledge about which clusters matter more.",[22,1320,1321],{},"This is a methodology problem. Not staffing, not tooling. The distinction matters, because better tools built on a broken methodology will reproduce the same failure at higher resolution.",[80,1323,1325],{"id":1324},"the-valve-indicator-problem","The valve indicator problem",[22,1327,1328,1329,1332],{},"The pressure relief valve was stuck open. Coolant was draining from the reactor. But the indicator on the control panel didn't show whether the valve was open or closed. It showed whether the valve had been ",[25,1330,1331],{},"commanded"," to close.",[22,1334,1335],{},"The command had been sent. The indicator showed \"closed.\" The valve was open. The operators trusted the indicator.",[22,1337,1338,1339,1342],{},"CVSS scores have the same problem. A CVSS score, even supplemented by EPSS or KEV data, tells you how exploitable a vulnerability is ",[25,1340,1341],{},"in theory",", under laboratory conditions, absent any context about your environment. It tells you the command was sent. It does not tell you the state of the valve.",[22,1344,1345],{},"A CVE with a CVSS score of 9.8 on an air-gapped internal build server with no inbound network paths is not a 9.8 in your environment. A CVE with a score of 5.3 on a public-facing service that chains with two other medium-severity issues to reach your database? That might be your actual 9.8.",[22,1347,1348],{},"CVSS measures theoretical exploitability. It says nothing about whether an attacker can reach the service, whether this CVE chains with others into a viable attack path, or what happens downstream if the service is compromised.",[22,1350,1351],{},"Calculating risk from CVSS alone is like calculating insurance premiums from the probability of a hurricane without checking whether the house is in Kansas or on the Florida coast.",[22,1353,1354],{},"The TMI-2 operators didn't lack data. They were drowning in it. What they lacked was a model that connected the data to reality.",[80,1356,1358],{"id":1357},"the-real-failure-mode-risk-displacement","The real failure mode: risk displacement",[22,1360,1361],{},"Most organizations handle vulnerability management the same way TMI-2 handled its alarms.",[22,1363,1364],{},"The reactor's alarm system was designed by one team. The control room was operated by another. The alarm designers built a comprehensive system: every possible anomaly would trigger a notification. Complete coverage. Nothing missed.",[22,1366,1367],{},"They were right. Nothing was missed.",[22,1369,1370,1371,1374],{},"The problem was that \"nothing missed\" and \"useful to the operator\" are not the same thing. The alarm system's completeness became the operator's paralysis. The designers had optimized for ",[25,1372,1373],{},"their"," metric (coverage) and displaced the actual hard problem (prioritization) to someone else.",[22,1376,1377],{},"This is exactly what happens when a security team runs a scanner, generates a report of 847 CVEs, and sends it to the platform team. The security team's job, by their own metrics, is done. Complete coverage. Nothing missed.",[22,1379,1380,1381,842],{},"The platform team now owns the triage. They have the list but not the context, not the topology, not the blast radius analysis. They have a wall of flashing lights and a reactor that needs attention ",[25,1382,1383],{},"now",[22,1385,1386],{},"Call it what it is: risk displacement. The burden of analysis moves from the team that understands threats to the team that doesn't have the tools or the mandate to prioritize them.",[22,1388,1389],{},"The TMI-2 alarm system didn't protect the operators. It made their job harder. The scanner report doesn't protect the platform team. It creates work and calls it security.",[80,1391,1393],{"id":1392},"what-the-nuclear-industry-learned","What the nuclear industry learned",[22,1395,1396],{},"After TMI-2, the nuclear industry redesigned the entire alarm methodology.",[22,1398,1399,1400,1403],{},"The reforms introduced alarm prioritization: suppress low-relevance notifications during high-stress events. They added contextual displays that show operators the ",[25,1401,1402],{},"state of the system"," rather than a list of deviations from normal. And they formalized alarm rationalization, determining which alarms matter under which conditions, and what the operator should actually do about it.",[22,1405,1406],{},"More alarms do not mean more safety. An alarm system that fires 100 alerts when 3 are critical is worse than one that fires 3. The operator's attention is finite, and every irrelevant alarm steals cognitive resources from the ones that matter.",[22,1408,1409],{},"The nuclear industry learned that the alarm system's job is not to tell the operator everything that's wrong. It's to tell the operator what to do next.",[22,1411,1412],{},"Vulnerability management hasn't learned this yet.",[80,1414,1416],{"id":1415},"from-lists-to-topology","From lists to topology",[22,1418,1419],{},"The cybersecurity industry's response has been to build better alarms. The scanner now produces a sorted list instead of an unsorted one. It adds a risk score. Maybe it cross-references with the CISA KEV catalog or flags \"actively exploited in the wild.\"",[22,1421,1422],{},"These are improvements, not solutions. You can't meaningfully sort 847 CVEs without understanding the topology they exist in. Sorting requires knowing which services are reachable, what they connect to, and what breaks if they're compromised. That knowledge doesn't live in a scanner. It lives in the relationships between assets.",[22,1424,1425],{},"A sorted list of CVEs is still a list. You can't ask a list \"what's the shortest path from the internet to my database through these vulnerabilities?\"",[22,1427,1428],{},"That's a graph question. Your assets, services, and identities are nodes. The connections between them, network paths, trust relationships, data flows, are edges. Vulnerabilities attach to nodes, but exploitability is a function of the path, not the node.",[22,1430,1431],{},"The 3-person platform team managing 12 clusters doesn't need a better list. They need answers to questions a list can't answer:",[776,1433,1434,1437,1440,1443],{},[212,1435,1436],{},"Which of these 847 CVEs sit on services reachable from the internet?",[212,1438,1439],{},"Which of those services connect to data stores with customer data?",[212,1441,1442],{},"If this service is compromised, what's the shortest path to a critical asset?",[212,1444,1445],{},"Which three patches would eliminate the most attack paths?",[22,1447,1448],{},"In a graph, blast radius is a query, not a guess. You traverse outward from the compromised node and measure what's reachable. Prioritization is a calculation over topology: which vulnerabilities sit on the most paths to the things that matter?",[22,1450,1451],{},"The scanner flags a critical CVE on a high-profile production service. The team scrambles to patch it. Meanwhile, a chain of three medium-severity CVEs on a forgotten internal service provides a clear path to the same database. Nobody sees the chain because nobody's modeling the relationships.",[22,1453,1454,1455,1458],{},"The TMI-2 valve wasn't dangerous because it was stuck open. It was dangerous because it was stuck open ",[25,1456,1457],{},"on the path between the reactor core and the environment",". Location in the topology defined the severity, not the defect itself.",[80,1460,1462],{"id":1461},"the-tool-is-the-process","The tool is the process",[22,1464,1465,1466,1469],{},"The nuclear industry's post-TMI redesign went further. They embedded the methodology in the control room itself. Alarm rationalization wasn't a document operators consulted alongside their instruments. It became how the instruments worked. The tool ",[25,1467,1468],{},"was"," the methodology.",[22,1471,1472],{},"This is where the \"just buy better tooling\" argument gets it half right. The right tool does embed methodology: reachability analysis, asset relationship mapping, exposure context should be built into how your team works, not bolted on as a separate triage step.",[22,1474,1475],{},"But a tool can only implement a methodology that exists. Reachability and exposure data don't tell you whether a compromised internal API matters more than an exposed storage bucket. That ranking comes from understanding business impact, and business impact is an organizational decision. Someone has to decide what the crown jewels are. The graph can model the paths, but the weight you assign to each destination is a business call.",[22,1477,1478],{},"The NRC understood this. Beyond the control room redesign, they mandated simulator training, licensing requirements, crew resource management borrowed from aviation. They retrained the people, not just the instruments. Because someone still has to look at the output and make the call, and that takes people who understand what the business loses if an attack path gets exploited.",[22,1480,1481],{},"Skip either step and you're back in the TMI-2 control room. A tool without methodology is a fancier wall of flashing lights. A methodology nobody follows is a PDF on a SharePoint nobody opens.",[80,1483,1485],{"id":1484},"before-you-send-the-next-report","Before you send the next report",[22,1487,1488],{},"The operators at TMI-2 were trained, competent, and trying their best. They still turned off the one system keeping the reactor alive. The information architecture made the right decision invisible and the wrong decision obvious.",[22,1490,1491],{},"Before you send the next vulnerability report, ask yourself: am I giving my platform team a decision, or am I giving them a wall of flashing lights?",[22,1493,1494],{},"The nuclear industry answered that question in 1979. It cost them a reactor.",[22,1496,1497],{},"What's it costing you?",[1175,1499],{},[22,1501,1502,1505,1506],{},[55,1503,1504],{},"Historical note:"," ",[25,1507,1508],{},"The Three Mile Island Unit 2 reactor was never restarted. Cleanup took 14 years and cost approximately $1 billion. The President's Commission on the Accident (the Kemeny Commission) concluded that the primary cause was not mechanical failure but \"human factors,\" operator confusion compounded by inadequate instrumentation and training. The control room's alarm system was specifically cited as a contributing factor. Unit 1 continued operating until 2019.",[1175,1510],{},[22,1512,1513,1514,842],{},"This article originally published on ",[91,1515,1518],{"href":1516,"rel":1517},"https:\u002F\u002Fmedium.com\u002F@levente.simon\u002Fthe-meltdown-before-the-meltdown-what-three-mile-island-teaches-about-cve-management-b7ad6fa92f70",[582],"Medium",{"title":96,"searchDepth":617,"depth":617,"links":1520},[1521,1522,1523,1524,1525,1526,1527],{"id":1302,"depth":617,"text":1303},{"id":1324,"depth":617,"text":1325},{"id":1357,"depth":617,"text":1358},{"id":1392,"depth":617,"text":1393},{"id":1415,"depth":617,"text":1416},{"id":1461,"depth":617,"text":1462},{"id":1484,"depth":617,"text":1485},"2026-03-04","The Three Mile Island operators were drowning in alerts when they shut off the emergency cooling. Your platform team is drowning in CVEs. Both problems have the same root cause — and the nuclear industry solved it decades ago.","\u002Fimages\u002Fblog\u002Ftmi2-alarm-flood.jpg",{"audioLabel":1246,"audio":1532,"category":1533},"\u002Faudio\u002FWhy_flat_vulnerability_lists_paralyze_engineers.mp3",[1248],"\u002Fblog\u002Ftmi2_alarm_flood",{"title":1261,"description":1529},"blog\u002Ftmi2_alarm_flood",[1538,1539,639,1253,1540],"vulnerability management","risk prioritization","security methodology","-h9ISSqQnHFWpEyQHeJR8dN_aXSNO6g4vdzHgMqvFNI",1778307524677]