Discovery & Probes

Your Enterprise Knowledge Graph,
populated automatically.

Probes are read-only multi-target scanners that connect to your real systems — GitHub, AWS, RDS, ECS, Okta, Datadog, Confluence — and populate your tenant's Enterprise Knowledge Graph as Records and Links. Library-first extraction keeps LLM cost bounded. AI-gated semantic enrichment with full provenance. Completeness gates prevent hallucinated artefacts. Built on the same platform substrate that runs Workflows, ChangeSets, and Policy.

7 probe categories 4 v1 concrete probes ~180 Records from one PostgreSQL scan <$0.50 LLM cost per GitHub probe run 99% gross margin on activity credits

Section 1 · Why this matters

Time-to-Knowledge-Graph
is the EA bottleneck. CXO Architect

Enterprises spend 6–18 months manually surveying tools, transcribing architectures into Visio, and hand-mapping team ownership before they can answer a basic question like "which services would fail if we deprecate this database?" Probes collapse that into days.

~45 min
Full self-discovery
Arcitopsia scanning its own GitHub + AWS + RDS + ECS stack end-to-end (§24 dogfood)
≥ 180
Records / DB scan
PostgreSQL DataProbe against a real production schema — schemas, tables, views, columns, FKs, routines
≥ 80
Provenance citations
Auto-generated Solution Architecture doc cites only Records discovered by probes — zero hallucination
≤ 5,500
Credits / full sweep
Full Arcitopsia stack end-to-end + EA artefact synthesis + doc generation, total cost ceiling
The substrate is already there. Workflow execution, ConnectorInstance, Record / SpecType / LinkType, ChangeSet semantics via WorkflowJob.changeSetId, AIPromptTemplate, and ToolConfiguration scope-resolver all exist in the platform. Probes add a first-class concept of a scheduled or on-demand scanner, multi-target tool configuration, probe → org-hierarchy routing, and a shared pattern library that grounds AI generation in tenant context.

Section 2 · v1 mandatory probes

Four concrete probes ship in v1. CXO Sales

The end-of-v1 demo is the Probe Analyzer running these four probes against the live Arcitopsia production stack and producing an Enterprise Knowledge Graph rich enough to generate a complete, traceable Solution Architecture Document about Arcitopsia itself — the dogfood acceptance contract (§24).

GitHub Source-Code Probe

Octokit + tree-sitter · 6 sub-stages · LIBRARY-mostly

Scans github.com/arcitopsia/arcitopsia-application across the default branch. Discovers the Next.js app + BullMQ workers, ≥50 library dependencies, CI workflows, and CODEOWNERS-derived ownership.

  • Emits: code-repo, service, library, api-spec, pipeline
  • Read-only PAT — repo:read, read:org, metadata:read
  • ~$0.50 LLM cost per full repo scan

AWS Infrastructure Probe

AWS SDK v3 · 4 sub-stages · LIBRARY-only · 0 LLM tokens

Scans account 848111426925 in us-east-1. Discovers VPCs, subnets, security groups, IAM roles, IAM users, ECR repositories and images.

  • Emits: cloud-account, vpc, subnet, security-group, iam-role, iam-user, container-image-registry
  • STS role-assumption to arcitopsia-probe-discovery-role (ReadOnlyAccess)
  • ~$0.00 LLM cost — pure SDK calls

AWS ECS / Fargate Probe

AWS SDK v3 · 5 sub-stages · LIBRARY-only · 0 LLM tokens

Discovers cluster arcitopsia-cluster + service arcitopsia-service + task definitions. Captures container env-var names only (never values — secrets stay opaque). CloudWatch log-group ARNs captured; log contents never read in v1.

  • Emits: cluster, service-deployment, container-workload, deployment-environment
  • Same arcitopsia-probe-discovery-role as Infrastructure probe
  • Links back to the cloud-account Record from Stage 1
Data sensitivity is enforced in three layers. probe-output-connector.ts synchronously rejects any SQL that targets a non-system table. The PostgreSQL session is opened with SET default_transaction_read_only = on. And the database user's grants only include SELECT on information_schema.* + pg_catalog.* — even if the application layer were bypassed, the database itself would refuse a write.

Section 3 · Probe taxonomy

Seven first-class
probe categories. Architect LLM

Every probe carries a category that determines the SpecTypes it can emit, the LinkTypes it can create, and the default ownership-routing rules. v1 ships the four concrete probes above; v1.5+ adds the remaining categories one probe per PR.

Category Reads from Emits SpecTypes Status
SOURCE_CODE GitHub, GitLab, Bitbucket application, service, library, api-spec, code-repo, pipeline v1 — GitHub
INFRASTRUCTURE AWS, GCP, Azure, Terraform Cloud, Kubernetes cloud-account, vpc, compute-instance, terraform-module, deployment-environment, cluster v1 — AWS Infra + ECS
DATA RDS, Snowflake, BigQuery, Databricks database, schema, table, view, dataset, etl-pipeline, data-lake v1 — PostgreSQL
IDENTITY Okta, Azure AD, Google Workspace, GitHub teams user, group, role, permission-policy v1.5
CI_CD GitHub Actions, Jenkins, ArgoCD pipeline, deployment, release v1.5
OBSERVABILITY Datadog, Grafana, Prometheus, New Relic monitor, dashboard, slo, alert-policy v1.5
DOCUMENT / EA_CONTENT Confluence, Notion, SharePoint, Google Drive architecture-doc, runbook, decision-record, customer-ea-document-candidate v1.5 (skeleton in v1.1)

Section 4 · Probe Analyzer

A six-phase wizard turns "we have lots of tools" into a deterministic execution plan. Architect

Probes work best when the platform already knows which systems are authoritative for which facts. The Probe Analyzer captures that upfront, runs a shallow discovery sweep to validate scope, and produces a stage-ordered DAG of which probes to run, with which targets, in what sequence, and where parallelism is safe.

flowchart LR
    A["A · Inventory
What systems
do you have?"]:::phase Ap["A' · Connect
Map systems to
ConnectorInstance"]:::phase B["B · Source-of-Truth
Pick primary source
per FactCategory"]:::phase Bp["B' · EA Content
Pre-Probe + Package
Gap Analysis"]:::phase C["C · Discovery Sweep
Shallow probes
~30-120s"]:::phase D["D · Plan Generation
Pure function
topological sort"]:::phase E["E · Approve & Run
Credit reservation
+ ProbeExecutionRun"]:::phase A --> Ap --> B --> Bp --> C --> D --> E classDef phase fill:#161616,stroke:#a3e635,stroke-width:1.5px,color:#e5e5e5,font-family:JetBrains Mono;

The 7-phase wizard (Phase B' added in v1.1 for EA content pre-probe).

Inputs to plan generation

  • OrganizationProbeProfile — singleton per tenant
  • SystemRegistration[] — declared systems by kind
  • FactCategoryMapping[] — primary source per FactCategory
  • DiscoverySnapshot[] — shallow-sweep findings per system

Outputs

  • ProbeExecutionPlan — versioned, supersedeable
  • ProbeExecutionPlanStage[] — barrier-ordered
  • ProbeExecutionPlanItem[] — probes within stage, with overrides
  • ProbeExecutionRun — on Approve & Run click

Canonical probe DAG

flowchart TB
    subgraph S0["Stage 0 · Foundation"]
        HR[HRProbe]:::s0
        ID[IdentityProbe]:::s0
        CO[CloudOrgProbe]:::s0
        VC[VCSOrgProbe]:::s0
    end
    subgraph S1["Stage 1 · System Scanning"]
        SC[SourceCodeProbe]:::s1
        IN[InfrastructureProbe]:::s1
        DP[DataPlatformOrgProbe]:::s1
        CI[CI_CDProbe]:::s1
    end
    subgraph S2["Stage 2 · Detail Scanning"]
        DA[DataProbe]:::s2
        OB[ObservabilityProbe]:::s2
        DO[DocumentProbe]:::s2
    end
    subgraph S3["Stage 3 · Analytical / Derived"]
        PM[PatternMatchProbe]:::s3
        GA[GapAnalysisProbe]:::s3
        HS[HierarchySynthesisProbe]:::s3
    end
    subgraph S4["Stage 4 · Generative"]
        BG[BacklogGenerationWorkflow]:::s4
        DG[DocDraftGenerationWorkflow]:::s4
    end

    HR --> SC
    ID --> SC
    CO --> IN
    VC --> SC
    SC --> DA
    IN --> DA
    DP --> DA
    SC --> OB
    IN --> OB
    SC --> DO
    DA --> PM
    DA --> GA
    DA --> HS
    PM --> DG
    GA --> BG
    HS --> DG

    classDef s0 fill:#0d1a0d,stroke:#a3e635,color:#fff;
    classDef s1 fill:#1e2a3a,stroke:#38bdf8,color:#fff;
    classDef s2 fill:#3b2e1a,stroke:#fbbf24,color:#fff;
    classDef s3 fill:#3b0764,stroke:#c084fc,color:#fff;
    classDef s4 fill:#3f1d1d,stroke:#fb7185,color:#fff;
                

Hard barriers between stages — Stage 1 only runs after every Stage 0 probe reaches terminal status.

Section 5 · Intra-probe sub-stages

Each probe is itself a layered scan, not one LLM blast. Architect LLM

A single LLM context cannot ingest an entire 1,500-routine schema. A single workflow cannot reliably emit 500 Records in one transaction. So every probe carries an ordered list of ProbeStageDefinition rows; stages execute sequentially per target with cross-target fan-out in parallel.

Worked example — PostgreSQL DataProbe

# Goal Extraction strategy Output SpecTypes LLM tokens
1 Schema Inventory — schemas, tables, views LIBRARY — information_schema.tables schema, table, view 0
2 Structural Detail — columns, PKs, FKs, indexes LIBRARY — information_schema.columns, pg_indexes column-metadata, references Links 0
3 Routines & Triggers — stored procedures, functions HYBRID — pg_proc for structure; LLM for top-50 summaries via §19 fan-out stored-procedure, function ~18K
4 Routine Dependencies — read/write graph LIBRARY — pg_depend + node-sql-parser reads-from, writes-to, calls Links 0
5 Lineage Stitching — table → SP → table chains COMPUTED — no DB hit, pure graph traversal derives-from Links 0
First run of a 200-table schema burns ~$0.20 in LLM cost. Compare with a naive "describe-everything-at-once" approach which would exceed $50 just to summarise the routines. The architectural lever is decomposition (§19) + library-first (§17).

Per-stage failure modes

failureMode If this stage fails…
BLOCK_DOWNSTREAM Stop the probe run for this target. Mark all later stages BLOCKED. Default for Stage 1 — without inventory, nothing else makes sense.
DEGRADE_DOWNSTREAM Mark later stages degradedDownstream=true. They still run, but AI calls auto-downgrade to PENDING_DISCOVERY. Default for Stage 3.
CONTINUE Later stages unaffected. Default for Stage 5 — lineage stitching is best-effort.

Section 6 · Library-First Doctrine

LLM tokens are economics, not magic. Most extraction is library work. Architect CXO

Every ProbeStageDefinition declares extractionStrategy ∈ {LIBRARY, LLM, HYBRID, COMPUTED}. LLM strategy without a justification (whyLlm) fails package validation. Plan-review surfaces the library coverage ratio; anything below 70% earns a yellow warning.

Tooling catalogue (v1 bundled)

Probe Library Why preferred over LLM Tokens avoided / use
GitHub source-code@octokit/rest, @octokit/graphqlNative API; rate-limited but deterministic~3K × repo
GitHub — TS/JS service discoverytree-sitter, tree-sitter-typescriptUniversal AST; 100+ grammars; incremental~10K × class
GitHub — manifestsyaml, native JSON.parseDeterministic structural parsing~2K × file
AWS Infra/ECS@aws-sdk/client-* v3First-party; consistent error surface~5K × resource
PostgreSQL — schemaNative SQL → information_schemaZero LLM cost; type-safe metadata~3K × table
PostgreSQL — SQL parsingnode-sql-parserDeterministic SP parsing~5K × routine

When LLM is justified (v1)

Section 7 · Hierarchy Synthesis

Auto-create the EA chain upstream of every discovered Record. Architect

A probe discovers payments-postgres-prod (Delivery Stream Record owned by Payments Team). For governance to function, that Record needs a chain of EA Stream Records — Postgres Standards, Postgres Guardrails, Reference Architecture, OLTP Pattern — each owned by the correct EA team. In an immature tenant none of these exist. The Hierarchy Synthesizer fixes this.

flowchart TB
    DR[Discovered Record
payments-postgres-prod
SpecType: database]:::disc DT[Delivery Team
Payments Team]:::team DR -- owned-by --> DT AS[Architecture Standard
Postgres 15 Standard]:::ea AG[Architecture Guardrail
Postgres Prod Guardrails]:::ea AP[Architecture Pattern
OLTP DB Pattern]:::ea RA[Reference Architecture
Standard Postgres Deployment]:::ea DR -- governed-by --> AS DR -- subject-to --> AG DR -- conforms-to --> AP DR -- instantiates-from --> RA TST[Tech-Stack Team
Data Platform Team]:::eateam SDT[Sub-Domain Team
Data Storage]:::eateam DT2[Domain Team
Data Architecture]:::eateam AS --> TST AG --> TST AP --> TST RA --> TST TST -- child-of --> SDT SDT -- child-of --> DT2 classDef disc fill:#3b2e1a,stroke:#fbbf24,color:#fff; classDef team fill:#1e2a3a,stroke:#38bdf8,color:#fff; classDef ea fill:#0d1a0d,stroke:#a3e635,color:#fff; classDef eateam fill:#3b0764,stroke:#c084fc,color:#fff;

The EA chain auto-built when a Delivery-Stream Record is discovered for the first time.

The four governance LinkTypes

LinkTypeFrom → ToSemantics
governed-byany tech Record → architecture-standard"This thing must comply with this Standard"
subject-toany tech Record → architecture-guardrail"This Guardrail applies here"
conforms-toany tech Record → architecture-pattern"This implements this Pattern"
instantiates-fromany tech Record → reference-architecture"This is an instance of this RefArch"
Idempotent on re-run. Every step uses upsert-by-composite-key. A second run against the same tenant emits zero new Records / zero new Links. Rejecting an EA artefact in the review queue does NOT cascade into re-creation on the next probe run — see Direction of Authority.

Section 8 · Publication policy

Auto-publish or human-review is policy-driven, not a single boolean. Architect

A DBA running the DataProbe should auto-publish discovered table metadata but must review newly drafted stored-procedure semantic summaries. ProbeDefinition.publicationPolicyKey references a PublicationPolicy row; each policy carries an ordered rule list (first-match wins).

Worked policy — data-probe-default

{
  "key": "data-probe-default",
  "defaultDecision": "PENDING_DISCOVERY",
  "rules": [
    { specTypeKey: "table",
      matchSynthesisOrigin: ["DISCOVERED"],
      decision: "AUTO_PUBLISH",
      rationale: "Structural facts from JDBC are reliable" },

    { specTypeKey: "table",
      matchSensitivity: ["PII"],
      decision: "PENDING_DISCOVERY",
      requiredApproverPersonas: ["data-steward"],
      rationale: "PII tables require data steward sign-off" },

    { specTypeKey: "stored-procedure",
      matchAiConfidenceLt: 0.85,
      decision: "PENDING_DISCOVERY",
      requiredApproverRoleCategories: ["DBA"] },

    { specTypeKey: "architecture-standard",
      matchSynthesisOrigin: ["AI_DRAFTED_ARTIFACT"],
      decision: "PENDING_DISCOVERY",
      requiredApproverPersonas: ["ea-architect", "domain-architect"] }
  ]
}

Bulk-approve endpoints (POST /api/probe-runs/:id/approve) slice their updateMany by matching rule; the caller's roles/personas determine which Records flip to APPROVED in a single call.

Section 9 · Child-workflow decomposition

Fan out, consolidate. The same pattern powers all batched AI work. Architect

LLM context windows cannot ingest "describe every stored procedure in a 1,500-routine schema." The diagram-generation workflow in ea-package-demo3 solves this by splitting into N child workflows and consolidating outputs. Probes reuse the exact same primitives.

flowchart LR
    P["Parent Stage
50 routines
chunkSize=10"]:::parent C1[Child #1
routines 1-10]:::child C2[Child #2
routines 11-20]:::child C3[Child #3
routines 21-30]:::child C4[Child #4
routines 31-40]:::child C5[Child #5
routines 41-50]:::child M[Consolidate
mergeStrategy=CONCAT]:::merge P --> C1 P --> C2 P --> C3 P --> C4 P --> C5 C1 --> M C2 --> M C3 --> M C4 --> M C5 --> M classDef parent fill:#0d1a0d,stroke:#a3e635,color:#fff; classDef child fill:#1e2a3a,stroke:#38bdf8,color:#fff; classDef merge fill:#3b2e1a,stroke:#fbbf24,color:#fff;

Token cost vs naive single-LLM approach: ~98% reduction. Latency slightly higher, but children run in parallel.

Decomposition thresholds — scope-resolved

All four decomposition thresholds are resolved at probe-execution time via the existing ToolConfiguration scope hierarchy (RECORD → TEAM → USER → ROLE → TENANT). Lets ops tune cost vs latency per deployment context.

SettingDefaultHard floorHard ceiling
decompositionTriggerItemCount50110,000
chunkSize (LLM strategies)10550
chunkSize (LIBRARY strategies)100501,000
maxConcurrentChildren20150
childTimeoutMs60,0005,000600,000

Section 10 · Context Resolution Pipeline

Generated artefacts are grounded in real Records. Nothing else. Architect LLM

A context-resolver is a Record of SpecType context-resolver (so resolvers are tenant-customisable, versioned, and shipped via PDL). Its body is a declarative DSL: anchor SpecType, facets (multi-hop graph queries with dependencies), and a completeness gate.

Resolver DSL — solution-architecture-doc-context (excerpt)

{
  "anchor": { specTypeKey: "solution-architecture", required: true },
  "facets": [
    { key: "technologyStack",
      traversal: { from: "anchor", linkType: "uses-technology", direction: "OUTBOUND" },
      required: true,
      completenessRule: "AT_LEAST_ONE" },

    { key: "standardsPerTech",
      dependsOn: "technologyStack",
      traversal: { fromEach: "technologies", linkType: "governed-by",
                   filterSpecType: "architecture-standard", filterStatusIn: ["APPROVED"] },
      required: true,
      completenessRule: "AT_LEAST_ONE_PER_INPUT" },

    // ... 19 more facets: refArchs, guardrails, patterns, ADRs, NFRs, compliance, ...
  ],
  "completenessGate": {
    minimumRequiredFacetsPresent: 1.0,
    onMissing: "BLOCK_GENERATION"
  }
}

BLOCK_GENERATION

Engine throws. AI workflow does not run. UI surfaces which facets are missing and offers "Open backlog" (creates Tasks for each gap).

DEGRADE

Engine returns what it has. Template must include placeholders like {standards | "[no documented]"} so the LLM acknowledges the gap.

WARN

Full payload + warnings[]. Generation proceeds; warnings persisted on the generated Record's contextResolverWarnings.

Reverse traceability — every generated Record knows what fed it

Section 11 · Two-gate injection completeness

Fetched ≠ Injected. Two gates close two silent-failure paths. Architect LLM

The Resolver Completeness Gate verifies that all required facets were fetched from the graph. But fetched data is not the same as injected data. Two silent failure modes exist between resolver output and the LLM call: token-budget truncation, and template-authoring omission. The Injection Completeness Gate closes both.

flowchart LR
    R[Resolver runs
fetches 21 facets]:::start G1{Gate 1
Completeness?}:::gate Bk["BLOCK
Required facet
didn't resolve"]:::fail Pr[Prompt renderer
renders template
vs facet data]:::middle G2{Gate 2
Mandatory facets
in prompt text?}:::gate Bk2["BLOCK
Mandatory facet truncated
or unreferenced"]:::fail L[LLM call proceeds
with full context]:::success R --> G1 G1 -- pass --> Pr G1 -- fail --> Bk Pr --> G2 G2 -- pass --> L G2 -- fail --> Bk2 classDef start fill:#0d1a0d,stroke:#a3e635,color:#fff; classDef gate fill:#3b2e1a,stroke:#fbbf24,color:#fff; classDef middle fill:#1e2a3a,stroke:#38bdf8,color:#fff; classDef fail fill:#3f1d1d,stroke:#fb7185,color:#fff; classDef success fill:#0a3b16,stroke:#86efac,color:#fff;

Both gates must pass for an AI invocation to proceed. Skipping either is a hallucination vector.

Token-budget truncation policy. When rendered prompt > 80% of model context window: priority order is mandatory facets → required facets → other facets in dependsOn order. If truncation reaches a mandatory facet, the call is aborted — don't silently call with known-incomplete context. AIUsageLog.contextTruncated=true records the event.

Section 12 · Governed Entity Pattern

Canonical state + version snapshots + project change proposals. One pattern, ten governed types. Architect

The canonical-vs-change separation applies to every governed entity type that multiple projects can modify concurrently: Applications, Services, APIs, Workflows, Databases, Systems, Tech Stacks, Reference Architectures, EA artefacts. The same three SpecTypes + six LinkTypes + one concurrency-analyser workflow handle them all (parameterised by canonical SpecType).

Governed Entity Canonical SpecType (EA-owned) Implementation Snapshot Change Proposal (Delivery-owned)
Applicationsapplicationapplication-version-snapshotapplication-change-proposal
Servicesserviceservice-version-snapshotservice-change-proposal
APIsapi-specapi-spec-version-snapshotapi-change-proposal
Workflowsworkflow-definitionworkflow-version-snapshotworkflow-change-proposal
Database Schemasdatabase-schemadatabase-schema-version-snapshotdatabase-schema-change-proposal
Systemssystemsystem-version-snapshotsystem-change-proposal
Tech Stackstechnology-stacktechnology-stack-version-snapshottech-stack-change-proposal
Reference Architecturesreference-architectureRecordVersion servesreference-architecture-change-proposal
EA Standards / Guardrails / Patternsarchitecture-standard, architecture-guardrail, architecture-patternRecordVersion serves<artefact>-change-proposal

Concurrent change detection

The change-concurrency-analyzer workflow is parameterised by canonicalSpecTypeKey — not hardcoded for Applications. Triggered on every new/updated change-proposal Record across all 10 entity types, plus a nightly catch-up sweep. For each canonical entity with ≥ 2 in-flight proposals, runs a structural diff (deterministic), then an LLM judge call (§15 confidence-gate) only for ambiguous text cases. Emits change-conflict-finding Records with severity BLOCKER | HIGH | LOW.

Section 13 · Direction of Authority

The apparent paradox — EA gates generation, but probes write EA Stream. Resolved. Architect

Different gates apply at different lifecycle stages. Probes are never gated by EA completeness — only generation is. Five direction-of-authority rules are enforced in code with unit tests per rule.

ConflictRule
Probe re-discovers a tech the architect previously REJECTED Synthesizer does not re-create. Emits policy-divergence-finding to App Architecture Team.
Architect-approved Standard claims "Postgres 13" but probe finds Postgres 16 Probe does not silently update the Standard. Emits tech-taxonomy-drift-finding; architect must update Standard or fix prod.
Two probe runs disagree about the same fact (different sources, different values) Last-write-wins only when sources agree on authority. Otherwise emits discovery-conflict-finding and routes to review.
Architect manually edits a probe-discovered Record Edit sticks. Record.lastHumanEditAt timestamp protects against silent overwrite on next probe run. If reality still differs, emits discovery-divergence-finding.
Probe needs to delete a probe-discovered Record (entity removed from source) Never hard-delete. Update Record.status=ARCHIVED, set archivedAt, retain full audit trail.

Section 14 · EA Content Pre-Probe + Package Gap Analysis

Don't overwrite the customer's existing investment. CXO Sales

Many enterprise customers hold years of EA documentation in Confluence / SharePoint / Notion / Git docs. Installing PDL packages first risks overwriting or duplicating that material, devaluing prior investment. The customer-friendly approach: probe the existing EA corpus first, compare against what PDL packages provide, and surface per-item architect consent.

Per-item consent options

Install Package

Package's Record created in EKG as-is on Phase E execution.

Keep Customer

Record created with package SpecType scaffolding; body / metadata from matched customer document.

Hybrid / Merge

Both versions imported; customer version lands PENDING_DISCOVERY for manual merge.

Skip

Neither installed; architect deems out of scope.

Decisions persist as OrganizationProbeProfile.packageInstallationDecisions — durable record of {packageKey, itemKey, choice, decidedByUserId, decidedAt, matchedCandidateId?} per item. Re-running gap analysis after a package update only re-evaluates changed items, preserving prior decisions. Adopted customer documents are marked synthesisOrigin = CUSTOMER_DOC_ADOPTED with source URL + last-modified date.

Section 15 · Credentials Management

Five credential families. Read-only enforcement. Five-step validation gauntlet. Architect Sales

Probes are read-only by design. The wizard's Phase A' refuses to persist credentials that fail the read-only scope check. AWS Secrets Manager + tenant-scoped KMS keys; raw secrets never reach the DOM more than once.

Five credential families

FamilyExample probesValidation gate
STATIC_TOKENGitHub PAT, Confluence API token, Datadog API keyAPI call → inspect x-oauth-scopes against allowed/forbidden lists
AWS_ROLE_ASSUMPTIONAWS Infrastructure, AWS ECS, RDS IAM authsts:AssumeRole + iam:SimulatePrincipalPolicy against representative write actions; all must return implicitDeny
DATABASE_CONNECTION_STRINGPostgreSQL, MySQL, MSSQL, Oracle, SnowflakeSELECT has_table_privilege(current_user, 'public.x', 'INSERT') must be FALSE; canary INSERT in rolled-back savepoint must fail with insufficient-privilege
MULTI_SECRETGitHub App, Azure AD service principalPer-connector handshake (GitHub App: App-ID + private key → installation access token)
OAUTH_AUTHORIZATION_CODEConfluence Cloud, Notion, Google Workspace, Okta admin (v1.5)Full OAuth handshake; refresh-token storage

The five-step save-time gauntlet

  1. Schema validation — form values match ConnectorDefinition.configSchema (type, required, pattern, minLength)
  2. Connectivity test — call validation.endpoint; non-2xx → reject with UX-surfaced error
  3. Identity verification — confirm the authenticated identity matches the architect's declared identity (e.g. GitHub login)
  4. Scope enforcement — per-family check (above). Rejected if any forbidden capability is present.
  5. Read-only confirmation — architect checks a box confirming read-only intent before save proceeds.
Read-only scope enforcement is a hard gate in v1. A PAT with repo:write scope is rejected with INSUFFICIENT_PRIVILEGE_ENFORCEMENT_FAILED — no "save anyway" override. Architect must regenerate with reduced scope. Every read goes through lib/probes/credentials/access-gateway.ts, which writes a CredentialAccessLog row first.

Encryption at rest

Storage: AWS Secrets Manager. Secret naming pattern: arcitopsia/<tenantId>/<connectorKey>/<targetKey>.
Key management: tenant-scoped KMS key alias alias/arcitopsia-tenant-<tenantId>; annual auto-rotation.
In the DB: only the secret reference (secretRef) is stored. Resolved at probe-run time via the access-gateway, held in memory for one probe stage only, never logged, never written to telemetry.
Cross-tenant access is impossible: each tenant's secrets encrypted with their own KMS key + IAM role's kms:Decrypt permission scoped per-tenant via kms:ResourceAliases condition.

Section 16 · AI LLM Integration

Ten integration points. One confidence-gate. Full provenance. Architect LLM

Every LLM call in v1 goes through lib/ai/confidence-gate.ts — context injection, schema validation, confidence thresholds, provenance recording, and cost caps in one gateway. Below threshold → Record forced to PENDING_DISCOVERY regardless of autoPublish.

#WhereOutputAutonomyThreshold
1SpecType classification{specTypeKey, confidence, reasoning?}Auto-apply if ≥ threshold0.85
2EA Domain / Tech tagging{eaDomainKey, subDomainKey?, techStackKey?, productKey?}Auto-tag if ≥ threshold0.80
3Ownership inference (step 5 of §3.2){ownerTeamId, confidence, rationale}< threshold forces PENDING_DISCOVERY0.70
4Identity resolution / dedup{verdict: SAME | DIFFERENT | UNSURE, mergeKey?}Auto-merge if SAME ≥ 0.90.90 / 0.70
5Document parsing{extractedFields, summary, mentionedEntityIds[]}Never auto-apply; review queuen/a
6EA artefact drafting (RefArch/Standard/Guardrail/Pattern){title, body Markdown, structured metadata}Never auto-apply; PENDING_DISCOVERYn/a
7Pattern match scoring[{patternId, score, gaps[], strengths[]}]Top match ≥ threshold sets Record.patternId0.60
8Diagram generation (Mermaid){c4Component?, c4Container?, sequence?, er?}Always review queuen/a
9Anomaly flagging[{severity, finding, suggestedAction}]Auto-emit if severity ≥ MEDIUMseverity-gated
10Analyzer guidance{suggestions: string[]}Suggestion only — never auto-actionn/a

Confidence sources (template-declared)

Section 17 · Credit economics

Reserve worst-case → settle actual. 99% gross margin at scale. CXO Sales

Per PAYG §5.0: every credit-bearing activity is priced at 20% of the documented manual cost to produce the equivalent outcome. baseCredits = round(manualCostBasisUSD × captureRatePct) where captureRatePct defaults to 0.20 (negotiable 0.15–0.25 per TenantCreditContract).

Worked example — PostgreSQL DataProbe with full hierarchy synthesis

200 tables, 50 stored procedures (10 require AI summarisation), 4 EA artefacts drafted per tech across 7 techs. Tenant on MULTINATIONAL tier (×1.30) with 20% Scale-pack discount (globalCreditMultiplier = 0.80), non-BYOLLM.

Line itemActivityManual basisBase credits
Probe Discovery Runprobe-discovery-run$500100
Bulk Import 200 tablesbulk-import-per-100-records$50 / 100200
AI Classification × 10 routinesai-enrichment-per-record$100 / record200
Hierarchy Synthesis × 4 artefacts × 7 techs (subset)probe-ea-artifact-draft$250 / artefact200
Σ baseCredits700

Worst-case reservation at Phase E

maxReserve = baseCredits × maxComplexityMultiplier × tierMultiplier × globalMultiplier
           = 700 × 3.0 × 1.30 × 0.80
           = 2,184 credits   // ≈ $1,747 at Scale-pack rate

Actual settled (after 10-factor complexity computation with log₂ dampening)

complexityScore       = 3.225           // weighted sum across 10 dimensions
rawMultiplier         = log₂(3.225 + 1) × 1.44
                      = 2.99
complexityMultiplier  = clamp(2.99, min=0.5, max=3.0) = 2.99

finalCredits = round(700 × 2.99 × 1.30 × 0.80) = 2,177
2,184
Reserved
worst-case at Approve & Run
2,177
Settled
actual after complexity
7
Released
back to balance on settle
$0.24
Internal cost
LLM + compute + storage (internal-only)

Gross margin: 99.99% on this run. Per PAYG §7.0 two-layer rule: customer never sees internal-cost or margin numbers — only the credit-side ledger visible in /admin/credits.

Failure scenario. If Stage 2 fails after importing 50 tables and 3 routine summaries: releaseReservation(executionId) flips reservation status to RELEASED_ON_FAILURE. Full 2,184 credits returned to balance. Zero credits consumed per PAYG §5.4 — "credits are never deducted for failed jobs." Internal telemetry still recorded (compute consumed by the failed run is iArchitron's cost).

Section 18 · Self-Discovery Dogfood Acceptance

v1 ships when an architect can scan Arcitopsia itself and generate a real architecture doc. CXO Sales Architect

A "framework with no probes" is not deliverable. The v1 acceptance test is an architect sitting down, completing the wizard against the Arcitopsia production stack, running the plan, and producing a Solution Architecture Document that cites only Records discovered by the four v1 probes — and passes architect spot-check.

18
Steps
in the end-to-end acceptance walkthrough
≤ 5,500
Credits
total spend ceiling (MULTINATIONAL tier)
≤ 45 min
Wall-clock
end-to-end including review-queue approval
≥ 80
Citations
contextRecordIds.length on generated doc

The acceptance test in one paragraph

Open /admin/probe-analyzer/ → register 4 SystemRegistrations (GitHub org arcitopsia, AWS account 848111426925, AWS-RDS prod DB, ECS cluster arcitopsia-cluster) → connect each via Phase A' with the 5-step gauntlet → assign FactCategoryMappings → run Discovery Sweep → Plan Generation produces a 4-stage DAG → Approve & Run reserves ≤ 5,000 credits → 4 probes execute in correct order → Hierarchy Synthesis auto-drafts ~28 EA artefacts → architect approves a subset in /admin/ea-review-queue/ → click "Generate Solution Architecture Document" on the synthesised application:arcitopsia-platform Record → Context Resolver application-context runs (completeness gate passes after approvals) → 11-section doc generated via §19 child-workflow fan-out → architect spot-checks 10 random claims → all 10 trace cleanly to source Records.

What the dogfood test validates. Passing §24.6 validates the framework end-to-end: multi-target connectors, intra-probe sub-stages, library-first doctrine, persona/role gating, child-workflow decomposition, context resolver completeness, application architecture separation, Hierarchy Synthesis, AI gateway + two-gate injection completeness, credit reservation + settle, data sensitivity safeguards. Failing any step triggers fix-forward, not scope reduction.

Section 19 · Probe API surface

Ten REST routes. Tenant-scoped. Same auth as the rest of the platform. Architect

Every route uses requireAuth + tenantId scoping per memory/multi-tenancy-patterns.md. Cross-tenant requests return 404 (not 403 — avoids existence-leakage). No DELETE in v1; soft-delete via PATCH { isArchived: true }.

RouteVerbsPurpose
/api/probesGET, POSTList / create ProbeDefinition
/api/probes/:idGET, PATCHGet / update probe
/api/probes/:id/targetsGET, POST, PATCHManage ProbeConnectionTarget rows for a probe
/api/probes/:id/runsGET, POSTList runs / trigger an on-demand run
/api/probe-runs/:idGETRun details with per-stage status
/api/probe-runs/:id/discoveriesGETDiscovered Records (paginated, filterable by SpecType)
/api/probe-runs/:id/approvePOSTBulk-approve, sliced per PublicationPolicy rule
/api/probe-runs/:id/rejectPOSTBulk-reject with optional reason
/api/probe-runs/:id/lineageGETPer-Record ownership-resolver decision audit
/api/connection-targetsGETCross-probe view of all ProbeConnectionTarget rows for the tenant
Analyzer-specific routes ship in parallel under /api/probe-analyzer/* — 8 routes covering the wizard phases (profile / inventory / mappings / sweep / plan / approve / runs). Gated behind canConfigureTenant from lib/auth/permissions/admin-bypass.ts.

Section 20 · PDL Distribution

Probes, policies, resolvers, and TaxMaps ship as PDL packages. Architect

The Package Definition Language (PDL) is the platform's portable, idempotent, version-controlled installer format. Every Probe Framework artefact has a corresponding PDL component type so packages travel cleanly across tenants.

New PDL component types added for Probe Framework

Component typePhaseInstallerIdempotent on
probes[] (24th type)4lib/platform/installers/probe-installer.ts(tenantId, probeDefinition.key)
publicationPolicies[]12publication-policy-installer.ts(tenantId, policy.key)
techTaxonomyMaps[]7tech-taxonomy-map-installer.ts(tenantId, productKey)
factCategoryMatrix5analyzer-extension installertenant-singleton merge
probeDependencyEdges[]5analyzer-extension installerper-edge upsert
aiPromptTemplates[] (extended)8prompt-template-installer.ts(tenantId, template.key)
contextResolvers[] (Records)14standard record installer(tenantId, specType, key)

Two reference distributions ship with v1: ea-foundation v-next (new SpecTypes + LinkTypes for the governance chain + Governed Entity Pattern) and ea-probes-arcitopsia-v1 (the 4 dogfood probe definitions + their ProbeConnectionTarget seeds for the Arcitopsia prod tenant — not installed by default; architect installs manually for the §24 acceptance test).

Section 21 · EA Review Queue

The human-in-the-loop surface for everything AI drafts. Architect CXO

Lives at /admin/ea-review-queue/. Lists every Record in PENDING_DISCOVERY with synthesisOrigin ∈ {AUTO_CREATED_HIERARCHY, AI_DRAFTED_ARTIFACT, CUSTOMER_DOC_ADOPTED} plus every Team.creationMode = AUTO_PROVISIONED waiting for sign-off. The single workflow that closes the loop on probes → drafts → governance.

Available actions per item

Inspect

Full AI provenance: model, prompt template, confidence, prompt hash, the source Records that fed the draft. One click to the source-Record list.

Edit

Refine title / body / structured metadata before approval. The edit is recorded as a manual override; aiProvenance.overriddenAt timestamp is set.

Approve

Flips Record to APPROVED. Existing approval workflow fires (cascades to dependents, lifecycle transitions, notifications).

Merge

Pick two Records of the same SpecType drafted for the same tech. Edges from the loser repoint to the winner. Audit trail preserved.

Reject

Flips to REJECTED. Cascades: inbound governance Links deleted. Records discovered by future probe runs that would re-trigger this artefact emit policy-divergence-finding instead of re-creating (per §13 Direction of Authority).

Bulk approve by template

"Approve all standards drafted by template postgres-standard-draft-v1 with confidence ≥ 0.85." Slices the updateMany per PublicationPolicy rule + the caller's persona.

Persona / role enforcement is server-side. The bulk-approve endpoint checks the caller's roles against each Record's matching PublicationPolicy rule. Records the caller cannot approve come back in a notApproved[] array with the required-persona hint — never silently approved.

Section 22 · Application Architecture Ownership

EA owns the canonical Application. Projects own change proposals. They link, never duplicate. CXO Architect

The original motivating case for the Governed Entity Pattern (§12). One Application has one canonical Record (slow-changing, EA-owned) and many in-flight application-change-proposal Records (one per project modifying it). Project proposals carry only the delta — the canonical Record stays stable across many parallel projects.

The three Application-tier SpecTypes

SpecTypeOwnerLifecycleCardinality / Application
applicationEA teamslow-changing canonical1
application-version-snapshotEA teamappend-only historyN (one per merged change)
application-change-proposalDelivery teamephemeral (lives for the project)M (one per active project)

Cross-project concurrency detection

The change-concurrency-analyzer workflow (per §12 — generic across all 10 Governed Entity types) fires whenever an application-change-proposal is created or updated, plus a nightly catch-up sweep. For each application with ≥ 2 in-flight proposals, it runs a structural diff (deterministic) over the proposal bodies — and only escalates to an LLM judge for ambiguous text cases. Emits change-conflict-finding Records with severity BLOCKER | HIGH | LOW, owner = App Architecture Team.

Visibility surfaces (v1.5 UI). Each canonical application Record gets an "In-Flight Changes" panel listing every application-change-proposal currently linked via proposes-change-to. Each proposal Record gets a "Cross-Project Conflicts" panel listing every change-conflict-finding it's implicated in. App Architecture Team gets a portfolio dashboard rolling up conflicts across every Application they own. v1 ships the data + API; v1.5 ships the rich UI panels.

Section 23 · Package Recommendations

Probes never auto-install packages. They recommend, you decide. Architect Sales

Packages mutate SpecTypes, Workflows, and permissions — too high-stakes for autonomous action. Instead, probes emit package-recommendation Records that surface in /admin/package-recommendations/ for architect-gated install.

Three detection points

1. Probe Analyzer Phase D

Plan-generator sees a SystemKind=DATABASE_PLATFORM, productKey=snowflake registered with no ea-data-snowflake-extension installed → emits recommendation referencing the missing package.

2. Post-probe gap analyzer

Gap-findings cluster around a missing SpecType (e.g., ≥ 10 services need data-classification Records but the SpecType isn't installed) → recommends ea-data-classification-v1.

3. Hierarchy Synthesizer

Unmapped tech whose canonical TechTaxonomyMap could be supplied by a known package → emits recommendation alongside (not instead of) the unmapped-technology-finding.

Anti-spam guarantee

@@unique([tenantId, recommendedPackageKey]) while status=OPEN. Re-triggers append to the existing recommendation's triggeringProbeRunIds[] array rather than inserting a duplicate row. A single recommendation can cite 20 probe runs without polluting the admin queue.

Install lifecycle

  1. Recommendation lands in /admin/package-recommendations/ + the EA Review Queue
  2. Architect inspects (gap description, expected findings closed, dependencies, risk notes)
  3. Picks version, confirms; install runs through standard lib/platform/package-installer.ts
  4. Originating gap-findings auto-resolve on next resolver run
  5. Recommendation flips to INSTALLED with the install ChangeSet ID for audit

Section 24 · Resolver Inferencer

Starter resolvers inferred from observed graph topology. Architect LLM

Hand-authoring Context Resolver DSL is precise but expensive. The Resolver Inferencer drafts starter resolvers automatically by observing the LinkType usage patterns in a working tenant — ea-package-demo3 in v1. A mature, well-linked tenant graph encodes its own best practices; the inferencer extracts them mechanically.

Algorithm

function inferResolverForAnchor(anchorSpecTypeKey, tenantGraph) {
  candidates = top-20 most-linked Records of SpecType anchorSpecTypeKey
  for each candidate:
    walk OUTBOUND Links → tally (LinkType, target SpecType) pairs
    walk INBOUND  Links → tally same
    walk 2-hop transitive for governance LinkTypes
  rank facets by frequency-across-candidates
  set required=true for facets present on ≥80% of candidates
  set completenessRule="AT_LEAST_ONE_PER_INPUT" for fan-out facets
  order facets by dependency (foundation first — team, program, technology)
  return draft ContextResolver DSL JSON
}

v1 — build-time only

A CI job runs scripts/generate-starter-resolvers.ts against a freshly-rehydrated ea-package-demo3 tenant on every release branch. Output is committed to packages/ea-foundation-v-next/resolvers/. Tenants get the starter resolvers by installing the package; they edit per-tenant in the standard record editor afterwards.

v2 — tenant-runnable. Same engine, exposed under /admin/context-resolvers/inferencer. A tenant outgrowing the platform-default resolvers can re-run the inferencer against their own graph topology to draft tenant-customised resolvers. Compared against the platform starter; deltas surfaced for architect review.

Section 25 · Multi-Target Connectors

One ConnectorInstance, N targets. Native fan-out at probe time. Architect

The original ConnectorInstance design assumed "one Slack workspace, one Jira instance, one GitHub org" — its @@unique([tenantId, connectorKey]) meant exactly one instance per tenant per connector. That breaks for "scan all 5 AWS accounts under our Org" or "introspect all 30 GitHub orgs we own." The ProbeConnectionTarget table solves this without changing the connector model.

ProbeConnectionTarget — the multi-target row

ProbeConnectionTarget
  id                String   // PK
  tenantId          String   // for the multi-tenancy guard
  toolConfigurationId String // FK → ToolConfiguration
  targetKey         String   // tenant-stable id, e.g. "aws-prod-848111426925"
  targetType        String   // "AWS_ACCOUNT" | "GITHUB_ORG" | "SNOWFLAKE_WAREHOUSE" | …
  externalId        String?  // native id at the source
  displayName       String
  environment       String?  // "prod" | "stage" | "dev" — null = unspecified
  region            String?
  tier              String?  // "tier-0" | "tier-1" | "tier-2"
  connectionConfig  Json     // per-target credential overrides (secretRef + scope hints)
  isEnabled         Boolean  // per-target enable/disable without deleting

  @@unique([tenantId, toolConfigurationId, targetKey])

Per-target credential override

Different AWS accounts often need different cross-account role ARNs; different GitHub orgs may require different installation tokens. The probe runtime resolves credentials in this order:

  1. Target-level ProbeConnectionTarget.connectionConfig.secretRef (most specific)
  2. ToolConfiguration scope-resolved connectionConfigs[connectorKey] per ENVIRONMENT → TEAM → PROJECT → PROGRAM → TENANT walk
  3. ConnectorInstance.configuration tenant default (least specific)

Fan-out at probe execution time

A single ProbeRun creates one parent WorkflowJob + N child WorkflowJobs (one per ProbeConnectionTarget), all sharing one changeSetId for audit / rollback. Across-target stages run in parallel; intra-target stages preserve their declared order per §5. Hard cap of 100 concurrent child jobs per tenant (configurable via OrganizationProbeProfile.maxConcurrentTargets).

Auto-discovery of targets. Some probes (AWS Organizations probe, GitHub Enterprise probe) discover their own targets — they list all member accounts / orgs and emit ProbeConnectionTarget rows via the probe.upsertConnectionTarget operation. A follow-up probe (the actual scanner) then fans out across those targets on its next run.

Section 26 · EA Stream vs Delivery Stream Routing

A Record's stream is derived, not stored. Architect

Routing to EA Stream vs Delivery Stream is automatic the moment a Record's ownership is resolved. Existing platform permission logic (lib/auth/permissions/ea-delivery-separation.ts) keys off record.ownerTeam.program.programType — probes inherit it for free, no new column needed.

The derivation rule

record.stream = record.ownerTeam.program.programType
              ∈ { EA, GOVERNANCE, PLATFORM }  → "EA Stream"
              ∈ { DELIVERY, SHARED }            → "Delivery Stream"

Five-step ownership resolution (§3.2)

For every Record a probe is about to upsert, the resolver walks this pipeline in order; stops at the first match. Step 5 (AI inference) under threshold forces the Record to PENDING_DISCOVERY regardless of probe autoPublish.

#SourceExampleConfidence
1Explicit signal from the probed systemGitHub CODEOWNERS file maps to a tenant-known team; AWS resource tag arc:ownerTeam=payments1.00
2DeliveryTeamType.specTypeKeys[] lookupSpecType service claimed by exactly one DTT in the tenant0.95
3DeliveryTeamType.govLayerLevel routingL3 for App/Integration, L6 for Data, L7 for Infra/Cloud, L8 Security, L9 DevOps0.85
4ProbeDefinition.defaultOwnerTeamId fallbackConfigured at probe registration time0.75
5AI inference (last resort)Calls record-owner-inference AIPromptTemplate with tenant team listmodel-reported

Conflict case — shared infrastructure

A Kafka cluster used by both Delivery and EA streams: the probe writes one Record owned by the Platform team (L7/L9 DTT) and emits cross-stream Links (used-by) to the consuming Delivery teams. No duplication, no ambiguity about which Record is authoritative.

Section 27 · v1 scope vs v1.5+ deferrals

What ships in v1, and what's deliberately deferred. Sales Architect

v1 ships

  • Probe framework + 4 concrete probes (GitHub, AWS Infra, AWS ECS, PostgreSQL)
  • Probe Analyzer 6-phase wizard + plan generation
  • Hierarchy Synthesis (4 governance LinkTypes + AI artefact drafting)
  • AI gateway + confidence-gate + provenance + two-gate completeness
  • Context Resolution Pipeline + 1 starter resolver (solution-architecture-doc-context)
  • Publication policy engine with default policies
  • Library-first doctrine + tooling registry for the 4 v1 probes
  • Child-workflow decomposition (scope-resolved thresholds)
  • Governed Entity Pattern for 10 entity types + concurrency analyser
  • Credentials management (3 of 5 families: STATIC_TOKEN, AWS_ROLE_ASSUMPTION, DATABASE_CONNECTION_STRING)
  • Intelligence Credits integration (estimate / reserve / settle / release)
  • Self-Discovery Dogfood Acceptance against the live Arcitopsia stack

Deferred to v1.5 / v2

  • Concrete probes for everything beyond v1 (GitLab, GCP, Azure, Snowflake, Okta, Datadog, etc.)
  • OAuth credential family (Confluence Cloud, Notion, Google Workspace)
  • MULTI_SECRET credential family (GitHub App)
  • Scheduler worker + drift detector + scheduled re-analysis
  • Embedding-similarity match strategy (RAG vector store) — v2
  • Tenant-runnable Context Resolver Inferencer — v2 (build-time only in v1)
  • Visual DSL editor for resolvers — v2
  • Stripe payouts API roundtrip for reconciliation — v1.5
  • Annual committed credit packages with carryover — v2
  • Razorpay (v1.5) / Chargebee (v2) / Paddle (v3) payment-processor add-ons
  • Auto-merge of redundant project-architecture Records — v2
  • Real-time credit consumption websocket dashboard — v2

Section 28 · Glossary

Vocabulary for AI LLM grounding + onboarding. LLM Architect

A canonical glossary keyed to the runtime types. When ground-truthing an LLM about Arcitopsia, paste this section + the §3 taxonomy table + the §4 analyzer DAG.

Record
A typed instance of a SpecType in a tenant's Knowledge Graph. Carries status (PENDING_DISCOVERY / APPROVED / REJECTED / ARCHIVED), ownerTeamId, programId, projectId, and arbitrary data JSON.
SpecType
The schema definition for a class of Record (e.g. service, database, architecture-standard). Versioned; tenant-customisable.
Link
A typed edge between two Records. Has source SpecType, target SpecType, and a LinkType (e.g. uses-technology, governed-by).
LinkType
The schema for a class of Link. Defines from/to SpecType constraints, cardinality, and traversal directionality.
ProbeDefinition
A scanner spec: category, workflowId, target selector, publication policy, default owner team, list of ProbeStageDefinition sub-stages.
ProbeRun
One execution of a ProbeDefinition. Wraps a parent WorkflowJob + fans out into N child WorkflowJobs (one per ProbeConnectionTarget). Shares one changeSetId for audit / rollback.
ProbeStageRun
One sub-stage execution of a ProbeRun against a specific target. Strict intra-target ordering; cross-target parallel.
ProbeConnectionTarget
A specific scan target under a ProbeDefinition (e.g. one AWS account, one GitHub org). Carries per-target credentials + scope hints.
SystemRegistration
Probe-Analyzer wizard Phase A output. Declares "we have this kind of system" — kind ∈ {IDP, HR_SYSTEM, CLOUD_ORG, VCS_ORG, CI_CD_PLATFORM, OBSERVABILITY, DOC_REPO, DATABASE_PLATFORM, MESSAGING, DATA_WAREHOUSE, GRC_TOOL, FINANCE_SYSTEM, MANUAL_INPUT}.
FactCategoryMapping
Phase B output. Maps a FactCategory (e.g. ORG_HIERARCHY, CLOUD_ACCOUNT_INVENTORY) to a primary source + sourcing decision (PROBED / MANUAL / AI_INFERRED / DEFERRED).
ProbeExecutionPlan
Phase D output. Versioned, supersedeable DAG: stages → items. Pure function of inputs — re-runnable deterministically.
Hierarchy Synthesis
The post-Stage-1 probe that walks the EA chain upward from each newly-discovered Delivery-Stream Record, auto-creating missing Teams + drafting missing EA artefacts as PENDING_DISCOVERY. Idempotent.
Context Resolver
A Record of SpecType context-resolver. Body is a DSL declaring an anchor SpecType + multi-hop facets + a completeness gate. Used by AI workflows to fetch grounded context.
Completeness Gate
The mechanism by which a resolver refuses to feed an LLM call with incomplete context. Modes: BLOCK_GENERATION, DEGRADE, WARN.
Two-Gate Injection
The pair of gates that protect against silent AI degradation: (1) Resolver Completeness — facets fetched; (2) Injection Completeness — mandatory facets reach the rendered prompt text.
Governed Entity
Any SpecType that follows the canonical-vs-change separation: one EA-owned canonical Record + N implementation snapshots + M project change proposals. Applies to ten entity types in v1.
EA Stream vs Delivery Stream
Streams are derived, not stored: record.stream = record.ownerTeam.program.programType. EA / GOVERNANCE / PLATFORM programs → EA Stream; DELIVERY / SHARED programs → Delivery Stream. Governs visibility + approval rules.
Direction of Authority
The set of rules (§13) that resolve probes-vs-architects conflicts: probes never silently overwrite architect decisions; architects never silently override discovered reality; both flag divergence as findings.
Library-First Doctrine
The rule: every stage that can use a deterministic library MUST use one. LLM is for semantic judgment, not structural extraction. Validated by package validator + plan-review library-coverage ratio.
Confidence Gate
lib/ai/confidence-gate.ts — the single AI gateway. Looks up template, injects context, calls model, validates output schema, computes confidence, enforces threshold + fallback, persists AIUsageLog with full provenance.
Provenance
Record.aiProvenance + Record.contextRecordIds[] + Record.aiInvocationIds[] + AIUsageLog entries. Every AI-touched Record traces back to its model, prompt template, confidence, source Records, and pinned versions at generation time.
Credit Ledger
The PSP-agnostic CreditTransaction table that records every credit movement (DEDUCTION / TOPUP / MONTHLY_GRANT / EXPIRY / REFUND). Idempotent via (executionId, type) unique index. Never deleted.
Reserve / Settle / Release
The credit lifecycle: reserve the worst-case at Phase E (hold against balance); settle the actual after complexity computation (deduct, release remainder); release the full reservation if execution failed.
Complexity Multiplier
10 runtime dimensions × weights, sum normalised, then log₂(score + 1) × 1.44 dampening, clamped to [contract.min, contract.max]. Captures real cost variation without runaway bills on legitimately large scans.
PSP-Agnostic Invariant
Per PAYG §9.6: the credit ledger (credit-engine.ts, credit-service.ts, complexity-calculator.ts) contains zero Stripe-specific types/IDs/imports. Stripe is isolated to stripe-credits.ts + webhook handler. Future PSPs (Razorpay v1.5, Chargebee v2, Paddle v3) drop in without touching the ledger.
Two-Layer Measurement
Per PAYG §7.0: internal telemetry (iArchitron-only — WorkflowExecutionTelemetry with compute units, tokens, LLM cost, gross margin) is strictly separated from customer-facing credits (CreditTransaction with fixed activity-type base × complexity × tier).

Ready to See Discovery in Action?

Book a personalised demo and we'll run the four v1 probes against a sandbox tenant — or your own systems if you bring credentials.

by iArchitron Software Inc. · iarchitron.ai