STPA-TRiSM Integration Specification

Automatic Hazard-to-Governance Mapping

Version: 1.0 Status: Specification Target: Q2-Q3 2026

Executive Summary

This specification defines how ATSF's STPA (System-Theoretic Process Analysis) hazard analysis automatically populates AI TRiSM governance pillars. The integration transforms safety engineering outputs into actionable governance controls, creating a closed-loop system where:

STPA identifies hazards and unsafe control actions
Hazards automatically map to TRiSM pillars
TRiSM enforces controls based on hazard severity
Cognitive Cube tracks causal chains for continuous learning

This eliminates manual translation between safety analysis and governance enforcement.

Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                        STPA Analysis Engine                          │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐              │
│  │   Hazards    │  │    UCAs      │  │   Scenarios  │              │
│  │  Identified  │──▶│   Defined    │──▶│   Generated  │              │
│  └──────────────┘  └──────────────┘  └──────────────┘              │
└────────────────────────────┬────────────────────────────────────────┘
                             │
                             ▼ Auto-Mapping
┌─────────────────────────────────────────────────────────────────────┐
│                      STPA-TRiSM Mapper                               │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │  Hazard Type → TRiSM Pillar Routing                          │   │
│  │  • Safety hazards → Security Pillar                          │   │
│  │  • Privacy hazards → Privacy Pillar                          │   │
│  │  • Behavioral hazards → Explainability Pillar                │   │
│  │  • Operational hazards → ModelOps Pillar                     │   │
│  └─────────────────────────────────────────────────────────────┘   │
└────────────────────────────┬────────────────────────────────────────┘
                             │
                             ▼ Control Generation
┌─────────────────────────────────────────────────────────────────────┐
│                       AI TRiSM Pillars                               │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌─────────┐ │
│  │ Explainability│  │   ModelOps   │  │   Security   │  │ Privacy │ │
│  │    Pillar    │  │    Pillar    │  │    Pillar    │  │  Pillar │ │
│  └──────────────┘  └──────────────┘  └──────────────┘  └─────────┘ │
└────────────────────────────┬────────────────────────────────────────┘
                             │
                             ▼ Causal Tracking
┌─────────────────────────────────────────────────────────────────────┐
│                      Cognitive Cube                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐              │
│  │     TKG      │  │     ART      │  │   Granger    │              │
│  │   (Causal    │  │  (Cluster    │  │  (Causality  │              │
│  │    Graph)    │  │   Hazards)   │  │   Testing)   │              │
│  └──────────────┘  └──────────────┘  └──────────────┘              │
└─────────────────────────────────────────────────────────────────────┘

Hazard-to-Pillar Mapping Rules

Mapping Matrix

Hazard Category	Primary Pillar	Secondary Pillars	Auto-Generated Controls
H-SAFETY (Physical harm)	Security	ModelOps	Kill-switch trigger, Action blocking
H-PRIVACY (Data exposure)	Privacy	Security	PII filtering, Access control
H-BIAS (Unfair outcomes)	Explainability	Privacy	Bias probing, Outcome monitoring
H-DRIFT (Behavioral change)	Explainability	ModelOps	Drift detection, Rollback triggers
H-SECURITY (Adversarial attack)	Security	All	Injection detection, Input validation
H-RELIABILITY (System failure)	ModelOps	Security	Health checks, Failover triggers
H-COMPLIANCE (Regulatory)	All	--	Audit logging, Compliance reports

Detailed Mapping Rules

H-SAFETY to Security Pillar

SAFETY_HAZARD_MAPPING = {
    "physical_harm": {
        "pillar": "security",
        "controls": [
            "action_blocking",
            "human_approval_required",
            "kill_switch_threshold_lower"
        ],
        "severity_multiplier": 2.0,  # Double risk score
        "auto_escalate": True
    },
    "resource_damage": {
        "pillar": "security",
        "controls": [
            "resource_access_restrict",
            "action_monitoring_enhanced"
        ],
        "severity_multiplier": 1.5
    },
    "system_compromise": {
        "pillar": "security",
        "controls": [
            "containment_protocol_activate",
            "network_isolation"
        ],
        "severity_multiplier": 1.8,
        "auto_escalate": True
    }
}

H-PRIVACY to Privacy Pillar

PRIVACY_HAZARD_MAPPING = {
    "pii_exposure": {
        "pillar": "privacy",
        "controls": [
            "pii_filter_strict",
            "output_redaction",
            "access_logging_enhanced"
        ],
        "severity_multiplier": 1.5,
        "compliance_flags": ["GDPR", "CCPA"]
    },
    "data_aggregation_risk": {
        "pillar": "privacy",
        "controls": [
            "aggregation_threshold_enforce",
            "k_anonymity_check"
        ],
        "severity_multiplier": 1.2
    },
    "inference_attack": {
        "pillar": "privacy",
        "secondary": "security",
        "controls": [
            "differential_privacy_apply",
            "query_rate_limit"
        ],
        "severity_multiplier": 1.4
    }
}

H-BIAS to Explainability Pillar

BIAS_HAZARD_MAPPING = {
    "demographic_bias": {
        "pillar": "explainability",
        "controls": [
            "bias_probe_activate",
            "outcome_distribution_monitor",
            "fairness_metrics_track"
        ],
        "severity_multiplier": 1.3,
        "compliance_flags": ["EEOC", "FHA"]
    },
    "selection_bias": {
        "pillar": "explainability",
        "controls": [
            "input_distribution_check",
            "sampling_audit"
        ],
        "severity_multiplier": 1.1
    }
}

H-DRIFT to ModelOps Pillar

DRIFT_HAZARD_MAPPING = {
    "behavioral_drift": {
        "pillar": "explainability",
        "secondary": "modelops",
        "controls": [
            "drift_detection_enable",
            "baseline_comparison_continuous",
            "rollback_checkpoint_create"
        ],
        "severity_multiplier": 1.2
    },
    "performance_degradation": {
        "pillar": "modelops",
        "controls": [
            "performance_threshold_alert",
            "auto_scaling_trigger"
        ],
        "severity_multiplier": 1.0
    }
}

UCA (Unsafe Control Action) Processing

UCA Structure

@dataclass
class UnsafeControlAction:
    uca_id: str
    controller: str  # Which component issues the control
    control_action: str  # The action being analyzed
    context: str  # When the UCA occurs
    hazard_type: str  # Which hazard it leads to

    # STPA UCA types
    uca_type: Literal[
        "not_provided",  # Control not provided when needed
        "provided_incorrectly",  # Control provided but wrong
        "wrong_timing",  # Provided too early/late
        "stopped_too_soon",  # Duration insufficient
        "applied_too_long"  # Duration excessive
    ]

    # Severity assessment
    severity: Literal["low", "medium", "high", "critical"]
    likelihood: Literal["rare", "unlikely", "possible", "likely", "certain"]

UCA to TRiSM Control Generation

def generate_trism_controls(uca: UnsafeControlAction) -> List[TRiSMControl]:
    """
    Generate TRiSM controls from an unsafe control action.

    Example:
        UCA: "Agent does not request approval before executing high-risk action"
        Generated Controls:
        - Security: approval_required_for_high_risk = True
        - Explainability: reasoning_trace_required = True
        - ModelOps: escalation_threshold = 0.7
    """
    controls = []

    # Map UCA type to control patterns
    if uca.uca_type == "not_provided":
        controls.append(TRiSMControl(
            pillar="security",
            control_type="mandatory_action",
            parameters={"action": uca.control_action, "required": True}
        ))

    elif uca.uca_type == "provided_incorrectly":
        controls.append(TRiSMControl(
            pillar="explainability",
            control_type="validation_check",
            parameters={"action": uca.control_action, "validators": ["semantic", "safety"]}
        ))

    elif uca.uca_type == "wrong_timing":
        controls.append(TRiSMControl(
            pillar="modelops",
            control_type="timing_constraint",
            parameters={"action": uca.control_action, "timing_rules": ["sequence_check"]}
        ))

    # Add severity-based controls
    if uca.severity in ["high", "critical"]:
        controls.append(TRiSMControl(
            pillar="security",
            control_type="human_oversight",
            parameters={"approval_required": True, "escalation_path": "immediate"}
        ))

    return controls

Cognitive Cube Integration

Causal Chain Recording

When STPA identifies a hazard scenario, the causal chain is recorded in the TKG:

def record_hazard_causal_chain(
    cube: CognitiveCube,
    hazard: Hazard,
    ucas: List[UnsafeControlAction],
    scenario: LossScenario
):
    """
    Record hazard causal chain in Temporal Knowledge Graph.

    Creates relationships:
    - (UCA) --causes--> (Hazard)
    - (Hazard) --leads_to--> (Loss)
    - (Control) --mitigates--> (Hazard)
    """
    # Record hazard node
    cube.tkg.add_node(
        node_id=hazard.hazard_id,
        node_type="hazard",
        label=hazard.description,
        metadata={"severity": hazard.severity, "category": hazard.category}
    )

    # Record UCA → Hazard relationships
    for uca in ucas:
        cube.tkg.add_node(
            node_id=uca.uca_id,
            node_type="uca",
            label=f"{uca.controller}: {uca.control_action}",
            metadata={"type": uca.uca_type}
        )

        cube.tkg.add_edge(
            subject=uca.uca_id,
            predicate="causes",
            obj=hazard.hazard_id,
            valid_from=datetime.now(),
            confidence=0.9
        )

    # Record Hazard → Loss relationship
    cube.tkg.add_edge(
        subject=hazard.hazard_id,
        predicate="leads_to",
        obj=scenario.loss_id,
        valid_from=datetime.now(),
        confidence=scenario.likelihood_score
    )

Granger Causality for Hazard Prediction

Use Granger causality testing to predict hazards before they occur:

def predict_hazard_from_actions(
    cube: CognitiveCube,
    recent_actions: List[AgentAction],
    known_hazards: List[Hazard]
) -> List[HazardPrediction]:
    """
    Use Granger causality to predict if recent actions lead to known hazards.

    Returns predictions with confidence scores.
    """
    predictions = []

    for hazard in known_hazards:
        # Get action patterns that historically led to this hazard
        causal_patterns = cube.granger.get_causal_patterns(
            effect=hazard.hazard_id,
            lookback_window=100
        )

        # Check if recent actions match causal patterns
        for pattern in causal_patterns:
            match_score = pattern.match(recent_actions)

            if match_score > 0.7:  # High match
                predictions.append(HazardPrediction(
                    hazard_id=hazard.hazard_id,
                    confidence=match_score,
                    triggering_actions=[a.action_id for a in recent_actions[-5:]],
                    recommendation="Activate mitigation controls"
                ))

    return predictions

ART Clustering for Hazard Groups

Use ART (Adaptive Resonance Theory) to cluster similar hazards:

def cluster_hazards(
    cube: CognitiveCube,
    hazards: List[Hazard]
) -> Dict[str, List[Hazard]]:
    """
    Cluster hazards using ART for pattern recognition.

    Returns clusters of related hazards that may share mitigation strategies.
    """
    # Convert hazards to feature vectors
    vectors = [hazard.to_feature_vector() for hazard in hazards]

    # Cluster using ART
    clusters = cube.art.cluster(vectors, vigilance=0.8)

    # Group hazards by cluster
    hazard_groups = {}
    for hazard, cluster_id in zip(hazards, clusters):
        if cluster_id not in hazard_groups:
            hazard_groups[cluster_id] = []
        hazard_groups[cluster_id].append(hazard)

    return hazard_groups

API Specification

REST Endpoints

# STPA-TRiSM Integration Endpoints

POST /stpa/hazards:
  description: Register a new hazard and auto-generate TRiSM controls
  request:
    hazard_id: string
    description: string
    category: enum[safety, privacy, bias, drift, security, reliability]
    severity: enum[low, medium, high, critical]
    ucas: array[UCA]
  response:
    generated_controls: array[TRiSMControl]
    pillar_mappings: object
    cognitive_cube_entries: integer

GET /stpa/hazards/{hazard_id}/controls:
  description: Get TRiSM controls generated for a hazard
  response:
    controls: array[TRiSMControl]
    pillar_breakdown: object

POST /stpa/scenarios/analyze:
  description: Analyze a loss scenario and map to TRiSM
  request:
    scenario_description: string
    involved_agents: array[string]
    context: object
  response:
    identified_hazards: array[Hazard]
    ucas: array[UCA]
    recommended_controls: array[TRiSMControl]

GET /stpa/predictions/{agent_id}:
  description: Get hazard predictions based on recent agent behavior
  response:
    predictions: array[HazardPrediction]
    recommended_actions: array[string]

POST /stpa/reports/generate:
  description: Generate STPA-TRiSM compliance report
  request:
    agent_ids: array[string]
    time_range: object
    format: enum[pdf, json, markdown]
  response:
    report_url: string
    summary: object

Python SDK

from atsf import ATSF
from atsf.stpa import STPAAnalyzer, Hazard, UCA
from atsf.ai_trism import TRiSMEngine

# Initialize
atsf = ATSF()
stpa = STPAAnalyzer()
trism = TRiSMEngine()

# Define hazard
hazard = Hazard(
    hazard_id="H-001",
    description="Agent executes action without human approval",
    category="safety",
    severity="high"
)

# Define UCAs
ucas = [
    UCA(
        uca_id="UCA-001",
        controller="agent",
        control_action="request_approval",
        context="high_risk_action",
        hazard_type="safety",
        uca_type="not_provided",
        severity="high",
        likelihood="possible"
    )
]

# Auto-generate controls
controls = stpa.generate_trism_controls(hazard, ucas)

# Apply to TRiSM
for control in controls:
    trism.apply_control(control)

# Get predictions
predictions = stpa.predict_hazards(agent_id="agent_001")

# Generate report
report = stpa.generate_report(
    agents=["agent_001", "agent_002"],
    format="pdf"
)

Implementation Roadmap

Phase 1: Core Mapping (Q2 2026)

Task	Status	Owner
Hazard category taxonomy	Done	Core team
UCA to Control generator	In progress	Bounty
Basic TKG integration	In progress	Core team
REST endpoints	Planned	Core team

Phase 2: Cognitive Integration (Q3 2026)

Task	Status	Owner
Granger causality for prediction	Planned	Research
ART hazard clustering	Planned	Research
Automated scenario generation	Planned	Community
Report generator (PDF)	Planned	Bounty

Phase 3: Advanced Features (Q4 2026)

Task	Status	Owner
Real-time hazard prediction	Planned	Core team
Cross-agent hazard correlation	Planned	Research
Regulatory template library	Planned	Community
Visual hazard explorer	Planned	Community

References

Leveson, N. (2011). Engineering a Safer World. MIT Press.
Gartner (2023). AI TRiSM Framework.
ATSF Security Layers Documentation
NIST AI RMF 1.0

Safety analysis meets governance enforcement.

Automatic Hazard-to-Governance Mapping​

Executive Summary​

Architecture Overview​

Hazard-to-Pillar Mapping Rules​

Mapping Matrix​

Detailed Mapping Rules​

H-SAFETY to Security Pillar​

H-PRIVACY to Privacy Pillar​

H-BIAS to Explainability Pillar​

H-DRIFT to ModelOps Pillar​

UCA (Unsafe Control Action) Processing​

UCA Structure​

UCA to TRiSM Control Generation​

Cognitive Cube Integration​

Causal Chain Recording​

Granger Causality for Hazard Prediction​

ART Clustering for Hazard Groups​

API Specification​

REST Endpoints​

Python SDK​

Implementation Roadmap​

Phase 1: Core Mapping (Q2 2026)​

Phase 2: Cognitive Integration (Q3 2026)​

Phase 3: Advanced Features (Q4 2026)​

References​