A cyber intelligence analyst responding to an intrusion alert does not start with a graph. They start with a list: a file hash from an endpoint detection platform, a suspicious IP address in the firewall logs, a domain flagged by a threat feed. Each indicator is isolated. None of them, on their own, tells the analyst what the adversary was doing, how far they got, or who they are.
Attack chain visualization solves this by converting that flat list of indicators of compromise (IOCs) and observed tactics, techniques, and procedures (TTPs) into a directed graph that represents the adversary's campaign as a coherent narrative. When the graph is built correctly, the analyst can see which kill chain phase each observed technique maps to, which infrastructure nodes connect to a known threat actor, and which gaps in the observed chain suggest detections the organization missed. This is the difference between reactive IOC blocking and genuine cyber intelligence analysis.
What attack chain visualization is solving
The core problem is a structural one. Modern intrusion campaigns generate evidence across multiple detection surfaces — endpoint, network, cloud, email, DNS — and that evidence arrives in different formats, at different times, with different levels of confidence. An analyst working from a SIEM dashboard sees individual alerts. They do not automatically see that the PowerShell execution event at 03:14 is causally connected to the phishing email that arrived six hours earlier and the lateral movement detected on a domain controller twelve hours later.
Attack chain visualization makes those causal connections explicit. The graph shows the adversary's intended operational sequence and lets the analyst map observed evidence onto that sequence. Gaps in the graph — phases where no evidence was collected — are as informative as the evidence itself: they identify blind spots in detection coverage that the adversary exploited or could exploit in a future campaign.
For defense and government analysts specifically, this capability matters beyond a single incident. Persistent state-sponsored actors run multiple campaigns against multiple targets over months or years, reusing infrastructure and tools. A graph that accumulates evidence across campaigns, rather than resetting after each incident, builds an institutional picture of adversary behavior that enables proactive defense — detecting the early phases of a new campaign because the infrastructure or techniques match a known actor profile.
The data model: STIX 2.1 and typed graph relationships
The foundation of any attack chain visualization is the underlying data model. STIX 2.1 (Structured Threat Information eXpression) provides a well-specified object model that maps cleanly to a property graph. The key STIX Domain Object types become graph node types:
Threat Actor — a named or tracked adversary entity. Intrusion Set — a specific campaign or cluster of activity attributed to an actor. Malware and Tool — software used in the attack. Attack Pattern — a specific TTP, typically referenced by MITRE ATT&CK technique ID. Infrastructure — command-and-control servers, staging hosts, exploit kits. Identity — targeted organizations or sectors. Indicator — a pattern (IP, domain, hash, YARA rule) that identifies malicious activity when observed.
STIX Relationship Objects become typed directed edges between these nodes. The relationship_type field defines the semantic: uses (Threat Actor uses Tool), delivers (Malware delivers Payload), targets (Intrusion Set targets Identity), indicates (Indicator indicates Malware), attributed-to (Intrusion Set attributed-to Threat Actor). These relationship types are not cosmetic — they determine which graph traversal queries are meaningful and which layout algorithm produces a legible diagram.
Each edge should carry provenance properties: the source feed or report, the ingestion timestamp, a confidence score (0.0–1.0), and the TLP classification of the originating intelligence. Confidence propagation is critical — a chain of high-confidence edges leading to a low-confidence attribution should surface the attribution uncertainty visually rather than hiding it in the data layer.
Graph database options for CTI workloads
The choice of graph database determines what analytical operations are practical at scale and what latency the analyst workflow can tolerate. Three options dominate CTI platform architectures.
Neo4j
Neo4j is the most widely deployed graph database in CTI platforms and the practical default for most defense organizations. Its Cypher query language makes multi-hop relationship traversal readable and maintainable. A query like MATCH (actor:ThreatActor)-[:USES*1..3]->(infra:Infrastructure) WHERE actor.name = 'Tracked Group A' RETURN infra finds all infrastructure reachable from a named actor within three relationship hops — the graph traversal that underlies most "expand actor context" operations in an analyst workflow.
Neo4j's limitations become relevant at scale: ingesting tens of millions of nodes with real-time write throughput requires careful index design and clustering configuration. For most defense CTI deployments — which deal in hundreds of thousands to low millions of nodes — this is not a constraint.
TigerGraph
TigerGraph is optimized for analytical graph workloads at very large scale — billions of edges with sub-second traversal latency. Its GSQL query language is more powerful than Cypher for complex pattern matching but requires more specialized expertise. TigerGraph is the right choice for national-level CTI platforms aggregating intelligence across many organizations, where Neo4j's write throughput or traversal latency becomes a bottleneck. For a single defense organization's CTI platform, the additional operational complexity rarely pays off.
In-memory graph
For real-time attack chain construction — where an analyst needs a graph populated within seconds of ingesting a new intelligence feed — an in-memory graph (NetworkX in Python, or a custom structure backed by a hash map) provides maximum query speed at the cost of scale and persistence. This approach is viable for session-scoped analysis: the analyst loads a relevant subgraph into memory, performs traversals and layout calculations, exports the result, and the in-memory state is discarded. The persistent knowledge base still lives in a durable graph database; the in-memory layer is the visualization cache.
MITRE ATT&CK navigator integration
MITRE ATT&CK provides the most important reference taxonomy for attack chain visualization: a structured enumeration of adversary techniques organized by tactic phase, from Reconnaissance through to Impact. Integrating ATT&CK into the graph means tagging each Attack Pattern node with its technique ID (e.g., T1566.001 — Spearphishing Attachment) and its parent tactic (Initial Access).
This tagging enables two distinct visualizations. The first is the kill chain diagram: nodes are placed in tactic-phase lanes, and directed edges show the adversary's observed progression through phases. An analyst can immediately see that this campaign was observed in Initial Access and Execution phases but showed no evidence in Persistence — either the adversary did not establish persistence, or persistence mechanisms were not detected.
The second is the coverage heat map: an ATT&CK Navigator-style matrix where each technique cell is colored based on whether the organization has a detection rule covering it, and whether that technique has been observed in tracked campaigns. Overlaying these two layers identifies the highest-priority detection gaps — techniques that adversaries actively use against organizations in the same sector, which the defending organization has no detection coverage for.
For defense CTI platforms, coverage heat maps should be generated per actor profile, not just globally. An actor known to exclusively use living-off-the-land techniques (LOLBins, WMI, scheduled tasks) has a very different coverage priority profile than an actor known to deploy custom implants via supply chain compromise.
Automated chain construction from threat reports
Manual graph population does not scale. A mature CTI program ingests dozens of threat reports per week — vendor research publications, government advisories, open-source blog posts — and each one potentially contains new nodes and edges relevant to the knowledge graph. Automation is not optional; it is the only way to keep the graph current.
The automation pipeline has three stages. The first is NLP extraction: a named entity recognition model fine-tuned on cybersecurity corpora extracts candidate entities (threat actor names, malware families, CVE identifiers, IP addresses, domain names, file hashes, ATT&CK technique references) and candidate relationships from unstructured text. Models fine-tuned on security-domain corpora substantially outperform general-purpose NER on this task — the vocabulary and entity boundaries in threat reporting are domain-specific.
The second stage is entity resolution: extracted entities are matched against existing graph nodes. "Sandworm," "Voodoo Bear," and "TeleBots" are different names for the same tracked actor — the resolution step must merge these to the canonical node rather than creating duplicates. Resolution uses fuzzy string matching, alias tables maintained by the intelligence team, and, for infrastructure indicators, direct identifier matching.
The third stage is graph population: resolved entities and relationships are written to the graph database as new nodes and edges, with a lower baseline confidence score (0.6–0.7 for auto-extracted vs. 0.9+ for manually reviewed) and the source report as provenance. The analyst queue shows new auto-extracted edges awaiting review, allowing them to confirm or reject attributions rather than building the graph from scratch.
Layout algorithms: Sugiyama for kill chains, force-directed for attribution
The layout algorithm determines whether the graph is analytically legible or a tangle of crossing edges. Two algorithms dominate CTI visualization.
The Sugiyama layered algorithm is optimal for kill chain diagrams. Attack chains have an inherent temporal and causal directionality — Initial Access precedes Execution, which precedes Persistence — that Sugiyama encodes as ordered horizontal layers. Nodes in the same ATT&CK tactic phase are placed in the same layer. The algorithm minimizes edge crossings between layers, producing a left-to-right flow diagram where the adversary's progression is immediately visible. For kill chain visualization, Sugiyama is not a style preference; it is the correct algorithm for the data structure.
Force-directed layouts (D3-force is the most commonly used implementation for web-based CTI dashboards) work better for attribution graphs — where the primary analytical question is "which infrastructure nodes cluster around which actor?" rather than "in what sequence did the adversary act?" Force-directed layouts place strongly connected nodes near each other, making clusters of shared infrastructure, tools used by multiple actors, or overlapping campaign activity visually apparent. The analyst sees overlaps that would be invisible in a tabular view.
For large graphs (more than 200 nodes in a single view), edge bundling — grouping parallel edges between the same pair of clusters into a single visual bundle — is necessary to preserve legibility. Without bundling, a graph with 500+ edges degrades into an unreadable visual. Libraries like Cytoscape.js and D3 both support hierarchical edge bundling.
The analyst workflow: from IOC to attribution to report
The visualization is only as useful as the workflow it supports. A well-designed attack chain visualization tool should support four analyst operations without requiring query authorship.
Pivot from IOC. The analyst enters a specific indicator — an IP address, a domain, a file hash — and the tool expands the graph to show all nodes directly connected to that indicator, with relationship types labeled. From a single IP, the analyst should be able to see immediately: which malware family it has been associated with, which campaigns used it, what other infrastructure appeared in the same campaign, and whether any of those nodes connect to a tracked actor profile.
Expand attribution. Following the graph from infrastructure back to actor. The query path is: Indicator → Malware → Tool → Intrusion Set → Threat Actor. Each hop may carry different confidence levels. The visualization should propagate uncertainty: a chain of three 0.8-confidence edges produces an overall attribution confidence of approximately 0.51 (0.8³), not 0.8. Analysts who present automated attribution without uncertainty quantification produce unreliable intelligence products.
Compare to known actor profiles. The analyst selects a tracked actor from the knowledge base and overlays their historical TTP profile — which techniques they have used, which infrastructure they have operated, which targets they have prioritized — against the current incident's observed evidence. Matches and divergences are both informative: divergences may indicate a false attribution or an actor adapting their TTPs.
Generate a report. The analyst selects the relevant subgraph — typically one intrusion set and its connected nodes — and exports it as a structured report. The report format should include the visual diagram, a table of all nodes with their properties, a MITRE ATT&CK heat map for the observed techniques, and a STIX 2.1 bundle for machine consumption by partner organizations. Automated report generation from a confirmed subgraph reduces reporting time from hours to minutes.
For analysts working on OSINT-based threat monitoring, the same visualization workflow applies to open-source intelligence: Telegram channel posts, dark web forum activity, and domain registration patterns all produce nodes and edges that populate the graph and support the pivot-and-expand workflow.
Implementation trade-offs for defense deployments
Several implementation decisions are specific to defense and government deployments and differ from commercial CTI platform design.
Classification handling. Graph nodes and edges sourced from classified intelligence feeds must carry TLP or national classification labels that propagate through the graph. A query that traverses from an unclassified indicator to a classified node must not return the classified node to an analyst without appropriate clearance. This requires classification-aware access control at the graph query layer, not just at the data ingestion layer.
Air-gapped operation. Defense networks often have segments that cannot reach external services. The graph database, NLP extraction pipeline, and visualization frontend must all be capable of operating without external network calls. Commercial graph visualization tools that embed CDN-loaded JavaScript libraries or cloud-based rendering services are architecturally incompatible with air-gapped deployments.
Latency requirements. Tactical cyber operations may require attack chain analysis within minutes of detecting an intrusion. The difference between a Neo4j query that returns in 200ms and one that takes 8 seconds matters when an analyst is pivoting through a live incident. Index design, query caching, and subgraph pre-computation for known actor profiles are all worth engineering effort in high-operational-tempo environments.
Corvus.Sense automates attack chain construction from Telegram monitoring and OSINT streams, populating a continuously updated knowledge graph that supports the full pivot-and-expand analyst workflow — without manual report parsing or graph authoring.
Explore Corvus.Sense →