NATO's Artificial Intelligence Strategy, adopted by Allied Defence Ministers in October 2021, established the Alliance's framework for responsible AI adoption across the full range of military and civilian operations. While the strategy itself is a high-level policy document, it has generated a set of concrete requirements and expectations that translate directly into procurement criteria for AI-enabled defence software. Understanding these requirements is not optional for software vendors seeking to sell AI capabilities into the NATO market — it is a baseline qualification requirement.
The strategy has continued to develop since 2021. The 2022 Strategic Concept reinforced AI as a priority technology area. The DIANA programme (Defence Innovation Accelerator for the North Atlantic) has integrated the strategy's principles into its evaluation criteria. And individual NATO member states have been translating the Alliance-level principles into national procurement requirements that affect what they will and will not buy.
NATO AI Strategy: Six Principles of Responsible AI
The core of NATO's AI strategy is a set of six principles for responsible use of AI in defence. These principles were developed in alignment with the OECD AI Principles and reflect the Alliance's commitment to maintaining human oversight and ethical constraints on AI systems even under operational pressure. The six principles are:
Lawfulness: AI applications must be developed and used in accordance with national and international law, including international humanitarian law and human rights law. For software vendors, this means that AI systems intended for operational military use must be designed with explicit constraints that prevent them from enabling unlawful uses, and the vendor must be able to document and demonstrate these constraints.
Responsibility and accountability: There must be clear and unambiguous accountability for AI systems, both during development and in operational use. This principle has significant implications for AI system architecture — it requires clear traceability from AI system outputs to the human decision-makers who act on those outputs, and clear documentation of the responsibility chain for AI system behaviour.
Explainability and traceability: AI applications must be understandable and traceable in a manner appropriate to the context of their use. This is one of the most technically demanding principles for AI vendors, because many high-performing AI approaches (particularly deep learning models) are inherently difficult to explain. NATO's expectation is not that all AI systems must use inherently interpretable approaches, but that vendors must be able to provide contextually appropriate explanations of system behaviour for the level of decision authority at which the system operates.
Reliability: AI applications must function reliably in accordance with their intended purpose across the full range of operational conditions — not just under test conditions. This directly echoes the battle-tested vs lab-tested distinction: a system that performs reliably in a test environment but degrades unexpectedly under operational conditions does not meet the reliability principle.
Governability: AI systems must be designed to allow appropriate human oversight and control, with the ability to intervene, adjust, or disable AI functions as necessary. This principle translates into concrete architectural requirements: AI systems must expose human override interfaces, must respond predictably and immediately to override commands, and must not exhibit emergent behaviour that circumvents human control mechanisms.
Bias mitigation: Efforts must be taken to minimise unintended bias in AI applications that could affect decision-making, particularly in applications that affect people. For intelligence, surveillance, and targeting applications, this requires documented testing for performance disparities across different environmental conditions, target types, and data sources, with mitigation measures applied where disparities are identified.
Requirements for AI Systems: Explainability, Auditability, Robustness
Translating the six principles into procurement requirements, the most consistently cited technical requirements for NATO AI systems are explainability, auditability, and robustness. These three requirements are interconnected and together define the practical compliance standard for AI defence software.
Explainability in a NATO context does not mean that the AI system must provide a full technical explanation of its model internals. It means that for a given output or recommendation, the system must be able to provide a contextually appropriate explanation that enables the human operator to evaluate the recommendation and make an informed decision to accept or reject it. For an image classification system used in intelligence analysis, this might mean highlighting the features in the image that contributed most strongly to the classification. For a logistics optimization system, it might mean showing which constraints drove the recommended routing decision. The level of explainability required is calibrated to the level of decision authority involved.
Auditability requires that AI system behaviour can be reconstructed after the fact for review, whether for operational learning, investigation of errors, or legal/policy accountability. This means that AI systems must log their inputs, reasoning processes, and outputs with sufficient detail to enable post-hoc analysis. The logging requirements for classified military systems are stringent and must be considered in the system architecture from the outset — retrofitting audit logging to an existing system is significantly more expensive than designing for it.
Robustness addresses the system's performance under distribution shift — when operational inputs differ from the distribution of training data. Military environments are characterised by exactly this kind of distribution shift: sensors operate under conditions that were not represented in training data, adversaries actively attempt to create inputs that fool AI systems, and operational tempo creates data quality problems. Robust AI systems are those whose performance degrades gracefully under distribution shift, rather than failing catastrophically. Demonstrating robustness requires systematic adversarial testing and performance evaluation across diverse environmental conditions.
Key insight: NATO's AI requirements are not primarily about limiting AI capability — they are about creating trust in AI systems sufficient to allow their operational adoption. A vendor who frames compliance with NATO AI principles as a constraint is missing the point. The principles exist because AI systems that cannot be explained, audited, or overridden will not be adopted by operational military users regardless of their technical performance.
Implications for Defence Software Vendors
For software vendors developing AI-enabled defence products, the NATO AI strategy has concrete implications for product development, documentation, and go-to-market positioning. The most important are:
Design for explainability from the start. Choosing model architectures and training approaches that support explainability is significantly cheaper than retrofitting explanation mechanisms after the fact. For vendors working in computer vision, gradient-weighted Class Activation Mapping (Grad-CAM) and similar attribution methods should be standard features. For NLP-based systems, attention visualization and feature attribution should be built in. For decision-support systems, showing the key factors driving recommendations should be a standard output format.
Document your training data and testing methodology. NATO procurement evaluators will ask about training data provenance, labelling methodology, and test set composition. Vendors who cannot answer these questions in detail raise immediate concerns about whether their system's stated performance is a reliable indicator of operational performance. Maintaining a model card for each AI component — documenting training data, performance benchmarks, known limitations, and appropriate use conditions — is the minimum documentation standard for NATO-aligned AI products.
Build human override into the architecture, not as an afterthought. The governability principle means that human override must be an architectural feature rather than a procedural workaround. Systems that require a user to navigate through multiple screens to override an AI recommendation, or that do not clearly distinguish between AI recommendations and validated human decisions, are unlikely to pass NATO procurement evaluation.
AI Governance Frameworks: ALTAI and the EU AI Act
Beyond NATO-specific requirements, defence AI vendors operating in Europe must also navigate the EU's AI governance framework. The EU AI Act, which entered into force in 2024, includes a specific exemption for AI systems used exclusively for military and national security purposes, which means that purely military AI applications are not subject to the Act's conformity assessment requirements. However, dual-use AI systems — those with both civil and military applications — may fall under the Act's high-risk AI category, triggering requirements for conformity assessment, CE marking, and registration in the EU AI database.
The Assessment List for Trustworthy AI (ALTAI), developed by the EU's High-Level Expert Group on AI, provides a self-assessment tool that aligns closely with both NATO's six principles and the EU AI Act's requirements. Using ALTAI as a development checklist creates a documented trail of responsible AI development that is useful in both NATO and EU procurement contexts, making it an efficient single framework for companies operating across both markets.