Every defense organization that runs wargames eventually confronts the same decision: do we run a facilitated, human-umpired tabletop session, or do we run the scenario through a software platform that adjudicates outcomes automatically? The framing of this as a binary choice — old versus new, manual versus digital — misses what both formats actually optimize for. Kriegsspiel and AI-assisted wargaming are not competing products. They are tools for different problems, and choosing the wrong one for the wrong problem wastes the most constrained resource a military organization has: the time of the people in the room.
This article examines both formats with enough technical specificity to support a genuine procurement or program decision. It covers what each approach does well, where each falls short, how hybrid architectures bridge the gap, and how to match the format to the use case. It draws on the broader literature on wargaming in doctrine development and on current practice in AI-assisted military simulation.
A brief history of Kriegsspiel
The word Kriegsspiel — German for war game — has referred to a specific method since 1811, when Georg von Reisswitz and his son developed a board-based simulation system for the Prussian army. The younger Reisswitz presented a refined version to the Prussian General Staff in 1824. Chief of the General Staff Karl von Müffling's reported reaction — "this is not a game, this is training for war" — captures precisely why the format spread rapidly through European and eventually global military establishments.
The core architecture of the original Kriegsspiel remains recognizable in modern tabletop wargaming. A map represents the operational area, typically at a scale that allows meaningful tactical decision-making. Physical pieces represent military units. Players issue orders in writing, maintaining the friction and delay of actual command. A human umpire — the Schiedsrichter — adjudges the outcomes of combat and movement, using dice and reference tables but retaining authority to override results that violate common sense or operational realities that the tables cannot capture.
The umpire is the mechanism that makes Kriegsspiel work as a training instrument. Because the umpire is a subject matter expert who can exercise judgment, the game can handle situations that no rule set anticipates. A player who finds an unexpected route through terrain, improvises a novel use of a support asset, or convinces the umpire that their deception operation would genuinely deceive the adversary — all of these can be accommodated, rewarded, and then discussed in the after-action review. The umpire is, in effect, a human adjudication engine that can reason from first principles.
What manual Kriegsspiel does well
The strengths of manual Kriegsspiel are directly related to what the umpire enables. The most important is the handling of genuine ambiguity. Military decision-making is characterized by incomplete information, conflicting reports, and the need to act before all facts are known. A skilled umpire can simulate this fog of war in ways that feel authentic — withholding information, introducing false reports, allowing players to request intelligence products with varying degrees of reliability. The result is a training environment where participants must make decisions under conditions that closely approximate operational reality.
Manual Kriegsspiel also excels at free-form play. Players are not constrained by a menu of available actions or a predefined action space. A commander can propose any action they can describe in military orders format; the umpire decides what happens. This freedom is training-critical for senior leader development. The point of the exercise is not to practice filling in a software interface — it is to practice the mental process of operational decision-making, command team interaction, and risk tolerance calibration. Any format that constrains the action space constrains the training value.
The deliberation quality in a well-run Kriegsspiel session is also distinctive. Because the game proceeds slowly — umpire adjudication of a single engagement can take several minutes — there is time for discussion, disagreement, and revision of assessments. This is not a bug; it is the core mechanism through which senior leaders internalize lessons. The deliberation is the training, not the obstacle to training.
The bottlenecks are equally clear. Manual Kriegsspiel is slow, requiring hours or days to play through scenarios that AI systems resolve in seconds. It is umpire-dependent in a way that creates inconsistency across events — the same scenario run by two different umpire teams will produce different experiences, sometimes incomparably different. And it does not scale: running a dozen simultaneous Kriegsspiel sessions requires a dozen experienced umpire teams, which is not a resource that most organizations can sustain.
What AI wargaming adds
AI-assisted wargaming addresses precisely the bottlenecks that limit Kriegsspiel. The most fundamental contribution is consistent adjudication at speed. A software adjudication engine applies the same combat model every time, across every engagement, without fatigue, inconsistency, or the interpersonal dynamics that sometimes cause umpires to soft-pedal outcomes that are uncomfortable for senior participants. When the model says a force is attrited to below combat effectiveness, it is attrited — regardless of rank in the room.
Speed enables a capability that manual wargaming cannot provide: statistical analysis across many runs. A single Kriegsspiel session produces one data point — one narrative of how the engagement unfolded under one set of decisions. An AI wargaming platform can run the same scenario hundreds or thousands of times overnight, varying adversary courses of action, initial conditions, and random seeds, and produce a probability distribution over outcomes. This is not a marginal improvement; it is a qualitatively different analytical capability.
For large-force simulation — brigade, division, and above — manual wargaming becomes increasingly unwieldy. The number of units, the simultaneity of engagements across the operational area, and the logistics complexity are simply beyond what a human umpire team can track in real time. AI-assisted platforms handle these scales naturally, with adjudication engines that simultaneously resolve hundreds of engagements across the map and logistics models that track supply consumption and distribution in parallel with combat.
Staff training at scale is another AI wargaming strength that Kriegsspiel cannot match. When the training audience is a battalion staff that needs to practice the battle rhythm of planning, briefing, and decision-making, you need an engine that produces realistic problems continuously and at a pace that matches the training objectives. A human umpire team managing a battalion-level staff exercise is managing a much higher cognitive load than a company-level Kriegsspiel session — AI adjudication reduces that load and allows the umpire team to focus on observation and coaching rather than mechanics.
Where AI wargaming still struggles
The limitations of AI wargaming are real and are not simply engineering problems awaiting a software update. The deepest limitation is the handling of genuine novelty. An AI adjudication engine is a model of combat based on historical data, doctrine, and engineering judgments about weapon effects and unit capabilities. It adjudicates correctly within the envelope of what it was trained or configured to handle. A new operational concept — a novel use of an unmanned system, an untested electronic warfare technique, a combined-arms approach that has no historical analog — falls outside that envelope, and the engine will either fail silently (producing a result based on the wrong template) or flag the action as unadjudicatable.
Human creativity and unexpected behavior present a related challenge. Experienced players quickly find the edges of a simulation's model and exploit them — not to game the system maliciously, but because clever military thinking naturally looks for unconventional approaches. If the simulation adjudicates a novel deception operation the same as a conventional frontal approach, it has failed to model a core military capability, and experienced players will lose confidence in the results. A human umpire confronted with the same deception operation can reason about its plausibility and reward or penalize it appropriately.
Strategic ambiguity — the political, coalition, and information environment that shapes military operations — is consistently undermodeled in AI wargaming platforms. Most platforms focus on the operational and tactical levels where combat mechanics are tractable. The strategic context that determines what objectives are achievable, which coalition partners will cooperate, and how political constraints shape military options is typically handled through scenario design and human facilitation, not AI adjudication. A wargame that produces a tactically successful outcome within a strategically impossible scenario has produced misleading results.
Hybrid approaches: the practical state of the art
The most effective current practice for complex wargaming programs is a hybrid architecture that assigns work to the format best suited to it. The division of labor is roughly: AI handles force resolution, humans handle friction.
Concretely, this means the AI adjudication engine owns combat mechanics — attrition, movement rates, logistics consumption, electronic warfare effects, sensor detection — and produces continuous updates to the simulation state. Human umpires own the high-level decision nodes: a commander's decision to commit the reserve, a surprise intelligence report that changes the operational picture, a coalition partner's unexpected refusal to provide fire support. The umpire team monitors the AI output and intervenes when the situation requires human judgment, not at every step.
This architecture preserves the most valuable function of the human umpire — the ability to inject friction, surprise, and judgment at the moments that matter most for training — while removing the umpire as the bottleneck for routine mechanics. It also allows a smaller, less experienced umpire team to run a more complex exercise, because the team's cognitive resources are concentrated on high-value interventions rather than spread across every combat resolution calculation.
Design principle for hybrid wargaming: Define the boundary between AI and human adjudication in the wargame design document before the event. The boundary should be based on what generates the most training value from human umpire time — not on what is easiest to configure in the software. Routine mechanics belong to the AI; consequential uncertainty belongs to the human.
Use case matching: a practical guide
Senior leader decision-making seminars
For a two-day event with a brigade or division commander and their key staff, the training objective is typically the quality of the decision-making process — how the commander exercises command authority, how the staff interacts under pressure, how risk tolerance is communicated and acted on. This is quintessential Kriegsspiel territory. The format's slowness is an asset, not a liability, because the deliberation is the training. A single carefully designed Kriegsspiel scenario, well umpired, will produce more lasting behavioral change in a senior leader than ten AI wargaming runs.
Staff training and exercise repetition
When the training audience is a battalion or brigade staff and the objective is to build procedural competence — the battle rhythm of plans-to-orders, the staff's ability to process and present intelligence, the speed and accuracy of targeting cycles — AI wargaming is the right format. The staff needs volume: many problems, many decisions, rapid feedback on whether the process produced the right output. AI adjudication provides this at a pace that manual umpiring cannot sustain across a multi-day training cycle.
Concept testing and requirements analysis
When the question is analytical rather than educational — does concept X outperform concept Y across a range of adversary actions? — AI wargaming is mandatory. Statistical confidence requires multiple runs, and multiple runs require automated adjudication. Manual Kriegsspiel cannot produce the sample size needed to support capability development decisions, and any attempt to use it for this purpose produces the appearance of analytical rigor without the substance.
Novel doctrine and emerging capabilities
When the concept being tested is new enough that no existing AI model can adjudicate it confidently, a human-umpired format is the only viable option. This is not a criticism of AI wargaming — it is a structural reality of any model-based system. Attempting to test genuinely novel doctrine in an AI platform that lacks a model for the relevant capabilities produces results that reflect the model's assumptions, not the doctrine's potential. For multi-domain operations wargaming, where cross-domain effects are still poorly understood, this constraint applies to significant portions of the action space.
WARG: built for the hybrid requirement
The hybrid architecture described above — AI adjudication for force resolution, human control for high-value decision nodes — is the design philosophy behind WARG. The platform provides an AI adjudication engine that resolves combat, logistics, and sensor mechanics at speed, while preserving full umpire control over scenario injects, decision node triggers, and adjudication overrides. Players interact through a command interface that does not constrain the action space to a predefined menu; the system accepts orders in structured natural language and routes them to the appropriate resolution mechanism. The result is a platform that scales to large forces and high repetition rates without sacrificing the human judgment that makes wargaming a genuine training instrument rather than a combat animation tool.
WARG is designed to run the hybrid wargames that manual Kriegsspiel cannot scale and that purely automated systems cannot adjudicate faithfully. If you are evaluating wargaming platforms for staff training, concept testing, or senior leader exercises, see what WARG delivers.
Explore WARG →