Edge AI hardware selection in defense is a constrained optimization problem disguised as a shopping list. The right part is the one that runs the production model fast enough, inside the size, weight, power, and cost envelope of the platform, with a supply chain that survives the program lifecycle, and a software stack the team can actually ship. Most program teams default to "Jetson AGX, we'll figure out the budget later" and discover, six months in, that the payload bay does not have 60 W of cooling. This article walks the trade-offs across the parts that matter in 2026.
1. The Edge AI Selection Problem
Every edge AI decision starts with three orthogonal constraints. SWaP-C — size, weight, power, cooling, cost — is set by the host platform and is rarely negotiable. A Group 1 UAS gives you maybe 15 W and 100 g for the entire compute stack. A vehicle-mounted ISR gimbal gives you 50 W and forced air. A rifle-mounted sight gives you 5 W and passive conduction. These numbers gate everything else.
The model determines the floor. A YOLOv8-n object detector at 640×640 needs roughly 8 GFLOPs per inference; at 30 FPS that is 240 GFLOPs/s — well inside a Hailo-8 or a Jetson Orin Nano. Swap to a transformer-based perception model with multi-modal fusion and the budget jumps an order of magnitude. Model optimization can recover a factor of 2-4x, but not 10x — choose hardware that fits the model you actually intend to ship, not the model you wish you had.
The supply chain sets the ceiling. Defense programs run 5-10 years. Commercial silicon turns over every 18-24 months. Country-of-origin, ECCN classification, and second-source availability are not procurement paperwork — they are engineering inputs that shape which parts are even allowed on the BOM.
2. NVIDIA Jetson Orin Family
The Jetson Orin family is the default answer for a reason. Three SKUs cover most defense edge envelopes. Orin Nano (20-40 TOPS INT8, 7-15 W configurable) carries the small-UAS and handheld bracket. Orin NX (70-100 TOPS, 10-25 W) sits in the sweet spot for tactical ISR payloads, ground vehicles, and unmanned surface vessels. Orin AGX (200-275 TOPS, 15-60 W) handles multi-stream, multi-modal workloads — typical use cases include simultaneous EO/IR detection plus tracking plus on-board SLAM.
The decisive argument is the software stack. CUDA, cuDNN, and TensorRT have a decade of model coverage and tooling maturity that no competitor matches. ONNX-to-TensorRT conversion works on almost everything; INT8 calibration is well-understood; DeepStream handles the video pipeline; ROS 2 integration is first-class. For most teams the engineering hours saved by sticking to the NVIDIA stack are worth more than any TOPS/watt deficit.
The downsides are real. Jetson Orin runs hot relative to NPU alternatives — sustained 25 W on Orin NX means real conduction cooling, not a heatsink-and-prayer enclosure. Cost per unit is 3-5x a Hailo-equivalent. And the parts are export-controlled (ECCN 4A003), which adds friction for non-US program partners. NVIDIA's defense-channel lifecycle commitment is solid, but the Jetson roadmap is still tied to NVIDIA's commercial priorities, not yours.
3. Hailo-8 and Hailo-15
Hailo is the TOPS/watt leader and it is not close. Hailo-8 delivers 26 TOPS INT8 at roughly 2.5 W typical — about 10 TOPS/W against Jetson Orin NX's 4-7 TOPS/W in practice. Hailo-15, the SoC variant, integrates a quad-core Arm Cortex-A53, an ISP, and 20 TOPS of NPU in a sub-3 W envelope — purpose-built for smart-camera form factors. For a small tethered drone, a body-worn ISR rig, or a helmet-mounted compute brick, the SWaP math is decisive.
The workflow is where it gets harder. The Hailo Dataflow Compiler takes an ONNX or TFLite model and emits a Hailo Executable Format (HEF) binary. Quantization to INT8 is mandatory — there is no FP16 fallback. The model zoo is solid for vision (YOLO family, MobileNet, EfficientDet, segmentation backbones) but thin for transformer architectures, custom ops, and anything outside the supported operator list. Expect a real porting effort for non-vanilla models; budget two weeks of engineering minimum for a first port, less for subsequent variants.
Country-of-origin is Israel, generally acceptable for NATO programs but worth confirming with your contracts team. Hailo's defense traction is growing — the parts ship in counter-UAS systems, ISR drones, and a handful of NATO-adjacent programs. The lifecycle commitment is shorter than NVIDIA's but improving.
4. Google Coral (Edge TPU)
The Coral Edge TPU was the original quantization-first edge accelerator: 4 TOPS INT8 at about 2 W, available in M.2, Mini PCIe, USB, and Dev Board form factors. For lightweight INT8 vision (MobileNet-class detectors, small classifiers) on a power-constrained platform, Coral still delivers. The TFLite tooling is mature, INT8 calibration is well-documented, and the parts are cheap.
The problem for defense is the supply-chain question. Coral parts are EAR99 and US-origin (TSMC fab), which is fine on paper. But Google has communicated no clear successor roadmap, the commercial channel is the only source, and the defense-program lifecycle commitment is effectively absent. Coral is acceptable for prototypes, training rigs, and non-mission-critical roles where a parts swap mid-program is tolerable. For production defense programs with 5+ year sustainment, planning a Hailo or Jetson alternative is the safer call.
The model coverage is also narrower than it looks. Coral handles INT8 CNN architectures from a curated list well; anything outside that list — transformers, custom ops, dynamic shapes — requires significant restructuring or just will not compile.
5. Qualcomm RB6 / Snapdragon Compute
The Qualcomm RB6 dev kit (QRB5165 SoC plus 5G modem) and the broader Snapdragon Compute line target a different problem: integrated AI plus cellular connectivity on a single SoC. The QRB5165 delivers about 15 TOPS across CPU, GPU, DSP, and Hexagon NPU at roughly 5-7 W, plus an integrated X55 5G modem and the full Qualcomm ISP stack.
The use case is the cellular-edge sensor: a body-worn or vehicle-mounted node that runs local AI inference and streams compressed metadata over 5G/LTE back to a command node. The integrated modem saves a discrete cellular module, a PCB layer, and 1-2 W of power — meaningful at the bottom of the SWaP envelope.
The downsides are software and licensing. The Qualcomm AI Engine SDK (formerly SNPE) is less mature than TensorRT or Hailo's toolchain, with thinner ONNX coverage and a steeper learning curve. The IP licensing terms around Qualcomm modems carry restrictions some defense customers find awkward. And the parts have a commercial-grade lifecycle commitment, not a defense-grade one. For programs where integrated cellular is the deciding factor, RB6 is the right answer; for everything else, Jetson or Hailo wins on tooling alone.
6. FPGA Alternatives
For a specific class of defense workloads, GPUs and NPUs are the wrong answer entirely. Xilinx Versal AI Edge (VE2302 through VE2802) combines hard AI Engines, programmable logic, and Arm cores on a single die — usable AI throughput in the 50-200 TOPS range plus tight integration with custom DSP front ends. Intel Stratix 10 NX targets the high end with tensor blocks integrated into the FPGA fabric.
FPGAs win when three things are true: (1) the workload demands deterministic sub-millisecond latency, (2) custom pre/post-processing — radar signal conditioning, EW front-end, custom sensor fusion — needs to sit on the same die as the AI block, and (3) the program lifecycle is long enough to amortize the development cost. Typical fits are radar signal processing, electronic warfare receivers, missile seekers, and any system where the AI is a stage in a tight DSP pipeline rather than the whole show.
The cost is real. FPGA development is 3-5x the engineering hours of a GPU equivalent. The toolchain (Vitis AI, Quartus Prime) has a steep curve. Headcount with HLS plus AI experience is genuinely scarce. For a program where the FPGA is justified, this cost is recoverable across the lifecycle; for a program that drifted into FPGA territory by accident, it is a budget killer.
7. Supply Chain and ITAR Considerations
Country-of-origin is the first BOM filter. Jetson Orin and Coral are US-origin. Hailo is Israeli. Qualcomm SoCs are US-designed, mixed-fab. Versal and Stratix are US-origin. For ITAR-controlled platforms the calculus is straightforward — favor US or trusted-ally parts and document the determination. For EAR-only programs the door is wider, but the export classification (EAR99 vs CCL entries like 4A003 for high-performance compute) still drives licensing and end-use restrictions.
Second-source planning is non-negotiable. The brutal commercial reality is that any single accelerator family will end-of-life inside the program lifecycle. The mitigations are layered: keep the model pipeline ONNX-first so a port to a successor part is a runtime swap rather than a rewrite; isolate vendor-specific code (TensorRT, Hailo Runtime, Qualcomm AI Engine) behind a thin abstraction; execute last-time-buy reserves at end-of-life notification; and validate at least one alternate accelerator in parallel during development. Programs that skip this discipline pay for it in year 4.
ITAR-free pipelines matter for export sales. A system built entirely from EAR99 parts plus open-source models can be sold under far lighter restrictions than one that pulls in CCL accelerators or US-origin defense IP. For multi-national NATO programs and FMS-adjacent sales, an ITAR-free configuration as a deliverable variant — not the only variant, but one of them — opens markets that an ITAR-locked stack closes. The defense AI landscape rewards architectural flexibility here.
8. Ruggedization and Lifecycle
Commercial developer kits are not deployable hardware. The Jetson Orin Nano Dev Kit is great for prototyping; it dies the first time you ship it in a vehicle. Production ruggedization adds a layer of engineering most teams underestimate.
Operating temperature is the headline spec — military edge typically requires -40 to +71 °C operating, conduction-cooled enclosures, and validated sustained performance (not peak) at the upper end. The Jetson Orin NX has an industrial-temp variant for this reason; the commercial Coral does not. Vibration and shock follow MIL-STD-810 profiles — connectors must lock, solder joints must not crack, and any rotating storage is out (NVMe with proper underfill, not SD cards). EMI/EMC certification (MIL-STD-461 typically) gates platform integration; the accelerator board, the carrier board, and the enclosure all participate.
The conduction-cooled enclosure is where commercial parts meet defense reality. A Jetson Orin AGX at 50 W needs real thermal mass and a real conduction path to a chassis baseplate. A Hailo-8 at 2.5 W can be cooled with a thermal pad to the enclosure wall. The accelerator choice and the mechanical design are coupled — choose them together, not sequentially.
Key insight: Lifecycle is the silent killer of edge AI defense programs. Commercial accelerator families turn over every 18-24 months; defense programs need 5-7 years of sustained supply. The mitigation is not "buy more inventory" — it is architectural: ONNX-first model pipelines, vendor-abstracted runtime layers, and a validated second-source accelerator on standby. Side-by-side benchmarks are useful inputs, but the lifecycle plan is what determines whether the program ships in year 5.
The summary is unromantic: there is no single right answer. Jetson Orin wins on tooling and breadth; Hailo wins on TOPS/watt and thermal headroom; Coral fills a narrow INT8 niche; Qualcomm RB6 wins when cellular integration matters; FPGAs win for deterministic, tightly-coupled signal-chain workloads. The right call is the one that survives a five-year program with a model you can actually ship — and the engineering team has the headcount and tools to support.