Tactical operators lose between 30 and 40 percent of their COP-management time to menu navigation. This is not a design failure unique to any one application — it is a structural consequence of trying to map the full operational information space onto a touchscreen interface that was designed for fingers, not gloves, and designed for calm, not contact. ATAK, WinTAK, and their browser-based equivalents like CloudTAK are powerful tools. They are also deeply menu-driven, and every layer of that menu tree is cognitive load arriving at exactly the moment the operator has the least available bandwidth to spend on it.

The hypothesis behind TAKpilot is straightforward: the fastest interface to a complex system is the one the operator already knows how to use — language. If a squad leader can say "place a marker at grid 37U DP 12345 67890 with callsign ALPHA-3" to a radio operator and have it appear on the map in ten seconds, then a system that accepts the same phrase as a chat message and does the same thing in two seconds is a meaningful capability improvement. TAKpilot is that system: an AI copilot for CloudTAK that translates natural language into TAK API operations using LLM function calling.

The menu-navigation problem in tactical software

Consider the typical workflow for placing an enemy contact marker in ATAK: open the map long-press menu, select "New Point", navigate to the CoT type selector, choose the hostile unit type from a hierarchical MIL-STD-2525 tree, enter the grid coordinates in the coordinate entry dialog (switching between MGRS, decimal degrees, and DMS depending on what the operator memorized), add callsign and remarks, tap confirm. If the operator's fingers are cold, if they are under fire, if they are in a vehicle on a dirt road, each of those taps carries a real error rate. Misplace a contact by one grid square and a fire mission goes wrong.

Time-and-motion studies from training exercises consistently show that experienced ATAK operators spend 35–40% of their map-management time on UI navigation rather than decision-making. The remaining 60–65% is split between reading the picture, communicating on radio, and updating their own position data. The navigation overhead is not a small rounding error — it is more than a third of the cognitive budget that could instead go to situational awareness.

Natural language does not eliminate the need for precision — "grid 37U DP 12345 67890" still requires the operator to know the grid — but it eliminates the navigation overhead entirely. The operator speaks (or types) the action; the system executes it. The cognitive path from "I need to place this contact" to "the contact is on the map" is one step instead of seven.

Chat-native architecture: LLM function calling as a TAK API layer

TAKpilot's core architecture is built on LLM function calling — the capability of modern large language models to not only generate text, but to select and parameterize structured function calls from a defined tool library. Each TAK API operation exposed by CloudTAK is wrapped in a tool definition: a JSON schema specifying the function name, description, and typed parameters with validation constraints.

A representative tool definition for marker placement looks like this:

{
  "name": "place_marker",
  "description": "Place a point-of-interest marker on the CloudTAK map at a specified grid coordinate.",
  "parameters": {
    "type": "object",
    "properties": {
      "mgrs": {
        "type": "string",
        "description": "MGRS grid reference, e.g. '37U DP 12345 67890'"
      },
      "callsign": {
        "type": "string",
        "description": "Callsign or label for the marker"
      },
      "cot_type": {
        "type": "string",
        "description": "MIL-STD-2525C CoT type string, e.g. 'a-h-G-U-C' for hostile ground combat"
      },
      "remarks": { "type": "string" }
    },
    "required": ["mgrs", "callsign", "cot_type"]
  }
}

When the operator sends "place a hostile infantry contact at 37U DP 12345 67890, callsign CONTACT-7", the model receives the message plus the full tool library and selects place_marker with the appropriate parameters — including resolving the natural-language "hostile infantry" to the correct CoT type string. TAKpilot executes the function call, the marker appears on the map, and the operator sees a collapsible tool-call card in the chat showing the function name, input parameters, execution time in milliseconds, and the HTTP response status from CloudTAK.

The full TAKpilot tool library covers the major CloudTAK operational verbs: place and update markers, create and close missions, list active tracks with optional sector filter, subscribe and unsubscribe from data channels, create data packages, and query unit status. Complex multi-step operations — "create a CAS mission for BRAVO-7 at grid 37U DP 98765 43210 and notify all units in channel ALPHA" — are handled by the model chaining multiple tool calls in sequence, with each intermediate result visible in the chat as a separate tool-call card.

Vision pipeline: from SITREP sketch to map placement

A significant portion of tactical information still arrives as images: hand-drawn sketches photographed on a phone, scanned situation reports, PDF overlays transmitted over email or messaging apps. TAKpilot's vision pipeline processes these inputs and converts them into structured map objects through a three-stage chain.

Stage 1 — Entity extraction. The image (PNG, JPG, or PDF converted to high-resolution PNG) is passed to a vision-capable model with a structured extraction prompt. The model identifies every map-relevant entity in the image: grid references, callsigns, unit type symbols (using MIL-STD-2525 or APP-6 recognition), bearing lines, phase lines, and free-text annotations. Output is a JSON array of entities with type, extracted text, and confidence score.

Stage 2 — Chain-of-thought confirmation. TAKpilot generates a CoT (chain-of-thought, not Cursor on Target) message presenting each extracted entity to the operator: "I found 4 entities in your SITREP. Entity 1: hostile mechanized infantry platoon at 37U DP 12345 67890 (confidence: 0.94). Entity 2: friendly observation post at 37U DP 11111 22222 (confidence: 0.88)…" Each entity is rendered with its proposed NATO symbology icon and the exact grid that will be used for placement. The operator reviews the list and either confirms all, selects individual items, or corrects misread grids before anything touches the map.

Stage 3 — Map placement. On operator confirmation, TAKpilot issues the corresponding place_marker or create_mission tool calls for each confirmed entity, batched and parallelized for speed. A ten-entity SITREP that would take four to six minutes to manually enter in ATAK is processed and placed in under thirty seconds.

The vision pipeline degrades gracefully for low-quality inputs: if confidence on an entity falls below 0.70, TAKpilot explicitly flags it as uncertain and asks the operator to verify the grid before confirming placement. It does not silently place low-confidence entities.

Key insight: The approval gate before map placement is not a UX nicety — it is a hard safety requirement. A vision model that misreads "37U DP 12345 67890" as "37U DP 23456 67890" places a friendly unit 1.1 km from where they actually are. In a CAS scenario, that error is mission-critical. The confirmation step turns a potential false placement into a detected and corrected one.

Approval gating for destructive operations

TAKpilot distinguishes between two classes of operations: additive (place marker, create mission, subscribe channel) and destructive (delete mission, remove track, clear data package, unsubscribe all channels). Additive operations execute immediately after the model selects them — the operator can see the result in the chat card and undo it with a follow-up "remove the marker I just placed". Destructive operations are gated behind an explicit confirmation step.

The approval gate design addresses a specific failure mode: the operator sends an ambiguous message like "clean up the mission list" or "remove old contacts", and the model, interpreting this correctly, generates a batch of delete_mission calls. Without a gate, those deletes execute and the data is gone — TAK Server has no built-in undo. With the gate, the operator sees a confirmation prompt: "I am about to delete 7 missions. Here are the affected records:" followed by a rendered list with NATO symbology icons, mission names, assigned callsigns, and last-modified timestamps. The operator must type "confirm" or click the explicit confirm button before execution proceeds.

The rendering of NATO MIL-STD-2525/APP-6 symbology in the confirmation prompt is deliberate: operators recognize their map symbols faster than they parse text. A confirmation that shows the SFGPU (friendly ground unit) icon next to "3rd Platoon, CHARLIE Company, assigned: BRAVO-7" is processed faster and with lower error rate than a plain-text list. TAKpilot renders the relevant symbol SVGs inline in the confirmation card using the same symbol set CloudTAK uses on the map itself.

Model selection across operating environments

TAKpilot supports multiple model backends and the active model is configurable per session. The selection is driven primarily by connectivity and latency requirements rather than capability preferences.

HQ and rear-echelon nodes with internet access use Claude Sonnet via the Anthropic API. Sonnet provides the best balance of reasoning quality, function-call accuracy, and latency for operational use — it correctly resolves natural-language unit descriptions to CoT type strings in over 97% of cases in testing, and handles multi-step mission-creation requests with reliable tool-call chaining. Claude Opus is available for complex SITREP vision processing where maximum extraction accuracy justifies higher latency and token cost.

Forward-edge nodes with intermittent or no internet connectivity use locally-hosted models. TAKpilot's edge model stack supports Llama 3 8B quantized (Q4_K_M, approximately 5 GB model weight), Qwen 2.5 7B, and Mistral 7B Instruct. These run on NVIDIA Jetson AGX Orin, tactical laptops with discrete GPU, or any x86 system with at least 8 GB VRAM. The local model handles natural-language parsing and function-call generation; TAK API calls go to a local CloudTAK instance on the same LAN segment. The full TAKpilot stack — CloudTAK, TAK Server, and local inference — can operate with zero external network dependencies.

Edge model accuracy for function-call generation is lower than Claude Sonnet — approximately 89% correct tool selection for straightforward commands, dropping to 78% for complex multi-step operations. TAKpilot compensates with a stricter validation layer: if the local model generates a tool call with an invalid CoT type string or an out-of-bounds grid reference, the validator rejects it and prompts the model to retry before presenting the action to the operator. This catches most structural errors before they reach the confirmation gate.

Claude Haiku is available as a middle tier — cloud-hosted, lower cost and latency than Sonnet, higher accuracy than local models — for nodes that have limited but reliable internet connectivity (VSAT, tactical satcom).

Security model: session sandboxing and identity attribution

Every TAKpilot session is bound to the operator's CloudTAK authentication token, which is passed to the TAKpilot Node.js service at session initialization and used for all downstream TAK API calls. The LLM agent never has direct database access — it generates function calls, and TAKpilot's API execution layer uses the operator's token to make the actual CloudTAK HTTP requests. All operations are gated by the same RBAC policies that apply to direct CloudTAK use: an operator who cannot delete missions through the CloudTAK UI cannot delete missions through TAKpilot either.

Operator attribution is preserved end-to-end through the audit log. Where a direct CloudTAK action logs "user: sgt_kovalenko — action: create_mission", a TAKpilot-mediated action logs "user: sgt_kovalenko via TAKpilot — action: create_mission — nlp_input: 'create a logistics mission for sector BRAVO'". This maintains the forensic integrity of the operational audit trail and allows after-action review to distinguish AI-assisted actions from direct UI actions.

Uploaded files — SITREP images, sketches, PDF overlays — are processed in a per-session temporary directory and deleted immediately after the vision pipeline returns its structured output. File content is never persisted to disk beyond the processing session, never stored in the conversation context sent to the LLM (only the structured extraction result is included), and never transmitted to the model provider as a stored file. The raw image sees the model once and is then gone.

Session isolation is enforced at the Node.js service layer: each WebSocket connection gets its own session context, and session state — conversation history, uploaded files, pending confirmation gates — is stored in memory keyed to the session ID. There is no shared mutable state between concurrent operator sessions.

Integration with existing TAK infrastructure

TAKpilot was designed to add capability without requiring infrastructure changes. It runs as a Node.js service alongside an existing CloudTAK deployment and communicates with TAK Server through CloudTAK's API layer — the same HTTP endpoints that the CloudTAK web client uses. There is no separate TAK Server plugin, no additional port to open on the TAK Server firewall, and no modification to TAK Server's federation configuration.

TAK Server federation — the mechanism by which multiple TAK Server instances share their COP across a WAN — works transparently with TAKpilot because TAKpilot operates at the CloudTAK layer, not the TAK Server layer. An operator at a federated node can use TAKpilot to place a marker that, via normal federation, propagates to all connected TAK Server instances. The natural language command "place a marker at 37U DP 12345 67890" results in a CoT event that travels the same federation path as any other marker placed through the CloudTAK UI.

Data package automation is exposed through the TAKpilot tool library: operators can create, populate, and distribute data packages through conversational commands. "Create a data package with all missions in sector ALPHA and send it to channel BRAVO" triggers a multi-step tool chain: list missions with sector filter, create data package, add mission items, publish to channel. Each step appears as a tool-call card in the chat with timing and status.

Channel subscription management — subscribing to and unsubscribing from CloudTAK data channels — is one of the higher-frequency operational tasks that benefits from natural language control. Operators routinely need to adjust their channel subscriptions as the mission picture evolves: "subscribe me to channel DELTA and unsubscribe from channel ECHO" is a two-tool-call chain that replaces four separate navigation actions in the CloudTAK UI.

Demo scenarios

SITREP processing. A platoon leader receives a handwritten situation report from a forward observer and photographs it with their phone. They attach the photo to TAKpilot and type "process this SITREP". TAKpilot's vision pipeline extracts six entities: two enemy vehicle positions, one friendly OP, one logistics node, a phase line, and a no-fire area boundary. The confirmation card renders each entity with its NATO symbol and proposed grid. The platoon leader corrects one misread grid, confirms the rest, and all six items appear on the CloudTAK map within eight seconds of confirmation. Total time from "I have a new SITREP" to "it's on the map": under ninety seconds, including the time to read and correct the confirmation.

Multi-track CAS coordination. A fires coordinator is managing close air support for three simultaneous engagements. Instead of switching between three open mission windows in the CloudTAK UI, they work through TAKpilot: "set BRAVO-7 status to engaged", "show me all active tracks in sector CHARLIE", "what's the last reported position of EAGLE-1?". Each command executes in under two seconds and the result appears in the chat. The coordinator's eyes stay on the map, not on the menu tree. When EAGLE-1 completes its run, "close the CAS mission for EAGLE-1 and set status to complete" triggers the approval gate — the coordinator confirms, and the mission closes.

Logistics request via chat. A company XO needs to submit a resupply request. They type: "create a logistics mission for 3rd Platoon, priority URGENT, requesting 400 rounds 5.56, 4 batteries BA-5590, at grid 37U DP 55555 44444, assign to callsign LOG-1". TAKpilot creates the mission with the correct category (logistics), priority, assigned callsign, and location marker in a single tool-call chain. The XO sees the confirmation card, verifies the details, and the mission is live on the map and visible to LOG-1 within five seconds. No form-filling, no category tree navigation, no coordinate entry dialog.

Availability: TAKpilot is AGPL-3.0 open source, available on GitHub at UA-WCV/takpilot, and listed on the Brave1 defense marketplace. Commercial support, deployment assistance, and integration with existing CloudTAK and TAK Server infrastructure are available from Corvus Intelligence at corvusintell.com/takpilot.