What is master data management (MDM) in a defense context?

Master data management in a defense context is the discipline of creating and maintaining a single, authoritative version of each core entity — person, organization, equipment, and location — that all operational systems agree to use as the reference. Without MDM, a soldier appears in the HR system under one personnel number, in the C2 system under a different unit-assigned ID, and in the logistics system under a third identifier tied to equipment allocation. When an analyst joins those systems, they cannot determine whether three records refer to the same individual or three different people. MDM solves this by running an entity resolution process that compares records across systems, links duplicates under a single global identifier, and constructs a golden record that represents the best-known state of each entity. All downstream analytics and operational systems consume the golden record via a shared API rather than querying each source system independently. The result is consistent entity references across C2, logistics, HR, and ISR, which is the prerequisite for any reliable multi-domain situational picture.

What are the three hub models in MDM and which is best for defense?

The three standard MDM hub models are registry, consolidation, and coexistence. In the registry model, the MDM hub stores only the cross-reference table — a mapping of source system identifiers to the global entity ID — without storing any master attribute data itself. Each system retains its own copy of entity attributes, and consumers query the source systems directly using the cross-reference to resolve identity. In the consolidation model, the hub copies and stores entity attributes from all source systems, enriches and de-duplicates them, and serves the consolidated result; source systems are not modified. In the coexistence model, the hub both stores the consolidated master and writes the golden record attributes back to the source systems, creating a shared authoritative state across all participating applications. For defense environments, the consolidation model is the most practical: it does not require modification of authoritative source systems (HR, logistics, C2 systems are rarely modifiable), does not demand that all systems accept hub-initiated writes (write-back authorization is complex in classified environments), and still provides a central location where analytics and intelligence systems can query consistent entity data. The registry model is appropriate when storage replication is infeasible due to classification segregation requirements.

What entity types does a defense MDM system manage?

A defense MDM system manages four primary entity domains. The person domain covers military personnel — soldiers, officers, contractors, and coalition partners — with attributes drawn from HR, personnel management, clearance management, and C2 systems. The organization domain covers units, commands, sub-units, and task forces, including their hierarchical relationships and current task organization structure, which changes frequently during operational planning. The equipment domain covers platforms, vehicles, weapons systems, and end-item inventory, with attributes drawn from logistics, maintenance, and property accountability systems; each equipment item has a serial number and a unique item identifier (UII) that must be matched across systems that may record different identifier formats. The location domain covers facilities, installations, named areas of interest, and tactical positions, requiring reconciliation between systems that reference the same physical location using military grid reference system (MGRS) coordinates, geographic coordinates, or natural language place names. Each domain requires domain-specific matching algorithms and stewardship assignments, because the attributes, identifier schemes, and data quality patterns differ substantially between domains.

How does entity resolution work at the scale of a large defense dataset?

Entity resolution at defense scale — millions of equipment records, hundreds of thousands of personnel records across multiple source systems — requires a blocking step before any pairwise comparison can be performed. Blocking partitions the full record set into smaller candidate groups within which all records could plausibly refer to the same entity. Without blocking, comparing every record against every other record is quadratically expensive and computationally infeasible for large datasets. Blocking strategies for defense entity sets include phonetic blocking on surname fields (Soundex, Metaphone) for personnel records, prefix blocking on serial number fields for equipment records, and geohash blocking for location records. After blocking, a matching model evaluates each candidate pair within a block. Deterministic matching applies a fixed rule set — if two records share an exact serial number and the same national stock number, they match. Probabilistic matching computes a match confidence score from a weighted combination of field-level similarity scores and compares it against a threshold. ML-based matching trains a classifier on labeled match/non-match pairs from the specific dataset, learning the relative weight of each field and the patterns of data entry errors specific to the source systems in the environment.

What is a golden record and how is it constructed from conflicting source data?

A golden record is the MDM system's authoritative, unified representation of a single entity, assembled from the attributes contributed by all source systems that contain records for that entity. When two or more source systems provide different values for the same attribute — for example, the equipment's assigned unit differs between the logistics system and the C2 system — a survivorship rule determines which source value wins. Common survivorship rules are: most recent wins (the source that most recently updated the attribute takes precedence), most trusted source wins (a fixed source priority ranking defines which system's values are authoritative for each attribute type), and most complete wins (the value from the source with the lowest null rate for that attribute is preferred). In practice, a production MDM system combines rules: unit assignment is authoritative from the C2 system, maintenance status is authoritative from the logistics system, and identification attributes (serial number, nomenclature) are authoritative from the property accountability system. Each attribute in the golden record carries a confidence score and a provenance reference indicating which source supplied the value and when it was last confirmed. Attributes derived from low-confidence matches or from sources with known data quality issues receive lower confidence scores, which is visible to downstream consumers assessing the reliability of the entity data.

What is data stewardship and how does it work in a defense MDM deployment?

Data stewardship is the human governance layer that resolves match and attribute conflicts that the automated MDM system cannot resolve with sufficient confidence. A data steward is an individual with domain expertise and system access authority who reviews ambiguous matches, adjudicates conflicting attribute values, and approves or rejects proposed golden record constructions. In a defense MDM deployment, stewardship is organized by entity domain: an equipment steward holds authority over the equipment domain and is typically embedded in the property accountability or logistics function; a personnel steward holds authority over the person domain and is typically embedded in the human resources or personnel management function. A dispute resolution workflow routes low-confidence match decisions and high-value attribute conflicts to the appropriate steward's queue with a structured case package: the candidate records from each source, the similarity scores for each matched field, the automated recommendation, and the alternative resolutions the steward can choose from. The steward's decision is recorded in the MDM audit trail with their identity, the decision made, and the rationale. This audit trail is both an operational record and a governance artifact demonstrating that the golden record was reviewed by an authorized domain expert.

How should MDM be architected to survive in a disconnected forward-deployed environment?

MDM survivability in disconnected environments requires a tiered architecture where a lightweight MDM cache is deployed forward alongside the tactical systems it serves, and the authoritative MDM hub remains in the rear echelon. The forward cache contains a pre-positioned snapshot of the entity master — the golden records and cross-reference table for the entities relevant to the forward element's operational area and task organization. Tactical systems in the disconnected environment query the local cache for entity resolution rather than requiring a network connection to the central hub. When the forward element operates disconnected, any changes to entity records are written to a local change log — a bounded queue of create/update/merge/split operations — rather than being lost. On reconnect, the change log is transmitted to the central hub, which applies a deterministic conflict resolution protocol to merge the forward updates with any changes that occurred at the hub or at other forward nodes during the disconnected period. Conflict resolution applies timestamp-based ordering by default, with steward-assisted resolution for conflicts where the same entity attribute was updated by both sides during the disconnected window.

What is the difference between deterministic and probabilistic entity matching?

Deterministic matching applies a fixed rule set where a match decision is binary — two records either meet all the matching criteria or they do not. A deterministic rule for equipment records might state: two records match if and only if their serial numbers are identical after stripping whitespace and normalizing case, and their national stock numbers (NSNs) agree to the first nine digits. Deterministic rules are easy to audit, produce no false positive matches when the rule is correctly specified, and require no training data. They are the appropriate choice for attributes that are intended to be globally unique identifiers maintained by authoritative sources. Probabilistic matching computes a weighted composite score from multiple field comparisons, each of which produces a partial similarity value rather than a binary yes/no. A personnel record comparison might weight surname phonetic similarity at 0.35, given name similarity at 0.25, date of birth exact match at 0.30, and unit assignment at 0.10, yielding a composite score that is compared against a configurable match threshold. Records above the threshold are candidate matches; records in a review band just below the threshold are routed to a steward. Probabilistic matching handles the dirty data reality of defense systems — records with typographic errors, variant name spellings, missing fields, or inconsistent encoding — where deterministic rules would produce unacceptably high miss rates.

How are golden record changes audited in a defense MDM system?

Every change to a golden record in a defense MDM system must be captured in an immutable audit trail that records the complete before-and-after state of the affected attributes, the source of the change (automated survivorship rule, manual steward decision, or incoming source system update), the identity of the principal who authorized the change (system service account for automated changes, authenticated human identity for steward decisions), the timestamp of the change, and a reference to the source event that triggered it (the ingest batch ID for automated updates, the steward case ID for manual decisions). The audit trail is stored append-only — existing audit records can never be modified or deleted, only new records can be appended. This immutability is enforced at the storage layer using object lock or database-level write-once policies rather than relying on application-layer enforcement alone. The audit trail enables two critical operational use cases: reconstruction of the state of any entity's golden record at any past point in time (answering the question 'what did the MDM system believe about this unit's equipment at 06:00 on a given date') and tracing the provenance of a specific attribute value in the current golden record back to the source system update and ingest event that introduced it.

How does eventual consistency work in a defense MDM system with multiple disconnected nodes?

Eventual consistency in a defense MDM deployment means that while all nodes in the MDM topology — the central hub and all forward caches — are guaranteed to converge to the same entity state after reconnection and synchronization, they may hold divergent states during a disconnected period. The consistency model is deliberately relaxed from strong consistency (where all nodes agree on the current state at all times) because strong consistency requires continuous network connectivity, which is incompatible with the disconnected operation requirement. In the eventual consistency model, each node processes updates locally and records them in its change log. Change logs use vector clocks or hybrid logical clocks rather than wall-clock timestamps to establish a partial ordering of events across nodes without requiring synchronized clocks — a requirement that is realistic in forward-deployed environments where NTP synchronization may be unavailable. On reconnect, the hub collects change logs from all nodes, merges them by applying the partial order, and produces a converged golden record state that incorporates all updates from all nodes. Where two nodes made conflicting updates to the same attribute during the disconnected window — a condition detectable by examining the vector clock partial order — the conflict is resolved by the configured survivorship rule (most trusted source, most recent by logical clock) or escalated to a steward if the survivorship rule does not produce a clear winner. After convergence, the hub propagates the merged state back to all nodes so they re-synchronize to the consistent view.

Master data management for defense: entity resolution, golden record and cross-system consistency

A battalion-level logistics officer requests a readiness report. The C2 system lists 47 vehicles assigned to the unit. The property accountability system records 51 vehicles under the same unit identifier. The maintenance system tracks 44 of those vehicles under different serial number formats, and three are flagged as deadlined under identifiers that do not appear in either of the other two systems. No one can answer the question "how many operational vehicles does this unit have right now" without a phone call — because the same physical entities are represented by different records, different identifiers, and different attribute values across systems that were never designed to agree with each other.

This is the defense MDM problem in its most common form. Master data management (MDM) is the discipline that creates a single authoritative representation of each core entity — person, organization, equipment, location — that all systems reference consistently. It is not a reporting tool and it is not a data warehouse. It is the layer that makes cross-system joins meaningful, multi-domain analytics trustworthy, and operational decisions based on data reliable. This article covers the architecture of a defense MDM system from hub model selection through entity resolution, golden record construction, stewardship workflows, and the survivability requirements that make MDM viable in contested environments.

The defense MDM problem — equipment records in three systems with three different IDs, personnel records diverging between HR and C2, the cost of inconsistency

Defense organizations accumulate source-of-truth fragmentation organically. The HR system was procured to manage personnel administration. The C2 system was built to track unit structure and tactical assignments. The logistics system was designed to manage supply and property accountability. Each was designed by a different program office, deployed in a different decade, and uses a different data model. None was designed to interoperate with the others at the entity identity level.

The result is a set of identifier schemes that are incompatible by design. A person may be identified by their personnel number in HR, their military occupational specialty code and rank combination in C2, and their equipment custodian identifier in logistics — three identifiers for one human being, with no system-maintained mapping between them. Equipment is worse: an M1A2 tank may carry a bumper number stenciled on the hull (used by the crew and C2 operators), a Department of Defense activity address code in logistics, a unique item identifier (UII) barcode in property accountability, and a maintenance work order number in the maintenance management system. None of these identifiers are the same format, and the systems that use them were not built to translate between them.

The cost of this inconsistency is not merely inconvenience. When a readiness analyst joins C2 unit assignment data against logistics supply data to compute a unit's operational capability score, the join fails to match records that refer to the same physical equipment under different identifiers. The resulting capability score is wrong — and it is wrong in a direction the analyst cannot detect without independently verifying each system, which defeats the purpose of the analytical model. For an integrated view of how these integration failures manifest across the broader data architecture, see our treatment of defense data integration patterns.

Personnel record divergence between HR and C2 is an additional, distinct problem. HR maintains the administrative record: permanent rank, duty position, assigned unit, clearance level, training history. C2 maintains the operational picture: which person is physically present with which unit, what role they are filling in the current task organization, what systems they are credentialed to operate. During stable garrison conditions, these records agree reasonably well. During operations — when personnel are attached to different units, when task organization deviates from table of organization and equipment (TO&E), when temporary duty assignments create secondary affiliations — the two records diverge rapidly. An MDM system that manages the person entity must reconcile both the administrative and operational representations into a coherent golden record that is useful for both readiness reporting and operational planning.

MDM hub architecture — registry vs consolidation vs coexistence hub models, selection criteria for defense environments, hub placement for classified vs unclassified data

Three hub models dominate production MDM deployments, each with a distinct relationship to the source systems it serves and a different set of operational trade-offs that matter in defense environments.

The registry model stores only the cross-reference table — a mapping of each source system identifier to the MDM-assigned global entity ID — without replicating or storing any entity attributes in the hub itself. When a consumer needs entity data, it queries the hub for the global ID, then queries the appropriate source systems using the source-specific identifiers returned by the cross-reference. The registry model has the lowest data footprint and requires no synchronization of attribute data into the hub, making it attractive for environments where replicating classified data into a central location raises authorization issues. Its limitation is that it forces every consumer to resolve entities across multiple source systems at query time, which is operationally impractical for high-frequency operational queries.

The consolidation model copies and normalizes entity attributes from all contributing source systems into the hub, runs entity resolution to link duplicates, and serves a unified entity view to consumers. Source systems are not modified — the hub is a read-optimized consumer of source data, not a writer back into it. This model is the most practical for defense environments because it does not require source system modification rights (C2, HR, and logistics systems are typically not modifiable by the MDM program), and it concentrates the entity resolution computation in the hub rather than distributing it across consumers. The military data lake architecture typically consumes the MDM consolidation hub's golden records as a curated reference layer rather than joining raw source tables directly.

The coexistence model adds write-back capability to the consolidation model: the hub constructs the golden record and then propagates authoritative attribute values back to the source systems, overwriting locally maintained values with the hub-determined authoritative value. This model produces the strongest consistency across systems but requires that every participating source system accept hub-initiated writes — a requirement that is frequently blocked by system authorization constraints, vendor change control processes, and operational risk aversion in live systems.

Hub placement for classified vs unclassified data requires separate hub instances at each classification boundary. A single MDM hub that processes both classified and unclassified entity data would require the hub to operate at the higher classification level, which would prevent unclassified consumers from accessing any entity data from the hub — defeating its purpose. The practical architecture deploys an unclassified hub for entities that exist entirely in unclassified source systems (general reference data, unclassified location data, commercial supplier records), and a classified hub at the appropriate classification level for entities that carry classified attributes. Cross-classification entity resolution — determining that an entity in the unclassified system corresponds to the same physical entity in a classified system — requires a specialized cross-domain solution guard to carry only the identifier cross-reference across the classification boundary, never the classified attributes themselves.

Entity types in defense MDM — person (military personnel), organization (unit/command), equipment (platform/end item), location (facility/position)

Defense MDM manages four primary entity domains, each with distinct source systems, identifier schemes, attribute structures, and data quality challenges.

The person entity domain covers military personnel, contractors, and coalition partners. Key source systems are the personnel management system (administrative record, permanent assignment, rank), the C2 system (current operational assignment, task organization position), and the clearance management system (clearance level, compartment authorizations, access expiration dates). The primary identifier challenge is that the same individual may appear under their permanent duty unit in HR while being operationally attached to a different unit in C2 — both records are correct in their respective systems, but they represent two different operational states of the same entity. Golden record construction for the person domain must capture both the administrative and operational states as distinct attribute groups rather than collapsing them into a single unit assignment field.

The organization entity domain covers units, commands, sub-units, and task forces. Organizations in defense environments have a hierarchical parent-child structure (brigade contains battalions, battalion contains companies) and a temporal dimension (task forces are created, merge, reorganize, and dissolve on operational timescales). The MDM system must maintain the current organizational hierarchy, its historical states at past timestamps, and the relationships between administrative TO&E structure and current operational task organization. Source systems are the unit hierarchy in the C2 system, the TO&E in the personnel management system, and the command element record in logistics systems.

The equipment entity domain covers platforms, vehicles, weapons systems, and end-item inventory. This is the most identifier-fragmented domain in practice: a single physical item typically carries four or more distinct identifier schemes across the systems that manage it. The UII (unique item identifier) is the Department of Defense standard globally unique barcode for serialized equipment items and should serve as the master identifier for the equipment domain, but legacy systems predating the UII standard use proprietary identifiers that require a mapping table to link to the UII space. Matching across systems requires fuzzy matching on serial number fields that may include different separator characters, leading zeros, or manufacturer code prefixes depending on which system recorded the identifier.

The location entity domain covers facilities, installations, named areas of interest, and tactical positions. Location matching is fundamentally a geospatial problem: the same physical location may be referenced by MGRS coordinates in one system, decimal-degree geographic coordinates in another, and a natural language place name in a third. Geospatial blocking groups candidate location records by proximity using a geohash grid, and matching determines whether two spatially proximate records refer to the same facility or to distinct nearby locations. Location entities also have a temporal dimension — facility classification, operational status, and controlling force change over time — requiring the golden record to carry temporal validity intervals for key location attributes. The real-time intelligence fusion layer that correlates ISR detections against known locations depends directly on the accuracy and completeness of the location entity master.

Entity resolution: matching across source systems — blocking strategies for large-scale defense entity sets, ML matching models, deterministic vs probabilistic matching

Entity resolution is the process that determines which records across source systems refer to the same real-world entity and links them under a shared global identifier. At the scale of a large defense dataset — millions of equipment records across logistics, maintenance, and property accountability systems; hundreds of thousands of personnel records across HR, C2, and clearance systems — naive pairwise comparison of all records against all other records is computationally infeasible. The matching pipeline must be structured as a two-stage process: blocking, which reduces the candidate pair space to a tractable size; followed by matching, which evaluates candidate pairs with sufficient precision to separate true matches from near-misses.

Blocking strategies for defense entity domains use domain-specific partitioning keys to group records that could plausibly be the same entity. For personnel records, phonetic blocking on the surname field using the Soundex or Double Metaphone algorithm groups records where different systems have transcribed the same name with variant spellings, extra spaces, or hyphenation differences — all common in personnel management systems that predate Unicode normalization. For equipment records, prefix blocking on the first six characters of the serial number (after normalizing whitespace and case) groups records from systems that represent the same serial number with different separator conventions. For location records, a geohash grid at precision level 6 (approximately 1.2 km cell width) groups spatially proximate records while excluding obviously distinct locations. The blocking design must be validated against a gold-standard dataset of known matches before deployment — the blocking step must retain at least 95% of true match pairs in the candidate set, or the resolution pipeline will produce systematic miss errors for the excluded record types.

Within each candidate block, the matching model evaluates each pair:

Deterministic matching applies a fixed rule set producing a binary match/non-match decision. A deterministic rule for equipment records: two records match if and only if their UII barcodes are identical (after stripping non-alphanumeric characters), or if their serial numbers are identical and their national stock numbers agree to the first nine digits. Deterministic rules require no training data, are fully auditable, and produce zero false positives when the rule is correctly specified. They are appropriate for attributes that are intended to be globally unique and maintained under data entry controls.
Probabilistic matching computes a composite match score from weighted field-level similarity metrics. A personnel record comparison might apply Jaro-Winkler similarity on the given name, phonetic matching on the surname, exact match on date of birth, and fuzzy matching on the rank abbreviation (to handle variant formatting), combined with a logistic regression or gradient boosted tree classifier trained on labeled match/non-match pairs from the actual source systems. The trained model learns the relative importance of each field in the specific dirty-data environment — in some source systems, date of birth is highly reliable; in others it has a 3% error rate that makes it a weak discriminating feature.
ML matching models extend probabilistic matching with learned representations of entity attribute text. A Siamese neural network trained on personnel name pairs learns a vector representation of names such that phonetically or orthographically similar names have similar vectors — capturing similarity patterns that hand-crafted string distance metrics miss. ML models require larger labeled datasets for training and are harder to audit than deterministic rules, but they outperform classical probabilistic models on high-noise entity sets where the data entry error patterns are complex and non-uniform across source systems.

The output of the matching pipeline for each candidate pair is a match decision (match / non-match / review) and a confidence score. Pairs above the match threshold are automatically linked under a shared global entity ID. Pairs in the review band — where the confidence score falls between the non-match threshold and the match threshold — are routed to a human steward for adjudication rather than being resolved automatically.

Golden record construction and maintenance — survivorship rules for conflicting source attributes, golden record confidence scoring, automated vs steward-assisted resolution

Once entity resolution has linked records from multiple source systems to the same global entity ID, the MDM system constructs the golden record: the authoritative, unified representation of that entity whose attributes are drawn from the best available source for each field. The golden record is not a simple merge of all source attributes — when sources disagree, a survivorship rule must determine which value the golden record carries.

Survivorship rules are defined per attribute per entity domain and encode the authority hierarchy for that attribute:

# Equipment entity — survivorship rules by attribute
serial_number:        source_priority=[property_acct, maintenance, logistics]
national_stock_num:   source_priority=[property_acct, logistics]
unit_assignment:      source_priority=[c2_system, logistics]
maintenance_status:   source_priority=[maintenance, logistics]
condition_code:       source_priority=[maintenance, property_acct]
location_mgrs:        most_recent_update  # C2 or logistics, whichever updated last
end_item_code:        source_priority=[property_acct]  # Single authoritative source

# Personnel entity — survivorship rules by attribute
permanent_rank:       source_priority=[hr_system]
operational_unit:     source_priority=[c2_system, hr_system]
clearance_level:      source_priority=[clearance_mgmt]
duty_position:        most_recent_update  # HR or C2 depending on assignment type
name_legal:           source_priority=[hr_system]
name_preferred:       most_complete  # Source with fewest null name fields wins
            

Each attribute in the golden record carries three metadata fields in addition to its value: a source reference (which source system provided this value and the identifier of the specific source record), an update timestamp (when the source system last confirmed this value), and a confidence score (a normalized value reflecting the reliability of the source and the quality of the match that linked this record to the golden entity). A confidence score of 1.0 indicates a deterministic match on a globally unique identifier from a highly reliable source; lower scores reflect probabilistic match results, sources with known data quality issues, or attributes where survivorship was contested between sources with conflicting values.

The confidence score is not a decoration — it is operationally significant. A readiness analyst building a capability assessment can filter the golden record query to exclude attributes below a confidence threshold, or the analytics layer can weight each equipment record's contribution to the unit readiness score by the confidence of its maintenance status attribute. An analyst who receives a readiness metric without visibility into the underlying confidence scores cannot distinguish between a high-confidence assessment and a figure assembled from low-quality matches and stale source data.

Golden record maintenance is a continuous process, not a one-time batch operation. When a source system updates an entity record, the MDM ingest pipeline receives the update, re-evaluates survivorship for all affected attributes, and updates the golden record accordingly. If the update causes the golden record to change for an attribute that downstream systems have consumed, a change notification is published to the MDM event bus so consuming systems can re-query the updated golden record. The change notification carries the entity global ID, the list of changed attributes, and the before/after values — enough information for a consuming system to determine whether the change affects any of its active operational data without requiring a full re-fetch of the golden record.

Data stewardship workflows — steward assignment by domain (equipment steward, person steward), dispute resolution workflow, audit trail for golden record changes

Automated entity resolution and survivorship rules handle the majority of matching and attribute conflict cases in a production MDM system — typically 85 to 95 percent of records can be resolved without human intervention when the matching pipeline is well calibrated to the specific source system data. The remaining 5 to 15 percent of cases — low-confidence matches, contested attribute values where multiple sources claim equal authority, and entity splits or merges that require a judgment call — must be routed to a human data steward for adjudication.

Stewardship is organized by entity domain, with each domain assigned to a steward who holds both the domain expertise and the system access authority needed to resolve disputes:

Equipment steward — typically embedded in the property accountability or logistics function, holds authority to determine the correct equipment record when systems disagree on serial numbers, unit assignment, or condition code. The equipment steward has direct access to physical records and can verify the ground truth by querying the originating paper trail or coordinating with equipment custodians.
Personnel steward — typically embedded in the HR or personnel management function, holds authority to resolve conflicting name spellings, duplicate personnel records (common when a person re-enlists and receives a new system-generated identifier), and assignment discrepancies between administrative and operational records.
Organization steward — typically the J1 or J3 staff element, holds authority to resolve unit hierarchy ambiguities during task organization changes, to confirm unit activation and inactivation dates, and to adjudicate parent-unit assignment when sub-units are temporarily attached to multiple headquarters.
Location steward — typically the geospatial intelligence or facilities management function, holds authority to confirm whether two spatially proximate records represent the same facility or distinct co-located facilities, and to establish the canonical coordinate and name for a location that appears under different designations in different systems.

The dispute resolution workflow presents each steward case as a structured package. The case package contains: the candidate records from each source system that the MDM system believes may refer to the same entity, field-level similarity scores for each compared attribute, the MDM system's automated recommendation (match, non-match, or the recommended winning value for a contested attribute), and a clear display of where the sources agree and disagree. The steward selects from the available resolution options — confirm the automated recommendation, override it with an alternative resolution, or flag the case as requiring additional information before it can be resolved. A free-text rationale field allows the steward to document the reasoning for non-obvious decisions.

The audit trail for golden record changes records every state transition in the golden record's history, regardless of whether the change was triggered by an automated survivorship update or a manual steward decision. Each audit record contains:

The entity global ID and the entity domain
The attribute(s) changed and the before/after values for each
The change trigger: automated survivorship rule (with rule ID and version), incoming source update (with source system ID and source record ID), or steward decision (with steward case ID)
The authenticated identity of the principal responsible for the change — the system service account for automated changes, the steward's authenticated identity for manual decisions
The change timestamp (logical clock value for distributed deployments, wall clock for single-hub deployments)

Audit records are written to append-only storage and cannot be modified after write. This immutability allows the MDM system to reconstruct the complete state of any golden record at any past point in time by replaying the audit trail from the initial record creation forward — a capability that is operationally necessary when analysts need to assess what the MDM system believed about an entity at a specific historical moment, and a compliance requirement for classified entity management systems.

MDM survivability in disconnected environments — local MDM cache for forward-deployed systems, conflict resolution on reconnect, eventual consistency model for disconnected nodes

A centralized MDM hub that requires network connectivity to the rear echelon for every entity query is operationally unusable in a contested environment where communications are degraded or severed. Forward-deployed C2 systems, logistics systems, and operational planning tools all depend on entity resolution to function correctly — if the MDM hub is unreachable, they must fall back to raw source identifiers that may not match across systems, returning the operational picture to the pre-MDM state of fragmented entity identity. MDM survivability design prevents this regression by deploying a local cache forward and defining a rigorous synchronization protocol for the reconnect phase.

The forward MDM cache is a pre-positioned read replica of the golden records and cross-reference table for the entity subset relevant to the forward element's operational scope. "Relevant scope" is defined by two dimensions: geographic area of operations (all location entities within the AO, all equipment entities with a last-known location within the AO) and task organization (all person, organization, and equipment entities associated with units in the forward element's task organization). The cache is populated before the forward element deploys, using a snapshot export from the central hub. It is deployed on hardware that operates independently of the wide-area network — a ruggedized server co-located with the forward C2 node, or embedded in the tactical command post computing environment.

During disconnected operation, tactical systems query the local cache exactly as they would query the central hub — the same API, the same response format. Entity resolution requests are served from the local cross-reference table. Golden record queries return the cached attribute values with a freshness timestamp indicating when each attribute was last confirmed by the central hub. Tactical systems are responsible for presenting this freshness timestamp to users when the data may be stale, rather than presenting cached data as if it were current.

All entity creates, updates, merges, and splits that occur during the disconnected period are written to a local change log — an ordered, bounded queue of entity change events. Each event carries a vector clock timestamp that establishes a partial ordering of events across disconnected nodes without requiring synchronized wall clocks. The change log capacity limits the duration of disconnected operation: a change log sized for 72 hours of typical operational entity update rates provides 72 hours of independent operation before synchronization becomes mandatory to prevent log overflow. The operational planning process must account for this constraint.

The reconnect synchronization protocol operates in three phases when the forward element re-establishes network connectivity with the central hub:

Phase 1 — Change log transmission:
  Forward node transmits its full change log to the central hub.
  Hub acknowledges receipt and assigns each event a global sequence number.
  Forward node continues serving local queries from cache during this phase.

Phase 2 — Conflict detection and resolution:
  Hub replays the forward change log against its current entity state.
  For each forward event, the hub checks whether the same attribute was
  updated at the hub or at another forward node during the disconnected period.

  No conflict: forward change is applied as-is.
  Conflict (same attribute updated by two sources):
    Apply configured survivorship rule (most trusted source, latest logical clock).
    If survivorship rule produces no clear winner: route to steward queue.

Phase 3 — Cache resynchronization:
  Hub transmits the merged post-conflict entity state to the forward cache.
  Forward cache replaces its snapshot with the merged state.
  Hub transmits the resolved change log to all other forward caches
    so they also receive updates that originated at this forward node.
  All nodes confirm receipt before hub marks the sync cycle complete.
            

The eventual consistency model that underlies this architecture provides a guarantee: after reconnection and successful completion of the synchronization protocol, all nodes in the MDM topology — the central hub and all forward caches — hold identical entity state. During a disconnected period, nodes may diverge. The synchronization protocol is designed so that divergence is bounded, detectable, and resolvable without data loss — no entity updates made during the disconnected period are discarded, only reordered and potentially overridden by a survivorship rule that applies the organization's defined authority hierarchy.

The operational implication is that analysts and systems using the forward cache during a disconnected period must understand they are working from a snapshot of uncertain currency. The forward cache should expose a "last sync timestamp" field at the cache level and a "last confirmed" timestamp at the individual golden record attribute level, giving downstream systems the information they need to present appropriate confidence caveats when operational decisions depend on entity data that has not been recently confirmed by the central hub. A unit readiness assessment built on equipment golden records with a 12-hour-old snapshot and a 71-hour disconnected period warrants a different level of confidence than one built on records synchronized 30 minutes before the assessment was generated. Making this distinction visible in the data — rather than hiding it behind a uniform presentation of golden record attributes — is what makes MDM useful rather than misleading in contested operational environments.