Telegram has become one of the highest-signal data sources for real-time cyber threat intelligence. State-aligned threat actors, hacktivist collectives, and criminal groups conducting attacks against government, critical infrastructure, and defense targets routinely announce operations before or immediately after they occur – naming victims, claiming attack vectors, and posting evidence. The problem is volume and structure: hundreds of channels, thousands of messages per day, almost all unstructured natural language, mixed with noise, reposts, and misdirection.
Corvus.Sense is built to solve this problem at production scale. At its core is a multi-stage LLM pipeline that ingests raw Telegram message streams and produces structured threat intelligence records with sector classification, geographic attribution, attack vector tagging, and confidence scoring – in under 90 seconds from message publication. This article describes how that pipeline is architected and why each design decision was made.
Why LLMs and not rule-based extraction
The first design decision was whether to use deterministic extraction (regex, keyword matching, named entity recognition) or generative LLM inference for classification. We evaluated both approaches extensively on a labeled dataset of 12,000 confirmed attack announcements across 34 Telegram channels. The conclusions were unambiguous.
Rule-based systems achieved acceptable precision for well-known actor groups with consistent posting patterns but collapsed on new actors, code-switching (messages mixing Ukrainian, Russian, and English), abbreviations, deliberate obfuscation, and stylistic variation. False negative rates above 30% for new actor channels made rule-based extraction operationally inadequate – missing one in three real attack announcements is not a viable intelligence product.
LLM-based classification achieved over 91% F1 on the same evaluation set, including on code-switching messages and novel actor channels not present in the training data. The tradeoff is latency and cost per message, both of which we address through the pipeline architecture described below.
Pipeline stage 1: ingestion and preprocessing
Corvus.Sense connects to Telegram channels via the Telegram API using dedicated service accounts. Each configured channel is polled on a configurable interval (default 30 seconds). New messages since the last poll timestamp are fetched, deduped against the message ID index, and queued for processing.
Preprocessing handles several data quality issues before any inference occurs. Messages shorter than 20 tokens are discarded – they carry insufficient semantic content for classification. Forwarded messages are tracked with their original source channel; if the original has already been processed, the forward is marked as a duplicate and skipped, preventing the same announcement from generating multiple alert records. Media-only messages (images, video without caption) are queued separately for a vision-based pipeline outside the scope of this article.
Language detection runs on each message to tag the source language (ISO 639-1). This tag is passed downstream to the LLM prompt to enable language-appropriate few-shot examples in the classification prompt.
Pipeline stage 2: binary relevance classification
The full LLM classification call is expensive relative to the volume of messages processed. A lightweight binary classifier gate runs before any LLM inference to filter out non-operational content. This classifier is a fine-tuned encoder model (350M parameters) trained to distinguish operational attack announcements from commentary, news reposts, recruitment posts, propaganda, and general channel content.
The binary classifier operates at under 200 milliseconds per message on CPU-only inference hardware. On the production evaluation set, it achieves 94.3% precision and 89.7% recall. The recall figure is intentionally not pushed higher – the cost of a false negative at this stage (a real announcement that does not proceed to LLM classification) is high, so the threshold is set conservatively to maximize recall. False positives at this stage cost a full LLM inference call, which is the controlled tradeoff.
Key insight: The binary gate is not the accuracy bottleneck – it is a cost filter. Accuracy is delivered by the LLM stage. The gate exists to ensure the LLM handles only candidate operational messages, reducing per-day LLM calls by approximately 78% compared to running LLM inference on the full message stream.
Pipeline stage 3: LLM classification and enrichment
Messages passing the binary gate enter the LLM classification stage. Corvus.Sense uses a structured output prompt that instructs the model to extract and classify each of the following fields from the message text:
Victim organization. The named or implied target organization, normalized to a canonical form. Where the message names a specific organization (e.g. a ministry, utility, or financial institution), that name is extracted verbatim. Where the victim is implied by sector and geography without a specific name, the field is populated as null and flagged for analyst review.
Sector classification. One of eight fixed taxonomy labels: critical infrastructure, financial, government, telecom, energy, defense, healthcare, or transport. The fixed taxonomy is intentional – open-ended classification produces inconsistent labels that cannot be aggregated reliably. The LLM is provided with definitions for each category and instructed to select the single best-fit label.
Geography attribution. ISO 3166-1 alpha-2 country code for the victim's country of operation. Where multiple countries are named as targets, all are extracted as an array. The model is explicitly instructed to distinguish the victim country from the actor's presumed origin country – a common source of error in naive extraction approaches.
Attack vector. One of six vector categories: DDoS, defacement, data exfiltration, ransomware, credential theft, or supply chain compromise. Multi-vector attacks are represented as an array.
Confidence scores. For each extracted field, the model returns a confidence score from 0 to 1. The prompt instructs the model to represent genuine epistemic uncertainty – a message that says "we will attack Ukrainian energy" yields high confidence on geography (UA) and sector (energy) but lower confidence on attack vector (not specified) and victim organization (not named). Scores are not post-hoc calibration; they are derived directly from the model's uncertainty representation during generation.
The LLM prompt is structured to produce a JSON response conforming to a strict schema. Response parsing validates the schema on receipt; malformed responses trigger an automatic retry with an additional instruction to correct the formatting error. Retry logic caps at two attempts; messages that still produce malformed output after two retries are flagged for analyst review and removed from automated processing.
Key insight: The fixed taxonomy constraint for sector and attack vector is critical for operational usability. An LLM left to generate free-text classification labels will produce inconsistent synonyms – "power grid," "electricity infrastructure," and "utility sector" all refer to energy sector targets but cannot be aggregated without a normalization step. Constraining to a fixed label set at inference time eliminates this entire class of data quality problems downstream.
Attack chain graph construction
Each classified message record is written to the attack chain graph database after LLM classification. The graph models the threat landscape as a property graph with three node types: threat actors, victim organizations, and attack events. Edges represent relationships: "conducted" (actor to event), "targeted" (event to victim), and "used vector" (event to attack vector taxonomy node).
When a new classified record arrives, the graph engine performs entity resolution: checking whether the named victim organization already exists as a node (using fuzzy name matching and country code disambiguation) and whether the source Telegram channel maps to a known actor profile. If both resolve, an edge is created connecting the actor node to the victim node via a new attack event node. If the actor is new (channel not yet mapped to a profile), a provisional actor node is created for analyst review.
The graph enables queries that flat record databases cannot support efficiently. Examples from analyst workflows: "Show all organizations in the energy sector targeted by this actor in the past 90 days, with attack vector breakdown." "Which actors have targeted both defense and financial sector organizations in Poland this month?" "What is the time distribution of attacks by this group relative to kinetic events in the theater?" These queries run as graph traversals, returning results in seconds on graphs of tens of thousands of nodes.
OSINT-based threat monitoring at this level of structure was not achievable before LLM-based extraction at the scale and accuracy needed to populate a graph continuously from open sources. Previous approaches required significant manual analyst effort per record, which constrained graph density and freshness.
Pattern-of-life analysis for threat actor groups
Once a threat actor profile accumulates sufficient history in the graph (typically 7 or more days of ingestion), Corvus.Sense computes pattern-of-life metrics. These are derived from the temporal and structural properties of the actor's attack event nodes in the graph.
Activity hour distribution. Attack event timestamps are binned by UTC hour of day and day of week. Most state-aligned groups operate during business hours in their home time zone; deviations from this pattern (unusual late-night surges, weekend spikes) can indicate operational tempo changes or involvement of multiple geographically distributed sub-groups. The activity hour histogram updates daily.
Target preference heatmap. The ratio of attacks by sector and geography is computed across the actor's full event history. This surfaces consistent targeting preferences – an actor that has attacked Ukrainian energy infrastructure in 73% of events is clearly specialized, and new announcements against energy targets from that actor should receive elevated priority regardless of confidence score.
TTP evolution tracking. Attack vector distributions are computed across rolling 30-day windows and compared to the actor's historical baseline. A group that historically conducted DDoS operations and is now classifying as conducting data exfiltration events represents a TTP shift – a high-value intelligence signal indicating capability development or changed objectives.
Key insight: Pattern-of-life analysis is most valuable not for confirming what you already know about a threat actor but for detecting when their behavior changes. Stable patterns are useful baselines; deviations from those patterns are the signal that warrants analyst attention and potential escalation to senior intelligence consumers.
Automated executive summary generation
Corvus.Sense includes an automated summary generation pipeline that produces human-readable intelligence products from the structured graph data. Summaries are generated on a configurable schedule (daily, weekly, or on-demand) or triggered by threshold events (attack count by a tracked actor exceeding a configured limit within a time window).
The summary pipeline queries the graph for the relevant actor, sector, or geography scope defined by the report template, retrieves the structured event records and pattern-of-life metrics, and passes this structured context to a generation model with a narrative synthesis prompt. The output is a prose intelligence brief in the register appropriate for executive consumers – no JSON, no field labels, no confidence scores unless they are analytically significant.
Critically, the summary generation model operates on structured data retrieved from the graph, not on raw Telegram message text. This architectural separation prevents hallucination from ambiguous source material: the generation model can only reference events that exist as validated classified records in the graph. If a claimed attack has not passed classification quality checks, it does not appear in a summary.
Confidence scoring and uncertainty handling
Every classified record in Corvus.Sense carries field-level confidence scores. These scores flow through to all downstream consumers: the analyst dashboard displays confidence visually, alert rules can be configured to trigger only above a minimum threshold per field, and the STIX export maps confidence scores to the STIX confidence property.
Records where any critical field (sector, geography, or actor attribution) falls below the configured threshold are placed in the analyst review queue rather than generating automated alerts. The threshold is configurable per deployment: high-sensitivity installations monitoring critical infrastructure can lower thresholds to maximize recall; broader monitoring deployments can raise thresholds to reduce analyst queue volume.
For fields where the LLM's confidence is marginal (between 0.65 and 0.80 by default), Corvus.Sense optionally submits the message for a second independent LLM pass using a different prompt formulation. When both passes agree on a field value, the confidence score is elevated; when they disagree, the field is flagged as contested and both candidate values are surfaced to the analyst.
Configuring Corvus.Sense to track a specific threat actor
The following sequence describes how to set up Corvus.Sense for targeted monitoring of a named hacker group across its Telegram channels.
Step 1 – Identify the actor's Telegram channels. Compile numeric channel IDs and @usernames for all known channels operated by or affiliated with the target group, including mirror and backup channels. Corvus.Sense accepts both formats.
Step 2 – Create an actor profile. In the Actors panel, create a new profile with the canonical group name and known aliases. Assign MITRE ATT&CK technique IDs reflecting the group's known TTPs. Link the channel identifiers to this profile. From this point, all messages from those channels are associated with this actor node in the graph.
Step 3 – Configure sector and geography scope. Select the sectors and country codes you want to monitor for this actor. Out-of-scope attacks are still ingested and classified but excluded from actor-specific alert generation. This allows broad ingestion while keeping alert volume focused on operationally relevant events.
Step 4 – Set confidence thresholds and alert delivery. Configure minimum confidence thresholds per field. For defense and critical infrastructure sectors, lower thresholds (0.65) maximize recall. Configure alert delivery to email, webhook, or a SIEM integration endpoint. Corvus.Sense supports CEF and JSON alert formats for SIEM ingestion.
Step 5 – Review and correct initial classifications. During the first 72 hours, review all classified records in the analyst queue for this actor, regardless of confidence score. Inline correction tools allow field-level edits. Corrections are logged and can be submitted to improve model calibration for this actor's linguistic patterns over time.
Step 6 – Enable pattern-of-life analysis. After 7 days of accumulated event data, enable the pattern-of-life view. Activity hour distributions, target preference heatmaps, and TTP histograms are computed from the graph and updated daily. This view is the primary input for anticipating future targeting behavior.
Step 7 – Export structured intelligence. Use the actor profile export to generate intelligence products in JSON, PDF, or STIX 2.1 bundle format. The STIX export maps actor profile data to STIX Threat Actor and Campaign objects for sharing via TAXII or import into external CTI platforms.