How does Corvus.Sense distinguish a real attack announcement from noise on Telegram?

Corvus.Sense applies a multi-stage filtering pipeline before any classification occurs. A lightweight binary classifier first separates content that semantically resembles an operational announcement (target named, claim of action, timeframe) from general commentary, reposts of news articles, and recruitment posts. Only messages that pass this gate enter the LLM classification stage. The binary classifier is fine-tuned on a labeled dataset of confirmed attack announcements and false positives, achieving over 94% precision on held-out evaluation data.

What output format does the LLM classification pipeline produce?

Each classified message produces a structured JSON object containing: extracted victim organization name and normalized organization type, sector classification from a fixed taxonomy (critical infrastructure, financial, government, telecom, energy, defense, healthcare, transport), ISO 3166-1 alpha-2 country code for the victim geography, attack vector classification (DDoS, defacement, data exfiltration, ransomware, credential theft, supply chain), confidence score between 0 and 1 for each field, and the source message metadata (channel ID, message ID, timestamp). This output feeds directly into the graph database and the analyst dashboard.

What is the end-to-end latency from a Telegram message being posted to it appearing in the analyst dashboard?

Under normal operating conditions, end-to-end latency from message publication on Telegram to a structured record appearing in the analyst console is under 90 seconds. The Telegram API polling interval is configurable (default 30 seconds). Binary classification takes under 200 milliseconds. The LLM classification call, including prompt construction and response parsing, averages 8-12 seconds per message. Graph enrichment and dashboard write takes an additional 2-4 seconds.

How does Corvus.Sense handle low-confidence classifications?

Records where the LLM returns a confidence score below the configured threshold (default 0.72) are flagged with a low-confidence indicator in the dashboard and excluded from automated alert generation. Analysts can review these records manually and apply corrections, which feed back into the fine-tuning dataset. For critical fields like sector and geography, Corvus.Sense requests a second independent LLM pass and takes the intersection of both responses when confidence is marginal.

Can Corvus.Sense track hacker groups across multiple Telegram channels simultaneously?

Yes. Corvus.Sense ingests from multiple Telegram channels concurrently. Each channel is associated with one or more tracked threat actor profiles in the system configuration. When a message is classified, it is linked to the originating channel and thereby to the associated actor node in the attack chain graph. This enables cross-channel correlation: the same actor operating across multiple channels is consolidated into a single activity timeline in the graph view.

How Corvus.Sense uses LLMs to classify and triage cyber

Telegram has become one of the highest-signal data sources for real-time cyber threat intelligence. State-aligned threat actors, hacktivist collectives, and criminal groups conducting attacks against government, critical infrastructure, and defense targets routinely announce operations before or immediately after they occur – naming victims, claiming attack vectors, and posting evidence. The problem is volume and structure: hundreds of channels, thousands of messages per day, almost all unstructured natural language, mixed with noise, reposts, and misdirection.

Corvus.Sense is built to solve this problem at production scale. At its core is a multi-stage LLM pipeline that ingests raw Telegram message streams and produces structured threat intelligence records with sector classification, geographic attribution, attack vector tagging, and confidence scoring – in under 90 seconds from message publication. This article describes how that pipeline is architected and why each design decision was made.

Why LLMs and not rule-based extraction

The first design decision was whether to use deterministic extraction (regex, keyword matching, named entity recognition) or generative LLM inference for classification. We evaluated both approaches extensively on a labeled dataset of 12,000 confirmed attack announcements across 34 Telegram channels. The conclusions were unambiguous.

Rule-based systems achieved acceptable precision for well-known actor groups with consistent posting patterns but collapsed on new actors, code-switching (messages mixing Ukrainian, Russian, and English), abbreviations, deliberate obfuscation, and stylistic variation. False negative rates above 30% for new actor channels made rule-based extraction operationally inadequate – missing one in three real attack announcements is not a viable intelligence product.

LLM-based classification achieved over 91% F1 on the same evaluation set, including on code-switching messages and novel actor channels not present in the training data. The tradeoff is latency and cost per message, both of which we address through the pipeline architecture described below.

Pipeline stage 1: ingestion and preprocessing

Corvus.Sense connects to Telegram channels via the Telegram API using dedicated service accounts. Each configured channel is polled on a configurable interval (default 30 seconds). New messages since the last poll timestamp are fetched, deduped against the message ID index, and queued for processing.

Preprocessing handles several data quality issues before any inference occurs. Messages shorter than 20 tokens are discarded – they carry insufficient semantic content for classification. Forwarded messages are tracked with their original source channel; if the original has already been processed, the forward is marked as a duplicate and skipped, preventing the same announcement from generating multiple alert records. Media-only messages (images, video without caption) are queued separately for a vision-based pipeline outside the scope of this article.

Language detection runs on each message to tag the source language (ISO 639-1). This tag is passed downstream to the LLM prompt to enable language-appropriate few-shot examples in the classification prompt.

Pipeline stage 2: binary relevance classification

The full LLM classification call is expensive relative to the volume of messages processed. A lightweight binary classifier gate runs before any LLM inference to filter out non-operational content. This classifier is a fine-tuned encoder model (350M parameters) trained to distinguish operational attack announcements from commentary, news reposts, recruitment posts, propaganda, and general channel content.

The binary classifier operates at under 200 milliseconds per message on CPU-only inference hardware. On the production evaluation set, it achieves 94.3% precision and 89.7% recall. The recall figure is intentionally not pushed higher – the cost of a false negative at this stage (a real announcement that does not proceed to LLM classification) is high, so the threshold is set conservatively to maximize recall. False positives at this stage cost a full LLM inference call, which is the controlled tradeoff.

Key insight: The binary gate is not the accuracy bottleneck – it is a cost filter. Accuracy is delivered by the LLM stage. The gate exists to ensure the LLM handles only candidate operational messages, reducing per-day LLM calls by approximately 78% compared to running LLM inference on the full message stream.

Pipeline stage 3: LLM classification and enrichment

Messages passing the binary gate enter the LLM classification stage. Corvus.Sense uses a structured output prompt that instructs the model to extract and classify each of the following fields from the message text:

Victim organization. The named or implied target organization, normalized to a canonical form. Where the message names a specific organization (e.g. a ministry, utility, or financial institution), that name is extracted verbatim. Where the victim is implied by sector and geography without a specific name, the field is populated as null and flagged for analyst review.

Sector classification. One of eight fixed taxonomy labels: critical infrastructure, financial, government, telecom, energy, defense, healthcare, or transport. The fixed taxonomy is intentional – open-ended classification produces inconsistent labels that cannot be aggregated reliably. The LLM is provided with definitions for each category and instructed to select the single best-fit label.

Geography attribution. ISO 3166-1 alpha-2 country code for the victim's country of operation. Where multiple countries are named as targets, all are extracted as an array. The model is explicitly instructed to distinguish the victim country from the actor's presumed origin country – a common source of error in naive extraction approaches.

Attack vector. One of six vector categories: DDoS, defacement, data exfiltration, ransomware, credential theft, or supply chain compromise. Multi-vector attacks are represented as an array.

Confidence scores. For each extracted field, the model returns a confidence score from 0 to 1. The prompt instructs the model to represent genuine epistemic uncertainty – a message that says "we will attack Ukrainian energy" yields high confidence on geography (UA) and sector (energy) but lower confidence on attack vector (not specified) and victim organization (not named). Scores are not post-hoc calibration; they are derived directly from the model's uncertainty representation during generation.

The LLM prompt is structured to produce a JSON response conforming to a strict schema. Response parsing validates the schema on receipt; malformed responses trigger an automatic retry with an additional instruction to correct the formatting error. Retry logic caps at two attempts; messages that still produce malformed output after two retries are flagged for analyst review and removed from automated processing.

Key insight: The fixed taxonomy constraint for sector and attack vector is critical for operational usability. An LLM left to generate free-text classification labels will produce inconsistent synonyms – "power grid," "electricity infrastructure," and "utility sector" all refer to energy sector targets but cannot be aggregated without a normalization step. Constraining to a fixed label set at inference time eliminates this entire class of data quality problems downstream.

Attack chain graph construction

Each classified message record is written to the attack chain graph database after LLM classification. The graph models the threat landscape as a property graph with three node types: threat actors, victim organizations, and attack events. Edges represent relationships: "conducted" (actor to event), "targeted" (event to victim), and "used vector" (event to attack vector taxonomy node).

When a new classified record arrives, the graph engine performs entity resolution: checking whether the named victim organization already exists as a node (using fuzzy name matching and country code disambiguation) and whether the source Telegram channel maps to a known actor profile. If both resolve, an edge is created connecting the actor node to the victim node via a new attack event node. If the actor is new (channel not yet mapped to a profile), a provisional actor node is created for analyst review.

The graph enables queries that flat record databases cannot support efficiently. Examples from analyst workflows: "Show all organizations in the energy sector targeted by this actor in the past 90 days, with attack vector breakdown." "Which actors have targeted both defense and financial sector organizations in Poland this month?" "What is the time distribution of attacks by this group relative to kinetic events in the theater?" These queries run as graph traversals, returning results in seconds on graphs of tens of thousands of nodes.

OSINT-based threat monitoring at this level of structure was not achievable before LLM-based extraction at the scale and accuracy needed to populate a graph continuously from open sources. Previous approaches required significant manual analyst effort per record, which constrained graph density and freshness.

Pattern-of-life analysis for threat actor groups

Once a threat actor profile accumulates sufficient history in the graph (typically 7 or more days of ingestion), Corvus.Sense computes pattern-of-life metrics. These are derived from the temporal and structural properties of the actor's attack event nodes in the graph.

Activity hour distribution. Attack event timestamps are binned by UTC hour of day and day of week. Most state-aligned groups operate during business hours in their home time zone; deviations from this pattern (unusual late-night surges, weekend spikes) can indicate operational tempo changes or involvement of multiple geographically distributed sub-groups. The activity hour histogram updates daily.

Target preference heatmap. The ratio of attacks by sector and geography is computed across the actor's full event history. This surfaces consistent targeting preferences – an actor that has attacked Ukrainian energy infrastructure in 73% of events is clearly specialized, and new announcements against energy targets from that actor should receive elevated priority regardless of confidence score.

TTP evolution tracking. Attack vector distributions are computed across rolling 30-day windows and compared to the actor's historical baseline. A group that historically conducted DDoS operations and is now classifying as conducting data exfiltration events represents a TTP shift – a high-value intelligence signal indicating capability development or changed objectives.

Key insight: Pattern-of-life analysis is most valuable not for confirming what you already know about a threat actor but for detecting when their behavior changes. Stable patterns are useful baselines; deviations from those patterns are the signal that warrants analyst attention and potential escalation to senior intelligence consumers.

Automated executive summary generation

Corvus.Sense includes an automated summary generation pipeline that produces human-readable intelligence products from the structured graph data. Summaries are generated on a configurable schedule (daily, weekly, or on-demand) or triggered by threshold events (attack count by a tracked actor exceeding a configured limit within a time window).

The summary pipeline queries the graph for the relevant actor, sector, or geography scope defined by the report template, retrieves the structured event records and pattern-of-life metrics, and passes this structured context to a generation model with a narrative synthesis prompt. The output is a prose intelligence brief in the register appropriate for executive consumers – no JSON, no field labels, no confidence scores unless they are analytically significant.

Critically, the summary generation model operates on structured data retrieved from the graph, not on raw Telegram message text. This architectural separation prevents hallucination from ambiguous source material: the generation model can only reference events that exist as validated classified records in the graph. If a claimed attack has not passed classification quality checks, it does not appear in a summary.

Confidence scoring and uncertainty handling

Every classified record in Corvus.Sense carries field-level confidence scores. These scores flow through to all downstream consumers: the analyst dashboard displays confidence visually, alert rules can be configured to trigger only above a minimum threshold per field, and the STIX export maps confidence scores to the STIX confidence property.

Records where any critical field (sector, geography, or actor attribution) falls below the configured threshold are placed in the analyst review queue rather than generating automated alerts. The threshold is configurable per deployment: high-sensitivity installations monitoring critical infrastructure can lower thresholds to maximize recall; broader monitoring deployments can raise thresholds to reduce analyst queue volume.

For fields where the LLM's confidence is marginal (between 0.65 and 0.80 by default), Corvus.Sense optionally submits the message for a second independent LLM pass using a different prompt formulation. When both passes agree on a field value, the confidence score is elevated; when they disagree, the field is flagged as contested and both candidate values are surfaced to the analyst.

Configuring Corvus.Sense to track a specific threat actor

The following sequence describes how to set up Corvus.Sense for targeted monitoring of a named hacker group across its Telegram channels.

Step 1 – Identify the actor's Telegram channels. Compile numeric channel IDs and @usernames for all known channels operated by or affiliated with the target group, including mirror and backup channels. Corvus.Sense accepts both formats.

Step 2 – Create an actor profile. In the Actors panel, create a new profile with the canonical group name and known aliases. Assign MITRE ATT&CK technique IDs reflecting the group's known TTPs. Link the channel identifiers to this profile. From this point, all messages from those channels are associated with this actor node in the graph.

Step 3 – Configure sector and geography scope. Select the sectors and country codes you want to monitor for this actor. Out-of-scope attacks are still ingested and classified but excluded from actor-specific alert generation. This allows broad ingestion while keeping alert volume focused on operationally relevant events.

Step 4 – Set confidence thresholds and alert delivery. Configure minimum confidence thresholds per field. For defense and critical infrastructure sectors, lower thresholds (0.65) maximize recall. Configure alert delivery to email, webhook, or a SIEM integration endpoint. Corvus.Sense supports CEF and JSON alert formats for SIEM ingestion.

Step 5 – Review and correct initial classifications. During the first 72 hours, review all classified records in the analyst queue for this actor, regardless of confidence score. Inline correction tools allow field-level edits. Corrections are logged and can be submitted to improve model calibration for this actor's linguistic patterns over time.

Step 6 – Enable pattern-of-life analysis. After 7 days of accumulated event data, enable the pattern-of-life view. Activity hour distributions, target preference heatmaps, and TTP histograms are computed from the graph and updated daily. This view is the primary input for anticipating future targeting behavior.

Step 7 – Export structured intelligence. Use the actor profile export to generate intelligence products in JSON, PDF, or STIX 2.1 bundle format. The STIX export maps actor profile data to STIX Threat Actor and Campaign objects for sharing via TAXII or import into external CTI platforms.

How Corvus.Sense uses LLMs to classify and triage cyber threats at scale

Why LLMs and not rule-based extraction

Pipeline stage 1: ingestion and preprocessing

Pipeline stage 2: binary relevance classification

Pipeline stage 3: LLM classification and enrichment

Attack chain graph construction

Pattern-of-life analysis for threat actor groups

Automated executive summary generation

Confidence scoring and uncertainty handling

Configuring Corvus.Sense to track a specific threat actor

See Corvus.Sense in Action

Frequently Asked Questions

How Corvus.Sense uses LLMs to classify and triage cyber threats at scale

Why LLMs and not rule-based extraction

Pipeline stage 1: ingestion and preprocessing

Pipeline stage 2: binary relevance classification

Pipeline stage 3: LLM classification and enrichment

Attack chain graph construction

Pattern-of-life analysis for threat actor groups

Automated executive summary generation

Confidence scoring and uncertainty handling

Configuring Corvus.Sense to track a specific threat actor

See Corvus.Sense in Action

Frequently Asked Questions

Related Articles