Building a GenAI-Powered Security Alert Triage System

← Back to Blog

If you work in security operations, you know the pain: hundreds of alerts a day across EDR, CSPM, application security, DNS threat feeds, and vulnerability scanners -most of which are noise. Analysts spend hours pulling context from five different consoles just to determine if something is a true positive or a known false positive they've already triaged last week.

I built a GenAI-powered triage engine to fix this. It takes a ticket ID, automatically enriches the alert with context from every relevant source, runs the analysis through Claude, and posts a structured triage report back to the ticketing system -all in under a minute.

Here's what I learned building it.

The Problem: Alert Fatigue Is Real

In any enterprise environment, security alerts come from everywhere:

Cloud security posture (CSPM) -Indicators of Attack across AWS, Azure, and GCP
Endpoint detection (EDR) -Behavioral detections, malware, suspicious processes
Application security -Web attacks, SQL injection attempts, scanner activity
DNS threat feeds -Malware C2 domains, phishing, suspicious queries
Vulnerability detections - Network-exploitable CVEs flagged across the fleet
Phishing and email threats - Reported phishing emails, suspicious attachments, credential harvesting attempts

Each alert type requires different enrichment. A cloud IOA needs cloud account context, permission analysis, and IP reputation. A DNS threat needs process-level context from the SIEM -what actually made the DNS query? A vulnerability needs CVSS scoring, CISA KEV status, and patch availability.

An analyst doing this manually is opening 5-10 tabs per alert. Multiply that by dozens of daily tickets, and you get burnout.

The Approach: Context-First Triage

The core idea is simple: gather all the context an analyst would need, then let an LLM reason over it.

The system works as a webhook-triggered service. When a new security ticket is created, it receives the ticket ID and kicks off the full triage pipeline:

1. Receive ticket ID via webhook
2. Fetch ticket details + conversation history
3. Detect alert type (CSPM, EDR, AppSec, DNS, Vuln, Phishing)
4. Route to type-specific enrichment pipeline
5. Pull context from 5-10 APIs in parallel
6. Search knowledge base for similar past triages
7. Search ticketing system for related historical tickets
8. Build comprehensive prompt with all context
9. Run analysis through Claude with MCP tool access
10. Post structured triage report back to ticket

Multi-Source Enrichment

The enrichment layer is where the real value lives. For each alert type, the system pulls different context:

Cloud IOA (CSPM) Detections

Parse the detection payload for user, source IP, resource, and action
Look up the cloud account to identify the owning team
Classify any OAuth scopes or permissions by risk level (sensitive vs. benign)
Check IP reputation via threat intelligence APIs
Determine if the source host is managed (corporate device vs. unknown)
Pull prior tickets mentioning the same application or resource

Application Security Alerts

Fetch the raw signal data from the SIEM, including attack payloads
Analyze HTTP status codes -did the attack actually succeed (200) or get blocked (403/404)?
Check if the source IP is a known scanner or hosting provider
Determine if the targeted environment is production or staging
Look at whether the WAF blocked the request before it reached the application

DNS Threat Detections

Extract the queried domain and check reputation across threat intelligence feeds
Identify which internal hosts made the query
Query the SIEM for process-level context -which process on which host triggered the DNS lookup
Check if the host is running a managed endpoint agent
Assess hit count: single probe vs. repeated beacon-like queries

Vulnerability Detections

Parse the CVE, CVSS score, and attack vector
Check CISA Known Exploited Vulnerabilities catalog
Query for all affected hosts and their exposure (internet-facing vs. internal)
Check patch availability and remediation status
Prioritize: CISA KEV + network-exploitable + production = critical

Phishing and Email Threat Analysis

Extract sender details, reply-to addresses, and header anomalies (SPF/DKIM/DMARC failures)
Analyze URLs in the email body - check against threat intelligence feeds and sandbox detonation results
Inspect attachments for known malware signatures and suspicious file types
Assess targeting: mass campaign vs. spear-phishing (who received it, how many users clicked)
Check if the sender domain is a lookalike or typosquat of a known trusted domain
Correlate with EDR telemetry - did any user who clicked the link have subsequent suspicious process execution?

How It Pulls Context: MCP (Model Context Protocol)

This is the core differentiator. The system doesn't use traditional REST API calls to gather context. Instead, it uses MCP (Model Context Protocol) to give Claude direct, live access to security tools.

MCP turns security platforms into tools that Claude can call on its own. Instead of writing rigid API integration code for every enrichment step, you define MCP servers for your security tools, and Claude decides which ones to call based on the alert context:

EDR/XDR tools - Claude queries host details, alert entities, device posture, and detection context directly through MCP
SIEM - Claude runs log queries to fetch process-level context, DNS events, and correlated signals
Threat intelligence - IP reputation, domain reputation, and IOC lookups happen as MCP tool calls
Ticketing systems - Claude fetches ticket details, conversation history, and searches for similar past tickets
Vulnerability scanners - Claude queries for affected hosts, CVE details, and remediation status

The power of this approach is that Claude can adaptively investigate. If a host lookup reveals something interesting, it can pivot and query related alerts for that host, check device posture, or look up the user's recent activity. For complex detections, it can take multiple investigative turns, querying different MCP tools to build a complete picture. This mirrors how a human analyst actually works.

On top of MCP-based live investigation, the system also builds a comprehensive prompt that includes:

The original alert data and ticket description
All pre-fetched enrichment results (IP reputation, host status, cloud account info, etc.)
Alert-type-specific triage guidance (decision frameworks for TP/FP classification)
Relevant knowledge base entries from past triages
Similar historical tickets and their resolutions

Why MCP over traditional APIs: Traditional automation scripts follow rigid decision trees - you hardcode every API call and every branching condition. With MCP, the AI decides what to investigate next based on what it's already found. You define the tools once, and the AI figures out the investigation path. This is the difference between "here's some data, what do you think?" and "here are some tools, go investigate this alert."

The Knowledge Base: Learning from Every Triage

One of the most impactful features is the self-growing knowledge base. After every triage, the system generates a structured entry:

Searchable keywords extracted from the alert
A concise summary: what happened, root cause, resolution, and guidance for next time
The alert category and classification result

On the next triage, the system searches this knowledge base for relevant past entries and includes them in the prompt. This means:

Repeated patterns get faster. If the same Azure policy fires every month during certificate rotation, the KB knows it's a known false positive.
Institutional knowledge persists. Even when team members rotate, the reasoning behind past decisions is preserved.
Edge cases get handled. The KB captures nuances that wouldn't fit in a static runbook -like "this specific app was approved by the security team in ticket #12345."

TP/FP Classification: Not Just Rules

The triage logic isn't a simple if/else decision tree. It combines structured guidance with contextual reasoning:

False Positive Indicators

Action from a managed corporate device + clean IP reputation + known team/owner
Known scanner IP (hosting provider) + all requests blocked (403/404) + no successful access
Routine activity matching a known KB pattern (e.g., monthly certificate rotation)
Low-risk OAuth scopes only (openid, profile) with no sensitive access requested

True Positive Indicators

Unknown application requesting sensitive permissions (mail access, directory write)
Source IP with malicious reputation + action outside business hours + unmanaged device
Successful HTTP requests (200 status) with attack payloads reaching the application
CISA KEV vulnerability with no patch on an internet-facing production host
DNS query to a known C2 domain from a process that shouldn't be making external calls

The LLM weighs all of these signals together, considers the knowledge base, checks similar past tickets, and produces a prioritized assessment with specific next steps.

Deployment: Serverless and Scalable

The service runs as a containerized application on a serverless platform. Key design decisions:

Webhook-triggered: Ticketing system fires a webhook on new ticket creation. No polling, no cron jobs.
Auto-scaling: Scales from zero to handle bursts. Most of the time, it's idle.
Persistent knowledge base: The KB is stored on a cloud storage volume mount, surviving container restarts and redeployments.
Secret management: All API credentials managed through the cloud provider's secret manager. Nothing hardcoded.
Audit trail: Every API call made during triage is logged and returned in the response.

Results

After deploying the system:

Triage time dropped from 15-30 minutes to under 60 seconds per alert
Consistent quality: Every triage follows the same enrichment and analysis process, regardless of the analyst's experience level
Knowledge compounds: The KB now has 70+ entries, and common alert patterns are resolved almost instantly with references to past decisions
Analyst focus shifts: Instead of spending time on context gathering, analysts review the AI's triage report and make the final call on escalation
Cost tracking: Every triage logs token usage and cost, making it easy to measure ROI

Lessons Learned

1. Context is everything

An LLM is only as good as the context you give it. The difference between a useless "this alert might be suspicious" and a precise "this is a false positive because the action was performed by the IT admin from a managed device during a scheduled maintenance window" is entirely about enrichment quality.

2. Start with the analyst's workflow

I built this by literally documenting what I do when I triage each alert type. What tabs do I open? What do I search for? What questions do I ask? The system automates that exact workflow.

3. The knowledge base is the moat

The KB is what makes this system compound in value over time. Static runbooks go stale. A self-growing KB that captures the reasoning behind every decision stays relevant because it evolves with the threat landscape.

4. MCP changes the game

Being able to give the AI live API access means it can investigate, not just analyze. The difference between "here's some data, what do you think?" and "here are some tools, go investigate this alert" is transformative.

5. Keep the human in the loop

This system triages, it doesn't remediate. The output is a recommendation posted to the ticket. A human analyst reviews it and decides on action. This is intentional -for security decisions, you want human judgment on the final call.

What's Next

I'm exploring extending the system to handle automated response actions for high-confidence false positives (auto-close with documented reasoning), integrating more threat intelligence feeds, and expanding the alert type coverage. The goal is to get analysts focused on the 10% of alerts that actually matter, while the system handles the 90% that are noise.

If you're building something similar or want to discuss GenAI in security operations, feel free to reach out on LinkedIn.