The Detection Rule That Fires Daily and Catches Nothing

The security operations team had a rule. It was written twelve months ago by an analyst who no longer worked at the company. It fired every single day — sometimes forty or fifty times a day — and had done so for eight of those twelve months without producing a single actionable finding.

Nobody disabled it. Nobody investigated why it fired so frequently. Nobody questioned whether a rule that fires fifty times a day and produces zero confirmed incidents is doing anything useful at all.

The rule was watching for failed authentication attempts against the company's internal VPN gateway. It triggered on five or more failed attempts from the same IP within a ten-minute window. It had been firing daily because the company's own remote employees — distributed across seven time zones — regularly locked themselves out of VPN on Monday mornings after weekend password resets.

Eight months after the rule was written, a threat actor conducted a low-and-slow credential stuffing attack against the same VPN gateway — two or three attempts per source IP, rotating across a /16 subnet, spread over four hours. The attack succeeded. A valid credential was compromised.

The detection rule never fired. The attack was specifically designed to stay below the threshold that had been set based on noise, not adversary behaviour. The breach was discovered six weeks later during an unrelated audit.

Security operations monitoring dashboard

The Metric Nobody Reports: Detection Coverage vs. Detection Volume

Every SOC has a dashboard that shows alert volume. Most have one that shows mean time to detect and mean time to respond. Very few have a dashboard that answers the question that actually matters:

Of the attacks that occurred in the last 90 days, what percentage did our detection rules identify?

This is detection coverage — and it is almost never measured, because measuring it requires knowing about attacks that your rules didn't catch. Which requires threat intelligence, purple team exercises, retrospective analysis of confirmed incidents, and a willingness to treat every gap as a measurement rather than as something to avoid acknowledging.

Alert volume is easy to measure and communicates nothing about security posture. A SOC generating 10,000 alerts per day with a 0.1% confirmed incident rate has worse detection engineering than a SOC generating 100 alerts per day with a 40% confirmed incident rate. The first SOC is overwhelmed with noise and missing real attacks in the flood. The second SOC has tuned its detection to fire on things that are actually happening.

Most SOC reporting optimises for volume metrics because they are easy to collect and look impressive in management reports. Coverage metrics require admitting that gaps exist — which they always do — and treating those gaps as engineering problems to be solved rather than weaknesses to be concealed.

Why High-Volume Low-Fidelity Rules Are Worse Than No Rule

The intuition behind keeping noisy rules is understandable: "it's better to have the rule than not have it." This intuition is wrong for a specific and measurable reason.

High-volume low-fidelity rules cause alert fatigue. Analysts who review fifty alerts per day from the same rule, all of which turn out to be false positives, develop a conditioned response: the rule becomes noise, the alerts become part of the background, and the cognitive weight of reviewing them trains analysts to close them faster and with less scrutiny.

When a real attack arrives in the same alert stream, it is processed with the same conditioned dismissal as the false positives that preceded it. This is not analyst failure — it is a predictable human response to signal-to-noise ratio. The noisy rule has not just failed to help; it has actively degraded the analyst's ability to detect the real attack.

The counterfactual — no rule at all — is often safer from a detection perspective because it does not create the false confidence that comes from having a rule and does not degrade analyst response through false positive conditioning.

What a Properly Engineered Detection Rule Looks Like

The threshold must be based on adversary behaviour, not internal noise:

The VPN failed-authentication rule should have been set based on the question "what does a credential stuffing attack look like against this endpoint?" — not "what threshold avoids firing on our own employees?"

A credential stuffing attack in 2024 typically distributes attempts across many source IPs to avoid per-IP thresholds. The detection logic should have been:

alert if:
  total_failed_authentications_to_VPN > 50
  from source_ip_count > 20
  within_time_window: 60 minutes

This detects distributed low-and-slow attacks while not triggering on a single user locking themselves out.

The rule must have a defined true positive rate:

Before a detection rule goes to production, it should be tested against known attack patterns to establish what percentage of times it fires it should be firing on something real. A rule with a defined expected true positive rate of 60% will still generate false positives — but analysts know to expect them and the rule is monitored for drift.

The rule must have a defined adversary behaviour it detects:

Rule name: VPN-CRED-STUFFING-DISTRIBUTED-001
MITRE ATT&CK technique: T1110.004 (Credential Stuffing)
Expected adversary behaviour: Multiple source IPs each attempting 1-3 logins within 60 minutes
False positive sources: Automated systems, roaming employees (low probability, review if triggered)
True positive indicator: Geographic spread of source IPs, known credential breach databases

Rules without defined adversary mappings cannot be maintained or improved systematically.

SIEM alert configuration and monitoring

The Alert Fatigue Audit

If your SOC has rules that fire more than 5 times per day with a less than 5% confirmed incident rate, they require immediate review. The specific questions:

1. What is the alert's true positive rate over the last 90 days?

Every alert should have documented outcomes: confirmed malicious, confirmed false positive, inconclusive. Rules with true positive rates below 10% need to be suppressed, tuned, or eliminated.

2. Is the threshold based on adversary behaviour or operational noise?

If the threshold was set by asking "what avoids false positives from our internal systems," it is likely wrong. The threshold should be set by asking "what does this specific attack technique look like in telemetry data."

3. Does the rule detect what it claims to detect?

Run a purple team test: have a red team execute the technique the rule is supposed to detect, and verify that the rule fires. A significant percentage of detection rules, when tested, fail to detect the technique they were written to detect.

4. Has the rule been reviewed in the last 90 days?

Detection rules are not static. Adversary techniques evolve. Infrastructure changes. A rule that was accurate six months ago may be firing on legitimate behaviour that emerged after its last review.

The Detection Engineering Improvement Loop

Detection engineering is not a one-time activity. It is a continuous improvement cycle that mirrors the adversary's evolution:

Map coverage — identify which MITRE ATT&CK techniques you have detection rules for and which you don't
Test detection — purple team exercises verify that rules fire when techniques are used
Measure fidelity — track true positive rates and set alerts for rules that drop below threshold
Tune or eliminate — rules below acceptable fidelity are tuned or removed
Close gaps — techniques without detection coverage are prioritised for new rule development

The goal is not more rules. It is better rules. A SOC with 200 high-fidelity rules that detect real attacks reliably is significantly more effective than a SOC with 2,000 rules that collectively produce noise and miss the attacks they were supposed to catch.

The rule that fires daily and catches nothing is not a safety net. It is technical debt that is actively making your SOC worse.