contexa-core contexa-identity hcad

HCAD Early Threat Detection

HCAD (Host & Context Anomaly Detection) collects behavioural signals from authenticated HTTP requests, scores them, and starts the asynchronous AI analysis pipeline before the request ever reaches a protected resource (@Protectable). HCAD itself never blocks or challenges — it only decides whether the LLM pipeline should start analysing early.

Overview

HCAD early threat detection runs before controllers or @Protectable methods are reached by having HCADFilterConfigurer insert HCADFilter before Spring Security's AuthorizationFilter. The current configurer order is SecurityConfigurer.HIGHEST_PRECEDENCE + 115, followed by AuthenticatedPendingAnomalyTriggerFilter at HIGHEST_PRECEDENCE + 116. On every request it collects nine dimensions of behavioural signals, combines them into a risk score using anchor and corroborating signal weights, and emits an early-analysis event when the score crosses redlineScore = 70 with at least one anchor signal present.

Design principle. HCAD always lets requests through (fail-open) and never blocks or challenges on its own. Enforcement is still handled exclusively by the existing asynchronous Zero Trust path backed by ZeroTrustActionRepository. HCAD only makes that path start earlier.

From request arrival to decision

Stage 1
HTTP request
Authenticated session enters the filter chain
Stage 2
HCADFilter
Collects signals. Never blocks.
Stage 3
Signal scoring
9 dimensions → risk score
Stage 4
Early analysis
Fires an AI event when the threshold is crossed
Stage 5
LLM pipeline
Joins the existing async analysis path

When analysis begins

Without HCAD early detection, AI analysis begins only when a protected resource is reached. With HCAD enabled, requests that already carry strong anomaly signals get their analysis scheduled at the filter stage — before any controller code runs.

Baseline path (no early detection)
Auth
Filter
Waiting for controller
LLM
Enforce
Authentication Filter chain Idle wait LLM analysis
With HCAD early detection
Auth
HCAD
LLM analysis (early)
Enforce
Authentication HCAD collection Early analysis Enforcement

Nine behavioural signal dimensions

On every request, HCAD builds an HCADContext value object that carries nine dimensions of information: identity, session, device, geography, behaviour, authentication, resource, intent, and baseline. The same object is used by the scorer and later by the LLM pipeline as shared context.

ID
Request identity
  • userId · sessionId
  • requestPath · httpMethod
  • remoteIp
GEO
Time & geography
  • country · city
  • latitude · longitude
  • currentAccessHour
NEW
Novelty
  • isNewSession
  • isNewDevice
  • isNewUser
AUTH
Authentication
  • authenticationMethod
  • failedLoginAttempts
  • hasValidMFA
BHV
Behaviour pattern
  • recentRequestCount (5 min)
  • lastRequestInterval
  • previousPath
BL
Normal-behaviour baseline
  • baselineConfidence
  • updateCount
  • avgTrustScore
RES
Resource
  • resourceType
  • isSensitiveResource
INT
Intent
  • intentBotUserAgent
  • intentMissingReferer
VEC
Vectorisation
  • toVector() — 384 dims
  • toJson()
  • toCompactString()

Signal weight model

Signals fall into two classes. Anchor signals are heavy enough that a single anchor can push the request into the early-analysis path. Corroborating signals carry meaning only when several accumulate. The final rule is "at least one anchor AND total ≥ 70".

Anchor signals

AIMPOSSIBLE_TRAVEL
45
AAUTH_CONTEXT_INCONSISTENT
40
AFAILED_LOGIN_BURST
30
ANEW_DEVICE
25

Corroborating signals

CSENSITIVE_SURFACE
20
CREQUEST_BURST
10
CRAPID_SEQUENCE
10
CPREVIOUS_PATH_JUMP
10
CBASELINE_UNCERTAIN
5

Risk score bands

The aggregated score on a 0–100 scale is classified into four bands. Early analysis kicks in once the request enters the RED band and an anchor signal is present.

LOW MEDIUM HIGH REDLINE
0 25 50 75 100
Band Score range Meaning Triggers early analysis
LOW 0 – 29 Normal No
MEDIUM 30 – 49 Mild anomaly No
HIGH 50 – 69 Worth watching No (existing path handles it)
REDLINE 70 – 100 Inspect now Yes, provided an anchor signal is present
Other thresholds. highRiskScore = 50, mediumRiskScore = 30, lowBaselineConfidenceThreshold = 0.35. When a user's normal-behaviour baseline confidence is below 0.35, the corroborating signal BASELINE_UNCERTAIN is added automatically.

Assessment record structure

The scorer output is stored in an immutable record — HcadPreProtectablePromotionAssessment — and projected into both HCADContext.additionalAttributes and HTTP request attributes. This record is the single source of truth for every downstream explanation, audit entry, and verifier replay.

score
Integer from 0 to 100. Sum of all signal weights
band
One of LOW · MEDIUM · HIGH · REDLINE
eligible
Whether the request qualifies for early analysis (true/false)
anchorSignals
Set of detected anchor signal enums
corroboratingSignals
Set of detected corroborating signal enums
reasonCodes
Array of reason codes. Used for operator explanations
summary
One-line summary string
evaluationVersion
Scorer version. Enables reproducible comparison across model updates
rawSignalSnapshot
Snapshot of raw signal values. Used by verifiers for replay

Early-analysis trigger chain

Placed right after the HCAD filter, AuthenticatedPendingAnomalyTriggerFilter publishes an event only when a request passes five sequential gates. Any gate can stop the flow independently — the original request path is never affected.

1
Eligibility
Authenticated · action = PENDING_ANALYSIS · assessment present
2
Evidence
Explanation fields complete. Produces evidence report
3
Dedup
Skips if still inside cooldown (resend wait) window
4
In-flight lock
Attempts to acquire a TTL-based distributed lock
5
Publish
Emits the early-analysis event and marks cooldown
Responsible components. PendingAnomalyEligibilityGate (1), PendingAnomalyEvidenceCheckService (2), AnalysisTriggerStateRepository (3 & 4), PendingAnomalyEventTriggerService (5). Requests passing all five gates are forwarded to ZeroTrustEventPublisher.publishPreProtectableThreat().

Normal-behaviour baseline learning

Before HCAD can decide what is abnormal for a user, it has to learn what is normal. BaselineLearningService observes high-trust requests and incrementally updates two baselines — one per user, one per cohort — across IP ranges, access hours, paths, User-Agents, operating systems, and authentication methods.

Learning mechanics

Exponential Moving Average (EMA)

  • newTrust = α·current + (1-α)·old
  • Weights recent observations more
  • Absorbs gradual drift naturally

Least-Frequently-Used eviction

  • IP · path · UA · OS sets
  • Drops the least-frequent element first
  • Keeps set sizes bounded

Personal baseline

  • Activates after 10+ samples
  • Confidence tier rises gradually
  • Stored per user

Organisational baseline

  • Fallback when personal data is absent
  • Cohort-level statistics
  • Protects brand-new users

Event payload fields

Requests that clear all five gates are published as PRE_PROTECTABLE_REDLINE events and then consumed by the existing asynchronous LLM pipeline and audit system. The payload contract carries every judgement input, so the same structure can later be reused by external verifier scenarios.

Field Type Description
hcadEscalationScore integer Risk score between 0 and 100
hcadEscalationBand enum LOW · MEDIUM · HIGH · REDLINE
hcadEscalationEligible boolean Whether early analysis fired
hcadEscalationReasons string[] Reason codes for each detected signal
hcadEscalationSummary string One-line summary for operator alerts
hcadEscalationVersion string Scorer version — ensures reproducibility
rawSignalSnapshot object Raw signal values for verifier replay
action enum Always PENDING_ANALYSIS. Enforcement action is decided by the LLM pipeline later

Infrastructure modes & wiring

HCAD's session metadata, request counters, and device records live in a storage layer that swaps implementations based on the infrastructure mode. Both implementations satisfy the same HCADDataStore contract, so application code is unaffected.

STANDALONE

Default mode for development and testing. All state lives in-process.

InMemoryHCADDataStore
  • Backed by ConcurrentHashMap
  • TreeMap 5-minute request window
  • Zero external dependencies

DISTRIBUTED

Multi-instance deployments. State is shared through Redis.

RedisHCADDataStore
  • Hash · Set · Sorted Set structures
  • Session TTL 24 h · device TTL 30 d
  • Consistent judgement across all instances

Wiring points

HCADFilterConfigurer
Inserts HCADFilter before Spring Security's AuthorizationFilter (configurer order=HIGHEST_PRECEDENCE + 115)
PendingAnomalyTriggerConfigurer
Places the trigger filter immediately after HCADFilter
CoreHCADAutoConfiguration
Registers the scorer, data store and service beans
IdentitySecurityCoreAutoConfiguration
Integrates the HCAD path with the rest of the identity filter chain