contexa-common contexa-core autonomous

Prompt Engineering

Contexa turns every authenticated HTTP request — whether from a human or an agent — into a structured prompt an LLM can reason about consistently. Getting there takes six stages: collect → process → harden → standardise → compose → calibrate. This page follows the flow end to end.

Six-stage pipeline

A prompt is never built on the spot. Six stages collect, organise and verify every fact around the request before anything reaches the LLM. Each stage lives in its own package and the output of one stage is the input of the next.

1
Collect
common/security
Three evidence stamps
2
Process
autonomous/context
Session, work, role-scope profiles
3
Harden
context/hardener
Injection defence, format enforcement
4
Standardise
CanonicalSecurityContext
22 fields + 7 profiles in one record
5
Compose
tiered/prompt
17 section plans
6
Calibrate
guardrail · calibration
Autonomy guardrail and learned calibration
Design principle. Every stage fails open: a missing output never blocks the next stage. If the evidence for a section is absent, that section is omitted from the prompt and the LLM is told directly not to draw strong conclusions about that area.

Stage 1 · Collect

Once a request enters the Spring Security filter chain, an AuthBridge implementation pulls the principal out of headers, session attributes or request attributes. The extracted evidence is packaged into three kinds of evidence stamps. Those three stamps are the raw input for every later stage.

AU
Authentication stamp
AuthenticationStamp
  • principalType · subject class
  • authenticationType
  • authenticationAssurance
  • mfaCompleted
  • authenticationSource
  • sessionId · authenticationTime
AZ
Authorization stamp
AuthorizationStamp
  • effect (ALLOW · DENY · UNKNOWN)
  • privileged action flag
  • policyId · policyVersion
  • decisionSource
  • effectiveRoles · effectiveAuthorities
DG
Delegation stamp
DelegationStamp
  • subjectId · the human delegator
  • agentId · the acting principal
  • objectiveId · objectiveFamily
  • allowedOperations
  • allowedResources
  • containmentOnly · expiresAt

Five evidence quality tiers

Not all evidence carries the same weight. A signal the customer sent explicitly is the strongest; a platform-observed structural signal comes next; a fallback derived from runtime continuity is the weakest. BridgeSemanticBoundaryPolicy tags every collected field with one of the tiers below.

1
EXPLICIT_CUSTOMER_SIGNAL
Signal the customer sent explicitly in headers or parameters
Top
2
STRUCTURAL_DISCOVERY_ONLY
Signal the platform auto-discovered from structural patterns
High
3
DERIVED_RUNTIME_FALLBACK
Signal inferred from runtime continuity as a fallback
Medium
4
HEURISTIC_HINT_ONLY
Hint only. Semantic conclusions are not allowed
Low
5
UNAVAILABLE
No evidence for this area
None

Stage 2 · Process

Raw evidence alone is not enough for the LLM to decide whether a user is acting differently from usual. Stage 2 expands it through three families of components that turn primitive facts into observable patterns.

Collectors
  • SessionNarrativeCollector — in-session action sequence
  • ProtectableWorkProfileCollector — usual work profile
  • RoleScopeCollector — current role scope snapshot
Enrichers
  • AuthenticationContextProvider — auth strength
  • DelegationContextProvider — delegation scope
  • PeerCohortContextProvider — cohort outliers
  • FrictionContextProvider — MFA & approval history
  • OrganizationContextProvider — org hierarchy
  • ReasoningMemoryContextProvider — prior cases
Inference
  • ObjectiveDriftEvaluator — agent objective drift
  • ObservedScopeInferenceService — observed scope
  • ContextCoverageEvaluator — coverage level
Agent-specific inference. Unlike humans, agents must act only within their allowed objective. ObjectiveDriftEvaluator checks in real time whether the request's resource and action family fall inside the allowedOperations and allowedResources declared by the delegation stamp.

Stage 3 · Harden

Prompt injection, field spoofing and signal forgery are the top threats for any LLM-based security system. CanonicalSecurityContextHardener puts every field through field-specific normalisation before it ever reaches the model.

Input validation
  1. Replace null fields with safe defaults
  2. Trim strings and cap length
  3. Normalise enum values
  4. Validate time and coordinate ranges
  5. Normalise language / country codes to ISO
  6. Decode payloads and filter to 80%+ printable chars
Profile hardening
  1. Preserve session narrative burst flags
  2. Drop negative values from the work profile
  3. Deduplicate role scope while keeping order
  4. Pin approval-lineage order in friction
  5. Clean reasoning-memory guardrails
  6. Normalise trust profile decision states

Stage 4 · Standardise

After stage 3, every piece of evidence is consolidated into a single standard context model — CanonicalSecurityContext. The current implementation is a Lombok-backed class; the hardener normalizes fields and fills required defaults before the compose stage receives it. Twenty-two fields and seven profiles sit in well-known places, so the compose stage always knows where each field lives.

Subject

  • userId
  • organizationId
  • tenantId
  • principalType
  • roleSet
  • authoritySet

Session

  • sessionId
  • clientIp
  • userAgent
  • authenticationType
  • mfaVerified
  • concurrentSessions

Delegation

  • delegated
  • agentId
  • objectiveId
  • allowedOperations
  • allowedResources
  • objectiveDrift

Authorization

  • effectiveRoles
  • effectivePermissions
  • scopeTags
  • authorizationEffect
  • policyId
  • policyVersion

Intent

  • botUserAgent
  • impossibleTravel
  • intentMissingReferer

Location · IP

  • country · city
  • latitude · longitude
  • IP band · ASN

Resource

  • resourceType
  • businessLabel
  • sensitivity
  • actionFamily

Seven profiles

  • SessionNarrative
  • Work
  • RoleScope
  • PeerCohort
  • Friction
  • ReasoningMemory
  • ObservedScope

Four coverage levels

ContextCoverageEvaluator grades how complete the record is by counting filled fields and available profiles. A low grade automatically injects a warning into the prompt telling the LLM not to draw strong conclusions from "usual work patterns".

BUSINESS_AWARE
Identity, session, scope, resource and observed patterns all present
SCOPE_AWARE
Identity and scope only; observed profiles missing
IDENTITY_AWARE
Identity only; scope and profiles missing
ENVIRONMENT_ONLY
Only environmental signals (IP, time) available

Stage 5 · Compose

The standardised context now becomes a prompt. SecurityDecisionPromptSections orchestrates the work and renders 2 system sections plus 15 user sections, for 17 section plans in total. When a field is missing, the builder for that section simply emits nothing.

System side and user side

System message

Tells the LLM its role, the policy, and the expected output format. Stable across requests.

  1. Security policy instructions
  2. Format and length constraints
  3. JSON schema enforcement
  4. Governance version stamp
User message

Lists the evidence for the current request, section by section. Changes per request.

  1. Current event
  2. Canonical context and coverage
  3. Identity and authority
  4. Device, location, session
  5. Delegation, friction, behaviour profile
  6. Threat knowledge, organisational learning, and explicit missing knowledge

17 section plans

SYS
SYSTEM_INSTRUCTION
Policy & format guidance
SYS
DECISION_CONTRACT
Enforces the JSON schema
USR
CURRENT_REQUEST_AND_EVENT
Current request & payload
USR
BRIDGE_AND_COVERAGE
Bridge resolution & coverage
USR
IDENTITY_AND_ROLE
Identity & authority set
USR
RESOURCE_AND_ACTION
Resource & action semantics
USR
RoleScope
Role scope & expected families
USR
OBSERVED_AND_PERSONAL_WORK_PATTERN
Personal & organisational norm
USR
SUPPORTING_LEARNING_CONTEXT
Reference learning & supporting evidence
USR
DEVICE_CONTEXT
OS & browser fingerprint
USR
LOCATION_CONTEXT
Country · city · IP band
USR
INTENT_SIGNAL_CONTEXT
Request intent & referer signals
USR
SESSION_NARRATIVE
Session state & MFA
USR
DELEGATED_OBJECTIVE
Delegation scope & drift
USR
FRICTION_AND_APPROVAL
MFA & approval history
USR
EXPLICIT_MISSING_KNOWLEDGE
Missing knowledge & trust limits
USR
THREAT_LEARNING_AND_MEMORY
Threat intel & org learning
Conditional inclusion. Most context builders first query CanonicalContextFieldPolicy.has*(), while the missing-knowledge section decides whether to render from the coverage report and trust profiles. If a field is absent, the whole section is dropped, so the prompt always carries only "honest" information. PromptGovernanceDescriptor stamps the model and version for reproducibility.

EXPLICIT_MISSING_KNOWLEDGE

EXPLICIT_MISSING_KNOWLEDGE is a P0 required quality indicator section that prevents the LLM from mistaking evidence gaps for certainty. SecurityContextQualityUserSectionBuilder calls PromptContextComposer.composeMissingKnowledgeSection() and renders the section only when coverage gaps or trust-profile evidence cautions exist. If no gap signal exists, the section remains empty; that means there is no explicit missing knowledge to declare.

  • Coverage gapsmissingCriticalFacts, remediationHints, or confidenceWarnings create the missing-knowledge section.
  • Trust limitsContextEvidenceLimitation, ContextTrustLimitation, ContextTrustWarning, ContextFieldCoverage, and ContextFieldLimitation are surfaced as explicit items.
  • False-positive control — stale AUTHORIZATION_EFFECT missing-context warnings are suppressed once the authorization effect has been resolved.
  • Baseline support — sparse personal or organisational baselines add BaselineGapSupport to remind the model that missing evidence is not proof of either risk or legitimacy.
  • Compression preservation — compact budgets prioritise BaselineGapSupport, ConfidenceWarning, ContextEvidenceLimitation, ContextTrustLimitation, and ContextTrustWarning. If the budget still overflows, summarised or omitted details are recorded in the compression ledger.

Stage 6 · Calibrate

Getting a response back from the LLM is not the end. Two safety layers adjust the result in sequence: the autonomy guardrail and the runtime calibration.

Autonomy guardrail

If the LLM's confidence is below a threshold or the output deviates from the schema, PromptConfidenceGuardrail forces the final action up to a safer tier. The proposed action and the enforced action are stored separately to keep the audit trail intact.

ALLOW LLM proposal
CHALLENGE Guardrail steps in
ESCALATE Final enforcement
As confidence drops, the enforced action steps up one safer tier at a time.

Runtime calibration

Decisions the guardrail did not touch are fine-tuned by SecurityDecisionCalibrationService using learned profiles. Confidence adjustments and action biases are applied based on past observations.

  • Scenario classification — group the current request with past similar scenarios (for example, unfamiliar location + unusual resource)
  • Profile selection — pick only profiles that have enough samples and operator review
  • Action bias — one of INCREASE_CHALLENGE · DECREASE_CHALLENGE · NONE
  • BLOCK is immutable — a BLOCK decision is never changed by a bias (safety first)
  • Guardrail wins — if the guardrail already intervened, calibration is skipped

The final decision record

What comes out the other end is the SecurityDecision. It is not just "ALLOW" or "BLOCK". It is an auditable record that carries the LLM's raw judgement, the policy intervention, and the learning calibration — all three.

SecurityDecision fields

action
The LLM's first-pass proposal
autonomousAction
Final enforced action after guardrail and calibration
llmAuditRiskScore / Confidence
Raw LLM scores — audit only, never enforced
autonomyConstraintApplied
Whether the guardrail fired and why
calibrationApplied
Whether runtime calibration ran and which profile
processingLayer
1 = fast tier · 2 = expert tier (escalation)

How prompts differ for humans and agents

The pipeline is the same, but the centre of gravity shifts depending on who the subject is. For a human, the key axis is "deviation from the usual pattern". For an agent, it is "staying within the allowed objective".

H Human
Subject typeHUMAN
Delegation stampUsually absent
Key profilesWork · PeerCohort · Friction
Key sectionsIdentityAuthority · BehaviorProfile
Decision axisDeviation from usual behaviour
MFA signalVerified at login time
A Agent
Subject typeAGENT · SERVICE_ACCOUNT
Delegation stampRequired. delegated = true + agentId
Key profilesObservedScope · RoleScope
Key sectionsDelegation · ObjectiveDrift
Decision axisDrift from the allowed objective
MFA signalVerified when the delegation token was issued