Model Providers

Overview

Contexa supports multiple LLM runtimes simultaneously. At startup, the DynamicModelRegistry populates its descriptor map through two mechanisms:

LlmRuntimeCatalog chat bindings — every LlmRuntimeBinding returned from LlmRuntimeCatalog.getChatBindings() is registered. The catalog is built from Spring AI ChatModel beans created by the Contexa runtime (for example via contexa.llm.chat.ollama.*) and by Spring AI autoconfigurations (Anthropic, OpenAI, and others).
TieredLLMProperties layers — the primary and optional backup models configured for spring.ai.security.layer1 and spring.ai.security.layer2 are registered with their tier. The registry uses resolveCanonicalModelId(...) so the same model can be found by its runtime ID, bean name, or any alias.

The registry infers the provider name from the model ID when no binding is available. Recognised substrings include llama, qwen, gemma, mistral, phi, exaone, codellama, and deepseek (→ ollama), claude (→ anthropic), gpt/o1/davinci (→ openai), gemini/vertex (→ gemini), and bedrock (→ bedrock). Anything else is labelled spring.

LlmRuntimeCatalog

The interface that exposes discovered chat and embedding runtimes. The registry uses it to find bindings and resolve live ChatModel instances on demand.

public interface LlmRuntimeCatalog

getChatBindings() List<LlmRuntimeBinding>

Returns all chat runtime bindings known to the catalog.

getEmbeddingBindings() List<LlmRuntimeBinding>

Returns all embedding runtime bindings known to the catalog.

findChatBinding(String selector) Optional<LlmRuntimeBinding>

Locates a chat binding by runtime ID, bean name, model ID, or any registered alias.

findEmbeddingBinding(String selector) Optional<LlmRuntimeBinding>

Locates an embedding binding by any of the same selectors.

resolveChatModel(String selector) ChatModel

Returns the live Spring AI ChatModel for the matching binding.

resolveEmbeddingModel(String selector) EmbeddingModel

Returns the live Spring AI EmbeddingModel for the matching binding.

resolvePrimaryChatModel(String priorityConfig) Optional<ChatModel>

Resolves the primary chat model using a comma-separated priority configuration string.

resolvePrimaryEmbeddingModel(String priorityConfig) Optional<EmbeddingModel>

Resolves the primary embedding model using a priority configuration string.

resolveSpringPrimaryChatModel() Optional<ChatModel>

Returns the Spring AI primary ChatModel bean when one is present in the context.

resolveSpringPrimaryEmbeddingModel() Optional<EmbeddingModel>

Returns the Spring AI primary EmbeddingModel bean when one is present.

LlmRuntimeBinding

An immutable record of a single runtime binding. Bindings are created by Contexa's LLM autoconfiguration and consumed by both the catalog and the registry.

public final class LlmRuntimeBinding

Property	Type	Description
`runtimeId`	`String`	Contexa runtime ID assigned to this binding.
`beanName`	`String`	Spring bean name of the underlying model bean.
`provider`	`String`	Provider identifier (for example, `ollama`, `anthropic`, `openai`).
`modelId`	`String`	Canonical model ID reported by the runtime (for example, `llama3.1:8b`).
`aliases`	`Set<String>`	Additional selectors that resolve to the same binding.
`type`	`LlmRuntimeType`	Chat or embedding runtime type.
`primary`	`boolean`	Whether this binding is marked as the Spring primary for its type.
`source`	`String`	Origin of the binding (for example, autoconfiguration class or configuration key).

Helper Methods

Method	Return	Description
`canonicalId()`	`String`	Returns the preferred identifier for the binding: `modelId`, then `runtimeId`, then `beanName`.
`matches(String selector)`	`boolean`	Returns whether the selector matches `runtimeId`, `beanName`, `modelId`, or any alias.

ModelDescriptor

Describes a model's identity, capabilities, default options, and current status.

@Data @Builder
public class ModelDescriptor

Property	Type	Description
`modelId`	`String`	Unique model identifier (for example, `llama3.1:8b`, `claude-3-opus`).
`displayName`	`String`	Human-readable model name.
`provider`	`String`	Provider name (`ollama`, `anthropic`, `openai`).
`tier`	`Integer`	The configured runtime tier assigned to the model (1 or 2). Null if unassigned.
`version`	`String`	Model version string.
`capabilities`	`ModelCapabilities`	Nested class describing streaming, tool calling, multimodal, context window, and output budget.
`options`	`ModelOptions`	Nested class describing default sampling options (temperature, topP, topK, repetitionPenalty).
`status`	`ModelStatus`	`AVAILABLE` or `UNAVAILABLE`.

ModelCapabilities (nested class)

Field	Type	Default
`streaming`	`boolean`	true
`toolCalling`	`boolean`	false
`functionCalling`	`boolean`	false
`multiModal`	`boolean`	false
`maxTokens`	`int`	4096
`contextWindow`	`int`	4096
`maxOutputTokens`	`int`	4096

ModelOptions (nested class)

Field	Type	Default
`temperature`	`Double`	0.7
`topP`	`Double`	0.9
`topK`	`Integer`	null
`repetitionPenalty`	`Double`	1.0

Helper Method

supportsAdvancedFeatures() boolean

Returns true when the capabilities declare tool calling, function calling, or multimodal support.

DynamicModelRegistry

Central registry that manages descriptors and caches ChatModel instances. It registers models from the catalog and from tier configuration during @PostConstruct initialization.

public class DynamicModelRegistry

Constructor Dependencies

Dependency	Description
`ApplicationContext`	Spring application context reference used during initialization.
`TieredLLMProperties`	Configured tier hierarchy (layer 1 and layer 2, primary and backup models).
`LlmRuntimeCatalog`	Catalog of Spring AI runtime bindings. Optional: when absent, `getModel(...)` raises `ModelSelectionException`.

Public API

getModel(String modelId) ChatModel

Returns the ChatModel for the given ID. Resolves the canonical model ID, looks up the binding, asks the catalog to resolve a live model, and caches the result. Throws ModelSelectionException when the ID or catalog is missing.

getAllModels() Collection<ModelDescriptor>

Returns a copy of every registered model descriptor.

getDescriptor(String modelId) ModelDescriptor

Returns the registered descriptor for a model ID without instantiating a new ChatModel.

getModelsByProvider(String provider) List<ModelDescriptor>

Returns AVAILABLE descriptors filtered by provider name (case insensitive).

registerModel(ModelDescriptor descriptor) void

Registers a descriptor or merges it into an existing one. Existing tier and provider values are preserved when the incoming descriptor leaves them unset.

refreshModels() void

Clears descriptors, instance cache, and aliases, then re-registers catalog bindings and configuration layers.

updateModelStatus(String modelId, ModelStatus status) void

Updates the availability status of a registered model.

shutdown() void

@PreDestroy hook that clears descriptors, instance cache, and aliases.

ModelSelectionStrategy

Interface for model selection logic. The DynamicModelSelectionStrategy is the default implementation.

public interface ModelSelectionStrategy

selectModel(ExecutionContext context) ChatModel

Selects the best model for the given execution context. Returns null if no model is available.

getSupportedModels() Set<String>

Returns the set of all model IDs available for selection.

isModelAvailable(String modelName) boolean

Checks whether a specific model is currently available.

DynamicModelSelectionStrategy

The default selection strategy uses an explicit-model-first fallback chain. It is constructed with a DynamicModelRegistry, the TieredLLMProperties, and the primary ChatModel bean.

public class DynamicModelSelectionStrategy implements ModelSelectionStrategy

Explicit model request — ExecutionContext.preferredModel is tried first. When absent, the strategy looks for metadata keys in this order: requestedModelId, preferredModel, runtimeModelId, modelId.
Tier resolution — When no explicit model resolves, the strategy picks a tier in this order: analysisLevel.getDefaultTier(), then securityTaskType.getDefaultTier(), then ExecutionContext.tier. Values above 1 are normalized to configured layer 2.
Primary/backup lookup — The strategy reads TieredLLMProperties.getModelNameForTier(...) for the primary model and getBackupModelNameForTier(...) for the backup.
Primary ChatModel fallback — If every tier candidate fails, the auto-configured primary ChatModel bean is returned as the last fallback.

Resolution Metadata

The strategy records the outcome on ExecutionContext.metadata. Downstream code and advisors can inspect these keys:

Key	Description
`requestedModelId`	The explicit model ID requested, when one was provided.
`requestedModelSourceKey`	Where the requested ID came from (for example, `executionContext.preferredModel` or `executionContext.metadata.runtimeModelId`).
`selectedModelId`	The model ID that was finally selected.
`selectedModelProvider`	Provider name from the matching `ModelDescriptor`, when available.
`runtimeModelId`	Duplicate of `selectedModelId` for downstream consumers that look for it under that key.
`modelSelectionSource`	Why the model was chosen (for example, `explicit_model`, `analysis_level:NORMAL`, `security_task_type:FORENSIC_ANALYSIS`, `tier:2`, or `primary_chat_model`).
`modelSelectionFallbackUsed`	`true` when a backup or the primary fallback was used.
`modelSelectionCandidates`	Ordered list of model IDs the strategy tried.
`modelSelectionFailure`	Reason for failure when no model could be selected.

Additional Method

getModelCapabilities(String modelName) Map<String, Object>

Returns a map that summarises the descriptor's model ID, provider, capabilities block, context window, maxTokens, streaming flag, tier, and status. Returns an empty map when the model is not registered.

Configuration

Tiered Model Hierarchy

YAML

spring:
  ai:
    security:
      layer1:
        model: qwen2.5:7b
        backup:
          model: qwen2.5:7b
      layer2:
        model: gpt-4o-mini
        backup:
          model: llama3.1:8b
      tiered:
        layer1:
          timeout-ms: 30000
        layer2:
          timeout-ms: 60000

contexa:
  llm:
    chat:
      ollama:
        base-url: http://localhost:11434
        model: qwen2.5:7b

Selection order: explicit model request → analysisLevel → securityTaskType → explicit tier → primary ChatModel fallback. The runtime only has configured layers 1 and 2, so higher semantic tiers are normalized to layer 2.

Overview

LlmRuntimeCatalog

LlmRuntimeBinding

Helper Methods

ModelDescriptor

ModelCapabilities (nested class)

ModelOptions (nested class)

Helper Method

DynamicModelRegistry

Constructor Dependencies

Public API

ModelSelectionStrategy

DynamicModelSelectionStrategy

Resolution Metadata

Additional Method

Configuration

Tiered Model Hierarchy

Related

LLM Orchestrator

AI Pipeline