contexa-core

Model Providers

The model provider system manages LLM models across multiple backends. The DynamicModelRegistry auto-discovers Spring AI ChatModel beans and custom ModelProvider implementations, while the DynamicModelSelectionStrategy resolves the runtime model for each request from explicit model hints, tier mapping, availability, and configured fallbacks.

Overview

Contexa supports multiple LLM providers simultaneously. At startup, the DynamicModelRegistry discovers all available models through three mechanisms:

  1. ModelProvider beans — custom provider implementations registered in the Spring context.
  2. Spring AI ChatModel beans — auto-detected ChatModel beans, typically created from contexa.llm.chat.ollama.*, spring.ai.anthropic.*, or spring.ai.openai.* configuration.
  3. TieredLLMProperties — model IDs from the configured layer-1 and layer-2 hierarchy.

The registry automatically infers the provider from the model class name (Ollama, Anthropic, OpenAI, Gemini, Mistral, Azure, Bedrock, HuggingFace) and performs health checks at initialization.

ModelProvider

The interface for custom model provider implementations. Implement this to add support for providers not covered by Spring AI auto-configuration.

public interface ModelProvider
getProviderName() String
Returns the unique provider identifier (e.g., "ollama", "anthropic", "openai").
getDescription() String
Returns a human-readable description of the provider implementation.
getAvailableModels() List<ModelDescriptor>
Returns all models currently exposed by this provider.
getModelDescriptor(String modelId) ModelDescriptor
Returns the descriptor for a specific model ID when the provider can resolve it.
createModel(ModelDescriptor descriptor, Map<String, Object> config) ChatModel
Creates a ChatModel instance for the given descriptor. The config map provides runtime overrides.
supportsModel(String modelId) boolean
Returns whether this provider can serve the given model ID.
checkHealth(String modelId) HealthStatus
Performs a health check on the specified model and returns the result with response time metrics.
initialize(Map<String, Object> config) void
Initializes the provider with the given configuration. Called once during registry startup.
shutdown() void
Releases provider resources during registry shutdown.
isReady() boolean
Reports whether the provider is initialized and ready for model access or health checks.
refreshModels() void
Refreshes the provider-side model cache before the registry re-registers descriptors.
getMetrics() Map<String, Object>
Returns provider-level metrics exposed by the implementation.
getPriority() int
Returns the provider priority. Lower values are preferred when multiple providers support the same model. Default: 100.

ModelDescriptor

Describes a model's identity, capabilities, default options, and current status.

@Data @Builder
public class ModelDescriptor
PropertyTypeDescription
modelIdStringUnique model identifier (e.g., "llama3.1:8b", "claude-3-opus").
displayNameStringHuman-readable model name.
providerStringProvider name (ollama, anthropic, openai).
tierIntegerThe configured runtime tier assigned to the model (1 or 2). Null if unassigned.
versionStringModel version string.
capabilitiesModelCapabilitiesWhat the model supports (streaming, tool calling, multimodal, context window, output budget).
optionsModelOptionsDefault sampling options (temperature, topP, topK, repetitionPenalty).
statusModelStatusAVAILABLE or UNAVAILABLE.

ModelCapabilities

FieldTypeDefault
streamingbooleantrue
toolCallingbooleanfalse
functionCallingbooleanfalse
multiModalbooleanfalse
maxTokensint4096
contextWindowint4096
maxOutputTokensint4096

ModelOptions

FieldTypeDefault
temperatureDouble0.7
topPDouble0.9
topKIntegernull
repetitionPenaltyDouble1.0

DynamicModelRegistry

Central registry that discovers, manages, and provides access to all LLM models. Auto-initializes at application startup.

public class DynamicModelRegistry
getModel(String modelId) ChatModel
Returns the ChatModel for the given ID. Creates and caches the instance if not already loaded. Throws ModelSelectionException if not found.
getAllModels() Collection<ModelDescriptor>
Returns all registered model descriptors.
getDescriptor(String modelId) ModelDescriptor
Returns the registered descriptor for a model ID without instantiating a new ChatModel.
getModelsByProvider(String provider) List<ModelDescriptor>
Returns AVAILABLE model descriptors filtered by provider name.
registerModel(ModelDescriptor descriptor) void
Registers or merges a model descriptor. Configuration-defined tiers take precedence over provider-defined tiers.
refreshModels() void
Asks all providers to refresh their model lists and registers any newly discovered models.
updateModelStatus(String modelId, ModelStatus status) void
Updates the availability status of a registered model.
shutdown() void
Invokes shutdown() on registered providers during application shutdown.

ModelSelectionStrategy

Interface for model selection logic. The DynamicModelSelectionStrategy is the default implementation.

public interface ModelSelectionStrategy
selectModel(ExecutionContext context) ChatModel
Selects the best model for the given execution context. Returns null if no model is available.
getSupportedModels() Set<String>
Returns the set of all model IDs available for selection.
isModelAvailable(String modelName) boolean
Checks whether a specific model is currently available.

DynamicModelSelectionStrategy

The default selection strategy uses an explicit-model-first fallback chain:

public class DynamicModelSelectionStrategy implements ModelSelectionStrategy
  1. Explicit model requestExecutionContext.preferredModel or metadata keys such as requestedModelId, preferredModel, runtimeModelId, and modelId are tried first.
  2. Tier resolution — If no explicit model is present, the strategy resolves a semantic tier from analysisLevel, then securityTaskType, then ExecutionContext.tier. Values above 1 are normalized to configured layer 2.
  3. Primary/backup lookup — The strategy uses TieredLLMProperties to try the primary model for the resolved layer and then its backup model.
  4. Primary ChatModel fallback — If tier resolution fails, the auto-configured primary ChatModel bean is used as the last fallback.

Configuration

Tiered Model Hierarchy

YAML
spring:
  ai:
    security:
      layer1:
        model: qwen2.5:14b
        backup:
          model: qwen2.5:7b
      layer2:
        model: exaone3.5:latest
        backup:
          model: llama3.1:8b
      tiered:
        layer1:
          timeout-ms: 30000
        layer2:
          timeout-ms: 60000

contexa:
  llm:
    chat:
      ollama:
        base-url: http://localhost:11434
        model: qwen2.5:14b

Selection order: explicit model request → analysisLevelsecurityTaskType → explicit tier → primary ChatModel fallback. The runtime only has configured layers 1 and 2, so higher semantic tiers are normalized to layer 2.