Skip to main content

[Preview] v1.81.14 - New Gateway Level Guardrails & Compliance Playground

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Deploy this version​

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.81.14.rc.1

Key Highlights​


Guardrail Garden​

AI Platform Admins can now browse built-in and partner guardrails from the Guardrail Garden. Guardrails are organized by use case β€” blocking financial advice, filtering insults, detecting competitor mentions, and more β€” so you can find the right one and deploy it in a few clicks.

Guardrail Garden

3 New Built-in Guardrails​

This release brings 3 new built-in guardrails that run directly on the gateway. This is great for AI Gateway Admins who need low latency, zero cost guardrails for their scenarios.

  • Denied Financial Advice β€” detects requests for personalized financial advice, investment recommendations, or financial planning
  • Denied Insults β€” detects insults, name-calling, and personal attacks directed at the chatbot, staff, or other people
  • Competitor Name Blocker β€” detects mentions of competitor brands in responses

These guardrails are built for production and on our benchmarks had a 100% Recall and Precision.

Store Model in DB Settings via UI​

Previously, the store_model_in_db setting could only be configured in proxy_config.yaml under general_settings, requiring a proxy restart to take effect. Now you can enable or disable this setting directly from the Admin UI without any restarts. This is especially useful for cloud deployments where you don't have direct access to config files or want to avoid downtime. Enable store_model_in_db to move model definitions from your YAML into the databaseβ€”reducing config complexity, improving scalability, and enabling dynamic model management across multiple proxy instances.

Store model in DB Setting

Eval results​

We benchmarked our new built-in guardrails against labeled datasets before shipping. You can see the results for Denied Financial Advice (207 cases) and Denied Insults (299 cases):

GuardrailPrecisionRecallF1Latency p50Cost/req
Denied Financial Advice100%100%100%<0.1ms$0
Denied Insults100%100%100%<0.1ms$0

100% precision means zero false positives β€” no legitimate messages were incorrectly blocked. 100% recall means zero false negatives β€” every message that should have been blocked was caught.

Compliance Playground​

The Compliance Playground lets you test any guardrail against our pre-built eval datasets or your own custom datasets, so you can see precision, recall, and false positive rate before rolling it out to production.

Compliance Playground



New Providers and Endpoints​

New Providers (1 new provider)​

ProviderSupported LiteLLM EndpointsDescription
IBM watsonx.ai/rerankRerank support for IBM watsonx.ai models

New LLM API Endpoints (1 new endpoint)​

EndpointMethodDescriptionDocumentation
/v1/evalsPOST/GETOpenAI-compatible Evals API for model evaluationDocs

New Models / Updated Models​

New Model Support (13 new models)​

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)Features
Anthropicclaude-sonnet-4-6200K$3.00$15.00Reasoning, computer use, prompt caching, vision, PDF
Vertex AIvertex_ai/claude-opus-4-6@default1M$5.00$25.00Reasoning, computer use, prompt caching
Google Geminigemini/gemini-3.1-pro-preview1M$2.00$12.00Audio, video, images, PDF
Google Geminigemini/gemini-3.1-pro-preview-customtools1M$2.00$12.00Custom tools
GitHub Copilotgithub_copilot/gpt-5.3-codex128K--Responses API, function calling, vision
GitHub Copilotgithub_copilot/claude-opus-4.6-fast128K--Chat completions, function calling, vision
Mistralmistral/devstral-small-latest256K$0.10$0.30Function calling, response schema
Mistralmistral/devstral-latest256K$0.40$2.00Function calling, response schema
Mistralmistral/devstral-medium-latest256K$0.40$2.00Function calling, response schema
OpenRouteropenrouter/minimax/minimax-m2.5196K$0.30$1.10Function calling, reasoning, prompt caching
Fireworks AIfireworks_ai/accounts/fireworks/models/glm-4p7---Chat completions
Fireworks AIfireworks_ai/accounts/fireworks/models/minimax-m2p1---Chat completions
Fireworks AIfireworks_ai/accounts/fireworks/models/kimi-k2p5---Chat completions

Features​

  • Anthropic

    • Day 0 support for Claude Sonnet 4.6 with reasoning, computer use, and 200K context - PR #21401
    • Add Claude Sonnet 4.6 pricing - PR #21395
    • Add day 0 feature support for Claude Sonnet 4.6 (streaming, function calling, vision) - PR #21448
    • Add reasoning effort and extended thinking support for Sonnet 4.6 - PR #21598
    • Fix empty system messages in translate_system_message - PR #21630
    • Sanitize Anthropic messages for multi-turn compatibility - PR #21464
    • Map websearch tool from /v1/messages to /chat/completions - PR #21465
    • Forward reasoning field as reasoning_content in delta streaming - PR #21468
    • Add server-side compaction translation from OpenAI to Anthropic format - PR #21555
  • AWS Bedrock

    • Native structured outputs API support (outputConfig.textFormat) - PR #21222
    • Support nova/ and nova-2/ spec prefixes for custom imported models - PR #21359
    • Broaden Nova 2 model detection to support all nova-2-* variants - PR #21358
    • Clamp thinking.budget_tokens to minimum 1024 - PR #21306
    • Fix parallel_tool_calls mapping for Bedrock Converse - PR #21659
  • Google Gemini / Vertex AI

    • Day 0 support for gemini-3.1-pro-preview - PR #21568
    • Fix _map_reasoning_effort_to_thinking_level for all Gemini 3 family models - PR #21654
    • Add reasoning support via config for Gemini models - PR #21663
  • Databricks

    • Add Databricks to supported providers for response schema - PR #21368
    • Native Responses API support for Databricks GPT models - PR #21460
  • GitHub Copilot

    • Add github_copilot/gpt-5.3-codex and github_copilot/claude-opus-4.6-fast models - PR #21316
    • Fix unsupported params for ChatGPT Codex - PR #21209
    • Allow GitHub model aliases to reuse upstream model metadata - PR #21497
  • Mistral

    • Add devstral-2512 model aliases (devstral-small-latest, devstral-latest, devstral-medium-latest) - PR #21372
  • IBM watsonx.ai

  • xAI

    • Fix usage object in xAI responses - PR #21559
  • Dashscope

    • Remove list-to-str transformation that caused incorrect request formatting - PR #21547
  • hosted_vllm

    • Convert thinking blocks to content blocks for multi-turn conversations - PR #21557
  • OCI / Oracle

  • AU Anthropic

    • Fix au.anthropic.claude-opus-4-6-v1 model ID - PR #20731
  • General

    • Add routing based on reasoning support β€” skip deployments that don't support reasoning when thinking params are present - PR #21302
    • Add stop as supported param for OpenAI and Azure - PR #21539
    • Add store and other missing params to OPENAI_CHAT_COMPLETION_PARAMS - PR #21195, PR #21360
    • Preserve provider_specific_fields from proxy responses - PR #21220
    • Add default usage data configuration - PR #21550

Bug Fixes​


LLM API Endpoints​

Features​

  • Responses API

    • Return finish_reason='tool_calls' when response contains function_call items - PR #19745
    • Eliminate per-chunk thread spawning in async streaming path for significantly better throughput - PR #21709
  • Evals API

    • Add support for OpenAI Evals API - PR #21375
  • Batch API

    • Add file deletion criteria with batch references - PR #21456
    • Misc bug fixes for managed batches - PR #21157
  • Pass-Through Endpoints

    • Add method-based routing for passthrough endpoints - PR #21543
    • Preserve and forward OAuth Authorization headers through proxy layer - PR #19912
  • Websearch / Tool Calling

    • Add DuckDuckGo as a search tool - PR #21467
    • Fix pre_call_deployment_hook not triggering via proxy router for websearch - PR #21433
  • General

    • Exclude tool params for models without function calling support - PR #21244
    • Add store param to OpenAI chat completion params - PR #21195
    • Add reasoning support via config for per-model reasoning configuration - PR #21663

Bugs​

  • General
    • Fix api_base resolution error for models with multiple potential endpoints - PR #21658
    • Fix session grouping broken for dict rows from query_raw - PR #21435

Management Endpoints / UI​

Features​

  • Access Groups

    • Add Access Group Selector to Create and Edit flow for Keys/Teams - PR #21234
  • Virtual Keys

    • Fix virtual key grace period from env/UI - PR #20321
    • Fix key expiry default duration - PR #21362
    • Key Last Active Tracking β€” see when a key was last used - PR #21545
    • Fix /v1/models returning wildcard instead of expanded models for BYOK team keys - PR #21408
    • Return failed_tokens in delete_verification_tokens response - PR #21609
  • Models + Endpoints

    • Add Model Settings Modal to Models & Endpoints page - PR #21516
    • Allow store_model_in_db to be set via database (not just config) - PR #21511
    • Fix input_cost_per_token masked/hidden in Model Info UI - PR #21723
    • Fix credentials for UI-created models in batch file uploads - PR #21502
    • Resolve credentials for UI-created models - PR #21502
  • Teams

    • Allow team members to view entire team usage - PR #21537
    • Fix service account visibility for team members - PR #21627
    • Organization Info page: show member email, AntD tabs, reusable MemberTable - PR #21745
  • Usage / Spend Logs

    • Allow filtering Usage by User - PR #21351
    • Inject Credential Name as Tag for Usage Page filtering - PR #21715
    • Prefix credential tags and update Tag usage banner - PR #21739
    • Show retry count for requests in Logs view - PR #21704
    • Fix Aggregated Daily Activity Endpoint performance - PR #21613
  • SSO / Auth

    • Fix SSO PKCE support in multi-pod Kubernetes deployments - PR #20314
    • Preserve SSO role regardless of role_mappings config - PR #21503
  • Proxy CLI / Master Key

    • Fix master key rotation Prisma validation errors - PR #21330
    • Handle missing DATABASE_URL in append_query_params - PR #21239
  • Project Management

    • Add Project Management APIs for organizing resources - PR #21078
  • UI Improvements

    • Content Filters: help edit/view categories and 1-click add with pagination - PR #21223
    • Playground: test fallbacks with UI - PR #21007
    • Add forward_client_headers_to_llm_api toggle to general settings - PR #21776
    • Fix is_premium() debug log spam on every request - PR #20841

Bugs​

  • Spend Logs: Fix cost calculation - PR #21152
  • Logs: Fix table not updating and pagination issues - PR #21708
  • Fix /get_image ignoring UI_LOGO_PATH when cached_logo.jpg exists - PR #21637
  • Fix duplicate URL in tagsSpendLogsCall query string - PR #20909
  • Preserve key_alias and team_id metadata in /user/daily/activity/aggregated after key deletion or regeneration - PR #20684
  • Uncomment response_model in user_info endpoint - PR #17430
  • Allow internal_user_viewer to access RAG endpoints; restrict ingest to existing vector stores - PR #21508
  • Suppress warning for litellm-dashboard team in agent permission handler - PR #21721

AI Integrations​

Logging​

  • DataDog

    • Add team tag to logs, metrics, and cost management - PR #21449
  • Prometheus

    • Fix double-counting of litellm_proxy_total_requests_metric - PR #21159
    • Guard against None metadata in Prometheus metrics - PR #21489
    • Add ASGI middleware for improved Prometheus metrics collection - PR #20434
  • Langfuse

    • Improve Langfuse test isolation (multiple stability fixes) - PR #21214
  • General

    • Fix cost to 0 for cached responses in logging - PR #21816
    • Improve streaming proxy throughput by fixing middleware and logging bottlenecks - PR #21501
    • Reduce proxy overhead for large base64 payloads - PR #21594
    • Close streaming connections to prevent connection pool exhaustion - PR #21213

Guardrails​

  • Guardrail Garden

    • Launch Guardrail Garden β€” a marketplace for pre-built guardrails deployable in one click - PR #21732
    • Redesign guardrail creation form with vertical stepper UI - PR #21727
    • Add guardrail jump link in log detail view - PR #21437
    • Guardrail tracing UI: show policy, detection method, and match details - PR #21349
  • AI Policy Templates

  • Compliance Checker

    • Add compliance checker endpoints + UI panel - PR #21432
    • CSV dataset upload to compliance playground for batch testing - PR #21526
  • Built-in Guardrails

    • Competitor name blocker: blocks by name, handles streaming, supports name variations, and splits pre/post call - PR #21719, PR #21533
    • Topic blocker with both keyword and embedding-based implementations - PR #21713
    • Insults content filter - PR #21729
    • MCP Security guardrail to block unregistered MCP servers - PR #21429
  • Generic Guardrails

    • Add configurable fallback to handle generic guardrail endpoint connection failures - PR #21245
  • Presidio

    • Fix Presidio controls configuration - PR #21798
  • LakeraAI

    • Avoid KeyError on missing LAKERA_API_KEY during initialization - PR #21422

Auto Routing​

  • Complexity-based auto routing β€” new router strategy that scores requests across 7 dimensions (token count, code presence, reasoning markers, technical terms, etc.) and routes to the appropriate model tier β€” no embeddings or API calls required - PR #21789, Docs

Prompt Management​

  • Prompt Management API
    • New API to interact with prompt management integrations without requiring a PR - PR #17800, PR #17946
    • Fix prompt registry configuration issues - PR #21402

Spend Tracking, Budgets and Rate Limiting​

  • Fix Bedrock service_tier cost propagation β€” costs from service-tier responses now correctly flow through to spend tracking - PR #21172
  • Fix cost for cached responses β€” cached responses now correctly log $0 cost instead of re-billing - PR #21816
  • Aggregate daily activity endpoint performance β€” faster queries for /user/daily/activity/aggregated - PR #21613
  • Preserve key_alias and team_id metadata in /user/daily/activity/aggregated after key deletion or regeneration - PR #20684
  • Inject Credential Name as Tag for granular usage page filtering by credential - PR #21715

MCP Gateway​

  • OpenAPI-to-MCP β€” Convert any OpenAPI spec to an MCP server via API or UI - PR #21575, PR #21662
  • MCP User Permissions β€” Fine-grained permissions for end users on MCP servers - PR #21462
  • MCP Security Guardrail β€” Block calls to unregistered MCP servers - PR #21429
  • Fix StreamableHTTPSessionManager β€” Revert to stateless mode to prevent session state issues - PR #21323
  • Fix Bedrock AgentCore Accept header β€” Add required Accept header for AgentCore MCP server requests - PR #21551

Performance / Loadbalancing / Reliability improvements​

Logging & callback overhead

  • Move async/sync callback separation from per-request to callback registration time β€” ~30% speedup for callback-heavy deployments - PR #20354
  • Skip Pydantic Usage round-trip in logging payload β€” reduces serialization overhead per request - PR #21003
  • Skip duplicate get_standard_logging_object_payload calls for non-streaming requests - PR #20440
  • Reuse LiteLLM_Params object across the request lifecycle - PR #20593
  • Optimize add_litellm_data_to_request hot path - PR #20526
  • Optimize model_dump_with_preserved_fields - PR #20882
  • Pre-compute OpenAI client init params at module load instead of per-request - PR #20789
  • Reduce proxy overhead for large base64 payloads - PR #21594
  • Improve streaming proxy throughput by fixing middleware and logging bottlenecks - PR #21501
  • Eliminate per-chunk thread spawning in Responses API async streaming - PR #21709

Cost calculation

  • Optimize completion_cost() with early-exit and caching - PR #20448
  • Cost calculator: reduce repeated lookups and dict copies - PR #20541

Router & load balancing

  • Remove quadratic deployment scan in usage-based routing v2 - PR #21211
  • Avoid O(nΒ²) membership scans in team deployment filter - PR #21210
  • Avoid O(n) alias scan for non-alias get_model_list lookups - PR #21136
  • Increase default LRU cache size to reduce multi-model cache thrash - PR #21139
  • Cache get_model_access_groups() no-args result on Router - PR #20374
  • Deployment affinity routing callback β€” route to the same deployment for a session - PR #19143
  • Session-ID-based routing β€” use session_id for consistent routing within a session - PR #21763

Connection management & reliability

  • Fix Redis connection pool reliability β€” prevent connection exhaustion under load - PR #21717
  • Fix Prisma connection self-heal for auth and runtime reconnection (reverted, will be re-introduced with fixes) - PR #21706
  • Make PodLockManager.release_lock atomic compare-and-delete - PR #21226

Database Changes​

Schema Updates​

TableChange TypeDescriptionPR
LiteLLM_DeletedVerificationTokenNew ColumnAdded project_id columnPR #21587
LiteLLM_ProjectTableNew TableProject management for organizing resourcesPR #21078
LiteLLM_VerificationTokenNew ColumnAdded last_active timestamp for key activity trackingPR #21545
LiteLLM_ManagedVectorStoreTableMigrationMake vector store migration idempotentPR #21325

Documentation Updates​

  • Add OpenAI Agents SDK with LiteLLM guide - PR #21311
  • Access Groups documentation - PR #21236
  • Anthropic beta headers documentation - PR #21320
  • Latency overhead troubleshooting guide - PR #21600, PR #21603
  • Add rollback safety check guide - PR #21743
  • Incident report: vLLM Embeddings broken by encoding_format parameter - PR #21474
  • Incident report: Claude Code beta headers - PR #21485
  • Mark v1.81.12 as stable - PR #21809

New Contributors​

  • @mjkam made their first contribution in PR #21306
  • @saneroen made their first contribution in PR #21243
  • @vincentkoc made their first contribution in PR #21239
  • @felixti made their first contribution in PR #19745
  • @anttttti made their first contribution in PR #20731
  • @ndgigliotti made their first contribution in PR #21222
  • @iamadamreed made their first contribution in PR #19912
  • @sahukanishka made their first contribution in PR #21220
  • @namabile made their first contribution in PR #21195
  • @stronk7 made their first contribution in PR #21372
  • @ZeroAurora made their first contribution in PR #21547
  • @SolitudePy made their first contribution in PR #21497
  • @SherifWaly made their first contribution in PR #21557
  • @dkindlund made their first contribution in PR #21633
  • @cagojeiger made their first contribution in PR #21664

Full Changelog​

v1.81.12.rc.1...v1.81.14.rc.1