Update AI docs for retired hosted models (#49486)

## Summary - Update hosted model and context-window tables in docs/src/ai/models.md to remove retired models and list current replacements. - Add a dated Recent Model Retirements section mapping each retired model to its replacement. - Update AI docs examples and references in agent-settings.md, inline-assistant.md, agent-panel.md, and llm-providers.md to use current model names. - Remove stale OpenAI model references in llm-providers.md that no longer align with currently offered hosted models. ## Validation - ./script/prettier - ./script/check-todos ## Suggested .rules additions - N/A Release Notes: - N/A
2026-04-18 07:47:53 +00:00 · 2026-02-18 10:36:38 -06:00
parent 66f7aea166
commit 71ffaeb817
10 changed files with 113 additions and 53 deletions
@@ -187,8 +187,8 @@ jobs:
             - Fix
             - Validation
             - Potentially Related Issues (High/Medium/Low from LINKED_ISSUES.md)
             - Release Notes
             - Reviewer Checklist
             - Release Notes (final section; format as Release Notes:, then a blank line, then one bullet like - N/A)
          Constraints:
          - Do not merge or auto-approve.
@@ -251,9 +251,34 @@ jobs:
              printf "Automated draft crash-fix pipeline output for %s.\n\nNo PR_BODY.md was generated by the agent; please review commit and linked artifacts manually.\n" "$CRASH_ID" > "$BODY_FILE"
            fi
-            if ! python3 -c 'import re, sys; body = open(sys.argv[1], encoding="utf-8").read(); raise SystemExit(0 if re.search(r"Release Notes:\r?\n\s+-", body) else 1)' "$BODY_FILE"; then
+            python3 -c '
-              printf '\nRelease Notes:\n\n- N/A\n' >> "$BODY_FILE"
+            import re
-            fi
+            import sys
            path = sys.argv[1]
            body = open(path, encoding="utf-8").read()
            pattern = re.compile(r"(^|\n)Release Notes:\r?\n(?:\r?\n)*(?P<bullets>(?:\s*-\s+.*(?:\r?\n|$))+)", re.MULTILINE)
            match = pattern.search(body)
            if match:
                bullets = [
                    re.sub(r"^\s*", "", bullet)
                    for bullet in re.findall(r"^\s*-\s+.*$", match.group("bullets"), re.MULTILINE)
                ]
                if not bullets:
                    bullets = ["- N/A"]
                section = "Release Notes:\n\n" + "\n".join(bullets)
                body_without_release_notes = (body[: match.start()] + body[match.end() :]).rstrip()
                if body_without_release_notes:
                    normalized_body = f"{body_without_release_notes}\n\n{section}\n"
                else:
                    normalized_body = f"{section}\n"
            else:
                normalized_body = body.rstrip() + "\n\nRelease Notes:\n\n- N/A\n"
            with open(path, "w", encoding="utf-8") as file:
                file.write(normalized_body)
            ' "$BODY_FILE"
            EXISTING_PR="$(gh pr list --head "$BRANCH" --json number --jq '.[0].number')"
            if [ -n "$EXISTING_PR" ]; then
@@ -145,9 +145,17 @@ When an agent opens or updates a pull request, it must:
 - Avoid conventional commit prefixes in PR titles (`fix:`, `feat:`, `docs:`, etc.).
 - Avoid trailing punctuation in PR titles.
 - Optionally prefix the title with a crate name when one crate is the clear scope (for example, `git_ui: Add history view`).
- Include a `Release Notes:` section in the PR body with one bullet:
+- Include a `Release Notes:` section as the final section in the PR body.
 - Use one bullet under `Release Notes:`:
  - `- Added ...`, `- Fixed ...`, or `- Improved ...` for user-facing changes, or
-  - `- N/A` for non-user-facing changes.
+  - `- N/A` for docs-only and other non-user-facing changes.
 - Format release notes exactly with a blank line after the heading, for example:
 ```
 Release Notes:
 - N/A
 ```
 # Crash Investigation
@@ -156,3 +156,14 @@ Zed doesn't index your project like IntelliJ does. You open a folder and start w
 ```
 While some users might miss indexing, Zed's approach is actually better because it's faster.
 ```
 ## Pull request hygiene
 - Include a `Release Notes:` section as the final section in the PR body.
 - For docs-only PRs, use exactly:
 ```
 Release Notes:
 - N/A
 ```
@@ -134,7 +134,7 @@ You can also do this at any time with an ongoing thread via the "Agent Options"
 After you've configured your LLM providers—either via [a custom API key](./llm-providers.md) or through [Zed's hosted models](./models.md)—you can switch between their models by clicking on the model selector on the message editor or by using the {#kb agent::ToggleModelSelector} keybinding.
-> The same model can be offered via multiple providers - for example, Claude Sonnet 4 is available via Zed Pro, OpenRouter, Anthropic directly, and more.
+> The same model can be offered via multiple providers - for example, Claude Sonnet 4.5 is available via Zed Pro, OpenRouter, Anthropic directly, and more.
 > Make sure you've selected the correct model **_provider_** for the model you'd like to use, delineated by the logo to the left of the model in the model selector.
 ### Favoriting Models
@@ -11,7 +11,7 @@ Settings for Zed's Agent Panel, including model selection, UI preferences, and t
 ### Default Model {#default-model}
-If you're using [Zed's hosted LLM service](./subscription.md), it sets `claude-sonnet-4` as the default model for agentic work (agent panel, inline assistant) and `gpt-5-nano` as the default "fast" model (thread summarization, git commit messages). If you're not subscribed or want to change these defaults, you can manually edit the `default_model` object in your settings:
+If you're using [Zed's hosted LLM service](./subscription.md), it sets `claude-sonnet-4-5` as the default model for agentic work (agent panel, inline assistant) and `gpt-5-nano` as the default "fast" model (thread summarization, git commit messages). If you're not subscribed or want to change these defaults, you can manually edit the `default_model` object in your settings:
 ```json [settings]
 {
@@ -37,7 +37,7 @@ You can assign distinct and specific models for the following AI-powered feature
  "agent": {
    "default_model": {
      "provider": "zed.dev",
-      "model": "claude-sonnet-4"
+      "model": "claude-sonnet-4-5"
    },
    "inline_assistant_model": {
      "provider": "anthropic",
@@ -68,7 +68,7 @@ Here's how you can customize your settings file ([how to edit](../configuring-ze
  "agent": {
    "default_model": {
      "provider": "zed.dev",
-      "model": "claude-sonnet-4"
+      "model": "claude-sonnet-4-5"
    },
    "inline_alternatives": [
      {
@@ -85,14 +85,14 @@ When multiple models are configured, you'll see in the Inline Assistant UI butto
 The models you specify here are always used in _addition_ to your [default model](#default-model).
 For example, the following configuration will generate three outputs for every assist.
-One with Claude Sonnet 4 (the default model), another with GPT-5-mini, and another one with Gemini 2.5 Flash.
+One with Claude Sonnet 4.5 (the default model), another with GPT-5-mini, and another one with Gemini 3 Flash.
 ```json [settings]
 {
  "agent": {
    "default_model": {
      "provider": "zed.dev",
-      "model": "claude-sonnet-4"
+      "model": "claude-sonnet-4-5"
    },
    "inline_alternatives": [
      {
@@ -101,7 +101,7 @@ One with Claude Sonnet 4 (the default model), another with GPT-5-mini, and anoth
      },
      {
        "provider": "zed.dev",
-        "model": "gemini-2.5-flash"
+        "model": "gemini-3-flash"
      }
    ]
  }
@@ -128,7 +128,7 @@ Specify a custom temperature for a provider and/or model:
      // To set parameters for a specific provider and model:
      {
        "provider": "zed.dev",
-        "model": "claude-sonnet-4",
+        "model": "claude-sonnet-4-5",
        "temperature": 1.0
      }
    ]
@@ -54,7 +54,7 @@ Here's how you can customize your settings file ([how to edit](../configuring-ze
  "agent": {
    "default_model": {
      "provider": "zed.dev",
-      "model": "claude-sonnet-4"
+      "model": "claude-sonnet-4-5"
    },
    "inline_alternatives": [
      {
@@ -71,14 +71,14 @@ When multiple models are configured, you'll see in the Inline Assistant UI butto
 The models you specify here are always used in _addition_ to your [default model](#default-model).
 For example, the following configuration will generate three outputs for every assist.
-One with Claude Sonnet 4 (the default model), another with GPT-5-mini, and another one with Gemini 2.5 Flash.
+One with Claude Sonnet 4.5 (the default model), another with GPT-5-mini, and another one with Gemini 3 Flash.
 ```json [settings]
 {
  "agent": {
    "default_model": {
      "provider": "zed.dev",
-      "model": "claude-sonnet-4"
+      "model": "claude-sonnet-4-5"
    },
    "inline_alternatives": [
      {
@@ -87,7 +87,7 @@ One with Claude Sonnet 4 (the default model), another with GPT-5-mini, and anoth
      },
      {
        "provider": "zed.dev",
-        "model": "gemini-2.5-flash"
+        "model": "gemini-3-flash"
      }
    ]
  }
@@ -304,8 +304,8 @@ Here is an example of a custom Google AI model you could add to your Zed setting
    "google": {
      "available_models": [
        {
-          "name": "gemini-2.5-flash-preview-05-20",
+          "name": "gemini-3-flash-preview",
-          "display_name": "Gemini 2.5 Flash (Thinking)",
+          "display_name": "Gemini 3 Flash (Thinking)",
          "max_tokens": 1000000,
          "mode": {
            "type": "thinking",
@@ -508,7 +508,7 @@ Zed will also use the `OPENAI_API_KEY` environment variable if it's defined.
 #### Custom Models {#openai-custom-models}
-The Zed agent comes pre-configured to use the latest version for common models (GPT-5, GPT-5 mini, o4-mini, GPT-4.1, and others).
+The Zed agent comes pre-configured to use the latest version for common OpenAI models (GPT-5.2, GPT-5 mini, GPT-5.2 Codex, and others).
 To use alternate models, perhaps a preview release, or if you wish to control the request parameters, you can do so by adding the following to your Zed settings file ([how to edit](../configuring-zed.md#settings-files)):
 ```json [settings]
@@ -517,20 +517,20 @@ To use alternate models, perhaps a preview release, or if you wish to control th
    "openai": {
      "available_models": [
        {
-          "name": "gpt-5",
+          "name": "gpt-5.2",
-          "display_name": "gpt-5 high",
+          "display_name": "gpt-5.2 high",
          "reasoning_effort": "high",
          "max_tokens": 272000,
          "max_completion_tokens": 20000
        },
        {
-          "name": "gpt-4o-2024-08-06",
+          "name": "gpt-5-nano",
-          "display_name": "GPT 4o Summer 2024",
+          "display_name": "GPT-5 Nano",
-          "max_tokens": 128000
+          "max_tokens": 400000
        },
        {
-          "name": "gpt-5-codex",
+          "name": "gpt-5.2-codex",
-          "display_name": "GPT-5 Codex",
+          "display_name": "GPT-5.2 Codex",
          "max_tokens": 128000,
          "capabilities": {
            "chat_completions": false
@@ -544,9 +544,9 @@ To use alternate models, perhaps a preview release, or if you wish to control th
 You must provide the model's context window in the `max_tokens` parameter; this can be found in the [OpenAI model documentation](https://platform.openai.com/docs/models).
-OpenAI `o1` and `o`-class models should set `max_completion_tokens` as well to avoid incurring high reasoning token costs.
+For reasoning-focused models, set `max_completion_tokens` as well to avoid incurring high reasoning token costs.
-If a model does not support the `/chat/completions` endpoint (for example `gpt-5-codex`), disable it by setting `capabilities.chat_completions` to `false`. Zed will use the Responses endpoint instead.
+If a model does not support the `/chat/completions` endpoint (for example `gpt-5.2-codex`), disable it by setting `capabilities.chat_completions` to `false`. Zed will use the Responses endpoint instead.
 Custom models will be listed in the model dropdown in the Agent Panel.
@@ -1,6 +1,6 @@
 ---
 title: AI Models and Pricing - Zed
-description: AI models available via Zed Pro including Claude, GPT-5, Gemini, and Grok. Pricing, context windows, and tool call support.
+description: AI models available via Zed Pro including Claude, GPT-5.2, Gemini, and Grok. Pricing, context windows, and tool call support.
 ---
 # Models
@@ -13,19 +13,15 @@ Zed's plans offer hosted versions of major LLMs with higher rate limits than dir
 |                        | Anthropic | Output              | $25.00                       | $27.50                  |
 |                        | Anthropic | Input - Cache Write | $6.25                        | $6.875                  |
 |                        | Anthropic | Input - Cache Read  | $0.50                        | $0.55                   |
-| Claude Opus 4.1        | Anthropic | Input               | $15.00                       | $16.50                  |
+| Claude Opus 4.6        | Anthropic | Input               | $5.00                        | $5.50                   |
-|                        | Anthropic | Output              | $75.00                       | $82.50                  |
+|                        | Anthropic | Output              | $25.00                       | $27.50                  |
-|                        | Anthropic | Input - Cache Write | $18.75                       | $20.625                 |
+|                        | Anthropic | Input - Cache Write | $6.25                        | $6.875                  |
-|                        | Anthropic | Input - Cache Read  | $1.50                        | $1.65                   |
+|                        | Anthropic | Input - Cache Read  | $0.50                        | $0.55                   |
 | Claude Sonnet 4.5      | Anthropic | Input               | $3.00                        | $3.30                   |
 |                        | Anthropic | Output              | $15.00                       | $16.50                  |
 |                        | Anthropic | Input - Cache Write | $3.75                        | $4.125                  |
 |                        | Anthropic | Input - Cache Read  | $0.30                        | $0.33                   |
-| Claude Sonnet 4        | Anthropic | Input               | $3.00                        | $3.30                   |
+| Claude Sonnet 4.6      | Anthropic | Input               | $3.00                        | $3.30                   |
 |                        | Anthropic | Output              | $15.00                       | $16.50                  |
 |                        | Anthropic | Input - Cache Write | $3.75                        | $4.125                  |
 |                        | Anthropic | Input - Cache Read  | $0.30                        | $0.33                   |
 | Claude Sonnet 3.7      | Anthropic | Input               | $3.00                        | $3.30                   |
 |                        | Anthropic | Output              | $15.00                       | $16.50                  |
 |                        | Anthropic | Input - Cache Write | $3.75                        | $4.125                  |
 |                        | Anthropic | Input - Cache Read  | $0.30                        | $0.33                   |
@@ -33,7 +29,10 @@ Zed's plans offer hosted versions of major LLMs with higher rate limits than dir
 |                        | Anthropic | Output              | $5.00                        | $5.50                   |
 |                        | Anthropic | Input - Cache Write | $1.25                        | $1.375                  |
 |                        | Anthropic | Input - Cache Read  | $0.10                        | $0.11                   |
-| GPT-5                  | OpenAI    | Input               | $1.25                        | $1.375                  |
+| GPT-5.2                | OpenAI    | Input               | $1.25                        | $1.375                  |
 |                        | OpenAI    | Output              | $10.00                       | $11.00                  |
 |                        | OpenAI    | Cached Input        | $0.125                       | $0.1375                 |
 | GPT-5.2 Codex          | OpenAI    | Input               | $1.25                        | $1.375                  |
 |                        | OpenAI    | Output              | $10.00                       | $11.00                  |
 |                        | OpenAI    | Cached Input        | $0.125                       | $0.1375                 |
 | GPT-5 mini             | OpenAI    | Input               | $0.25                        | $0.275                  |
@@ -42,11 +41,9 @@ Zed's plans offer hosted versions of major LLMs with higher rate limits than dir
 | GPT-5 nano             | OpenAI    | Input               | $0.05                        | $0.055                  |
 |                        | OpenAI    | Output              | $0.40                        | $0.44                   |
 |                        | OpenAI    | Cached Input        | $0.005                       | $0.0055                 |
-| Gemini 3.0 Pro         | Google    | Input               | $2.00                        | $2.20                   |
+| Gemini 3 Pro           | Google    | Input               | $2.00                        | $2.20                   |
 |                        | Google    | Output              | $12.00                       | $13.20                  |
-| Gemini 2.5 Pro         | Google    | Input               | $1.25                        | $1.375                  |
+| Gemini 3 Flash         | Google    | Input               | $0.30                        | $0.33                   |
 |                        | Google    | Output              | $10.00                       | $11.00                  |
 | Gemini 2.5 Flash       | Google    | Input               | $0.30                        | $0.33                   |
 |                        | Google    | Output              | $2.50                        | $2.75                   |
 | Grok 4                 | X.ai      | Input               | $3.00                        | $3.30                   |
 |                        | X.ai      | Output              | $15.00                       | $16.5                   |
@@ -61,6 +58,17 @@ Zed's plans offer hosted versions of major LLMs with higher rate limits than dir
 |                        | X.ai      | Output              | $1.50                        | $1.65                   |
 |                        | X.ai      | Cached Input        | $0.02                        | $0.022                  |
 ## Recent Model Retirements
 As of February 19, 2026, Zed Pro serves newer model versions in place of the retired models below:
 - Claude Opus 4.1 → Claude Opus 4.5 or Claude Opus 4.6
 - Claude Sonnet 4 → Claude Sonnet 4.5 or Claude Sonnet 4.6
 - Claude Sonnet 3.7 (retired Feb 19) → Claude Sonnet 4.5 or Claude Sonnet 4.6
 - GPT-5.1 and GPT-5 → GPT-5.2 or GPT-5.2 Codex
 - Gemini 2.5 Pro → Gemini 3 Pro
 - Gemini 2.5 Flash → Gemini 3 Flash
 ## Usage {#usage}
 Any usage of a Zed-hosted model will be billed at the Zed Price (rightmost column above). See [Plans and Usage](./plans-and-usage.md) for details on Zed's plans and limits for use of hosted models.
@@ -74,18 +82,18 @@ A context window is the maximum span of text and code an LLM can consider at onc
 | Model             | Provider  | Zed-Hosted Context Window |
 | ----------------- | --------- | ------------------------- |
 | Claude Opus 4.5   | Anthropic | 200k                      |
-| Claude Opus 4.1   | Anthropic | 200k                      |
+| Claude Opus 4.6   | Anthropic | 200k                      |
-| Claude Sonnet 4   | Anthropic | 200k                      |
+| Claude Sonnet 4.5 | Anthropic | 200k                      |
-| Claude Sonnet 3.7 | Anthropic | 200k                      |
+| Claude Sonnet 4.6 | Anthropic | 200k                      |
 | Claude Haiku 4.5  | Anthropic | 200k                      |
-| GPT-5             | OpenAI    | 400k                      |
+| GPT-5.2           | OpenAI    | 400k                      |
 | GPT-5.2 Codex     | OpenAI    | 400k                      |
 | GPT-5 mini        | OpenAI    | 400k                      |
 | GPT-5 nano        | OpenAI    | 400k                      |
-| Gemini 2.5 Pro    | Google    | 200k                      |
+| Gemini 3 Pro      | Google    | 200k                      |
-| Gemini 2.5 Flash  | Google    | 200k                      |
+| Gemini 3 Flash    | Google    | 200k                      |
 | Gemini 3.0 Pro    | Google    | 200k                      |
-> Context window limits for hosted Sonnet 4 and Gemini 2.5 Pro/Flash may increase in future releases.
+> Context window limits for hosted Sonnet 4.5/4.6 and Gemini 3 Pro/Flash may increase in future releases.
 Each Agent thread and text thread in Zed maintains its own context window.
 The more prompts, attached files, and responses included in a session, the larger the context window grows.
@@ -301,6 +301,13 @@ EOF
 echo "$MANIFEST" | jq -r '.suggestions[] | "- [PR #\(.pr)](\(.file)): \(.title)"' >> "$PR_BODY_FILE"
 cat >> "$PR_BODY_FILE" << 'EOF'
 Release Notes:
 - N/A
 EOF
 git add docs/
 git commit -m "docs: auto-apply preview release suggestions
@@ -156,6 +156,7 @@ Required workflow:
   - Validation
   - Potentially Related Issues (High/Medium/Low from LINKED_ISSUES.md)
   - Reviewer Checklist
   - Release Notes (final section, formatted as "Release Notes:", blank line, then one bullet like "- N/A" when not user-facing)
 Constraints:
 - Keep changes narrowly scoped to this crash.