Files
flutter-docker-image/openspec/changes/archive/2026-05-20-p2-release-windows-image/design.md
T
Eligio Mariño 668ec5041a docs: archive p2-release-windows-image (#455)
- Archives `p2-release-windows-image` to
`openspec/changes/archive/2026-05-20-p2-release-windows-image/`.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:05:11 +02:00

12 KiB
Raw Blame History

Context

release.yml is the publishing workflow for tagged releases. It currently has five jobs, all Android-scoped: release_android, update_description, record_image, set_bootstrap_image, create_github_release. The Windows Dockerfile and the windows.yml PR-test workflow exist but are not connected to release. PR #339's windows.yml even has a commented-out Push to Docker Hub step (lines 84-88) hinting at this gap.

Constraints:

  • Docker Buildx and docker/build-push-action cache features do not work for Windows containers (already noted in windows.yml's comment "Docker Buildx is not supported for Windows containers"). The release job must use docker build + docker push directly, like windows.yml does.
  • The gx-managed pinning regime (spec actions-version-tracking) requires every new uses: to be SHA-pinned via .github/gx.toml. The actions themselves (actions/checkout, docker/metadata-action, docker/login-action, actions/github-script) are already pinned by release_android, so no new top-level [actions] entry is needed. However, [actions.overrides]."docker/metadata-action" is keyed per (workflow, job, step) and gains one new entry for the release_windows step, pinned at ~5.10.0 for parity with release_android (the ~5.7.0 entry for windows.yml::test_windows is unrelated and stays).
  • The three-registry fan-out (Docker Hub + GHCR + Quay) is not negotiable — it's the pre-existing distribution promise from release_android.
  • windows-2025 runner cost: each release run adds 3060 minutes of windows-2025 minutes. The repository already pays for this in the PR-test workflow per p1, so this is incremental cost on tag pushes only.

Goals / Non-Goals

Goals:

  • A single tag push publishes both flutter-android:X.Y.Z and flutter-windows:X.Y.Z to Docker Hub, GHCR, and Quay.
  • The Windows release path is operationally identical to the Android release path from a user's perspective: same registries, same tag scheme, same OCI labels.
  • Failure isolation between architectures: Windows runner flake does not block Android publishing.
  • workflow_dispatch continues to allow manual re-runs without re-tagging.

Non-Goals:

  • Updating Docker Hub description for the Windows image (peter-evans/dockerhub-description). The current update_description job is Android-scoped against readme.md. Adding a Windows variant requires a separate readme and a separate Docker Hub repo description; out of scope.
  • Recording Windows image vulnerabilities in Docker Scout (record_image job). Scout's Windows-base-image coverage is limited and the existing windows.yml already commented out the Scout block (line 65 onwards) for that reason. If/when Scout becomes useful for Windows, add it in a follow-up.
  • Updating the FLUTTER_VERSION repo variable from a Windows job (set_bootstrap_image). That variable bootstraps test_gradle against the Android image; Windows has no analogous bootstrap need today.
  • Including the Windows image in the GitHub Release notes generated by create_github_release. The release notes are tag-scoped, not image-scoped, so they continue to apply.
  • Multi-arch manifests (linux/amd64 + windows/amd64 under one tag). Windows containers cannot share a manifest list with Linux containers in practice; users pull the right image for their host. Status quo.

Decisions

Decision: Add a single release_windows job, not a matrix on release_android

A strategy.matrix over [android, windows] would conflict on runs-on (ubuntu-24.04 vs windows-2025), dockerfile (android.Dockerfile vs windows.Dockerfile), target (android vs flutter), and the clean-runner-disk step (Linux-only). The result would be a matrix mostly composed of conditionals — less readable than two parallel jobs sharing the metadata pattern. Two-job design wins for clarity.

Alternatives considered:

  • Reusable workflow with the registry login + metadata + push pattern factored out. Rejected: only two callers (Android and Windows), and the steps differ enough (Buildx vs. plain docker build) that the reusable surface would have ~5 inputs to model two callers. Not worth it.

Decision: No Buildx on the Windows job; use plain docker build + per-registry docker push

docker/build-push-action does not support Windows containers — the action runs through buildx/buildkit, and Windows-container support there has been an open feature request since 2020 (https://github.com/docker/build-push-action/issues/18). Buildkit's experimental WCOW worker (since v0.13.0, Nov 2024) requires manual containerd 1.7.7+, CNI, and admin-elevated setup that is not available out of the box on hosted windows-2025 runners. Third-party alternatives (e.g., mr-smithers-excellent/docker-build-push@v6) work but consume docker/metadata-action outputs in a comma-delimited format that conflicts with OCI label values containing commas, and add a non-Docker-OSS dependency to the gx-managed supply chain.

The Windows job therefore uses plain docker build (multiple -t flags from ${{ steps.metadata.outputs.tags }}, multiple --label flags from ${{ steps.metadata.outputs.labels }}) followed by docker push per tag. This matches what windows.yml::test_windows already does on the same runner image. GHA cache (type=gha) is Linux-only and is not used here regardless.

Decision: Three registry logins, all reusing existing secrets

The job runs docker/login-action three times (Docker Hub, GHCR, Quay) using the same secrets as release_android: DOCKER_HUB_USERNAME / DOCKER_HUB_TOKEN for Docker Hub, github.actor / github.token for GHCR, QUAY_USERNAME / QUAY_ROBOT_TOKEN for Quay. No new secret rotation is needed.

Decision: No needs: dependency between release_android and release_windows

Independent jobs run in parallel. If Windows build fails, Android still publishes. The workflow run as a whole reports failure, but the Android image is live. This matches how multi-arch open-source distributions usually behave: don't hold back one platform on another's flake.

Alternative considered:

  • release_windows: needs: release_android so Android validates first. Rejected: if Android fails, the tag is half-cut anyway and Windows would only matter as a postmortem; in the common case (both pass) it just adds 3060 minutes to the wall-clock of the Windows publish.

Decision: Image build target is flutter, not a new release-only target

windows.Dockerfile already has the flutter stage as its production stage and test as the test stage. The release job builds --target flutter. windows.yml (PR test) builds --target test. This split is correct and matches android.Dockerfile.

Risks / Trade-offs

  • [Risk] Windows runner is flaky and tags ship with no Windows image. → Mitigation: workflow_dispatch is the supported Windows-only recovery path. The if: github.event_name == 'push' guard on release_android makes workflow_dispatch re-run only release_windows, so recovery does not re-publish Android, re-push the Docker Hub readme, or re-attempt gh release create (which would fail on an already-existing release). Android recovery remains fix-forward + re-tag, consistent with the established 35-run history of release.yml. The release notes are tag-scoped (Android already publishes), so a delayed Windows publish is observable but not a failed release.
  • [Risk] Quay or GHCR push fails after Docker Hub push succeeds. → Mitigation: docker push per-registry is idempotent; a re-run of the release_windows job pushes any missing tags. Document that the job is safe to re-run.
  • [Risk] windows-2025 minutes cost grows by 3060 min per tag. → Acceptable: tag cadence is roughly monthly (flutter-version-update PRs land on stable bumps). Annualized cost is bounded.
  • [Trade-off] No Docker Hub description update for the Windows image. → Acceptable: users discover the Windows variant via the GitHub README, which already documents both. A separate Docker Hub repo (flutter-windows) will appear bare for now.
  • [Trade-off] No Scout scan / SARIF upload for Windows. → Acceptable: the Scout coverage gap on Windows base images is well-known. Code-scanning dashboard remains Android-only until Scout matures.

Automated Test Strategy

  • Pre-merge verification (the only level that matters here): the release.yml workflow itself does not run on pull_request; it runs on push: tags: * and workflow_dispatch. Therefore this change cannot be verified by a normal PR check. The verification path is:
    1. Land the PR with the new release_windows job and the if: github.event_name == 'push' guard on release_android.
    2. Use workflow_dispatch against an existing tag (e.g., the most recent stable Flutter tag) to trigger a one-shot run before the next stable bump. Expect release_windows to execute and the five Android-side jobs (release_android, update_description, record_image, set_bootstrap_image, create_github_release) to all be reported as skipped. A green workflow run is the success criterion.
    3. Confirm the three pushed images exist via docker manifest inspect.
  • Post-merge ongoing verification: every monthly tag exercises the path. The flutter-version-update PR pipeline (spec: flutter-version-update) already gates that the Android image is healthy before tagging; once p1-fix-windows-ci-tests is in, the Windows image is also gated by its test_windows PR check on the upgrade PR. So the release-time signal is "PR was green and merged → tag was cut → release builds against a verified image."
  • No new test infrastructure: the change is GitHub Actions YAML. The metadata-action and login-action are battle-tested in release_android.

Observability

  • Failure surface: the release_windows job appears as its own check on the workflow run. Failures show in the GitHub Actions UI exactly like release_android failures do today.
  • Per-registry push errors: docker push errors are emitted to stdout by the Docker daemon and end up in the workflow log under the push step. No additional logging is needed.
  • Image digest visible after push: each docker push prints the resulting digest. Copying that digest into the run summary is a nice-to-have but not required; docker manifest inspect from any host gives the same answer post-hoc.
  • No silent failures possible: each docker push is its own step (or sub-command) and propagates its exit code. The job step fails on any non-zero exit.
  • Maintainer dashboard: gh run list --workflow=release.yml --limit 5 shows the most recent releases with per-job status; this is the existing observability surface and continues to work.

Migration Plan

  1. Land p1-fix-windows-ci-tests first so the Windows image is actually verified per-PR.
  2. Open a PR adding the release_windows job to release.yml. The PR's pull_request checks do not exercise release.yml (it runs on push: tags); land it on the strength of YAML review + pinned-action diff.
  3. After merge, manually run release.yml via workflow_dispatch against the most recent stable tag to validate the new job end-to-end before the next monthly upgrade PR cycle. If something is wrong, hotfix in a follow-up PR rather than waiting for a real release.
  4. The next flutter-version-update PR that lands and is tagged exercises the path automatically.
  5. Rollback strategy: if a regression appears (e.g., Windows registry credentials are wrong), revert the PR. Tags already pushed continue to publish only Android until the revert is itself reverted.

Open Questions

  • Should the update_description job on Docker Hub gain a parallel update_description_windows step pointing at a readme-windows.md? Out of scope here — split if/when there's a Windows-specific readme worth maintaining.