- Archives `p2-release-windows-image` to `openspec/changes/archive/2026-05-20-p2-release-windows-image/`. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12 KiB
Context
release.yml is the publishing workflow for tagged releases. It currently has five jobs, all Android-scoped: release_android, update_description, record_image, set_bootstrap_image, create_github_release. The Windows Dockerfile and the windows.yml PR-test workflow exist but are not connected to release. PR #339's windows.yml even has a commented-out Push to Docker Hub step (lines 84-88) hinting at this gap.
Constraints:
- Docker Buildx and
docker/build-push-actioncache features do not work for Windows containers (already noted inwindows.yml's comment "Docker Buildx is not supported for Windows containers"). The release job must usedocker build+docker pushdirectly, likewindows.ymldoes. - The
gx-managed pinning regime (specactions-version-tracking) requires every newuses:to be SHA-pinned via.github/gx.toml. The actions themselves (actions/checkout,docker/metadata-action,docker/login-action,actions/github-script) are already pinned byrelease_android, so no new top-level[actions]entry is needed. However,[actions.overrides]."docker/metadata-action"is keyed per(workflow, job, step)and gains one new entry for therelease_windowsstep, pinned at~5.10.0for parity withrelease_android(the~5.7.0entry forwindows.yml::test_windowsis unrelated and stays). - The three-registry fan-out (Docker Hub + GHCR + Quay) is not negotiable — it's the pre-existing distribution promise from
release_android. windows-2025runner cost: each release run adds 30–60 minutes ofwindows-2025minutes. The repository already pays for this in the PR-test workflow perp1, so this is incremental cost on tag pushes only.
Goals / Non-Goals
Goals:
- A single tag push publishes both
flutter-android:X.Y.Zandflutter-windows:X.Y.Zto Docker Hub, GHCR, and Quay. - The Windows release path is operationally identical to the Android release path from a user's perspective: same registries, same tag scheme, same OCI labels.
- Failure isolation between architectures: Windows runner flake does not block Android publishing.
workflow_dispatchcontinues to allow manual re-runs without re-tagging.
Non-Goals:
- Updating Docker Hub description for the Windows image (
peter-evans/dockerhub-description). The currentupdate_descriptionjob is Android-scoped againstreadme.md. Adding a Windows variant requires a separate readme and a separate Docker Hub repo description; out of scope. - Recording Windows image vulnerabilities in Docker Scout (
record_imagejob). Scout's Windows-base-image coverage is limited and the existingwindows.ymlalready commented out the Scout block (line 65 onwards) for that reason. If/when Scout becomes useful for Windows, add it in a follow-up. - Updating the
FLUTTER_VERSIONrepo variable from a Windows job (set_bootstrap_image). That variable bootstrapstest_gradleagainst the Android image; Windows has no analogous bootstrap need today. - Including the Windows image in the GitHub Release notes generated by
create_github_release. The release notes are tag-scoped, not image-scoped, so they continue to apply. - Multi-arch manifests (
linux/amd64+windows/amd64under one tag). Windows containers cannot share a manifest list with Linux containers in practice; users pull the right image for their host. Status quo.
Decisions
Decision: Add a single release_windows job, not a matrix on release_android
A strategy.matrix over [android, windows] would conflict on runs-on (ubuntu-24.04 vs windows-2025), dockerfile (android.Dockerfile vs windows.Dockerfile), target (android vs flutter), and the clean-runner-disk step (Linux-only). The result would be a matrix mostly composed of conditionals — less readable than two parallel jobs sharing the metadata pattern. Two-job design wins for clarity.
Alternatives considered:
- Reusable workflow with the registry login + metadata + push pattern factored out. Rejected: only two callers (Android and Windows), and the steps differ enough (Buildx vs. plain docker build) that the reusable surface would have ~5 inputs to model two callers. Not worth it.
Decision: No Buildx on the Windows job; use plain docker build + per-registry docker push
docker/build-push-action does not support Windows containers — the action runs through buildx/buildkit, and Windows-container support there has been an open feature request since 2020 (https://github.com/docker/build-push-action/issues/18). Buildkit's experimental WCOW worker (since v0.13.0, Nov 2024) requires manual containerd 1.7.7+, CNI, and admin-elevated setup that is not available out of the box on hosted windows-2025 runners. Third-party alternatives (e.g., mr-smithers-excellent/docker-build-push@v6) work but consume docker/metadata-action outputs in a comma-delimited format that conflicts with OCI label values containing commas, and add a non-Docker-OSS dependency to the gx-managed supply chain.
The Windows job therefore uses plain docker build (multiple -t flags from ${{ steps.metadata.outputs.tags }}, multiple --label flags from ${{ steps.metadata.outputs.labels }}) followed by docker push per tag. This matches what windows.yml::test_windows already does on the same runner image. GHA cache (type=gha) is Linux-only and is not used here regardless.
Decision: Three registry logins, all reusing existing secrets
The job runs docker/login-action three times (Docker Hub, GHCR, Quay) using the same secrets as release_android: DOCKER_HUB_USERNAME / DOCKER_HUB_TOKEN for Docker Hub, github.actor / github.token for GHCR, QUAY_USERNAME / QUAY_ROBOT_TOKEN for Quay. No new secret rotation is needed.
Decision: No needs: dependency between release_android and release_windows
Independent jobs run in parallel. If Windows build fails, Android still publishes. The workflow run as a whole reports failure, but the Android image is live. This matches how multi-arch open-source distributions usually behave: don't hold back one platform on another's flake.
Alternative considered:
release_windows: needs: release_androidso Android validates first. Rejected: if Android fails, the tag is half-cut anyway and Windows would only matter as a postmortem; in the common case (both pass) it just adds 30–60 minutes to the wall-clock of the Windows publish.
Decision: Image build target is flutter, not a new release-only target
windows.Dockerfile already has the flutter stage as its production stage and test as the test stage. The release job builds --target flutter. windows.yml (PR test) builds --target test. This split is correct and matches android.Dockerfile.
Risks / Trade-offs
- [Risk] Windows runner is flaky and tags ship with no Windows image. → Mitigation:
workflow_dispatchis the supported Windows-only recovery path. Theif: github.event_name == 'push'guard onrelease_androidmakesworkflow_dispatchre-run onlyrelease_windows, so recovery does not re-publish Android, re-push the Docker Hub readme, or re-attemptgh release create(which would fail on an already-existing release). Android recovery remains fix-forward + re-tag, consistent with the established 35-run history ofrelease.yml. The release notes are tag-scoped (Android already publishes), so a delayed Windows publish is observable but not a failed release. - [Risk] Quay or GHCR push fails after Docker Hub push succeeds. → Mitigation:
docker pushper-registry is idempotent; a re-run of therelease_windowsjob pushes any missing tags. Document that the job is safe to re-run. - [Risk]
windows-2025minutes cost grows by 30–60 min per tag. → Acceptable: tag cadence is roughly monthly (flutter-version-updatePRs land on stable bumps). Annualized cost is bounded. - [Trade-off] No Docker Hub description update for the Windows image. → Acceptable: users discover the Windows variant via the GitHub README, which already documents both. A separate Docker Hub repo (
flutter-windows) will appear bare for now. - [Trade-off] No Scout scan / SARIF upload for Windows. → Acceptable: the Scout coverage gap on Windows base images is well-known. Code-scanning dashboard remains Android-only until Scout matures.
Automated Test Strategy
- Pre-merge verification (the only level that matters here): the
release.ymlworkflow itself does not run onpull_request; it runs onpush: tags: *andworkflow_dispatch. Therefore this change cannot be verified by a normal PR check. The verification path is:- Land the PR with the new
release_windowsjob and theif: github.event_name == 'push'guard onrelease_android. - Use
workflow_dispatchagainst an existing tag (e.g., the most recent stable Flutter tag) to trigger a one-shot run before the next stable bump. Expectrelease_windowsto execute and the five Android-side jobs (release_android,update_description,record_image,set_bootstrap_image,create_github_release) to all be reported asskipped. A green workflow run is the success criterion. - Confirm the three pushed images exist via
docker manifest inspect.
- Land the PR with the new
- Post-merge ongoing verification: every monthly tag exercises the path. The
flutter-version-updatePR pipeline (spec:flutter-version-update) already gates that the Android image is healthy before tagging; oncep1-fix-windows-ci-testsis in, the Windows image is also gated by itstest_windowsPR check on the upgrade PR. So the release-time signal is "PR was green and merged → tag was cut → release builds against a verified image." - No new test infrastructure: the change is GitHub Actions YAML. The
metadata-actionandlogin-actionare battle-tested inrelease_android.
Observability
- Failure surface: the
release_windowsjob appears as its own check on the workflow run. Failures show in the GitHub Actions UI exactly likerelease_androidfailures do today. - Per-registry push errors:
docker pusherrors are emitted to stdout by the Docker daemon and end up in the workflow log under the push step. No additional logging is needed. - Image digest visible after push: each
docker pushprints the resulting digest. Copying that digest into the run summary is a nice-to-have but not required;docker manifest inspectfrom any host gives the same answer post-hoc. - No silent failures possible: each
docker pushis its own step (or sub-command) and propagates its exit code. The job step fails on any non-zero exit. - Maintainer dashboard:
gh run list --workflow=release.yml --limit 5shows the most recent releases with per-job status; this is the existing observability surface and continues to work.
Migration Plan
- Land
p1-fix-windows-ci-testsfirst so the Windows image is actually verified per-PR. - Open a PR adding the
release_windowsjob torelease.yml. The PR'spull_requestchecks do not exerciserelease.yml(it runs onpush: tags); land it on the strength of YAML review + pinned-action diff. - After merge, manually run
release.ymlviaworkflow_dispatchagainst the most recent stable tag to validate the new job end-to-end before the next monthly upgrade PR cycle. If something is wrong, hotfix in a follow-up PR rather than waiting for a real release. - The next
flutter-version-updatePR that lands and is tagged exercises the path automatically. - Rollback strategy: if a regression appears (e.g., Windows registry credentials are wrong), revert the PR. Tags already pushed continue to publish only Android until the revert is itself reverted.
Open Questions
- Should the
update_descriptionjob on Docker Hub gain a parallelupdate_description_windowsstep pointing at areadme-windows.md? Out of scope here — split if/when there's a Windows-specific readme worth maintaining.