chore: archive p3, implement p4-cleanup, propose p6 (#457)

Bundles three openspec changes that converged on this branch:

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Eligio Mariño
2026-05-23 14:50:52 +02:00
committed by GitHub
parent 4eb40b1786
commit 5bbeeb282a
17 changed files with 545 additions and 11 deletions
+90
View File
@@ -0,0 +1,90 @@
name: Cleanup PR image tag
on:
pull_request:
types: [closed]
delete:
permissions:
contents: read
concurrency:
group: cleanup-pr-image-${{ github.event.pull_request.number || github.event.ref }}
cancel-in-progress: false
jobs:
cleanup:
# On `delete` events, only act on branch deletions (not tag deletions).
# `pull_request: closed` is always in scope.
if: ${{ github.event_name == 'pull_request' || github.event.ref_type == 'branch' }}
runs-on: ubuntu-24.04
permissions:
packages: write
contents: read
env:
PACKAGE_NAME: flutter-android
steps:
- name: Compute target tag
id: tag
env:
EVENT_NAME: ${{ github.event_name }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REF_NAME: ${{ github.event.ref }}
run: |
set -euo pipefail
if [[ "$EVENT_NAME" == "pull_request" ]]; then
target="pr-${PR_NUMBER}"
else
target="branch-${REF_NAME//\//-}"
fi
# Defense in depth: fail closed if the computed tag does not match
# the documented handoff-tag regex. Release tags (`<version>`) and
# `buildcache` MUST be unreachable from this code path.
if [[ ! "$target" =~ ^pr-[0-9]+$ ]] && [[ ! "$target" =~ ^branch-[A-Za-z0-9._-]+$ ]]; then
echo "::error::computed tag '$target' does not match handoff-tag regex; refusing to proceed"
exit 1
fi
echo "target=$target" >> "$GITHUB_OUTPUT"
echo "Computed target tag: $target"
- name: Resolve and delete GHCR version
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
TARGET_TAG: ${{ steps.tag.outputs.target }}
run: |
set -euo pipefail
# The package is user-owned (gmeligio); /user/packages/... is the
# authenticated-user endpoint. The workflow GITHUB_TOKEN can delete
# because the package -> repo Actions Access role is Admin.
version_id=$(gh api \
"/user/packages/container/${PACKAGE_NAME}/versions" \
--paginate \
--jq ".[] | select(.metadata.container.tags[]? == \"${TARGET_TAG}\") | .id" \
| head -n 1)
if [[ -z "$version_id" ]]; then
echo "tag not found, nothing to delete (tag='${TARGET_TAG}')"
exit 0
fi
echo "Deleting version_id=${version_id} tag=${TARGET_TAG}"
status=$(curl -sS -o /tmp/delete-body -w '%{http_code}' \
-X DELETE \
-H "Authorization: Bearer ${GH_TOKEN}" \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"https://api.github.com/user/packages/container/${PACKAGE_NAME}/versions/${version_id}")
case "$status" in
204)
echo "Deleted ${TARGET_TAG} (version_id=${version_id})"
;;
404)
echo "Version ${version_id} already gone (404); treating as idempotent success"
;;
*)
echo "::error::DELETE returned HTTP ${status} for version_id=${version_id}"
cat /tmp/delete-body
exit 1
;;
esac
@@ -56,8 +56,8 @@
## 8. Validate end-to-end
- [ ] 8.1 Push the branch; the `windows.yml` PR check builds the image with the new build args and runs the tightened Pester suite. Confirm green.
- [ ] 8.2 Run `update_version.yml` via `workflow_dispatch` (which runs unconditionally on the dispatch branch); confirm a draft PR is opened with a non-empty `windows` block diff. (If Flutter is not changing, the dispatch will be a no-op; force a dispatch on a test branch where `flutter_version.json` has been hand-edited.)
- [x] 8.1 Push the branch; the `windows.yml` PR check builds the image with the new build args and runs the tightened Pester suite. Confirm green.
- [x] 8.2 Run `update_version.yml` via `workflow_dispatch` (which runs unconditionally on the dispatch branch); confirm a draft PR is opened with a non-empty `windows` block diff. (If Flutter is not changing, the dispatch will be a no-op; force a dispatch on a test branch where `flutter_version.json` has been hand-edited.)
## 9. Documentation
@@ -66,4 +66,4 @@
## 10. Archive
- [ ] 10.1 After merge and confirmation that the next scheduled `update_version.yml` produces a sensible Windows-bumping PR, archive this change so the `windows-version-tracking` spec is promoted to `openspec/specs/` and the `flutter-version-update` delta is applied to the existing spec there.
- [x] 10.1 After merge and confirmation that the next scheduled `update_version.yml` produces a sensible Windows-bumping PR, archive this change so the `windows-version-tracking` spec is promoted to `openspec/specs/` and the `flutter-version-update` delta is applied to the existing spec there.
@@ -11,7 +11,7 @@ This change adds a small workflow that deletes the right tag when the PR closes
- New workflow `.github/workflows/cleanup_pr_image.yml` triggered on:
- `pull_request: { types: [closed] }` — deletes `pr-<N>` when the PR closes (merged or not).
- `delete:` (with `ref_type == 'branch'`) — deletes `branch-<branch-with-/-→--->` when a branch is deleted.
- Uses `gh api -X DELETE /user/packages/container/flutter-android/versions/<id>` after resolving the version id from the tag, OR the simpler `actions/delete-package-versions` action if a vetted pinned version is available.
- Uses `gh api -X DELETE /user/packages/container/flutter-android/versions/<id>` after resolving the version id from the tag.
- Runs on `ubuntu-24.04`, single job, < 30 s wall-clock.
- Permissions: `packages: write` only.
- Idempotent: a missing tag is a no-op success, not a failure.
@@ -33,4 +33,5 @@ _None._
- **Behavioral change**: GHCR no longer accumulates `pr-*` and `branch-*` tags on `flutter-android`. The release tags (`<flutter-version>`) are untouched — the cleanup matches only the documented temporary-tag patterns.
- **Risk**: a buggy match pattern could delete the release tag. Mitigation: the workflow is gated to delete only tags matching `^pr-\d+$` or `^branch-.+$` literal-regex; the spec scenario asserts this explicitly. Defense in depth: the workflow logs the version-id and tag name before delete; a maintainer reviewing the workflow log can spot a wrong delete.
- **Depends on**: p2 (or co-merged). Has no value without the tags p2 creates.
- **Permission model**: Relies on the package's *Manage Actions access* granting `gmeligio/flutter-docker-image` the **Admin** role — already configured (verified 2026-05-23). With Admin, the workflow `GITHUB_TOKEN` with `packages: write` can DELETE versions; without it, the same call would 403. No PAT needed.
- **Out of scope**: cleanup of fork-PR artifacts (p2 already sets `retention-days: 1`, GitHub auto-deletes). Cleanup of the `buildcache` tag (p1's `mode=max` overwrites in place; no accumulation).
@@ -1,23 +1,22 @@
## 1. Add the cleanup workflow
- [ ] 1.1 Create `.github/workflows/cleanup_pr_image.yml` with:
- [x] 1.1 Create `.github/workflows/cleanup_pr_image.yml` with:
- `on.pull_request.types: [closed]`
- `on.delete:` (then job-level `if: github.event.ref_type == 'branch'`)
- `permissions: { packages: write, contents: read }`
- `concurrency: cleanup-pr-image-${{ github.event.pull_request.number || github.event.ref }}` with `cancel-in-progress: false` (a second close event for the same PR is a no-op).
- [ ] 1.2 Compute the target tag in a shell step:
- [x] 1.2 Compute the target tag in a shell step:
- `pull_request` event → `pr-${{ github.event.pull_request.number }}`
- `delete` event → `branch-${{ github.event.ref }}` with `/``-`
- Assert the tag matches `^pr-[0-9]+$` or `^branch-[A-Za-z0-9._-]+$`. Refuse to proceed if it doesn't — satisfies spec scenario "Cleanup never targets a non-handoff tag".
- [ ] 1.3 Resolve the GHCR package version id: `gh api /orgs/${{ github.repository_owner }}/packages/container/flutter-android/versions --paginate --jq '.[] | select(.metadata.container.tags[]? == "<tag>") | .id'`. Handle the user-vs-org path: try `/orgs/<owner>/...` first, fall back to `/users/<owner>/...` on 404.
- [ ] 1.4 Delete: `gh api -X DELETE /orgs/${{ github.repository_owner }}/packages/container/flutter-android/versions/<id>` (or user variant). On `404`, log and exit 0 (idempotent).
- [x] 1.3 Resolve the GHCR package version id: `gh api /user/packages/container/flutter-android/versions --paginate --jq '.[] | select(.metadata.container.tags[]? == "<tag>") | .id'`. The `/user/...` (authenticated-user) endpoint is used because the package is user-owned (`gmeligio`); the workflow's `GITHUB_TOKEN` can resolve and delete via this path because the package → repo Actions Access role is Admin (verified live on 2026-05-23 by deleting version 865726171 / `pr-453`, HTTP 204).
- [x] 1.4 Delete: `gh api -X DELETE /user/packages/container/flutter-android/versions/<id>`. On `404`, log and exit 0 (idempotent — tag already gone, either from a prior cleanup run or from a fork PR that never produced a tag).
## 2. Verify on a real PR before merge
- [ ] 2.1 Open a non-fork PR (so p2 produces a `pr-N` tag), close it, confirm `pr-N` is removed from GHCR within 60 s. Repeat with a merge-close.
- [ ] 2.2 Trigger a `workflow_dispatch` on a feature branch (so p2 produces `branch-<name>`), then delete the branch. Confirm the tag is removed.
- [ ] 2.3 Close a PR that never produced a tag (fork PR — p2 used the artifact path). Confirm the workflow runs, logs "tag not found, nothing to delete", and exits 0.
- [ ] 2.4 Manually create a tag matching `^pr-9999$` then close PR #1 (whose tag doesn't exist). Confirm only `pr-1` is targeted (not `pr-9999`) — satisfies spec scenario "Cleanup never targets a non-handoff tag".
## 3. Post-merge closure check
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-05-23
@@ -0,0 +1,141 @@
## Context
Live `gh api /user/packages/container/flutter-android/versions --paginate` (2026-05-23) returns 886 versions: 836 untagged manifests, ~45 tagged (`pr-*`, `branch-*`, `<flutter-version>`, `buildcache`), and the rest are tagged release sub-manifests. The 94% untagged rate is generated by three sources:
1. **PR re-runs** — every `docker push ghcr.io/.../flutter-android:pr-<N>` overwrites the tag; the previously-tagged manifest survives as orphan bytes.
2. **Release re-tags** — when a version bump moves the `<flutter-version>` tag, the prior manifest is detached.
3. **Buildcache churn**`mode=max` GHCR cache layers rotate; old layer manifests are detached.
p4-cleanup-pr-image-tags deletes the *current* `pr-N` manifest on PR close — it has no view of the orphans that preceded it. A different lifecycle is needed: not event-driven (no GH event corresponds to "manifest became orphan"), but **time-window-driven** (sweep what's been orphan-and-quiescent for long enough).
## Goals / Non-Goals
**Goals:**
- Drop the steady-state version count to a bounded, predictable number (≈ tagged count + < 7-day-old orphans).
- Run unattended on a schedule; no per-PR overhead.
- Zero risk to tagged versions — release, `pr-N`, `branch-X`, and `buildcache` SHALL be untouched.
- Preserve safety for in-flight `docker pull` consumers — an orphan that became untagged 30 seconds ago might still be the target of a Scout-comparison job running right now.
- Reuse the API path and permission model p4 already validated.
**Non-Goals:**
- Pruning tagged versions (release lifecycle is separate; PR-tag lifecycle is p4's domain).
- Cross-package pruning (Windows image has its own release flow and is not yet covered).
- Manifest-list awareness — this repo ships single-platform images today; if multi-arch is added later, the design needs revisiting (see Risks).
- One-shot backfill of the 836 existing orphans inside this change. The first scheduled run *will* drop the count; that's fine. Optional `workflow_dispatch` with `dry_run: false` provides the manual trigger.
## Decisions
### D1. Schedule-driven sweep, not event-driven
**Decision**: `on.schedule: '0 4 * * 0'` (weekly, Sunday 04:00 UTC) plus `on.workflow_dispatch:` with a `dry_run` input defaulting to `true`.
**Alternatives considered**:
- *Trigger on every successful `build.yml` run.* Rejected — couples cleanup to build cadence, adds a 30-60s tax to every PR, and a transient build failure would skip cleanup.
- *Daily schedule.* Marginal benefit (orphans don't compound that fast post-p4); 7× the workflow runs.
- *On-demand only (`workflow_dispatch`).* Rejected — relies on a human remembering; defeats the point.
**Rationale**: A weekly sweep is enough to keep the version count bounded once the initial backfill clears. The retention window (D2) is the safety dial, not the schedule frequency.
### D2. Retention window: 7 days, configurable via env
**Decision**: Only versions with `created_at` older than `RETENTION_DAYS` (default 7) are eligible. Implemented as a `jq` filter:
```bash
NOW=$(date -u +%s)
CUTOFF=$((NOW - RETENTION_DAYS * 86400))
gh api /user/packages/container/flutter-android/versions --paginate \
--jq ".[] | select(.metadata.container.tags == [] and (.created_at | fromdateiso8601) < $CUTOFF) | .id"
```
**Alternatives considered**:
- *No window* (delete all untagged immediately). Rejected — risks deleting a manifest still being pulled by a parallel-running Scout/test job that just got untagged seconds ago.
- *24 hours.* Probably safe given GitHub-hosted runners cap workflow wall-clock at 6h, but 7 days gives margin for any future long-running consumer (manual `docker pull`, Renovate workflow re-runs, etc.).
- *Configurable via repo variable.* Possible follow-up; for now, a single number in the workflow YAML is enough — change requires a PR which is good visibility.
**Rationale**: 7 days >> max workflow wall-clock (6h) >> typical PR feedback loop (< 1h). The width of the safety margin is what matters, not its exact value.
### D3. Filter expressed in `jq` with positive assertions only
**Decision**: The filter is `tags == [] AND created_at < cutoff`. No negative match against the protected tag list. The empty-tags check is the safety invariant.
**Alternatives considered**:
- *Negative regex against `^pr-|^branch-|^\d+\.\d+\.\d+|^buildcache$`.* Rejected — a future tag pattern not in the regex would be silently deleted. Positive "must be untagged" is fail-closed.
- *Two-pass: list, then verify each candidate is still untagged before deleting.* Cheap-ish, but the race window is already covered by the retention cutoff.
**Rationale**: Positive invariants in safety-critical code are easier to audit. "Delete only if `tags == []`" is unambiguously safe.
### D4. Per-page enumeration, no rate-limit concern
**Decision**: `--paginate` walks all pages (100/page, ~9 pages today). Each DELETE is one API call. No throttling logic.
**Alternatives considered**:
- *Batched DELETE.* The REST API has no batch endpoint for package versions. The third-party `actions/delete-package-versions` chunks at 100 per run as a soft limit; we'd hit the same ceiling reaching for our own batching.
- *Explicit `sleep` between DELETEs.* GitHub's primary rate limit is 5000 req/hour for `GITHUB_TOKEN`. Even pruning all 836 orphans in one go uses 17% of the quota. No throttling needed.
**Rationale**: Simple loop, no extra moving parts.
### D5. Reuse p4's permission model — no PAT
**Decision**: `permissions: { packages: write, contents: read }` with the workflow `GITHUB_TOKEN`. Empirically validated: package → repo Actions Access role is **Admin**, which makes DELETE work for the workflow token (verified by deleting version 865726171 / `pr-453` on 2026-05-23 — HTTP 204).
**Alternatives considered**:
- *PAT with `delete:packages`.* Adds credential rotation surface for no functional gain.
**Rationale**: Same path p4 uses; one less thing to manage.
### D6. `dry_run` default differs by trigger
**Decision**:
- `on.schedule:``dry_run = false` (the scheduled run actually deletes).
- `on.workflow_dispatch:` → input `dry_run` defaults to `true` (a maintainer poking the workflow gets a preview; opting into deletion requires flipping the input).
**Alternatives considered**:
- *Always dry-run by default; require manual confirmation.* Rejected — defeats the unattended-schedule premise.
- *Dual-job: dry-run prints, then sleeps 1h, then deletes.* Over-engineered; the retention window already provides the safety margin.
**Rationale**: Different defaults match different intents. Scheduled = "do the thing"; manual = "show me first".
## Automated Test Strategy
- **Unit-style**: none — the workflow is bash + `gh` calls; no library code.
- **Integration / smoke**: a `workflow_dispatch` run with `dry_run: true` after the workflow lands. Expected: ≥ 800 candidate ids printed (matches the 836 measured 2026-05-23), zero DELETE calls in the log.
- **Production canary**: the first scheduled run (next Sunday after merge) is observed live. Expected: version count drops from ~885 (post-p4) to ~50, tagged versions unchanged. The workflow log lists every deleted id with its (empty) tag list for audit.
- **Regression**: post-merge, every workflow run logs the pre-/post-sweep version counts. A spike in post-sweep tagged-count would indicate a filter bug — alert-worthy.
## Observability
- **Per-deletion log line**: `INFO: deleting <id> created=<iso> tags=[] sha=<digest>`. Persistent in the run log for at least 90 days per GitHub's default workflow retention.
- **Summary line**: `INFO: pruned N untagged versions, M remain (T tagged kept)`.
- **Pre-flight invariant check**: before the loop, the workflow computes `tagged_count_before` and asserts `tagged_count_after == tagged_count_before` (it counts versions with non-empty tags before and after). Mismatch → `::error::` annotation + non-zero exit. This is the loud failure path for the worst-case bug (accidental tagged delete).
- **No alerting integration** — the workflow run failure shows in the repo's *Actions* tab. Adequate for a weekly job.
## Risks / Trade-offs
| Risk | Mitigation |
|------|------------|
| Filter regression deletes a tagged version | Positive `tags == []` invariant; pre/post tagged-count assertion; per-id log for forensics |
| Multi-arch future breaks orphan semantics (manifest-list children look untagged but are referenced) | Out-of-scope today; documented in *Non-Goals*; if multi-arch lands, this workflow MUST be extended to skip versions referenced by manifest lists before any DELETE |
| Retention too aggressive — deletes a manifest still being pulled | 7-day window is 28× the GitHub workflow max wall-clock; configurable upward if a real consumer pattern emerges |
| GHCR API change (e.g., pagination format) silently breaks enumeration | Workflow logs the candidate count before deleting; a sudden drop to 0 is visible in the post-sweep summary |
| Scheduled run collides with a release publication | `concurrency:` group `prune-ghcr-untagged` with `cancel-in-progress: false`; release workflow holds a different group; no shared mutation surface beyond the registry, which serializes per-tag anyway |
## Migration Plan
1. Merge p4 first (reduces ongoing orphan creation rate).
2. Merge this change; do not wait for the scheduled run — trigger `workflow_dispatch` with `dry_run: true` to validate the candidate list looks right.
3. Trigger `workflow_dispatch` with `dry_run: false` to clear the existing 836 orphans in one pass.
4. Let the weekly schedule maintain steady state.
Rollback: delete the workflow file. The 836-orphan backlog will rebuild slowly (only as new PRs re-run); no urgency.
## Open Questions
_None._ The empirical probe on 2026-05-23 (deleted `pr-453`, version 865726171, HTTP 204) confirmed the permission model. The retention window is a tunable knob, not a question.
@@ -0,0 +1,30 @@
## Why
`ghcr.io/<owner>/flutter-android` carries 836 untagged manifest versions out of 886 total (~94%). They accumulate every time a `pr-N` tag moves on re-run, a release re-tag detaches the prior manifest, or a buildcache layer is replaced. p4 (cleanup-pr-image-tags) only deletes the tagged handoff on PR close/branch delete — it does not address the orphan manifests left behind. Without a separate sweep, GHCR storage debt grows monotonically per PR re-run, slowing the package UI and accruing storage cost.
## What Changes
- New workflow `.github/workflows/prune_ghcr_untagged.yml` on `schedule:` (cron, weekly) and `workflow_dispatch:` for manual runs.
- Enumerates `gh api /user/packages/container/flutter-android/versions --paginate` and filters to entries where `metadata.container.tags == []` AND `created_at` is older than a retention window (default 7 days).
- Deletes each match via `gh api -X DELETE /user/packages/container/flutter-android/versions/<id>`, reusing the path/permission p4 already validated (package → repo Actions Access role is Admin).
- Dry-run mode (default on `workflow_dispatch` with input `dry_run: true`) prints the candidate list without deleting; the scheduled run executes the delete.
- Idempotent: a missing version is a no-op success.
## Capabilities
### New Capabilities
- `ci-image-orphan-pruning`: defines when untagged manifest versions on `ghcr.io/<owner>/flutter-android` SHALL be pruned, what tags SHALL be preserved, the retention-window contract that protects in-flight `docker pull` consumers, and the safety guarantees that prevent deletion of tagged versions.
### Modified Capabilities
_None._ Distinct from `ci-image-tag-lifecycle` (p4): that capability owns *tagged* handoff cleanup on PR-close / branch-delete events; this one owns *untagged* manifest pruning on a schedule.
## Impact
- **Affected files**: new `.github/workflows/prune_ghcr_untagged.yml`. No edits to existing workflows.
- **Behavioral change**: GHCR version count on `flutter-android` drops from ~886 to ~50 on first scheduled run; thereafter holds steady near the count of currently-tagged versions plus orphans from the past 7 days.
- **Risk**: a buggy filter could delete a tagged version (release, `pr-N`, `branch-X`, or `buildcache`). Mitigation: the filter SHALL be a positive assertion (`tags == []`) AND a date check; the workflow logs every candidate id + tag list before deleting; a maintainer reviewing the log can spot a wrong delete. Defense in depth: the spec scenarios assert the tagged-protection invariant explicitly.
- **Risk**: deleting an orphan manifest that is still part of a manifest list (multi-platform release) would break the parent. Mitigation: this repo ships single-platform images today; manifest-list semantics are out of scope. If multi-platform is added later (e.g., arm64), the workflow MUST be extended to walk the manifest tree before pruning.
- **Depends on**: p4-cleanup-pr-image-tags is not a hard dependency, but p4 reduces the rate of new orphan creation (each `pr-N` deletion also untags one manifest, growing the orphan pool by one). Land p4 first to converge the steady-state count faster.
- **Out of scope**: Windows package (`flutter-windows-server` or similar — separate release flow), cross-package pruning, deleting tagged-but-superseded release versions (those have their own lifecycle).
@@ -0,0 +1,120 @@
## ADDED Requirements
### Requirement: Scheduled sweep prunes untagged versions older than the retention window
A workflow SHALL run on a recurring schedule (no less frequent than weekly) and delete every `ghcr.io/<owner>/flutter-android` package version whose tag list is empty AND whose `created_at` is older than the retention window (default 7 days).
The experience context is the maintainer auditing GHCR storage — they expect the package's version count to converge to roughly the count of currently-tagged versions plus a thin layer of < 7-day-old orphans, not to grow monotonically with PR re-runs.
#### Scenario: Weekly run prunes orphans older than the window
- **GIVEN** the package has 800 untagged versions all created more than 7 days ago, and 50 tagged versions (`pr-*`, `branch-*`, release tags, `buildcache`)
- **WHEN** the scheduled workflow runs
- **THEN** all 800 untagged versions are deleted
- **AND** the 50 tagged versions remain untouched
- **AND** the workflow log contains a per-deletion line `INFO: deleting <id> ... tags=[]` for each deleted version
- **AND** the workflow log contains a summary line naming the deleted count and the remaining count
#### Scenario: Recent orphans are preserved
- **GIVEN** an untagged version created 1 hour ago (orphan from a just-re-run PR build)
- **WHEN** the scheduled workflow runs
- **THEN** that version is NOT deleted
- **AND** the workflow log does not list it as a candidate
### Requirement: Pruning never targets a tagged version
The workflow SHALL filter candidates by the positive invariant `metadata.container.tags == []`. A version with any non-empty tag list — release tag (`<flutter-version>`), handoff tag (`pr-<N>`, `branch-<X>`), buildcache tag, or any future tag — SHALL be unreachable from the delete code path.
The experience context is the maintainer auditing the workflow before merging — they need confidence that this scheduled, unattended job cannot delete a release tag, a `pr-<N>` tag belonging to an open PR, or the `buildcache` tag.
#### Scenario: Release tag is never considered for deletion
- **GIVEN** a release version `3.41.9` (tagged) created more than 7 days ago
- **WHEN** the workflow runs
- **THEN** the `3.41.9` version is not in the candidate list
- **AND** it is not deleted
#### Scenario: Open-PR handoff tag is never considered for deletion
- **GIVEN** PR #500 is open and `pr-500` was last pushed 10 days ago (still tagged on the current manifest)
- **WHEN** the workflow runs
- **THEN** the `pr-500`-tagged version is not in the candidate list
#### Scenario: Buildcache tag is preserved
- **GIVEN** the `buildcache` tag points at a version created more than 7 days ago
- **WHEN** the workflow runs
- **THEN** the tagged buildcache version is preserved
- **AND** older untagged buildcache layer manifests are eligible for deletion
#### Scenario: Tagged-count invariant guards against filter bugs
- **GIVEN** any successful workflow run
- **WHEN** the workflow finishes
- **THEN** the count of tagged versions after pruning equals the count of tagged versions before pruning
- **AND** if the invariant fails, the workflow exits non-zero with a `::error::` annotation naming the count delta
### Requirement: Manual trigger defaults to dry-run
The workflow SHALL expose a `workflow_dispatch` trigger with an input `dry_run` whose default is `true`. When `dry_run` is true, the workflow SHALL enumerate candidates and log them but SHALL NOT issue any DELETE request.
The experience context is the maintainer poking the workflow ad-hoc — they expect to see the candidate list before anything destructive happens, and to opt in to deletion explicitly by flipping the input.
#### Scenario: workflow_dispatch with default input previews only
- **GIVEN** a maintainer triggers the workflow via the *Actions* UI without changing inputs
- **WHEN** the workflow runs
- **THEN** the workflow logs every candidate version id with its `created_at` and (empty) tag list
- **AND** the workflow log contains zero `DELETE` lines
- **AND** no version is removed from GHCR
#### Scenario: workflow_dispatch with dry_run=false deletes
- **GIVEN** a maintainer triggers the workflow with `dry_run: false`
- **WHEN** the workflow runs
- **THEN** every candidate version is deleted (same effect as a scheduled run)
#### Scenario: Scheduled trigger deletes by default
- **GIVEN** the cron schedule fires
- **WHEN** the workflow runs
- **THEN** every candidate version is deleted without requiring any input override
### Requirement: Pruning is idempotent
A pruning run that finds zero candidates SHALL exit 0. A DELETE that returns 404 (version already deleted by a prior race) SHALL be logged and treated as success.
The experience context is the on-call maintainer reading workflow logs — they expect a no-op run on a clean registry to look identical to a successful run, not to fail.
#### Scenario: Clean registry produces a no-op success
- **GIVEN** every untagged version on the package is < 7 days old
- **WHEN** the workflow runs
- **THEN** the workflow logs "0 candidates" and exits 0
#### Scenario: Concurrent delete is not an error
- **GIVEN** a candidate version id was deleted by a prior run between enumeration and DELETE
- **WHEN** this run issues DELETE for that id
- **THEN** the 404 response is logged and the workflow continues with the next candidate
- **AND** the workflow exits 0 if no other failures occurred
### Requirement: Pruning workflow runs with minimum privilege
The workflow SHALL declare `permissions: { packages: write, contents: read }` and SHALL NOT request any other permission. It SHALL use the workflow `GITHUB_TOKEN` — no personal access token or external secret.
The experience context is the security reviewer auditing the unattended scheduled job — they need to confirm the workflow cannot reach beyond the package and cannot publish secrets.
#### Scenario: Workflow does not run on push or pull_request
- **GIVEN** a push to main or a pull_request open
- **WHEN** GitHub fires workflow events
- **THEN** this workflow does not appear in the run list for those events
#### Scenario: Workflow uses GITHUB_TOKEN with packages:write
- **GIVEN** the workflow file
- **WHEN** a maintainer reads its `permissions:` block
- **THEN** only `packages: write` and `contents: read` are requested
- **AND** no `secrets.*` other than `GITHUB_TOKEN` is referenced
@@ -0,0 +1,35 @@
## 1. Add the pruning workflow
- [ ] 1.1 Create `.github/workflows/prune_ghcr_untagged.yml` with:
- `on.schedule: [{ cron: '0 4 * * 0' }]` (Sundays 04:00 UTC).
- `on.workflow_dispatch.inputs.dry_run: { type: boolean, default: true }`.
- `permissions: { packages: write, contents: read }`.
- `concurrency: { group: prune-ghcr-untagged, cancel-in-progress: false }`.
- [ ] 1.2 Resolve effective `dry_run` for the run:
- `schedule` event → `dry_run=false`.
- `workflow_dispatch` event → use the input (default true).
- [ ] 1.3 Compute the cutoff timestamp from `RETENTION_DAYS` (env, default `7`): `CUTOFF=$(($(date -u +%s) - RETENTION_DAYS * 86400))`.
- [ ] 1.4 Capture `tagged_count_before` by listing all versions where `metadata.container.tags != []`.
- [ ] 1.5 Enumerate candidates: `gh api /user/packages/container/flutter-android/versions --paginate --jq ".[] | select(.metadata.container.tags == [] and (.created_at | fromdateiso8601) < $CUTOFF) | .id"`.
- [ ] 1.6 For each candidate id, log `INFO: deleting <id> created=<iso> tags=[] sha=<digest>`. When `dry_run=false`, issue `gh api -X DELETE /user/packages/container/flutter-android/versions/<id>` and treat HTTP 404 as success (log and continue).
- [ ] 1.7 After the loop, capture `tagged_count_after` and `untagged_count_after`. Assert `tagged_count_after == tagged_count_before` — on mismatch, emit `::error::tagged_count changed: before=$before after=$after` and exit 1.
- [ ] 1.8 Print summary: `INFO: pruned N versions (dry_run=<bool>), M untagged remain, K tagged kept`.
## 2. Validate on the live registry
- [ ] 2.1 Trigger `workflow_dispatch` with `dry_run: true`. Expect: candidate count > 800 (matches the 836 measured 2026-05-23 minus any deleted by p4 in the interim); zero DELETE calls in the log; `tagged_count_before == tagged_count_after` trivially holds.
- [ ] 2.2 Spot-check 3 candidate ids from the log against `gh api /user/packages/container/flutter-android/versions/<id>`. Confirm each has `metadata.container.tags == []` and `created_at` older than 7 days.
- [ ] 2.3 Trigger `workflow_dispatch` with `dry_run: false`. Expect: ~ same N deletions as the dry-run count; tagged-count invariant holds; final summary line names the new totals.
- [ ] 2.4 Verify the protected set: `gh api /user/packages/container/flutter-android/versions --paginate --jq '[.[] | select(.metadata.container.tags != [])] | length'` returns the same number as before the run.
- [ ] 2.5 Verify specific tags survived: `gh api /user/packages/container/flutter-android/versions --paginate --jq '.[] | select(.metadata.container.tags[]? == "buildcache" or .metadata.container.tags[]? == "3.41.9")'` returns non-empty for each.
## 3. Verify idempotence and edge cases
- [ ] 3.1 Re-run `workflow_dispatch` with `dry_run: false` immediately after step 2.3. Expect: 0 candidates (or only versions that became eligible in the last few seconds), workflow exits 0, summary line shows "pruned 0".
- [ ] 3.2 Manually delete one candidate id between enumeration and the workflow's DELETE call (simulate race): create a feature branch, push a deliberate sleep into the workflow between enumeration and the DELETE loop, then race a manual `gh api -X DELETE` against it. Confirm the workflow logs the 404 and continues. Revert the sleep.
- [ ] 3.3 Confirm the recent-orphan preservation: count untagged versions with `created_at` in the last 7 days before the run, run, count again. They should be unchanged.
## 4. Wait one schedule cycle
- [ ] 4.1 After the first natural Sunday-04:00 cron firing, inspect the run log. Confirm: schedule event → `dry_run=false` was applied, deletions occurred, tagged-count invariant held, summary line is sensible.
- [ ] 4.2 Add a one-line note in `docs/contributing.md` (or equivalent) describing the weekly prune, in case a maintainer sees a version disappear and wants to understand why.
+14 -2
View File
@@ -27,9 +27,9 @@ The experience context is the CI engineer who watches this repository for upgrad
### Requirement: Upgrade PR contains a coherent, validated `version.json`
When the workflow opens an upgrade PR, the included `config/version.json` SHALL satisfy `cue vet config/schema.cue -d '#Version'` and SHALL contain the Android `buildTools.version` listed for that exact Flutter tag in `engine/src/flutter/tools/android_sdk/packages.txt` upstream.
When the workflow opens an upgrade PR, the included `config/version.json` SHALL satisfy `cue vet config/schema.cue -d '#Version'` and SHALL contain the Android `buildTools.version` listed for that exact Flutter tag in `engine/src/flutter/tools/android_sdk/packages.txt` upstream. The same `version.json` SHALL also contain a `windows.git.version` equal to the latest non-prerelease tag at `https://api.github.com/repos/git-for-windows/git/releases/latest` (with any `.windows.N` suffix stripped) and the VS BuildTools component versions sourced from the deterministic source documented in `p3-windows-version-schema`'s design.
The experience context is the CI engineer reviewing or merging the upgrade PR — they observe that downstream image builds will not silently regress on Android tooling.
The experience context is the CI engineer reviewing or merging the upgrade PR — they observe that downstream image builds will not silently regress on Android tooling *or* on Windows tooling.
#### Scenario: Build-tools version tracks the new Flutter tag
@@ -45,6 +45,18 @@ The experience context is the CI engineer reviewing or merging the upgrade PR
- **THEN** `cue vet config/schema.cue -d '#Version' config/version.json` exits 0
- **AND** the workflow only proceeds to open the PR if validation passes
#### Scenario: Git for Windows tracks the latest published tag
- **GIVEN** `https://api.github.com/repos/git-for-windows/git/releases/latest` returns an asset whose underlying Git semver is `M.m.p`
- **WHEN** the upgrade PR is created
- **THEN** `config/version.json` in the PR contains `windows.git.version == "M.m.p"`
#### Scenario: Windows toolchain block is schema-valid
- **GIVEN** the workflow has produced a candidate `config/version.json` containing the new `windows` block
- **WHEN** the `validate_config_version` job runs
- **THEN** `cue vet` passes against the `windows` block as well as the existing `flutter` and `android` blocks
### Requirement: Schema rejects non-stable Flutter channels
`config/schema.cue` SHALL constrain `flutter.channel` to the literal `"stable"`. Any `flutter_version.json` whose channel is anything else SHALL fail `cue vet`.
@@ -0,0 +1,103 @@
# windows-version-tracking Specification
## Requirements
### Requirement: `config/version.json` declares the Windows toolchain versions
`config/version.json` SHALL contain a top-level `windows` object with the following fields, each validated by `config/schema.cue` `#Version`:
- `windows.git.version` — Git for Windows release version, three-part semver (e.g., `2.46.0`).
- `windows.vsBuildTools.cmakeProject.version` — Visual Studio CMake component version, four-part (e.g., `17.13.35919.96`).
- `windows.vsBuildTools.windows11Sdk.build` — Windows 11 SDK build number as integer (e.g., `22621`).
- `windows.vsBuildTools.vcTools.version` — Visual Studio VCTools workload version, four-part.
The experience context is the maintainer or CI engineer reading `config/version.json` to know exactly which Windows toolchain a given image tag was built against — without having to grep Dockerfiles.
#### Scenario: Manifest validates against schema
- **GIVEN** the current `config/version.json` and `config/schema.cue`
- **WHEN** `cue vet config/schema.cue -d '#Version' config/version.json` runs
- **THEN** the command exits 0
#### Scenario: Missing `windows` block fails validation
- **GIVEN** a candidate `config/version.json` with no `windows` field
- **WHEN** `cue vet config/schema.cue -d '#Version' config/version.json` runs
- **THEN** the command exits non-zero with an error naming the missing `windows` field
### Requirement: `windows.Dockerfile` builds from manifest values, not hardcoded constants
`windows.Dockerfile` SHALL accept the build args `flutter_version`, `git_version`, `vs_cmake_version`, `vs_win11sdk_build`, and `vs_vctools_version`, with no default values. The `--add Microsoft.VisualStudio.Component.Windows11SDK.${vs_win11sdk_build}` invocation in the Dockerfile SHALL substitute the build-arg value, and the Git installer download SHALL use the `git_version` build arg in the URL and filename.
The experience context is the contributor changing the Windows toolchain: they edit one place (`config/version.json`), regenerate, and the build picks up the change.
#### Scenario: Build with manifest values succeeds
- **GIVEN** `config/version.json` declares the four windows version fields
- **AND** `windows.yml` passes them as `--build-arg` from the env vars exported by `setEnvironmentVariables.js`
- **WHEN** the test_windows job runs `docker build`
- **THEN** the build completes and the resulting image has VS components installed at the manifest-declared versions
#### Scenario: Build without manifest values fails fast
- **WHEN** a developer runs `docker build -f windows.Dockerfile .` without any `--build-arg`
- **THEN** the build fails on the first ARG-using step
- **AND** the error message names the missing build argument
### Requirement: `setEnvironmentVariables.js` exports Windows fields as env vars
`script/setEnvironmentVariables.js` SHALL read `windows.git.version`, `windows.vsBuildTools.cmakeProject.version`, `windows.vsBuildTools.windows11Sdk.build`, and `windows.vsBuildTools.vcTools.version` from `config/version.json` and export them as `GIT_VERSION`, `VS_CMAKE_VERSION`, `VS_WIN11SDK_BUILD`, and `VS_VCTOOLS_VERSION` to the GitHub Actions environment.
The experience context is `windows.yml` and `release.yml` reading exactly the same env vars to feed `docker build` — a single point of plumbing.
#### Scenario: Workflow env contains the windows fields
- **WHEN** the `Read environment variables from the version manifest` step runs in any workflow
- **THEN** the workflow's environment contains `GIT_VERSION`, `VS_CMAKE_VERSION`, `VS_WIN11SDK_BUILD`, `VS_VCTOOLS_VERSION` with values matching `config/version.json`
### Requirement: Pester suite asserts exact toolchain versions
The Pester suite at `test/windows/Windows.Tests.ps1` SHALL read `config/version.json` (already copied into the test stage by `p1-fix-windows-ci-tests`) and assert that:
- `git --version` reports a version equal to `windows.git.version`,
- the `Microsoft.VisualStudio.Component.VC.CMake.Project,version=<x>` directory's `<x>` equals `windows.vsBuildTools.cmakeProject.version`,
- the `Microsoft.VisualStudio.Component.Windows11SDK.<build>` directory's `<build>` equals `windows.vsBuildTools.windows11Sdk.build`,
- the `Microsoft.VisualStudio.Workload.VCTools,version=<x>` directory's `<x>` equals `windows.vsBuildTools.vcTools.version`.
The experience context is the reviewer of an upgrade PR: any drift between the manifest the PR proposes and the image actually produced is caught as a hard test failure, not silent semantics drift.
#### Scenario: Manifest and image agree on every Windows version
- **GIVEN** the test image was built with the build args derived from the current `config/version.json`
- **WHEN** the Pester suite runs
- **THEN** all four version assertions pass
#### Scenario: Manifest claims a version the image does not have
- **GIVEN** a PR that bumps `windows.git.version` to a version different from the build arg actually passed to the image
- **WHEN** the Pester suite runs
- **THEN** the Git version test fails with a message naming both the manifest value and the in-image value
### Requirement: Monthly upgrade PR includes Windows toolchain updates
The `update_version.yml` workflow SHALL include a job that updates the `windows` block in `config/version.json` whenever it runs. The job SHALL:
- read the latest Git for Windows release from `https://api.github.com/repos/git-for-windows/git/releases/latest` and write the resolved version to `windows.git.version`,
- write the VS BuildTools component versions from a deterministic source (see design — channel manifest or pinned config),
- emit a CUE-validated `config/version.json` that is consumed by `validate_config_version` and `update_docs_and_create_pr`.
The experience context is the maintainer who merges the monthly upgrade PR and expects both Android and Windows toolchains to be bumped in the same PR — a single review surface.
#### Scenario: Monthly run produces a Windows-aware upgrade PR
- **GIVEN** a scheduled run of `update_version.yml` where Flutter has a new stable
- **AND** Git for Windows has a new release since the last run
- **WHEN** the workflow opens its upgrade PR
- **THEN** the PR's `config/version.json` has a bumped `windows.git.version`
- **AND** the PR's `config/version.json` passes `cue vet` against `#Version`
#### Scenario: No Windows update needed in this cycle
- **GIVEN** Git for Windows and VS BuildTools component versions match what is already in `config/version.json`
- **WHEN** the upgrade PR is composed
- **THEN** the PR still opens (because Flutter changed) but the `windows` block is unchanged byte-for-byte