Files
DMehaffy 19fef31e08 chore(examples): migration performance benchmark harness + mariadb/sqlite + anti-pattern schemas (#26036)
* chore(examples): add mariadb + sqlite + podman support to complex

Extends the complex example's DB tooling to cover all Strapi-supported
database dialects and both container runtimes, as groundwork for a
migration performance benchmark harness:

- New compose.js runtime shim auto-detects podman compose / podman-compose
  / docker compose / docker-compose and the matching container CLI; all
  existing db-* scripts now go through it so podman-only environments
  work without installing docker
- New db-mariadb.js mirrors db-mysql.js using mariadb-dump / mariadb CLIs
  and adds a mariadb:11 service on port 3307 to docker-compose.dev.yml
- New db-sqlite.js handles file-based snapshot/restore/wipe/check via
  fs.copy / better-sqlite3
- db-utils.js falls back to `<runtime> ps --filter name=` for container
  lookup since podman-compose doesn't support `ps -q`
- develop-with-db.js and the v4 templates (develop-with-db.js,
  seed-with-db.js) handle mariadb + sqlite (sqlite skips compose)
- setup-v4-project.js includes better-sqlite3 in v4 deps, database.js
  template covers all 4 clients, and compose.js is copied into the
  v4 scaffold scripts dir (dep of db-utils.js)

All four DBs smoke-tested locally against podman: start/check/snapshot/
restore/wipe cycle works for mariadb; cp-based snapshot cycle works
for sqlite.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore(examples): add migration perf benchmark harness

Three new scripts enable per-migration timing and baseline-vs-candidate
comparison reports for v4→v5 migrations in the complex example:

- bench-hook.js: Node --require preload that intercepts require('umzug')
  and subscribes to Umzug's native `migrating`/`migrated` events for
  sub-ms timing. Captures every migration that runs (including dynamically
  registered ones like discard-drafts and EE-only release migrations)
  without hardcoding names. Dumps to a JSON file on process exit; self-
  disables when STRAPI_BENCH_HOOK_OUTPUT is unset.
- bench.js: orchestrator with `run`, `seed`, and `suite` subcommands.
  `run` restores a snapshot, spawns Strapi in migrate-then-exit mode
  with the hook preload, collects row counts, and writes a result JSON
  with baseline/candidate attribution, env capture (node, CPU, memory,
  DB version, host type), and config (multiplier, seed/hook modes).
  `seed` wipes the DB, runs the v4 seed via seed-with-db.js, then
  snapshots. First iteration supports --strapi-source=local only;
  experimental/pinned are stubbed with a clear error.
- bench-compare.js: takes N labels and emits both a clipboard-friendly
  markdown report (stdout + results/compare-*.md) and a self-contained
  HTML report (results/compare-*.html) with inline SVG bar charts,
  per-DB grid, sortable tables, collapsible raw JSON, and a light/dark
  adaptive theme via prefers-color-scheme. No CDN deps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore(examples): bench harness smoke-test fixes

Fixes discovered during the end-to-end smoke test on the existing 6
content types at multiplier=1:

- bench-hook.js: switch from subclass-based wrapping to in-place
  Umzug.prototype.up patching. The subclass approach replaced the module
  export at require time, but Node's module cache hands out the original
  class on subsequent requires, so listeners weren't attached on all
  instances. In-place prototype patching works for every instance
  regardless of how Umzug was imported.
- bench-hook.js: flush incrementally after each recorded migration.
  Strapi's shutdown path can bypass process.on('exit') handlers under
  some conditions (signal or explicit exit from deep inside), causing
  fully-collected timing data to be lost. Writing after each recording
  makes the benchmark resilient to any exit path.
- bench.js: compile TypeScript configs via @strapi/typescript-utils
  before createStrapi().load(). The examples/complex project has .ts
  config files; the Strapi CLI compiles them to dist/ before boot but
  our direct node -e loader skipped this, producing
  "db.config.connection undefined" failures.
- bench.js: propagate STRAPI_BENCH_HOOK_DEBUG to the Strapi child so
  debug output is visible when tracing hook behavior.
- bench-compare.js: rework the SVG chart. Dynamic label column sized
  to the longest migration name (up to 420px), 80px reserved on the
  right for value labels so they never clip, inlined monospace font
  (SVG text doesn't reliably inherit CSS variables from the surrounding
  stylesheet), and `dominant-baseline="middle"` for proper vertical
  centering.

Verified: full pipeline (setup:v4 → seed → snapshot → bench:run →
bench:compare) works against postgres at multiplier=1. Ran a baseline
vs cherry-picked PR #25988 comparison — captured all 7 v4→v5 migrations,
produced both markdown and HTML reports with correct test-setup
attribution and delta coloring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(examples): run ANALYZE before db:check to get fresh row counts

pg_stat_user_tables.n_live_tup and information_schema.tables.table_rows
are approximate and can lag behind reality by minutes or hours depending
on autovacuum / ANALYZE cadence. For a benchmark harness that publishes
row-count numbers in its reports, stale counts are misleading.

Trigger a refresh via ANALYZE (postgres) / ANALYZE TABLE per-table
(mysql/mariadb) before each db:check invocation. Best-effort on the
mysql/mariadb side — fall through to stale stats if ANALYZE fails rather
than error the whole command.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(examples): add hc-m2m-source/target anti-pattern schemas

First anti-pattern schema pair for migration benchmark stress-testing.
A high-cardinality many-to-many relation that forces the v4→v5
discard-drafts migration's copyRelationTableRows code path to span
multiple chunks (>1000 rows) — the same scenario PR #25988's caching
fixes target.

- src/api/hc-m2m-source: collection type with DP and a manyToMany
  relation to hc-m2m-target (owning side)
- src/api/hc-m2m-target: collection type with DP and the inverse
  manyToMany back to source
- setup-v4-project.js: include both in the v4 scaffold CONTENT_TYPES
- seed-v4.js: seedHcM2m() method that creates sources + targets and
  fans out 10 targets-per-source via the M2M relation. BASE counts at
  m=1 are tiny (15 pub + 5 draft per side) but at m=100 produce ~2K
  sources × ~2K targets × 10 = 20K join rows, crossing the 1000-row
  chunk boundary multiple times

Intentionally NOT a realistic content-type design — this is a
stress-test fixture. See the description in schema.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(examples): render multiplier x db matrix in bench-compare

Rework bench-compare to index results by (label, multiplier, dbEngine)
triples pulled from each result JSON's own fields, rather than parsing
labels out of filenames. Lets the same canonical baseline/candidate
label span any number of (multiplier, db) combinations and produces:

- A speedup matrix at the top: rows = multipliers, cols = databases,
  cells = "baseline -> candidate (delta%)". Missing cells render as
  "-" so partial data still produces a useful report.
- A data-availability matrix listing what ran vs what's still missing.
- Per-(db, multiplier) detail sections as collapsible details in
  HTML, all expanded in markdown.

Also:
- New flag syntax: --baseline <label> / --candidate <label>, with
  positional args kept for backward compat.
- Legacy labels that embedded the multiplier (e.g. "baseline-m100")
  are normalized to their base form ("baseline"), letting older
  result files keep working.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(examples): force TCP for mysql/mariadb CLI in containers

The mysql/mariadb CLI tools default to connecting via unix socket at
/var/run/mysqld/mysqld.sock, which isn't populated in the official
mysql:8 / mariadb:11 container images. Every invocation (check,
snapshot, restore, wipe, readiness probe, version probe) needs an
explicit -h 127.0.0.1 to force TCP via the container's loopback.

Without this fix, bench:seed and bench:run error out with
"Can't connect to local MySQL server through socket" on anything
requiring the CLI inside the container (pg_stat-style row-count
queries, snapshot restore, etc.).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* enhancement(examples): parallelize entity creation in seed-v4

Replace sequential for-loops of `await entityService.create(...)` with
a `concurrentMap(count, concurrency, taskFn)` helper that runs N tasks
in flight at once. At SEED_CONCURRENCY=5 (default), a seed that was
strictly serial now fans out into 5 parallel creates.

Concurrency chosen conservatively: Strapi v4's default knex pool is
`{min: 2, max: 10}`, and entity-heavy creates (components + DZs +
localizations) can use multiple connections per call. 5 keeps us well
under the pool ceiling. Tune via `SEED_CONCURRENCY=<n>` env var if
you've also raised the pool max.

Applied to: seedBasic, seedBasicDp, updateComponentRelations,
seedBasicDpI18n, seedRelation, seedRelationDp, seedRelationDpI18n,
seedHcM2m (all entity-creation loops plus their follow-up
self-reference update loops).

Not yet done: incremental seeding (restore previous snapshot + seed
delta) — a separate optimization tracked as a follow-up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs(examples): update complex README for new bench tooling + DBs

README was documenting just the original 6-type, postgres+mysql
workflow. Updated to cover everything this branch adds:

- 8 content types (added hc-m2m-source/target anti-patterns)
- 4 supported databases (added mariadb + sqlite)
- Container runtime auto-detection (podman compose / podman-compose /
  docker compose / docker-compose) with STRAPI_BENCH_RUNTIME override
- Benchmark harness workflow (bench:seed / bench:run / bench:compare /
  bench:suite) for reviewing migration-performance PRs
- SEED_CONCURRENCY, STRAPI_BENCH_HOOK_OUTPUT, STRAPI_BENCH_HOOK_DEBUG,
  and the existing port-override env vars
- MariaDB port default 3307 to avoid colliding with MySQL on 3306

Also collapsed the redundant per-DB command sections (postgres and
mysql both had identical copy-pasted blocks) into a single
'yarn db:<op>:<db>' table since the commands are symmetric across
all four dialects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(examples): align better-sqlite3 version with monorepo convention

I picked `11.3.0` arbitrarily. Every other example and tests/app-template
use `12.8.0`, and the root yarn.lock already resolves that version.
Without alignment CI's `yarn install --immutable` fails with 'lockfile
would have been modified', cascading every subsequent job (build, pretty,
commitlint, aggregate_test_result) to red.

Bumping to `12.8.0` to match, regenerating yarn.lock.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: throw instead of return to fail fast

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Ben Irvin <ben@innerdvations.com>
2026-04-27 18:28:08 +02:00
..

Complex Example Project

This project contains complex Strapi schemas for testing migrations between Strapi v4 and v5, plus a benchmark harness for measuring the performance of those migrations.

Content Types

The project includes 8 content types covering the feature space v4→v5 migrations touch.

Baseline feature combinations

  • basic — no draft/publish, no i18n
  • basic-dp — draft/publish
  • basic-dp-i18n — draft/publish + i18n
  • relation — relations + morphs + components + DZ
  • relation-dp — + draft/publish
  • relation-dp-i18n — + i18n

Anti-pattern stress schemas

Intentionally unrealistic; each targets a specific migration code path.

  • hc-m2m-source / hc-m2m-target — high-cardinality many-to-many. At --multiplier 100 produces ~2K sources × ~2K targets × 10 fanout = 20K+ join rows, crossing the 1000-row chunk boundary in copyRelationTableRows.

Supported databases

  • PostgreSQL 16 — via podman/docker container on ${POSTGRES_PORT:-5432}
  • MySQL 8 — via container on ${MYSQL_PORT:-3306}
  • MariaDB 11 — via container on ${MARIADB_PORT:-3307}
  • SQLite — file-based at ../complex-v4/.tmp/data.db (override with SQLITE_DATABASE_FILENAME)

Container runtime is auto-detected in this order: podman composepodman-composedocker composedocker-compose. Override with STRAPI_BENCH_RUNTIME=podman|docker on mixed-install hosts.

Migration Testing Workflow

This project includes tools for testing migrations between Strapi v4 and v5 by creating an isolated v4 project and managing database snapshots. The complex example ships its own docker-compose.dev.yml so the database containers are independent of the monorepo root.

Setup

  1. Create/Update the external v4 project:

    yarn setup:v4
    

    This creates a Strapi v4 project outside the monorepo (default: a sibling directory named complex-v4). You can override the location via V4_OUTSIDE_DIR.

  2. Install v4 deps (one-time):

    cd <path-printed-by-setup>
    yarn install
    
  3. Configure the v4 project (only if you need custom DB creds):

    cp .env.example .env
    # Edit .env as needed
    
  4. Start the v4 project:

    yarn develop:postgres    # or :mysql, :mariadb, :sqlite
    

Database Management

The same per-command pattern applies to postgres, mysql, mariadb, and sqlite:

yarn db:start:<db>                 # start the DB container (no-op for sqlite)
yarn db:stop:<db>                  # stop the DB container (no-op for sqlite)
yarn db:snapshot:<db> <name>       # snapshot current DB state
yarn db:restore:<db> <name>        # restore DB from a named snapshot
yarn db:wipe:<db>                  # drop + recreate (clean slate)
yarn db:check:<db>                 # print table row counts (runs ANALYZE first for fresh stats)

Snapshots live in snapshots/ and are gitignored:

  • PostgreSQL: snapshots/postgres-<name>.sql
  • MySQL: snapshots/mysql-<name>.sql
  • MariaDB: snapshots/mariadb-<name>.sql
  • SQLite: snapshots/sqlite-<name>.db (raw file copy; fast)

Typical Migration Testing Workflow

  1. Setup v4 project (if not already done):

    yarn setup:v4
    
  2. Wipe the database (ensures v4 format, no v5 schema):

    yarn db:wipe:postgres
    
  3. Start v4 project (in separate terminal, use the path printed by setup):

    cd <path-printed-by-setup>
    yarn develop:postgres
    

    (v4 will automatically start its database if needed)

  4. Seed test data in the v4 project:

    yarn seed
    
  5. Create snapshot:

    cd examples/complex
    yarn db:snapshot:postgres mybackup
    
  6. Stop v4 server (Ctrl+C in v4 terminal)

  7. Start v5 server with the same database:

    yarn develop:postgres
    

    Migrations will run automatically on startup.

  8. Validate migration (no HTTP server needed):

    yarn test:migration
    
  9. Test and fix bugs as needed

  10. Restore snapshot to reset database:

    yarn db:restore:postgres mybackup
    
  11. Repeat from step 7 to test fixes

Note: The database container stays running even after stopping Strapi, so you can inspect the database or run multiple tests without restarting the container. The complex example uses its own Compose project name (strapi_complex) so it does not collide with other containers.

Migration performance benchmark

For reviewing PRs that touch v4→v5 migration code, this project ships a benchmark harness that captures per-migration timings and produces baseline-vs-candidate reports across any combination of databases and multipliers.

Quick start

# One-time setup
yarn setup:v4
cd ../../complex-v4 && yarn install && cd -

# Seed data (one snapshot per DB × multiplier, kept in snapshots/)
yarn bench:seed --db postgres --multiplier 100

# Capture baseline — on develop (or whatever you're comparing against)
yarn bench:run --db postgres --multiplier 100 --label baseline

# Capture candidate — git checkout or cherry-pick the PR, rebuild, then:
yarn workspace @strapi/database run build
yarn workspace @strapi/core run build
yarn bench:run --db postgres --multiplier 100 --label pr-xxxxx

# Generate matrix comparison report
yarn bench:compare --baseline baseline --candidate pr-xxxxx

Reports land in results/:

  • compare-<timestamp>.md — clipboard-ready markdown, also echoed to stdout
  • compare-<timestamp>.html — self-contained single-file HTML with inline SVG charts, sortable tables, and light/dark theme support via prefers-color-scheme

Bench subcommands

  • yarn bench:seed --db <db> --multiplier <n> — wipe + boot v4 + seed + snapshot. One-time per (db, multiplier). Runtime scales with multiplier; at m=100 expect ~810 min per DB depending on hardware.
  • yarn bench:run --db <db> --multiplier <n> --label <label> — restore snapshot + spawn Strapi v5 in migrate-then-exit mode + capture per-migration timings via a Node --require preload that subscribes to Umzug's native migrating/migrated events. Emits a result JSON to results/<db>-<label>-<timestamp>.json. Typically ~15s to several minutes depending on dataset size.
  • yarn bench:compare --baseline <label> --candidate <label> — render a multiplier × database matrix plus per-cell per-migration breakdowns, to both markdown and self-contained HTML. Accepts partial data (missing cells render as ).
  • yarn bench:suite --multiplier <n> [--dbs postgres,mysql,mariadb,sqlite] — chained bench:run across DBs for a given multiplier. Runs under whatever Strapi version is currently checked out; label via --label.

Workflow for reviewing a migration-perf PR

  1. On develop, seed once per (db, multiplier) you want data for.
  2. Run baselines: yarn bench:run --db <db> --multiplier <n> --label baseline.
  3. Cherry-pick the PR's commits (or gh pr checkout), rebuild @strapi/database and @strapi/core.
  4. Run candidates with the same (db, multiplier) combinations, --label pr-xxxxx.
  5. Reset cherry-pick + rebuild.
  6. yarn bench:compare --baseline baseline --candidate pr-xxxxx — paste the markdown into a PR comment; attach the zipped HTML as an upload (GitHub comments don't render .html directly).

Snapshots are reused across bench:run invocations — you only re-seed when the schema itself changes.

Benchmark-specific env vars

  • STRAPI_BENCH_HOOK_OUTPUT=<path> — enables the timing preload (set automatically by bench.js, exposed for debugging). The hook self-disables when this isn't set, so the --require can safely live in other dev configs.
  • STRAPI_BENCH_HOOK_DEBUG=1 — verbose preload output (migration attach/record events to stderr).
  • STRAPI_BENCH_RUNTIME=podman|docker — override the auto-detected container runtime.
  • SEED_CONCURRENCY=<n> — how many entity-creation tasks run in parallel during bench:seed / seed. Default 5, which stays under Strapi v4's default knex pool of {min: 2, max: 10}. Tune up only if you've also raised the pool max.

Development Commands

Simplified Database Commands

The easiest way to start Strapi with a specific database:

yarn develop:postgres     # PostgreSQL container + Strapi dev server
yarn develop:mysql        # MySQL container + Strapi dev server
yarn develop:mariadb      # MariaDB container + Strapi dev server
yarn develop:sqlite       # SQLite file (no container) + Strapi dev server

These commands:

  • Automatically start the database container if it's not already running (no-op for sqlite)
  • Configure Strapi to use the specified database (no manual config needed)
  • Start the Strapi development server
  • Keep the database container running when you press Ctrl+C (only Strapi stops)

Note: Default ports:

  • PostgreSQL: 5432 (override with POSTGRES_PORT)
  • MySQL: 3306 (override with MYSQL_PORT)
  • MariaDB: 3307 (override with MARIADB_PORT)

Set the override env var if you have a local DB already bound to the default port:

POSTGRES_PORT=5433 yarn develop:postgres

Standard Strapi Commands

  • yarn develop — Start development server (defaults to PostgreSQL; requires a running DB)
  • yarn build — Build for production
  • yarn start — Start production server
  • yarn strapi — Run Strapi CLI commands

V5 Seeding (Large Dataset)

Use the v5 seeder in this project to generate a large dataset for homepage perf testing:

yarn seed:v5

You can scale the volume with a multiplier:

yarn seed:v5 -- --multiplier 20

Or:

SEED_MULTIPLIER=20 yarn seed:v5