Both were defensive against rare edge cases the reviewer flagged but
which don't justify the complexity:
- $log clone: protects against tag pollution on the per-request Log
if reportError fires AND the request later errors. http.php's
request-end handler overwrites the core fields (namespace, message,
action, etc.) anyway; only addTag/addExtra accumulate. Aligns with
Embeddings/Text/Create.php precedent which mutates $log directly.
- sdks in-lock re-read: closes a sequential-acquire stale-read race on
the sdks append-list. The race exists but the impact is bounded —
one SDK registration delayed until the next request from that SDK
fires. Self-healing on retry. The codebase already accepts this
exact race for auths, oAuthProviders, services, identities, sessions,
factors, etc. Special-casing this one site is precision the analytics
use case doesn't need.
The sdks attribute is an append-only list, not idempotent. With the
read happening outside the lock, two sequential acquirers could each
read the same stale list and overwrite each other's appends.
Now the lock body re-reads the keys document and re-derives the sdks
array from the fresh state. Skip-on-contention still drops the update
when the lock is held, but a same-SDK retry on the next request picks
the registration up.
Bounded loss only affects the rare 'first-seen' SDK request that
happens to land while the lock is held; sequential traffic from the
same SDK (or any later request from any SDK) re-attempts and writes.
Co-authored-by: greptile-apps[bot]
Cloud production runs four separate single-master+replica Dragonfly
deployments (cache, queue-dragonfly, queue-usage, pubsub-dragonfly),
not sharded Redis Cluster topology — confirmed by deploy/cloud/values
+ environments/production/*.values.yaml (Dragonfly Operator with
replicas=2 = 1 primary + 1 read replica), and by the dev DSN scheme
'redis://' (not 'redis-cluster://').
So a standard \Redis client suffices for the direct redis resource
(timelimit, Lock). Cloud just needs to pass _APP_REDIS_HOST/PORT/USER/
PASS through to the appwrite container — handled in the cloud PR's
docker-compose.yml change.
This reverts the resource to its original pre-PR shape. The
utopia-php/lock cluster-support PR (utopia-php/lock#1) stays open at
upstream as a future-ready option if cloud ever moves to actual
Redis Cluster mode.
Extracts the lock-key format and the lock+auth-skip+sparse-update pattern
into Appwrite\Locking\Lock with three methods:
- set(collection, id, attribute=accessedAt, value=null) — throttled
single-attribute write
- run(collection, id, fn) — generic skip-on-contention
- runOrFail(collection, id, fn) — block-then-409 for the deferred
lost-update follow-up
Migrates the 4 call sites (router projects accessedAt + 3 in shared/api)
off the raw $distributedLock callable. Raw factories stay as escape
hatches for non-platform key shapes.
The previous shape required every caller to thread `log: $log, logger: $logger`
as named args into each `distributedLock(...)` invocation, plus inject `log`
and `logger` into the surrounding action just to forward them to the lock.
Across 21 call sites this added ~100 LOC of pure plumbing.
The cause: the lock factory was registered on the global container in
`app/init/resources.php`, where per-request resources like `log` aren't
visible. That forced the factory to expose its inner closure with optional
`?Log $log = null, ?Logger $logger = null` params, which every caller had
to satisfy.
Move the lock factory + its `lockErrorReporter`/`lockTargetOf` helpers from
the global container to the per-request container (`resources/request.php`),
and add `'log'` + `'logger'` to the factory's dep list. The factory closure
now runs per-request and closes over the per-request `Log`/`Logger`. Inner
closure returned to callers no longer needs the optional params, and call
sites drop the named args entirely.
Knock-on cleanup:
- Drop `->inject('log')`, `->inject('logger')`, the corresponding action
params, and `use Utopia\Logger\{Log,Logger}` imports from 19 endpoint
files where they were only there for the lock
- Drop the same plumbing from `app/controllers/shared/api.php` (3 lock call
sites)
- Drop just the Logger plumbing from `app/controllers/general.php` (router
function + 3 callbacks); `Log` is kept because it's used elsewhere in
that file
- Net 120 LOC removed across 23 files
No behavior change: the lock factories still produce the same closures
(skip-on-contention `distributedLock`, blocking-with-409 `distributedLockOrFail`).
The static lockErrorReporter rate limiter (1 push per 60s per
`(action, target)` bucket) continues to work — it lives on a closure-static
in the helper, which is independent of where the helper is constructed.
Verified end-to-end: testConcurrentTogglesAllPersist passes 4/5 (the cold-
start race flake is the same one we've consistently seen and is orthogonal
to lock changes).
Every request that arrives via a custom-domain rule (router path) reads
the project's `accessedAt` timestamp and, if the throttle window
(`APP_PROJECT_ACCESS`) has elapsed, writes a fresh value. With concurrent
traffic across multiple pods, this is a per-row hot RMW that loses
updates silently — the surviving timestamp depends on which pod's write
landed last.
Wrap the read-modify-write in `distributedLock('lock:platform:projects:{id}')`
(skip-on-contention variant). Every concurrent pod would write the same
throttled value, so losing the race is correct: the winner's update covers
ours.
Wires `distributedLock` and `?Logger` through:
- `router()` function signature (app/controllers/general.php:70)
- the three Http::init / Http::get callbacks that invoke router():
`*` catch-all (init), `/robots.txt`, `/humans.txt`
Two related cloud-only RMW sites (`teams.accessedAt`,
`projects.mcpAccessedAt`) live in `appwrite-labs/cloud` and need a
follow-up PR there. They depend on this branch reaching 1.9.x so the
`distributedLock` DI resource is available downstream.
Lock backend errors (Redis/Dragonfly unreachable) and release errors
(TTL expired or backend dropped while held) were previously visible only
in the lock.attempts counter and Console::warning lines. They now also
push a structured Log entry through the configured logger adapter, so
operators using Sentry/Raygun/AppSignal/LogOwl get first-class events
for these specific failure modes.
Pattern matches Embeddings/Text/Create.php exactly:
- Action injects 'log' (per-request Log object) and 'logger'
(?Logger, nullable when _APP_LOGGING_CONFIG unset).
- Helper mutates the per-request $log instead of constructing a
fresh one — preserves the per-request context Embeddings expects.
- Same field set: namespace='http', server, version, type,
setMessage, setAction, setEnvironment, addTag('code', ...),
addExtra('file' / 'line' / 'trace').
- Defensive try/catch around addLog() so logging failures don't
break fail-open.
Lock-specific tags added for slicing in Sentry:
- lock.target — collection name (projects, keys, users, ...).
Bounded set, safe for high-cardinality stores.
- lock.key_pattern — full key with the trailing document ID
stripped (lock:platform:projects:* not lock:platform:projects:abc).
Prevents unbounded log cardinality from per-document IDs.
Rate limiting via per-pod static buckets, 60s window per
(action, target) combo. During a 5-minute Dragonfly outage, a fleet
of N pods produces at most N events/min, well within Sentry's dedup
tolerance. Static state is per-Swoole-worker; coroutines may race
on the bucket boundary but the worst case is one duplicate report.
Type level set to Log::TYPE_WARNING (not ERROR): fail-open means the
request still succeeds, so this is degraded operation, not a failed
request.
Deliberately NOT reported to Sentry:
- 409 GENERAL_RESOURCE_LOCKED (normal user-facing concurrency)
- skip-on-contention events (idempotent fan-out by design)
- acquire retry conflicts (internal loop)
- destructor cleanups (have an expected baseline rate; the
lock.attempts counter aggregates them better than Sentry would)
Factory signature change: distributedLock and distributedLockOrFail
now accept ?Log and ?Logger as optional named args at call time
(rather than capturing Logger at factory-build time). The factory
closure runs once at boot but the per-request Log resource is
fresh per request — capturing at boot would have given stale state.
Existing call sites threaded log: $log, logger: $logger. Sites that
don't (workers, CLI tasks) get null and just log to Console as
before.
Adds two DI factories and wires them where coordination is needed:
- distributedLock — skip on contention, void return. For idempotent
fan-out where N pods doing the same write is wasteful but losing
the race is correct.
- distributedLockOrFail — blocking acquire (3s default) then throws
GENERAL_RESOURCE_LOCKED (HTTP 409) on contention. For
read-modify-write on shared mutable state where a silent skip
would drop a user's change.
Both factories: _APP_LOCKING_ENABLED kill switch (set 'disabled' for
fail-open), fail-open on Redis-unreachable, and a lock.attempts
telemetry counter sliced by outcome and target collection.
Wired sites:
- shared/api.php × 3 (distributedLock): keys.accessedAt + sdks,
projects.accessedAt, users.accessedAt. Reduces redundant writes
and cache-purge fan-out under request bursts on the same project.
- Project/Services/Update.php × 1 (distributedLockOrFail): the
services map toggle. Re-reads inside the lock so the baseline
reflects concurrent updates. Two simultaneous toggles to
different services no longer lose one of them.
Lock key namespace: lock:platform:{collection}:{id}.
Dep: premtsd-code/lock pinned to a specific commit as a development
preview. Migration to utopia-php/lock is a follow-up once that
package is published.
Per review feedback on the PHPStan cleanup, the two `if
($executionsRetentionCount > 0 && ENABLE_EXECUTIONS_LIMIT_ON_ROUTE)`
blocks in `app/controllers/general.php` and
`src/Appwrite/Platform/Modules/Functions/Http/Executions/Create.php`
were load-bearing feature flags, not dead code. Removing them silently
dropped the ability to turn the cleanup on later.
Changes:
- Convert `ENABLE_EXECUTIONS_LIMIT_ON_ROUTE` from
`const ... = false;` to a `define()` backed by the new
`_APP_EXECUTIONS_LIMIT_ON_ROUTE` env var (defaults to `disabled`).
PHPStan can no longer fold the `&&` away since the value is now
runtime-resolved, so the guarded blocks are live again.
- Restore the `/* cleanup */` block in the `router()` helper in
`app/controllers/general.php`.
- Restore the two cleanup blocks in `Functions/Http/Executions/Create.php`
(one on the async-scheduled return path, one on the sync-response
path), and re-add the `DeleteEvent $queueForDeletes` /
`int $executionsRetentionCount` injections plus the
`Appwrite\Event\Delete` import.
Runtime behavior is identical to main (flag off by default); operators
can now flip it via env without a code change.
Three follow-ups from CI that the level-4 pass got wrong:
1. `account.php` / `users.php`: `Document::find()` returns `mixed`
(specifically `Document|false` in practice), not `Document`. The
earlier `@var Document $oldTarget` docblocks were lies, and the
runtime `instanceof Document` guards were load-bearing — removing
them caused `Call to a member function isEmpty() on false` 500s
on the `PATCH /v1/users/:id/email` and `/phone` endpoints (and the
analogous `/v1/account/email`, `/v1/account/phone` flows). Dropped
the misleading `@var` docblocks and restored
`$oldTarget instanceof Document && !$oldTarget->isEmpty()`.
2. `Installer/Runtime/Config::setEnabledDatabases()` is a boundary
that actually takes arbitrary user/compose input — not a trusted
`string[]`. The `is_string($v)` filter was covering for that, and
`ConfigTest::testSetEnabledDatabasesFiltersInvalid` explicitly
asserts it. Widened the PHPDoc to `array<mixed>` and restored
`is_string($v) && $v !== ''` in the filter.
3. `OAuth2/Apple::getAppSecret()` wrapped `json_decode` in a
`try/catch (\Throwable)` — but `json_decode` without
`JSON_THROW_ON_ERROR` returns `null` on failure, it doesn't throw.
PHP 8.3's PHPStan flagged the catch as dead (PHP 8.5 didn't, which
is why it slipped through locally). Replaced with
`if (!\is_array($secret)) throw`, which preserves the original
"invalid secret" guard.
Raises `phpstan.neon` level from 3 to 4 and fixes the 549 new errors
that level 4 surfaces across 157 files. Fixes are root-cause — no
`@phpstan-ignore`, no `@var` casts, no baseline entries, no widened
types. A handful of latent bugs were fixed along the way:
- `app/controllers/general.php`: path-traversal guard was negating
`\substr(...)` before the strict comparison (`!\substr(...) === $base`
was always `false === $base`). Rewritten as `\substr(...) !== $base`.
- `src/Appwrite/Platform/Modules/Databases/Http/Databases/Logs/XList.php`
and `.../TablesDB/Logs/XList.php`: were importing the raw Matomo
`DeviceDetector` (whose `getDevice()` returns `?int`) but treating the
result as an array with `deviceName/deviceBrand/deviceModel` keys.
Swapped to `Appwrite\Detector\Detector`, matching the wrapper already
used a few lines below for `$os`/`$client`.
- `src/Appwrite/Platform/Modules/Functions/Workers/Builds.php`: a match
key was checking `$resourceKey === 'functions'` when `$resourceKey`
is `'functionId'|'siteId'` — always false. Switched to the intended
`$resource->getCollection() === 'functions'` check.
- `src/Appwrite/OpenSSL/OpenSSL.php`: `encrypt()` return type tightened
to `string|false` to match `openssl_encrypt`; this lets callers'
`=== false` error handling remain meaningful.
- `app/controllers/api/messaging.php`: removed a dead
`array_key_exists('from', [])` branch in the Msg91 provider (empty
array literal; branch was unreachable).
Large cleanup categories across the 549 fixes:
- Removed redundant `?? default` on array offsets and expressions that
PHPStan now knows are non-nullable.
- Removed unreachable statements (mostly `return;` after `throw` or
`markTestSkipped()`).
- Removed redundant `is_array`/`is_string`/`is_bool`/`instanceof` checks
on already-narrowed types.
- Added `default =>` arms (or throwing arms) to non-exhaustive matches
on `string`/`mixed` input.
- Removed dead `$document === false` branches where method return types
were tightened to non-nullable `Document`.
- Removed unused properties (`$version` on Etsy/Zoom OAuth2, `$paths` on
Installer State, `$source` on MigrationsWorker, `$account2` on two
GraphQL auth tests), unused traits (`ApiVectorsDB`, `DatabaseFixture`),
and an unused `cleanupStaleExecutions` task method.
- Replaced `assertTrue(true)` and redundant `assertIsArray`/`assertIsString`/
`assertNotNull` assertions with `addToAssertionCount(1)` or
`assertNotEmpty` where the runtime type was already known.
Cache write hook now checks HTTP status code before writing to prevent
failed AVIF (or any other) conversions from poisoning the cache.
Bumps utopia-php/image to 0.8.5 which fixes AVIF/HEIC output by using
native Imagick instead of the deprecated magick convert shell command.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>