Cloud production runs four separate single-master+replica Dragonfly
deployments (cache, queue-dragonfly, queue-usage, pubsub-dragonfly),
not sharded Redis Cluster topology — confirmed by deploy/cloud/values
+ environments/production/*.values.yaml (Dragonfly Operator with
replicas=2 = 1 primary + 1 read replica), and by the dev DSN scheme
'redis://' (not 'redis-cluster://').
So a standard \Redis client suffices for the direct redis resource
(timelimit, Lock). Cloud just needs to pass _APP_REDIS_HOST/PORT/USER/
PASS through to the appwrite container — handled in the cloud PR's
docker-compose.yml change.
This reverts the resource to its original pre-PR shape. The
utopia-php/lock cluster-support PR (utopia-php/lock#1) stays open at
upstream as a future-ready option if cloud ever moves to actual
Redis Cluster mode.
Per-manager request, lock keys are now prefixed with the project's
internal id (sequence) so that:
- Locks are partitioned by project — Redis cluster slot affinity
if/when sharded.
- Cross-project requests can't compete on the same key for
collection-scoped resources.
- Telemetry (counter + Sentry tags) carries 'project' alongside
'target', so dashboards can filter contention by project.
Key shapes:
set: lock:platform:{project}:{collection}:{id}:{attribute}
run/orFail: lock:platform:{project}:{collection}:{id}
withKey: raw (caller-provided)
Lock now requires a project document at construction. All existing
call sites (4 in CE + 2 in cloud) run inside Http::init()-resolved
request scope where the project document is set, so no migration
needed. Workers/CLI without project context can use withKey directly.
The dedicated \Redis DI resource (used by timelimit and the new Lock
class) was reading _APP_REDIS_HOST/PORT/PASS exclusively. Cloud
deployments configure cache via _APP_CONNECTIONS_CACHE URI form
(e.g. cache=redis://dragonfly:6379) and don't pass the legacy
_APP_REDIS_* vars to the appwrite container locally, so timelimit and
Lock both fail to connect outside production where Helm separately
injects the legacy vars.
Now prefers _APP_CONNECTIONS_CACHE when set (matching the cache pool
backend), falls back to _APP_REDIS_* for CE-style configs. No new env
vars introduced; both timelimit and Lock work in CE, cloud-local, and
cloud-production without compose changes.
Lock now uses Utopia\Lock\Distributed directly and owns the full
acquire/release/telemetry/error-reporting/fail-open/kill-switch logic
that previously lived in two inline DI factory closures.
Adds withKey($key, $fn, $ttl, $orFail, $waitTimeout) as a generic
escape hatch for non-platform key shapes (cache, queue, edge) and
unusual TTL/timeout requirements.
Per-attribute lock keys for set() so that an accessedAt bump and a
mcpAccessedAt bump on the same projects:{id} document don't compete.
Whole-document operations (run, runOrFail) keep document-level keys.
Removes the standalone distributedLock and distributedLockOrFail DI
factories — Lock is the single API.
request.php shrinks ~150 LOC; Lock.php grows to ~190 LOC.
Extracts the lock-key format and the lock+auth-skip+sparse-update pattern
into Appwrite\Locking\Lock with three methods:
- set(collection, id, attribute=accessedAt, value=null) — throttled
single-attribute write
- run(collection, id, fn) — generic skip-on-contention
- runOrFail(collection, id, fn) — block-then-409 for the deferred
lost-update follow-up
Migrates the 4 call sites (router projects accessedAt + 3 in shared/api)
off the raw $distributedLock callable. Raw factories stay as escape
hatches for non-platform key shapes.
The previous shape required every caller to thread `log: $log, logger: $logger`
as named args into each `distributedLock(...)` invocation, plus inject `log`
and `logger` into the surrounding action just to forward them to the lock.
Across 21 call sites this added ~100 LOC of pure plumbing.
The cause: the lock factory was registered on the global container in
`app/init/resources.php`, where per-request resources like `log` aren't
visible. That forced the factory to expose its inner closure with optional
`?Log $log = null, ?Logger $logger = null` params, which every caller had
to satisfy.
Move the lock factory + its `lockErrorReporter`/`lockTargetOf` helpers from
the global container to the per-request container (`resources/request.php`),
and add `'log'` + `'logger'` to the factory's dep list. The factory closure
now runs per-request and closes over the per-request `Log`/`Logger`. Inner
closure returned to callers no longer needs the optional params, and call
sites drop the named args entirely.
Knock-on cleanup:
- Drop `->inject('log')`, `->inject('logger')`, the corresponding action
params, and `use Utopia\Logger\{Log,Logger}` imports from 19 endpoint
files where they were only there for the lock
- Drop the same plumbing from `app/controllers/shared/api.php` (3 lock call
sites)
- Drop just the Logger plumbing from `app/controllers/general.php` (router
function + 3 callbacks); `Log` is kept because it's used elsewhere in
that file
- Net 120 LOC removed across 23 files
No behavior change: the lock factories still produce the same closures
(skip-on-contention `distributedLock`, blocking-with-409 `distributedLockOrFail`).
The static lockErrorReporter rate limiter (1 push per 60s per
`(action, target)` bucket) continues to work — it lives on a closure-static
in the helper, which is independent of where the helper is constructed.
Verified end-to-end: testConcurrentTogglesAllPersist passes 4/5 (the cold-
start race flake is the same one we've consistently seen and is orthogonal
to lock changes).
`utopia-php/lock` v0.2.0 was published this week and provides the same
Redis SET-NX-EX + Lua-compare-and-delete primitive we built locally as
`premtsd-code/lock`. Drop the dev-preview package in favor of the
official Utopia PHP library.
- composer: replace `premtsd-code/lock` with `utopia-php/lock` 0.2.*
(still via VCS — not on Packagist yet)
- resources.php: rewire both factory variants
- `Lock + Adapter\Redis` → `Distributed`
- `acquire()` → `tryAcquire()` for skip variant
- `acquire(blocking: true, waitTimeout)` → `acquire($waitTimeout)` for
OrFail variant
- `LockAcquireException` → `\RedisException`
- `(int) $ttl` cast — utopia-php/lock takes seconds as int
- docker-compose: thread `_APP_LOCKING_ENABLED` into the appwrite
service environment so the kill switch documented in
`app/config/variables.php` is actually usable from `.env`
Verified end-to-end on local stack:
- positive case (locking enabled): 5/5 testConcurrentTogglesAllPersist
pass, lock keys observed in `redis-cli MONITOR` with concurrent SET
NX contention
- negative case (locking disabled): 1/3 detect lost updates as before
Remove query param fallback for impersonateEmail and impersonatePhone
to avoid PII exposure in server logs, browser history, and Referer
headers. Only impersonateUserId (an opaque internal ID) is safe to
pass via URL query param.
Allow impersonation to be specified via URL query params
(?impersonateUserId, ?impersonateEmail, ?impersonatePhone) as a
fallback to the existing headers, enabling Console to embed
impersonation in direct file/image URLs where headers cannot be set.
Lock backend errors (Redis/Dragonfly unreachable) and release errors
(TTL expired or backend dropped while held) were previously visible only
in the lock.attempts counter and Console::warning lines. They now also
push a structured Log entry through the configured logger adapter, so
operators using Sentry/Raygun/AppSignal/LogOwl get first-class events
for these specific failure modes.
Pattern matches Embeddings/Text/Create.php exactly:
- Action injects 'log' (per-request Log object) and 'logger'
(?Logger, nullable when _APP_LOGGING_CONFIG unset).
- Helper mutates the per-request $log instead of constructing a
fresh one — preserves the per-request context Embeddings expects.
- Same field set: namespace='http', server, version, type,
setMessage, setAction, setEnvironment, addTag('code', ...),
addExtra('file' / 'line' / 'trace').
- Defensive try/catch around addLog() so logging failures don't
break fail-open.
Lock-specific tags added for slicing in Sentry:
- lock.target — collection name (projects, keys, users, ...).
Bounded set, safe for high-cardinality stores.
- lock.key_pattern — full key with the trailing document ID
stripped (lock:platform:projects:* not lock:platform:projects:abc).
Prevents unbounded log cardinality from per-document IDs.
Rate limiting via per-pod static buckets, 60s window per
(action, target) combo. During a 5-minute Dragonfly outage, a fleet
of N pods produces at most N events/min, well within Sentry's dedup
tolerance. Static state is per-Swoole-worker; coroutines may race
on the bucket boundary but the worst case is one duplicate report.
Type level set to Log::TYPE_WARNING (not ERROR): fail-open means the
request still succeeds, so this is degraded operation, not a failed
request.
Deliberately NOT reported to Sentry:
- 409 GENERAL_RESOURCE_LOCKED (normal user-facing concurrency)
- skip-on-contention events (idempotent fan-out by design)
- acquire retry conflicts (internal loop)
- destructor cleanups (have an expected baseline rate; the
lock.attempts counter aggregates them better than Sentry would)
Factory signature change: distributedLock and distributedLockOrFail
now accept ?Log and ?Logger as optional named args at call time
(rather than capturing Logger at factory-build time). The factory
closure runs once at boot but the per-request Log resource is
fresh per request — capturing at boot would have given stale state.
Existing call sites threaded log: $log, logger: $logger. Sites that
don't (workers, CLI tasks) get null and just log to Console as
before.
Three P1 issues flagged on the initial commit:
1. Lock key in updateProjectService used "platform:project:{id}" —
missing the "lock:" namespace prefix and using singular "project"
instead of the conventional plural collection name. The factory's
`lockTargetOf` extracts segment [2] as the telemetry target, so
the broken key was emitting the project ID itself as the target
attribute (cardinality blowup, broken dashboards). Fixed to
"lock:platform:projects:{id}" matching the convention used in
shared/api.php.
2. The 409 contention exception embedded the raw Redis lock key in
its user-facing message, leaking internal collection names and
the locking namespace to API clients. Removed the custom message
so the catalog default ("The requested resource is currently
being modified...") is used. Telemetry already carries the
target collection for operator-side observability.
3. _APP_LOCKING_ENABLED variable doc had `introduction: '1.10.0'`
on a 1.9.x-targeted PR. Corrected to '1.9.3' (next 1.9.x patch).
- Add APP_LIMIT_UPLOAD_CHUNK_SIZE constant (5MB) matching official SDKs
- Replace dynamic chunk calculation with fixed 5MB chunk math in all upload endpoints
- Remove -1 last-chunk sentinel that broke when last chunk arrived first
- Fix duplicate-retry guards: return existing resource instead of erroring for chunked uploads
- Add out-of-order e2e tests for Storage, Functions, and Sites
- Upgrade utopia-php/storage to 2.0.0 for device-level out-of-order assembly support
Adds two DI factories and wires them where coordination is needed:
- distributedLock — skip on contention, void return. For idempotent
fan-out where N pods doing the same write is wasteful but losing
the race is correct.
- distributedLockOrFail — blocking acquire (3s default) then throws
GENERAL_RESOURCE_LOCKED (HTTP 409) on contention. For
read-modify-write on shared mutable state where a silent skip
would drop a user's change.
Both factories: _APP_LOCKING_ENABLED kill switch (set 'disabled' for
fail-open), fail-open on Redis-unreachable, and a lock.attempts
telemetry counter sliced by outcome and target collection.
Wired sites:
- shared/api.php × 3 (distributedLock): keys.accessedAt + sdks,
projects.accessedAt, users.accessedAt. Reduces redundant writes
and cache-purge fan-out under request bursts on the same project.
- Project/Services/Update.php × 1 (distributedLockOrFail): the
services map toggle. Re-reads inside the lock so the baseline
reflects concurrent updates. Two simultaneous toggles to
different services no longer lose one of them.
Lock key namespace: lock:platform:{collection}:{id}.
Dep: premtsd-code/lock pinned to a specific commit as a development
preview. Migration to utopia-php/lock is a follow-up once that
package is published.