Caddy does not set X-Forwarded-Host by default. Without it, MAS cannot
verify the request host when building OAuth2 redirect URIs, causing the
login "Continue" button to do nothing.
Added header_up Host and X-Forwarded-Host to all MAS reverse_proxy
blocks in both local and production Caddyfile generation.
Fixes#16
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Caddy admin API: bind to localhost:2019 instead of 0.0.0.0:2019 (local + production)
- Production Caddyfile: block /_synapse/admin* with 403 (not needed publicly)
- homeserver.yaml: explicitly set allow_public_rooms_without_auth/over_federation to false
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace handle_path /account/* with handle /account/* in both local and
production Caddyfile templates. handle_path was stripping the /account/
prefix before proxying to MAS, causing MAS to serve the root landing page
instead of the account management SPA — making it impossible to edit
account data.
- Fix Caddy CA cert extraction: use sudo cp/chmod (caddy/data/ is root-owned
via Docker) and replace the one-shot sleep check with a retry loop (24×5s)
that triggers curl on each iteration to prompt Caddy to generate the cert.
Previously the cert copy silently failed, leaving MAS in a crash loop.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `adminapi` resource to MAS HTTP listener in deploy.sh and quickstart.sh.
Without it, MAS never served /api/admin/v1/... causing Element Admin to always
throw TypeError: Failed to fetch. Updated docs and added regression test.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
deploy.sh now generates appservices/doublepuppet.yaml with fresh random
tokens on every run and registers it in homeserver.yaml under
app_service_config_files. The user regex is scoped to SERVER_NAME so it
works correctly in both TLD and subdomain identity modes.
test_deploy.sh: add 7 assertions per scenario (14 total) verifying the
file exists, tokens are present, and homeserver.yaml references it.
All 66 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- deploy.sh: add SERVER_NAME prompt so users can choose @user:example.com
(TLD) vs @user:matrix.example.com (subdomain); wire SERVER_NAME through
.env, MAS config, Element config, Synapse init, and both Caddyfiles
- deploy.sh: add identity-domain well-known delegation block to local and
production Caddyfiles when SERVER_NAME != MATRIX_DOMAIN
- deploy.sh: remove -it flag from synapse docker run (non-interactive);
fix synapse/data ownership (uid 991) around homeserver.yaml modifications
- test_deploy.sh: new integration test suite — two scenarios (TLD + subdomain),
config-file assertions, live endpoint checks, automatic teardown; 52/52 passing
- .gitlab-ci.yml: new CI pipeline with full (25 min) and config-only (12 min) jobs
- .gitignore: add caddy/Caddyfile (now generated); remove both Caddyfiles from tracking
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All image references in docker-compose.yml and docker-compose.local.yml
are replaced with ${IMAGE_VAR:-default} env var syntax, so compose files
work standalone without .env while deploy.sh writes resolved image paths.
deploy.sh gains two new prompts:
- Custom registry prefix (prepended to all image names)
- Hardened images from dhi.io for Redis/PostgreSQL/Caddy (takes priority
over custom registry for those three)
Compared to PR #10: interactive prompts instead of hardcoded vars,
no sed-based compose file mutation, all 14 images covered (PR #10 missed
element-admin, element-call, lk-jwt-service, and all 3 bridges), and
standalone compose usage is preserved via :-default fallbacks.
SETUP.md and README.md document the feature including a note on
pull-through cache registries (Harbor/Artifactory/Nexus) that require
the full docker.io/ path prefix in image names.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Stripped-down alternative to deploy.sh aimed at users with little
hosting experience. Asks three questions (domain, Let's Encrypt email,
Element Call yes/no), generates all secrets and configs automatically,
and starts the full stack with a single command.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously Element Call used call.element.io for the frontend JS while
routing media through a local LiveKit SFU. This adds the self-hosted
Element Call frontend so no requests go to Element's CDN at all.
Changes:
- docker-compose.yml: add element-call service (ghcr.io/element-hq/element-call)
under the element-call profile, port 8083:8080. No config required — the app
reads the Matrix homeserver from URL params passed by Element Web, and gets
the livekit_service_url from the homeserver's .well-known/matrix/client.
- caddy/Caddyfile: add call.example.test:443 virtual host for local dev.
- deploy.sh: add CALL_DOMAIN variable (call.example.test / call.<domain>),
prompt for call subdomain in production mode, write CALL_DOMAIN to .env,
update element_call.url from call.element.io to the self-hosted CALL_DOMAIN
in all three places (element/config.json generator and both Caddyfile
inline JSON strings), add Caddy blocks for CALL_DOMAIN in local and
production Caddyfile generation, add CALL_DOMAIN to /etc/hosts hint,
update summary output.
- README.md: rewrite to reflect actual project state — deploy.sh workflow,
all optional components, correct architecture diagram, bridge setup, ports.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Production deployment revealed two bugs:
1. element-admin container port changed to 8080 in recent image versions
(was 80). Update docker-compose.yml port mapping 8091:80 → 8091:8080,
and Caddy reverse_proxy targets in deploy.sh and caddy/Caddyfile.
2. element-admin requires SERVER_NAME, OIDC_CLIENT_ID, OIDC_ISSUER env vars
to function. Add them to the docker-compose.yml service definition using
the stack's existing MATRIX_DOMAIN, AUTH_DOMAIN variables and the
corrected MAS client ID 01ADMN00000000000000000000.
3. Document Caddy inline-JSON single-line requirement: if a `respond` body
containing JSON is manually edited and an editor wraps the line, Caddy
refuses to start with "invalid control character in string". Add warning
comments to both affected respond blocks in caddy/Caddyfile.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Setup-bridges.sh complete rewrite and bug fixes:
- Fix YAML ordering bug: bare app_service_config_files: key was added then
immediately removed before entries were appended, leaving orphaned list
items and breaking Synapse YAML parsing. Rewrite as atomic teardown+rebuild.
- Fix appservice section accumulation: remove old comment line on each run
so comments do not pile up across re-runs.
- Fix bridge listen address: megabridges default to hostname 127.0.0.1,
which prevents Synapse (different container) from pinging back. Add sed
to set hostname to 0.0.0.0 for both WhatsApp and Signal.
- Fix missing chmod: registration.yaml files created by bridge containers
are root:root 600; Synapse cannot read them. Add chmod 644 after wait loop.
- Fix sed ordering: all removals now run before the fresh section is written.
- Conditionalize Telegram on TELEGRAM_API_ID/HASH presence in .env.
- Fix WhatsApp/Signal sed patterns: correct 4-space megabridge indentation,
add missing homeserver address update (example.localhost→synapse:8008),
replace double_puppet placeholder directly instead of inserting new block,
remove broken encryption range sed (allow_key_sharing field not present).
Tested: WhatsApp and Signal bridges start cleanly, connect to Synapse,
register bot users, migrate databases, and reach UNCONFIGURED state
(waiting for user logins) without restart loops.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
homeserver.yaml contains a commented-out '# app_service_config_files:' line
(from the template). The unanchored grep matched it, causing the script to
skip adding the real key — resulting in appservice list entries being appended
directly into the preceding rc_delayed_event_mgmt block (invalid YAML).
Fix: anchor grep with ^ so only an actual key at column 0 matches.
Discovered during local integration test of setup-bridges.sh.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Synapse continuously retries HTTP transactions to any appservice with a
non-null URL. Setting url: null (not url: "") tells Synapse this appservice
has no HTTP endpoint, stopping the retry loop.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix invalid MAS admin client ID: 01ADMIN000000000000000000 → 01ADMN00000000000000000000
(was 25 chars and contained 'I', both invalid for Crockford base32/ULID)
- Add Element Call (LiveKit SFU) as an optional feature prompted in deploy.sh
- Adds livekit + lk-jwt-service services with element-call Docker Compose profile
- Generates livekit/livekit.yaml at deploy time (excluded from git, contains secrets)
- Conditionally adds MSC3266/4222/4140 to Synapse experimental_features
- Conditionally adds element_call block to Element Web config and rtc_foci to well-known
- Adds Caddy routing for /livekit/jwt and /livekit/sfu under RTC domain
- Fix Synapse MAS config: migrate from removed experimental_features.msc3861 to
stable matrix_authentication_service: block (Synapse 1.137+)
- Fix Docker Compose v5 path resolution: add --project-directory . for local mode
- Fix cleanup to use correct compose project scope
Tested locally: core stack (no Element Call) and full stack (with Element Call) both pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
official generate command
Fixes#4
Replace manual homeserver.yaml template copying with the official
Synapse
docker generate command. This resolves permission errors and missing
configuration files that prevented Synapse from starting.
Changes:
- Step 4: Use docker generate command instead of template copying
- Step 9a: Remove synapse/data from mkdir (created by generate command)
- Update all path references from synapse/config to synapse/data
- Update backup section to reflect correct directory structure
- Fix incorrect line number reference for database section
The generate command creates homeserver.yaml with correct permissions
(UID 991) and generates required signing keys automatically.
Addresses three breaking changes in MAS v1.8.0:
1. Database separation (wlphi/ess-docker-compose#1)
- Update mas-config.yaml to use dedicated 'mas' database
- Database is created by postgres/init/01-init-databases.sql
2. Signing key format (wlphi/ess-docker-compose#2)
- Replace hex string keys with EC private keys
- Update key generation: openssl ecparam -name prime256v1 -genkey
- Add reference to official MAS signing keys documentation
- Update templates and docs to reflect new key format
3. CLI syntax change (wlphi/ess-docker-compose#3)
- Update register-user command to use positional username
- Old: --username admin
- New: admin (positional argument)
Problem:
- WhatsApp bridge cannot respond to messages in encrypted Matrix rooms
- Root cause: Synapse 1.144.0 has incomplete MSC4190 implementation with MAS
- Bridge encryption (MSC4190) is incompatible with MAS authentication
Solution:
- Implement double puppet appservices for better message attribution
- Disable bridge encryption (use unencrypted Matrix rooms)
- Configure all bridges with proper encryption flags for future compatibility
Changes:
- Add templates/doublepuppet.yaml: Appservice template for double puppet
- Update setup-bridges.sh: Automated setup with double puppet + encryption config
- Update BRIDGE_SETUP_GUIDE.md: Comprehensive double puppet setup guide
- Update templates/homeserver.yaml: Add MSC flags and appservice section
- Update docker-compose files: Mount appservices directory in Synapse
Technical details:
- Double puppet allows bridges to send as actual user (not bot)
- Encryption disabled: allow: false, msc4190: false
- Future-ready: MSC flags configured for when Synapse fixes compatibility
- All changes tested and validated with simulated bridge configs
Users can now run ./setup-bridges.sh to automatically configure bridges
with working message attribution in unencrypted rooms.
Changes:
- Add clear warning at top: MAS is used, encryption is disabled
- Update all bridge config examples to show encryption: allow: false
- Remove misleading MSC4190/MSC3202 instructions for registration.yaml
- Make it clear this is a MAS-first setup where encryption won't work
This ensures users understand upfront that:
- MAS is required for this deployment
- Bridge encryption is incompatible with MAS
- All configs reflect the non-encrypted bridge setup
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major improvements:
- Add Element Admin service for user/room management
- Upgrade to stable MAS integration (Synapse 1.136+)
- Add QR code login support (MSC4108) with rendezvous endpoints
- Fix CORS header duplication by stripping backend headers
- Document known MAS + encrypted bridge compatibility issue
- Add bridge encryption configuration examples (MSC3202/MSC4190)
- Update deploy.sh to support Element Admin domain configuration
- Remove stale bridge registrations during deployment
Breaking changes:
- MSC3861 experimental config replaced with stable matrix_authentication_service
- Synapse OIDC client no longer needed in MAS config
Known issues:
- Encrypted bridges incompatible with MAS (appservice login not supported)
- QR code login may fail with NotImplementedError on some setups
- Workaround: Disable encryption in bridge configs until MAS adds appservice support
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Change 'expose' to 'ports' for Synapse (8008, 8448)
- Change 'expose' to 'ports' for MAS (8080, 8081)
- Change 'expose' to 'ports' for Element (80->8090)
- Allows external reverse proxies to reach services
- Fixes 502 errors in multi-machine deployments
Without this, ports are only accessible within Docker network
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Only remove postgres/data (fixes password mismatch)
- Preserve synapse/data/homeserver.yaml (keeps mail config, etc.)
- Preserve mas/config (keeps CLIENT_SECRET for Authelia)
- Stop containers before cleanup to prevent database locks
- Remove only regenerable data (mas/data, certs, bridges)
Fixes Issue #9 without destroying user customizations
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove all -f docker-compose.local.yml flags
- Use default docker-compose.yml instead
- Fixes 'no such file or directory' error after compose file restructuring
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Removed Prompts:
- Matrix server IP (3 prompts removed)
- Authelia server IP
- Let's Encrypt email
- "Is this correct?" confirmation
Why Removed:
- Multi-machine deployments access services via domains, not IPs
- Caddy runs on separate machine with its own Let's Encrypt config
- User copies generated Authelia configs to separate servers
- IP/email values were only used in Caddyfile template generation
- MULTI_MACHINE_CONFIG_SNIPPETS.md provides the actual deployment guide
Changes:
1. Set placeholder values automatically:
- MATRIX_SERVER_IP=10.0.1.10
- AUTHELIA_SERVER_IP=10.0.1.20
- LETSENCRYPT_EMAIL=admin@{domain}
2. Simplified configuration summary (no IPs shown)
3. Added helpful note pointing to MULTI_MACHINE_CONFIG_SNIPPETS.md
4. Updated .env comments to indicate these are placeholders
Result:
- Faster deployment: 12 prompts → 7 prompts
- Domain prompts preserved (user requested)
- Still generates Caddyfile template with placeholder IPs
- User updates actual IPs manually on Caddy server
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Changes:
- Renamed docker-compose.production.yml → docker-compose.yml (main config)
- Moved unused compose files to compose-variants/ folder:
- docker-compose.local.yml → compose-variants/
- docker-compose.authelia.yml → compose-variants/
- docker-compose.caddy.yml → compose-variants/
- docker-compose.yml (old) → compose-variants/docker-compose.old.yml
- Added compose-variants/README.md explaining the variants
Benefits:
- Default command now works: docker compose up -d (no -f flag needed)
- Cleaner project root directory
- Clear separation between active config and variants
- Multi-machine deployment is the default mode
Updated Documentation:
- MULTI_MACHINE_CONFIG_SNIPPETS.md: Removed -f flags from all commands
- README.md: Updated deploy commands to use simplified syntax
- All commands now use: docker compose up -d
Deployment Modes (from docker-compose.yml):
1. Multi-machine (default):
docker compose up -d
→ Starts: Synapse, MAS, Element, PostgreSQL only
2. Single-machine with Authelia:
docker compose --profile single-machine --profile authelia up -d
→ Starts everything including Caddy and Authelia
3. Single-machine without Authelia:
docker compose --profile single-machine up -d
→ Starts everything with Caddy, no Authelia
This makes the default behavior match the multi-machine architecture
where Caddy and Authelia run on separate servers.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Production deployments now use HTTPS domain for Authelia discovery URL
instead of internal Docker service name. This ensures OAuth2 validation
works correctly with Let's Encrypt certificates and multi-machine setups.
Changes:
- Production: discovery_url uses https://${AUTHELIA_DOMAIN}
- Local: discovery_url uses http://authelia:9091 (avoids self-signed cert issues)
Technical Details:
- OAuth2 spec requires consistent issuer and discovery URLs
- Multi-machine deployments need public HTTPS domains, not internal service names
- Local deployments continue using internal HTTP for container-to-container communication
Validated:
- Logic tested with production variables ✓
- Logic tested with local variables ✓
- Generated configs verified ✓
Fixes OAuth issuer URL consistency for production with separate Authelia servers.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>