Fix mismatched app.db usage across services by resolving the settings DB from CWA_APP_DB_PATH or CALIBRE_DBPATH, and refuse non-app.db file paths.
Align ingest pipeline and Calibre init helpers to the same app.db resolver to avoid stale/wrong DB usage (e.g., accidentally using cwa.db).
Add clear error logging when the auto_send_enabled column migration fails so permission/lock/path issues surface instead of silently breaking startup.
Introduce a lightweight app.db healthcheck at startup to flag missing, non-writable, or locked DBs early.
Skip PRAGMA quick_check when NETWORK_SHARE_MODE=true to avoid network-share stalls or spurious lock errors.
Fixes caused by unsafe UTF‑8 decoding and DOM reserialization that could corrupt EPUBs. Introduces encoding detection/preservation with UTF‑16 only honored when BOM/XML declares it, logs low‑confidence decodes, and writes text using per‑file target encodings. HTML charset updates now preserve existing http‑equiv tags, and safe mode removes stray `<img>` tags without full reserialization. Adds `kindle_epub_fixer_aggressive` (default off) to settings UI and schema so riskier transforms are opt‑in.
Backups used shutil.copy2, preserving original file timestamps. The auto-zipper only zips files with today’s mtime, so many backups were skipped. Switch to shutil.copy and reset mtime to now so daily zips include new backups.
Fixes#260
Fix archived book count mismatch by deleting archived_book rows when a book is deleted.
Add TaskCleanArchivedBooks to purge stale archived references safely in batches.
Schedule cleanup via CWA settings (default 03:00 local) and expose schedule controls in CWA Settings UI.
Add new cwa_settings defaults/schema fields for archived cleanup timing.
[bug] Deleting An Archived Book Doesn't Remove Archived Book Entry From app.db's archived_book table
Fixes#8243
The Kindle EPUB fixer now detects and removes duplicate XML declarations in META-INF/container.xml, validating the result before writing. This normalizes malformed EPUBs during ingest to reduce reader failures and potential Send‑to‑Kindle E999 ingestion errors.
Problems:
- Sidecar manifest files were being treated as ingest targets, causing premature deletion.
- Ignored/temporary ingest artifacts could be deleted too early when readiness checks timed out.
- Stale temp cleanup was hardcoded, not user-configurable, and required restarts to change behavior.
Solutions:
- Filtered manifest files in the ingest watcher and added processor guards to skip them.
- Added skip-delete handling for ignored/temporary files on readiness timeout to preserve artifacts.
Implemented robust stale temp cleanup with age and interval settings.
- Persisted cleanup settings in the CWA database with sane defaults and validation.
- Exposed new cleanup controls in the settings UI and made the ingest service read live values from the database instead of environment variables.
Other changes:
- Centralized integer parsing and defaulting logic for the new settings.
- Added clear UI descriptions and bounds for the new cleanup options.
- Improved observability with explicit log messages for skip-delete behavior and cleanup timing.
The metadata-change-detector polling watcher would re-process the same JSON log when calibredb/enforcement fails because the log was left in place. This change deletes failed logs and records a failure entry (“auto -log (failed)”) so the job is not retried indefinitely and admins can see failures in CWA stats.
[bug] Subprocess error in cover enforcer results in infinite loop with polling watcher
Fixes#829
Cause: stats dropdown used activity table user_name and stale user_ids, leaving many unmatched entries shown as “Unknown”.
Fix: resolve names against app users by user_id, group all unmatched IDs under a single “Unknown User”, and allow list-based user filters in stats queries (with tests).
[bug] Many unknown users in stats
Fixes#942
The ingest worker never initialized CPS config, so config_use_google_drive stayed false and GDrive sync was skipped. Load minimal settings from app.db on startup so sync runs when enabled.
[bug] GDrive not syncing
Fixes#933.
Resolves issue where CWA hangs when editing metadata and cover
simultaneously, particularly on network shares or when NETWORK_SHARE_MODE
is enabled.
Changes:
- Increase SQLite connection timeout from 30s to 60s throughout
cover_enforcer.py to handle slower network share operations
- Add retry logic with exponential backoff (3 attempts) to calibredb
export, specifically handling "database is locked" errors
- Replace os.system() with subprocess.run() for ebook-polish operations:
* Better error handling and logging
* 120s timeout to prevent indefinite hangs
* Capture stderr/stdout for debugging
- Add 0.5s delay before ebook-polish execution to ensure file buffers
are flushed and locks released
- Explicit database connection closure before file modification operations
All changes occur in background service; no UI impact.
Fixes#904
- Reorder process() execution: fix_encoding() now runs BEFORE fix_book_language()
to clean malformed XML declarations before parsing attempts
- Add try/except wrapper to fix_book_language() for graceful handling of
non-standard EPUB structures (e.g., duplicate XML declarations)
- Replace minidom.toxml() with regex-based updates in fix_book_language() to
preserve XML attribute order (prevents breaking Amazon's strict parser)
- Add strip_embedded_fonts() to remove TTF/OTF/WOFF fonts and @font-face CSS
declarations, plus manifest entries (Kindle doesn't support custom fonts reliably)
- Add remove_javascript() to strip <script> tags (not supported on Kindle)
- Add validate_images() to check format/size compatibility and warn about issues
- Add validate_css() for non-invasive syntax validation with Kindle compatibility warnings
- Add strip_amazon_identifiers() function (implemented but not called by default
per user preference to preserve complete metadata)
- Extend fix_encoding() malformed XML detection to cover OPF/XML/NCX files,
not just HTML/XHTML
Tested with both working and failing EPUBs - execution order fix allows proper
language validation on files with initially malformed XML, and all new fixes
are non-destructive with proper error handling.
Resolves#920
Amazon's Send-to-Kindle service was rejecting some EPUBs with E999 errors due
to malformed XML declarations containing excessive whitespace. Example:
<?xml version="1.0" encoding="UTF-8"?>
^^
Double space breaks Amazon's strict XML parser
Root cause: fix_encoding() only processed HTML/XHTML files and missed OPF/XML/NCX
files entirely. Additionally, the validation regex incorrectly accepted malformed
declarations with multiple consecutive spaces.
Changes to kindle_epub_fixer.py:
- Extended fix_encoding() to process OPF, XML, and NCX files (not just HTML/XHTML)
- Added malformed XML declaration detection pattern (2+ consecutive spaces)
- Normalizes malformed declarations to proper single-space format
- Uses elif logic to ensure malformed fix runs before missing declaration check
- Preserves all existing behavior for properly formatted files
Fixes issue where EPUBs passed Python's lenient minidom parser but were rejected
by Amazon's strict validation.
Amazon's Send-to-Kindle service was rejecting EPUBs with E999 errors when
language metadata didn't match content. Root cause: CWA was silently forcing
ALL invalid/unsupported language tags to 'en', causing German books with
"Unknown" tags to be mislabeled as English.
Changes to kindle_epub_fixer.py:
- Rewrote fix_book_language() with proper validation logic:
* Validate format with regex before processing (prevents false positives)
* Extract 2-character base code to match Amazon's behavior (de-DE → de)
* Handle case-insensitive tags (en-us/EN-US → en)
* Preserve valid language codes instead of forcing to English
- Added _detect_language_from_metadata() fallback:
* Queries Calibre's metadata.db when EPUB tag is invalid
* Extracts book_id from file path to find correct language
- Performance & code quality improvements:
* Moved regex compilation to module-level constant (LANGUAGE_TAG_PATTERN)
* Removed duplicate 'nb' entry from allowed_languages list
* Cleaned up redundant imports scattered in functions
* Added detailed comments explaining Amazon's 2-char limitation
Fixes#920
Major Features:
- Auto-duplicate resolution: 6 strategies (newest, oldest, merge, highest_quality_format, most_metadata, largest_file_size)
- Automatic cancellation of pending tasks and scheduled jobs when books deleted by resolution
- Settings UI for enabling/configuring auto-resolution with cooldown periods
- Enhanced Duplicates Manager UI with clickable book covers, titles, and Edit/Archive buttons
Performance Fixes:
- Fixed critical application hang: Pass pre-scanned duplicate groups to auto_resolve_duplicates() to avoid expensive re-scan
- Fixed deadlock in cancel_tasks_for_book(): Access queue/dequeued directly instead of using .tasks property to prevent recursive lock
- Optimized incremental scan to include last scanned book (>= instead of >)
Implementation Details:
- cps/duplicates.py: auto_resolve_duplicates() with dry-run preview, backup, deletion, and audit logging
- cps/tasks/duplicate_scan.py: Pass found_duplicate_groups to resolution, added comprehensive debug logging
- cps/services/worker.py: cancel_tasks_for_book() method with deadlock prevention
- scripts/cwa_db.py: scheduled_cancel_for_book() to cancel pending auto-send/scheduled jobs
- cps/templates/duplicates.html: Fixed blueprint endpoints, added clickable UI elements
- cps/templates/cwa_settings.html: Uncommented and fixed auto-resolution settings section
Bug Fixes:
- Fixed template crash from wrong blueprint endpoint ('editbook' vs 'edit-book')
- Fixed settings page overwriting format lists with duplicate_auto_resolve_cooldown_minutes
- Fixed permission errors by bypassing user context check for automatic deletions
- Fixed SQL query debugging output for hybrid prefilter
Resolves SQL syntax error: near '>>': syntax error
The merge conflict marker '>>>>>>> origin/main' at line 95 was causing:
- cwa-update-notification-service failures
- translation-notification-service failures
- ingest_processor crashes
- All services depending on CWA_DB initialization to fail
This was missed during the merge conflict resolution in commit 646544b.
Improvements to the split library implementation in generate_book_checksums.py:
- Rename books_path() to get_books_path() for better naming convention
- Add NULL-safe handling for config_calibre_split_dir database field
- Check config_calibre_split flag before using split path (not just path existence)
- Replace sys.exit() with graceful warnings and fallback to library_path
- Remove lambda in favor of explicit, readable conditional logic
- Enhance logging to clearly indicate when split library mode is active
- Improve docstrings and inline comments for clarity
Add comprehensive test coverage (177 lines, 4 test cases):
- test_split_library_with_separate_paths - core split library functionality
- test_books_path_falls_back_to_library_path - invalid path handling
- test_books_path_with_none_value - default behavior without --books-path
- test_split_library_with_multiple_formats - multi-format books in split mode
All tests follow existing patterns using subprocess execution, helper functions,
and tmp_path isolation for consistency with the test suite.config_calibre_split_dir setting
- Check config_calibre_split flag before using split path
- Replace sys.exit() with graceful fallback and warnings on DB errors
- Remove lambda in favor of explicit conditional logic
- Add clear logging to indicate split library mode vs normal mode
- Improve docstrings to document split library behavior
Added 4 comprehensive test cases (177 lines) following existing patterns:
- test_split_library_with_separate_paths: Core split library functionality
- test_books_path_falls_back_to_library_path: Invalid path handling
- test_books_path_with_none_value: Normal mode operation
- test_split_library_with_multiple_formats: Multi-format split library
All tests use existing helpers, subprocess execution, and tmp_path
isolation for consistency with the test suite.
default hybrid detection + SQL prefilter; updated defaults in schema
duplicate scans now run as background tasks with progress + cancel
debounced after-import scans and cron-based scheduled scans
settings UI updated for cron, defaults, and explanations + next run display
duplicates page shows progress bar + next scheduled run
task queue fixes + better error/confirmation messaging
cron validation on save and ISO task date formatting
Resolves#892
- Add retry logic with 3 attempts and 0.5s delays in read_log() method
- Handle FileNotFoundError, JSONDecodeError, and file existence checks gracefully
- Return None instead of crashing when log files are missing or invalid
- Update main() to exit gracefully when log file is unavailable
- Update check_for_other_logs() to skip invalid entries and continue processing
- Prevent metadata-change-detector service crashes from timing issues
This prevents the server from crashing when the metadata-change-detector
service detects a log file that gets deleted or is not yet fully written
before cover_enforcer.py attempts to read it. The service now handles
these race conditions gracefully and continues operating without requiring
manual container restarts.
- Add _validate_book_exists() to check book ID against metadata.db before add_format
- Enhance add_format_to_book() with upstream validation and improved error reporting
- Preserve failed manifests as .cwa.failed.json for debugging
- Add book existence check in editbooks.py before creating upload manifest
- Properly handle invalid book IDs by backing up to failed/ with clear error messages
- Added `get_internal_api_url` helper to dynamically construct localhost URLs based on cert/key presence.
- Updated `ingest_processor.py`, `cwa_functions.py`, and `tasks/ops.py` to use the new helper.
- Added unit tests for internal API URL generation.
- Updated Dockerfile HEALTHCHECK to fallback to HTTPS if HTTP fails.
- Upgraded external links to HTTPS in (ISFDB) and (Rakuten/Help) to fallback to HTTPS if HTTP fails.
Upgraded external links to HTTPS in db.py (ISFDB) and kobo.py (Rakuten/Help).
[bug] DB fails to update when adding new book if using TLS/SSL
Fixes#818