Calibre-Web-Automated

Author	SHA1	Message	Date
crocodilestick	d11a1c45b7	Updated translations. Major Update to 4.0.0/.4.0.1 messes up translations Fixes #960	2026-01-30 20:58:44 +01:00
crocodilestick	e6b0c38cb5	Update copyright years in all relevant files to 2026 for Calibre-Web contributors and Calibre-Web Automated contributors	2026-01-29 23:56:43 +01:00
crocodilestick	4a5774fead	Harden app.db resolution and startup checks for auto-send migration issues Fix mismatched app.db usage across services by resolving the settings DB from CWA_APP_DB_PATH or CALIBRE_DBPATH, and refuse non-app.db file paths. Align ingest pipeline and Calibre init helpers to the same app.db resolver to avoid stale/wrong DB usage (e.g., accidentally using cwa.db). Add clear error logging when the auto_send_enabled column migration fails so permission/lock/path issues surface instead of silently breaking startup. Introduce a lightweight app.db healthcheck at startup to flag missing, non-writable, or locked DBs early. Skip PRAGMA quick_check when NETWORK_SHARE_MODE=true to avoid network-share stalls or spurious lock errors.	2026-01-29 23:53:37 +01:00
crocodilestick	32156431df	Make EPUB fixer safer by default + add aggressive toggle Fixes caused by unsafe UTF‑8 decoding and DOM reserialization that could corrupt EPUBs. Introduces encoding detection/preservation with UTF‑16 only honored when BOM/XML declares it, logs low‑confidence decodes, and writes text using per‑file target encodings. HTML charset updates now preserve existing http‑equiv tags, and safe mode removes stray `<img>` tags without full reserialization. Adds `kindle_epub_fixer_aggressive` (default off) to settings UI and schema so riskier transforms are opt‑in.	2026-01-29 23:13:22 +01:00
crocodilestick	9b2d59948c	Fix auto-zip by updating backup mtime Backups used shutil.copy2, preserving original file timestamps. The auto-zipper only zips files with today’s mtime, so many backups were skipped. Switch to shutil.copy and reset mtime to now so daily zips include new backups. Fixes #260	2026-01-29 22:21:03 +01:00
crocodilestick	be6cb19cb3	Fix archived_book cleanup and add scheduled maintenance Fix archived book count mismatch by deleting archived_book rows when a book is deleted. Add TaskCleanArchivedBooks to purge stale archived references safely in batches. Schedule cleanup via CWA settings (default 03:00 local) and expose schedule controls in CWA Settings UI. Add new cwa_settings defaults/schema fields for archived cleanup timing. [bug] Deleting An Archived Book Doesn't Remove Archived Book Entry From app.db's archived_book table Fixes #8243	2026-01-29 21:14:38 +01:00
crocodilestick	e3cea8cbaa	fix(fixer): repair duplicate XML declarations in container.xml The Kindle EPUB fixer now detects and removes duplicate XML declarations in META-INF/container.xml, validating the result before writing. This normalizes malformed EPUBs during ingest to reduce reader failures and potential Send‑to‑Kindle E999 ingestion errors.	2026-01-29 20:21:39 +01:00
crocodilestick	8a889050ab	cwa: harden ingest flow, manifest handling, and stale temp cleanup Problems: - Sidecar manifest files were being treated as ingest targets, causing premature deletion. - Ignored/temporary ingest artifacts could be deleted too early when readiness checks timed out. - Stale temp cleanup was hardcoded, not user-configurable, and required restarts to change behavior. Solutions: - Filtered manifest files in the ingest watcher and added processor guards to skip them. - Added skip-delete handling for ignored/temporary files on readiness timeout to preserve artifacts. Implemented robust stale temp cleanup with age and interval settings. - Persisted cleanup settings in the CWA database with sane defaults and validation. - Exposed new cleanup controls in the settings UI and made the ingest service read live values from the database instead of environment variables. Other changes: - Centralized integer parsing and defaulting logic for the new settings. - Added clear UI descriptions and bounds for the new cleanup options. - Improved observability with explicit log messages for skip-delete behavior and cleanup timing.	2026-01-29 20:06:57 +01:00
crocodilestick	3a1af44d62	fix(cover-enforcer): avoid polling loop on failed logs The metadata-change-detector polling watcher would re-process the same JSON log when calibredb/enforcement fails because the log was left in place. This change deletes failed logs and records a failure entry (“auto -log (failed)”) so the job is not retried indefinitely and admins can see failures in CWA stats. [bug] Subprocess error in cover enforcer results in infinite loop with polling watcher Fixes #829	2026-01-29 18:16:17 +01:00
crocodilestick	b3c2230e3e	fix(stats): resolve Unknown user dropdown entries Cause: stats dropdown used activity table user_name and stale user_ids, leaving many unmatched entries shown as “Unknown”. Fix: resolve names against app users by user_id, group all unmatched IDs under a single “Unknown User”, and allow list-based user filters in stats queries (with tests). [bug] Many unknown users in stats Fixes #942	2026-01-29 18:00:57 +01:00
crocodilestick	fe60df7bf6	Fixed duplicate detection notifications not displaying reliablely immedietly after the ingest of multiple books	2026-01-29 13:55:34 +01:00
crocodilestick	8d280c246f	fix(ingest): load GDrive settings for sync The ingest worker never initialized CPS config, so config_use_google_drive stayed false and GDrive sync was skipped. Load minimal settings from app.db on startup so sync runs when enabled. [bug] GDrive not syncing Fixes #933.	2026-01-28 23:12:12 +01:00
crocodilestick	bf3b0d42a8	Rolled back problematic new epub fixer functions to come back to later	2026-01-28 22:21:26 +01:00
crocodilestick	fa2f2f70c2	Fix database locking and race conditions in cover_enforcer Resolves issue where CWA hangs when editing metadata and cover simultaneously, particularly on network shares or when NETWORK_SHARE_MODE is enabled. Changes: - Increase SQLite connection timeout from 30s to 60s throughout cover_enforcer.py to handle slower network share operations - Add retry logic with exponential backoff (3 attempts) to calibredb export, specifically handling "database is locked" errors - Replace os.system() with subprocess.run() for ebook-polish operations: * Better error handling and logging * 120s timeout to prevent indefinite hangs * Capture stderr/stdout for debugging - Add 0.5s delay before ebook-polish execution to ensure file buffers are flushed and locks released - Explicit database connection closure before file modification operations All changes occur in background service; no UI impact. Fixes #904	2026-01-26 17:50:43 +01:00
crocodilestick	984647e00c	fix(epub-fixer): improve Kindle compatibility and fix XML parsing order - Reorder process() execution: fix_encoding() now runs BEFORE fix_book_language() to clean malformed XML declarations before parsing attempts - Add try/except wrapper to fix_book_language() for graceful handling of non-standard EPUB structures (e.g., duplicate XML declarations) - Replace minidom.toxml() with regex-based updates in fix_book_language() to preserve XML attribute order (prevents breaking Amazon's strict parser) - Add strip_embedded_fonts() to remove TTF/OTF/WOFF fonts and @font-face CSS declarations, plus manifest entries (Kindle doesn't support custom fonts reliably) - Add remove_javascript() to strip <script> tags (not supported on Kindle) - Add validate_images() to check format/size compatibility and warn about issues - Add validate_css() for non-invasive syntax validation with Kindle compatibility warnings - Add strip_amazon_identifiers() function (implemented but not called by default per user preference to preserve complete metadata) - Extend fix_encoding() malformed XML detection to cover OPF/XML/NCX files, not just HTML/XHTML Tested with both working and failing EPUBs - execution order fix allows proper language validation on files with initially malformed XML, and all new fixes are non-destructive with proper error handling. Resolves #920	2026-01-26 16:43:12 +01:00
crocodilestick	aea35e27e8	Fix malformed XML declarations in OPF/XML files causing Amazon E999 rejections Amazon's Send-to-Kindle service was rejecting some EPUBs with E999 errors due to malformed XML declarations containing excessive whitespace. Example: <?xml version="1.0" encoding="UTF-8"?> ^^ Double space breaks Amazon's strict XML parser Root cause: fix_encoding() only processed HTML/XHTML files and missed OPF/XML/NCX files entirely. Additionally, the validation regex incorrectly accepted malformed declarations with multiple consecutive spaces. Changes to kindle_epub_fixer.py: - Extended fix_encoding() to process OPF, XML, and NCX files (not just HTML/XHTML) - Added malformed XML declaration detection pattern (2+ consecutive spaces) - Normalizes malformed declarations to proper single-space format - Uses elif logic to ensure malformed fix runs before missing declaration check - Preserves all existing behavior for properly formatted files Fixes issue where EPUBs passed Python's lenient minidom parser but were rejected by Amazon's strict validation.	2026-01-26 14:36:31 +01:00
crocodilestick	0746472a8f	Fix EPUB language tag handling to prevent Amazon Kindle E999 rejections (#920 ) Amazon's Send-to-Kindle service was rejecting EPUBs with E999 errors when language metadata didn't match content. Root cause: CWA was silently forcing ALL invalid/unsupported language tags to 'en', causing German books with "Unknown" tags to be mislabeled as English. Changes to kindle_epub_fixer.py: - Rewrote fix_book_language() with proper validation logic: * Validate format with regex before processing (prevents false positives) * Extract 2-character base code to match Amazon's behavior (de-DE → de) * Handle case-insensitive tags (en-us/EN-US → en) * Preserve valid language codes instead of forcing to English - Added _detect_language_from_metadata() fallback: * Queries Calibre's metadata.db when EPUB tag is invalid * Extracts book_id from file path to find correct language - Performance & code quality improvements: * Moved regex compilation to module-level constant (LANGUAGE_TAG_PATTERN) * Removed duplicate 'nb' entry from allowed_languages list * Cleaned up redundant imports scattered in functions * Added detailed comments explaining Amazon's 2-char limitation Fixes #920	2026-01-26 14:21:35 +01:00
crocodilestick	e3db8b2152	feat: Add auto-duplicate resolution with task cancellation and fix critical deadlocks Major Features: - Auto-duplicate resolution: 6 strategies (newest, oldest, merge, highest_quality_format, most_metadata, largest_file_size) - Automatic cancellation of pending tasks and scheduled jobs when books deleted by resolution - Settings UI for enabling/configuring auto-resolution with cooldown periods - Enhanced Duplicates Manager UI with clickable book covers, titles, and Edit/Archive buttons Performance Fixes: - Fixed critical application hang: Pass pre-scanned duplicate groups to auto_resolve_duplicates() to avoid expensive re-scan - Fixed deadlock in cancel_tasks_for_book(): Access queue/dequeued directly instead of using .tasks property to prevent recursive lock - Optimized incremental scan to include last scanned book (>= instead of >) Implementation Details: - cps/duplicates.py: auto_resolve_duplicates() with dry-run preview, backup, deletion, and audit logging - cps/tasks/duplicate_scan.py: Pass found_duplicate_groups to resolution, added comprehensive debug logging - cps/services/worker.py: cancel_tasks_for_book() method with deadlock prevention - scripts/cwa_db.py: scheduled_cancel_for_book() to cancel pending auto-send/scheduled jobs - cps/templates/duplicates.html: Fixed blueprint endpoints, added clickable UI elements - cps/templates/cwa_settings.html: Uncommented and fixed auto-resolution settings section Bug Fixes: - Fixed template crash from wrong blueprint endpoint ('editbook' vs 'edit-book') - Fixed settings page overwriting format lists with duplicate_auto_resolve_cooldown_minutes - Fixed permission errors by bypassing user context check for automatic deletions - Fixed SQL query debugging output for hybrid prefilter	2026-01-25 01:24:04 +01:00
crocodilestick	9571f89a62	[bug] Can't save CWA Settings on latest DEV-335 due to issues with duplicate detection settings Fixes #903	2026-01-24 22:55:52 +01:00
crocodilestick	52fb64455e	Fix: Remove leftover git merge conflict marker from cwa_schema.sql Resolves SQL syntax error: near '>>': syntax error The merge conflict marker '>>>>>>> origin/main' at line 95 was causing: - cwa-update-notification-service failures - translation-notification-service failures - ingest_processor crashes - All services depending on CWA_DB initialization to fail This was missed during the merge conflict resolution in commit `646544b`.	2026-01-24 21:37:37 +01:00
crocodilestick	646544b820	Merge main into auto-hardcover-id - sync with latest changes and resolve conflicts	2026-01-24 21:25:37 +01:00
CrocodileStick	c4ac747797	Merge branch 'main' into checksum-split-awareness	2026-01-23 23:04:12 +01:00
crocodilestick	d67c1db641	refactor: improve split library error handling and add test coverage Improvements to the split library implementation in generate_book_checksums.py: - Rename books_path() to get_books_path() for better naming convention - Add NULL-safe handling for config_calibre_split_dir database field - Check config_calibre_split flag before using split path (not just path existence) - Replace sys.exit() with graceful warnings and fallback to library_path - Remove lambda in favor of explicit, readable conditional logic - Enhance logging to clearly indicate when split library mode is active - Improve docstrings and inline comments for clarity Add comprehensive test coverage (177 lines, 4 test cases): - test_split_library_with_separate_paths - core split library functionality - test_books_path_falls_back_to_library_path - invalid path handling - test_books_path_with_none_value - default behavior without --books-path - test_split_library_with_multiple_formats - multi-format books in split mode All tests follow existing patterns using subprocess execution, helper functions, and tmp_path isolation for consistency with the test suite.config_calibre_split_dir setting - Check config_calibre_split flag before using split path - Replace sys.exit() with graceful fallback and warnings on DB errors - Remove lambda in favor of explicit conditional logic - Add clear logging to indicate split library mode vs normal mode - Improve docstrings to document split library behavior Added 4 comprehensive test cases (177 lines) following existing patterns: - test_split_library_with_separate_paths: Core split library functionality - test_books_path_falls_back_to_library_path: Invalid path handling - test_books_path_with_none_value: Normal mode operation - test_split_library_with_multiple_formats: Multi-format split library All tests use existing helpers, subprocess execution, and tmp_path isolation for consistency with the test suite.	2026-01-23 22:49:41 +01:00
crocodilestick	1011dd509a	Phase 3: incremental duplicate scans, debounced scheduling, and metadata-safe title normalization	2026-01-15 16:48:50 +01:00
crocodilestick	689d223195	feat(duplicates): finalize Phase 2 scanning with hybrid detection, background tasking, cron scheduling, and UI feedback default hybrid detection + SQL prefilter; updated defaults in schema duplicate scans now run as background tasks with progress + cancel debounced after-import scans and cron-based scheduled scans settings UI updated for cron, defaults, and explanations + next run display duplicates page shows progress bar + next scheduled run task queue fixes + better error/confirmation messaging cron validation on save and ISO task date formatting	2026-01-15 14:07:24 +01:00
crocodilestick	ce3c452dca	Implemented performant & more reliable SQL query + python hybrid system	2026-01-15 11:51:33 +01:00
crocodilestick	1491589bbf	Added SQL dupe search as fallback for python dupe search as I couldn't get it to work as reliably. Disabled for now, might come back to in the future	2026-01-14 18:57:58 +01:00
crocodilestick	63f2cbbecc	Added V1 of auto-resolving duplicate detection system	2026-01-14 17:07:16 +01:00
crocodilestick	897fe0ec34	fix(cover-enforcer): Add robust error handling for race conditions in metadata log processing Resolves #892 - Add retry logic with 3 attempts and 0.5s delays in read_log() method - Handle FileNotFoundError, JSONDecodeError, and file existence checks gracefully - Return None instead of crashing when log files are missing or invalid - Update main() to exit gracefully when log file is unavailable - Update check_for_other_logs() to skip invalid entries and continue processing - Prevent metadata-change-detector service crashes from timing issues This prevents the server from crashing when the metadata-change-detector service detects a log file that gets deleted or is not yet fully written before cover_enforcer.py attempts to read it. The service now handles these race conditions gracefully and continues operating without requiring manual container restarts.	2026-01-13 16:33:40 +01:00
Seth Milliken	f575d3258e	fix: add `--books-path` argument to `generate_book_checksums.py` Enables `generate_book_checksums` to use `config_calibre_split_dir` setting.	2026-01-09 12:22:55 -08:00
crocodilestick	7f7948cd41	Fixed comment is DB schemas causing errors	2026-01-09 14:49:15 +01:00
crocodilestick	a4cff97a7c	Added a notification system for duplicates that can be disabled in teh cwa settings panel	2026-01-08 19:08:07 +01:00
crocodilestick	378b2facff	Implement Hardcover Auto-Fetch feature - Add confidence scoring algorithm with Levenshtein distance and Jaccard similarity - Create background worker (TaskAutoHardcoverID) with rate limiting and exponential backoff - Add database schema: hardcover_match_queue and hardcover_auto_fetch_stats tables - Implement comprehensive scheduling system (10 options: 15min intervals through monthly) - Build settings UI with token validation and dynamic schedule selectors - Add manual trigger button in admin panel - Create review queue UI with gradient status hero cards - Integrate stats dashboard with auto-matched, pending review, and manually reviewed counts - Add text similarity utilities (normalized_levenshtein_similarity, author_list_similarity) - Enhance Hardcover provider with calculate_confidence_score method - Extend MetaRecord with confidence_score and match_reason fields - Add IntervalTrigger support to background scheduler	2026-01-06 00:20:40 +01:00
crocodilestick	70d2e3a6c9	Fix #866 : Validate book ID before adding format to prevent duplicates - Add _validate_book_exists() to check book ID against metadata.db before add_format - Enhance add_format_to_book() with upstream validation and improved error reporting - Preserve failed manifests as .cwa.failed.json for debugging - Add book existence check in editbooks.py before creating upload manifest - Properly handle invalid book IDs by backing up to failed/ with clear error messages	2026-01-05 12:40:23 +01:00
CrocodileStick	509ada7a59	Merge pull request #863 from navels/fix/ingest-metadata-autosend Fix auto-send and auto-metadata fetching	2026-01-03 14:05:39 +01:00
crocodilestick	bb9d932706	Added stats tracking for magic shelves	2026-01-03 02:20:28 +01:00
Lee Nave	0f3c2efa5c	Improve ingest metadata and auto-send	2026-01-02 21:16:15 +00:00
crocodilestick	a62ad9e122	Fixed issues with the User Activity stats	2025-12-31 00:47:52 +01:00
crocodilestick	01229a750c	Added Rating Statistics, Top Enforced Books & Import Source Flows to Library Stats tab	2025-12-30 21:22:53 +01:00
crocodilestick	1d502a291b	Added API Usage stats tab	2025-12-30 20:02:02 +01:00
crocodilestick	78ec7d35d8	Added Average Session Duration, Search Success Rate & Shelf activity stats to User Activity tab	2025-12-30 17:49:44 +01:00
crocodilestick	52f1f5a2e4	Added Largest Series, Publication Year Distribution & Most Fixed Books sections to Library Stats tab	2025-12-30 16:13:12 +01:00
crocodilestick	b97506153b	Added tab for library stats	2025-12-30 15:29:09 +01:00
crocodilestick	bcf604017f	Added stat sections for Book Discovery Methods, Device Breakdown & Failed Auth Attempts	2025-12-30 14:04:28 +01:00
crocodilestick	65eb2dcc68	Added Peak Usage Hours heat map, reading velocity & format preferences by user	2025-12-30 13:52:23 +01:00
crocodilestick	880aff67e9	Added user specific filtering to user stats dashboard	2025-12-30 13:05:37 +01:00
crocodilestick	0e199f5fc5	Top Users usernames now link to user edit page	2025-12-30 12:31:42 +01:00
crocodilestick	4288ee2bb8	added user selectable time period functionality	2025-12-30 11:15:22 +01:00
crocodilestick	1949261110	Version 1 User Activity Stats page	2025-12-29 16:31:14 +01:00
crocodilestick	d2fe1f1657	- Fixes #818 : Internal API calls (ingest, library conversion, etc.) now respect SSL configuration instead of forcing HTTP. - Added `get_internal_api_url` helper to dynamically construct localhost URLs based on cert/key presence. - Updated `ingest_processor.py`, `cwa_functions.py`, and `tasks/ops.py` to use the new helper. - Added unit tests for internal API URL generation. - Updated Dockerfile HEALTHCHECK to fallback to HTTPS if HTTP fails. - Upgraded external links to HTTPS in (ISFDB) and (Rakuten/Help) to fallback to HTTPS if HTTP fails. Upgraded external links to HTTPS in db.py (ISFDB) and kobo.py (Rakuten/Help). [bug] DB fails to update when adding new book if using TLS/SSL Fixes #818	2025-12-08 22:15:30 +01:00

1 2 3 4 5 ...

254 Commits