Calibre-Web-Automated

Author	SHA1	Message	Date
crocodilestick	32156431df	Make EPUB fixer safer by default + add aggressive toggle Fixes caused by unsafe UTF‑8 decoding and DOM reserialization that could corrupt EPUBs. Introduces encoding detection/preservation with UTF‑16 only honored when BOM/XML declares it, logs low‑confidence decodes, and writes text using per‑file target encodings. HTML charset updates now preserve existing http‑equiv tags, and safe mode removes stray `<img>` tags without full reserialization. Adds `kindle_epub_fixer_aggressive` (default off) to settings UI and schema so riskier transforms are opt‑in.	2026-01-29 23:13:22 +01:00
crocodilestick	be6cb19cb3	Fix archived_book cleanup and add scheduled maintenance Fix archived book count mismatch by deleting archived_book rows when a book is deleted. Add TaskCleanArchivedBooks to purge stale archived references safely in batches. Schedule cleanup via CWA settings (default 03:00 local) and expose schedule controls in CWA Settings UI. Add new cwa_settings defaults/schema fields for archived cleanup timing. [bug] Deleting An Archived Book Doesn't Remove Archived Book Entry From app.db's archived_book table Fixes #8243	2026-01-29 21:14:38 +01:00
crocodilestick	8a889050ab	cwa: harden ingest flow, manifest handling, and stale temp cleanup Problems: - Sidecar manifest files were being treated as ingest targets, causing premature deletion. - Ignored/temporary ingest artifacts could be deleted too early when readiness checks timed out. - Stale temp cleanup was hardcoded, not user-configurable, and required restarts to change behavior. Solutions: - Filtered manifest files in the ingest watcher and added processor guards to skip them. - Added skip-delete handling for ignored/temporary files on readiness timeout to preserve artifacts. Implemented robust stale temp cleanup with age and interval settings. - Persisted cleanup settings in the CWA database with sane defaults and validation. - Exposed new cleanup controls in the settings UI and made the ingest service read live values from the database instead of environment variables. Other changes: - Centralized integer parsing and defaulting logic for the new settings. - Added clear UI descriptions and bounds for the new cleanup options. - Improved observability with explicit log messages for skip-delete behavior and cleanup timing.	2026-01-29 20:06:57 +01:00
crocodilestick	fe60df7bf6	Fixed duplicate detection notifications not displaying reliablely immedietly after the ingest of multiple books	2026-01-29 13:55:34 +01:00
crocodilestick	e3db8b2152	feat: Add auto-duplicate resolution with task cancellation and fix critical deadlocks Major Features: - Auto-duplicate resolution: 6 strategies (newest, oldest, merge, highest_quality_format, most_metadata, largest_file_size) - Automatic cancellation of pending tasks and scheduled jobs when books deleted by resolution - Settings UI for enabling/configuring auto-resolution with cooldown periods - Enhanced Duplicates Manager UI with clickable book covers, titles, and Edit/Archive buttons Performance Fixes: - Fixed critical application hang: Pass pre-scanned duplicate groups to auto_resolve_duplicates() to avoid expensive re-scan - Fixed deadlock in cancel_tasks_for_book(): Access queue/dequeued directly instead of using .tasks property to prevent recursive lock - Optimized incremental scan to include last scanned book (>= instead of >) Implementation Details: - cps/duplicates.py: auto_resolve_duplicates() with dry-run preview, backup, deletion, and audit logging - cps/tasks/duplicate_scan.py: Pass found_duplicate_groups to resolution, added comprehensive debug logging - cps/services/worker.py: cancel_tasks_for_book() method with deadlock prevention - scripts/cwa_db.py: scheduled_cancel_for_book() to cancel pending auto-send/scheduled jobs - cps/templates/duplicates.html: Fixed blueprint endpoints, added clickable UI elements - cps/templates/cwa_settings.html: Uncommented and fixed auto-resolution settings section Bug Fixes: - Fixed template crash from wrong blueprint endpoint ('editbook' vs 'edit-book') - Fixed settings page overwriting format lists with duplicate_auto_resolve_cooldown_minutes - Fixed permission errors by bypassing user context check for automatic deletions - Fixed SQL query debugging output for hybrid prefilter	2026-01-25 01:24:04 +01:00
crocodilestick	52fb64455e	Fix: Remove leftover git merge conflict marker from cwa_schema.sql Resolves SQL syntax error: near '>>': syntax error The merge conflict marker '>>>>>>> origin/main' at line 95 was causing: - cwa-update-notification-service failures - translation-notification-service failures - ingest_processor crashes - All services depending on CWA_DB initialization to fail This was missed during the merge conflict resolution in commit `646544b`.	2026-01-24 21:37:37 +01:00
crocodilestick	646544b820	Merge main into auto-hardcover-id - sync with latest changes and resolve conflicts	2026-01-24 21:25:37 +01:00
crocodilestick	1011dd509a	Phase 3: incremental duplicate scans, debounced scheduling, and metadata-safe title normalization	2026-01-15 16:48:50 +01:00
crocodilestick	689d223195	feat(duplicates): finalize Phase 2 scanning with hybrid detection, background tasking, cron scheduling, and UI feedback default hybrid detection + SQL prefilter; updated defaults in schema duplicate scans now run as background tasks with progress + cancel debounced after-import scans and cron-based scheduled scans settings UI updated for cron, defaults, and explanations + next run display duplicates page shows progress bar + next scheduled run task queue fixes + better error/confirmation messaging cron validation on save and ISO task date formatting	2026-01-15 14:07:24 +01:00
crocodilestick	ce3c452dca	Implemented performant & more reliable SQL query + python hybrid system	2026-01-15 11:51:33 +01:00
crocodilestick	1491589bbf	Added SQL dupe search as fallback for python dupe search as I couldn't get it to work as reliably. Disabled for now, might come back to in the future	2026-01-14 18:57:58 +01:00
crocodilestick	63f2cbbecc	Added V1 of auto-resolving duplicate detection system	2026-01-14 17:07:16 +01:00
crocodilestick	a4cff97a7c	Added a notification system for duplicates that can be disabled in teh cwa settings panel	2026-01-08 19:08:07 +01:00
crocodilestick	378b2facff	Implement Hardcover Auto-Fetch feature - Add confidence scoring algorithm with Levenshtein distance and Jaccard similarity - Create background worker (TaskAutoHardcoverID) with rate limiting and exponential backoff - Add database schema: hardcover_match_queue and hardcover_auto_fetch_stats tables - Implement comprehensive scheduling system (10 options: 15min intervals through monthly) - Build settings UI with token validation and dynamic schedule selectors - Add manual trigger button in admin panel - Create review queue UI with gradient status hero cards - Integrate stats dashboard with auto-matched, pending review, and manually reviewed counts - Add text similarity utilities (normalized_levenshtein_similarity, author_list_similarity) - Enhance Hardcover provider with calculate_confidence_score method - Extend MetaRecord with confidence_score and match_reason fields - Add IntervalTrigger support to background scheduler	2026-01-06 00:20:40 +01:00
crocodilestick	1949261110	Version 1 User Activity Stats page	2025-12-29 16:31:14 +01:00
crocodilestick	123f105fe7	feat(ui): Add Bootstrap toast notifications to scheduled jobs UI Enhance user feedback with toast notifications: tasks.html: - Add showNotification() helper for consistent toast display - Enhanced cancelScheduled() with success/error feedback cwa_convert_library.html & cwa_epub_fixer.html: - Convert schedule links to AJAX buttons - Add scheduleConvertLibrary(5\|15) and scheduleEpubFixer(5\|15) functions - Show success/error notifications for all scheduling actions - Maintain existing manual trigger functionality All notifications: - Positioned top-right with 4-second auto-dismiss - Use Bootstrap's alert-success/alert-danger styling - Provide clear action confirmation to users	2025-11-17 14:39:55 +01:00
crocodilestick	7d3f411da4	feat: implement global enablement for metadata providers with UI controls	2025-09-12 23:14:18 +02:00
crocodilestick	ebd68c3091	Added metadata_providers_enabled to schema to resolve conflict with PR #632	2025-09-12 23:07:10 +02:00
crocodilestick	42e7aabe5f	feat: Add retained formats functionality for auto-conversion Implements ability to keep original book formats after conversion to target format. Users can now select which formats to retain via CWA settings UI. Features: - New auto_convert_retained_formats setting with checkbox grid UI - Automatic conflict prevention (target format always retained) - Database migration support for backward compatibility - Enhanced ingest processor with robust format addition logic Credit to @angelicadvocate for original implementation concept in PR #284. Fixes edge cases including race conditions, UI state handling, and iteration safety.	2025-09-12 21:34:32 +02:00
crocodilestick	7c58906e53	feat: Implement configurable duplicate detection system (#604 ) Fix issue where books in different languages were incorrectly grouped as duplicates by implementing a comprehensive configurable duplicate detection system. Key Changes: Database Schema: - Add 6 new duplicate detection settings to cwa_schema.sql: - duplicate_detection_title/author/language (default: enabled) - duplicate_detection_series/publisher/format (default: disabled) Frontend UI: - Add "CWA Duplicate Detection Criteria" section to cwa_settings.html - Implement checkbox grid for configuring detection criteria - Include explanatory text and validation warnings Core Logic Rewrite: - Replace hardcoded (title, author) matching with configurable criteria - Support dynamic key generation based on selected metadata fields - Add comprehensive error handling and edge case coverage Robustness Improvements: - Handle missing/null metadata gracefully with fallback values - Add safety checks for empty collections and corrupt data - Include CWA database connection error handling - Performance warnings for large libraries (50k+ books) Issue Resolution: - Books in different languages no longer considered duplicates (language included by default) - Users can now fully customize duplicate detection criteria - Maintains backward compatibility with existing duplicate manager - Comprehensive error handling prevents crashes on edge cases Technical Details: - Follows established CWA settings patterns for seamless integration - Boolean settings automatically handled by existing backend logic - Added datetime import for timestamp sorting fallbacks - Extensive null/empty validation throughout duplicate detection pipeline	2025-09-12 16:26:22 +02:00
crocodilestick	8402e3225e	Added the ability to select which fields can be overwritten by the automatic metadata fetching service, for both the smart and verbatim modes	2025-09-05 15:46:42 +02:00
crocodilestick	3e3d2e5e4d	feat: Implement auto-send and enhanced auto-metadata fetch systems ## Major Features Added ### 📧 Auto-Send System - Automatically emails newly ingested books to users' eReaders - Configurable delay (1-60 minutes) to allow for processing - Supports multiple formats (EPUB, MOBI, AZW3, KEPUB, PDF) - Integrates with existing Calibre-Web email configuration - Respects user preferences and access controls ### 🏷️ Auto-Metadata Fetch System Enhancements - Enhanced metadata fetching with multiple provider support - Added smart metadata application mode with intelligent criteria - Moved control from user-level to admin-only configuration - Implemented provider hierarchy with drag-and-drop interface - Added quality-based metadata replacement logic ## Database Schema Changes ### CWA Settings (scripts/cwa_schema.sql) - Added auto_metadata_smart_application SMALLINT DEFAULT 0 - Enables intelligent vs direct metadata replacement modes ## User Interface Updates ### Admin Interface (cps/templates/cwa_settings.html) - Added smart metadata application toggle with detailed tooltip - Enhanced provider hierarchy management ### User Interface (cps/templates/user_edit.html) - Removed auto_metadata_fetch controls (now admin-only) - Cleaned up user profile interface ## Smart Metadata Application Logic ### Direct Replacement Mode (Default) - Takes metadata from preferred provider exactly as provided - Complete replacement of existing metadata - Philosophy: "Just take the metadata as is" ### Smart Application Mode (Optional) - Intelligent criteria for metadata replacement: * Titles: Only replace if longer/more descriptive * Descriptions: Only replace if longer/more detailed * Publishers: Only replace if current field is empty * Covers: Only replace if higher resolution * Authors: Always update for consistency * Tags/Series: Always add for discoverability ## Technical Implementation ### Metadata Helper (cps/metadata_helper.py) - Enhanced _apply_metadata_to_book() with smart application logic - Updated fetch_and_apply_metadata() for admin-only control - Integrated CWA_DB settings checking for both modes ### Ingest Processor (scripts/ingest_processor.py) - Removed user-based metadata checking - Streamlined to use admin settings only - Improved processing pipeline integration ### Form Processing (cps/cwa_functions.py) - Auto-detection of boolean settings from schema - Automatic handling of auto_metadata_smart_application ## Provider System Enhancements - Google Books, Internet Archive, DNB, ComicVine, Douban support - Priority-based searching with first-success-wins logic - Quality criteria evaluation for metadata selection - Configurable provider hierarchy with drag-and-drop interface ## Documentation ### Wiki Pages Created - Auto-Send-System.md: Comprehensive user and admin guide - Auto-Metadata-Fetch-System.md: Detailed configuration and usage - Enhanced with relevant emojis for improved readability - Covers troubleshooting, best practices, and technical details ## Integration & Compatibility - Maintains backward compatibility with existing email settings - Integrates seamlessly with auto-convert and ingest systems - Respects existing access controls and user permissions - No breaking changes to existing functionality ## Testing Notes - Database schema updates will apply automatically on app startup - Settings form processing handles new boolean field automatically - Metadata fetching now controlled entirely by admin settings - User interface cleaned of deprecated metadata controls This implementation provides a complete automated book delivery and metadata enhancement system while maintaining the principle of admin-controlled automation and user-friendly operation.	2025-09-04 18:22:54 +02:00
crocodilestick	955320d648	Added ability to change ingest timeout duration in CWA Settings	2025-09-02 15:37:38 +02:00
crocodilestick	4dfbcba34d	Reverted unwanted changes	2025-08-09 23:42:20 +02:00
crocodilestick	0c891d259f	REFACTOR - Moved audiobook.py, auto_library.py, auto_zip.py, convert_library.py, cover_enforcer.py, cwa_db.py, cwa_functions.py, cwa_schema.sql, ingest_processor.py & kindle_epub_fixer.py to the cps dir, changing dependencies as necessary	2025-08-09 20:21:58 +02:00
crocodilestick	91f727529d	Added "sexy-background-blur" option to cwa-settings to give mobile dark mode users the same cover incorporated background blur effect they have on desktop (can be disabled in cwa settings)	2025-08-09 11:56:30 +02:00
crocodilestick	f9a22ea94d	Added system to prompt users using CWA in languages other than English that has missing translations, to help complete the translations for that language. These notifications can be disabled in the CWA settings panel	2025-08-06 16:02:42 +02:00
have-a-boy	76947e5ad3	Add configurable setting for automerge param New setting is stored in CWA Settings as 'auto_ingest_automerge' and can be set to 'ignore', 'overwrite' or 'new_record' (which is the default and should match previous behaviour). Descriptions for each value should broadly match the calibredb docs.	2025-07-09 18:04:36 +02:00
crocodilestick	47e8cf0d66	Major Changes - - kindle_epub_fixer.py has been completely rewritten to more closely replicate the function of the original JS tool written by innocenat - Auto backup of files processed by kindle_epub_fixer is now added, enabled by default and able to be turned on or off by the user in the settings panel - Entries for Files that have been processed by kindle_epub_fixer are now automatically added to cwa.db in a new table called epub_fixes. These entries are now also available to view in the CWA Stats page accessible via the Admin Panel - CWA Stats pages has been rearranged to have the stats for the functions users care most about at the top and a bug was fixed in the see more pages where the title for all pages was for the enforcement statistics regardless of what was being shown - Creation of the /config/processed_books archive folders as well the setting of their permissions is now the sole responsibility of the CWA Auto Zipper service - Major improvements to exception handling in both convert_library and ingest_processor when handling files to improve performance, reliability as well as making it much easier to diagnose user errors - The default CWA settings are now pulled from the schema instead of being hardcoded into the CWA_DB class - Plus general refactoring and tidying up of the codebase	2025-01-05 21:26:32 +01:00
crocodilestick	1a63a51c1f	NEW FEATURE - Added Kindle-EPUB-Fixer - Originally developed by [innocenat](https://github.com/innocenat/kindle-epub-fix), this tool corrects the following potential issues for every EPUB processed by CWA: - Fixes UTF-8 encoding problem by adding UTF-8 declaration if no encoding is specified - Fixes hyperlink problem (result in Amazon rejecting the EPUB) when NCX table of content link to `<body>` with ID hash. - Detect invalid and/or missing language tag in metadata, and prompt user to select new language. - Remove stray `<img>` tags with no source field. - This ensures maximum comparability for each EPUB file with the Amazon Send-to-Kindle service and for those who don't use Amazon devices, has the side benefit of cleaning up your lower quality files - This feature is on by default and is able to be toggled on and off by the user in the CWA Settings panel Minor Changes: - All CWA python scripts now conform to the snake_case naming convention - Minor refactoring of ingest_processor script	2024-12-11 13:59:53 +00:00
crocodilestick	5bbd15af62	Finished work on making Cover & Metadata Enforcement service compatible with multiple formats and the presence of multiple formats for each book. Also rearranged CWA Settings page to make the layout abit more logical ect. Added end format to conversion history DB as well as original format Added lock file for Cover & Metadata Enforcement service	2024-11-22 16:03:45 +00:00
crocodilestick	fc949d281c	Made major changes to the CWA Cover & Metadata Enforcement service. Like the ingest service, it now also supports multiple formats (currently limited to EPUB & AZW3 due to limitations of the calibre ebook-polish function) and can also now be disabled in the CWA Settings panel. The majority of the necessary work has been done to achieve these goals but these changes are currently untested	2024-11-21 17:29:53 +00:00
crocodilestick	06a5e0fd96	IN PROGRESS - Testing additional ingest functionality (User definable output formats & behaviour)	2024-11-18 11:35:38 +00:00
crocodilestick	fc188873a4	- Fixed numerous issues with the cwa_db that were resulting in the db occasionally becoming locked for some users - CWA Settings page checkbox labels are now clickable and the target format is unable to be selected to be ignored - Users can now also set certain formats to be ignored by the auto importer	2024-11-15 11:47:20 +00:00
crocodilestick	d6b2caa7f8	IN PROGRESS - Continuing work on CWA Settings system for upcoming features	2024-11-14 15:50:21 +00:00
crocodilestick	3befb24e95	Fixed errors in schema, the adding of new settings to existing databases and fixed jinja errors in cwa_settings.html	2024-11-11 15:52:42 +00:00
crocodilestick	e6ae408268	Working on adding support to give the user more control over auto-conversion, formats ect	2024-11-11 13:43:26 +00:00
crocodilestick	3b1da04a16	Implementing user toggleable CWA settings for update notifications and auto-backup behaviour during ingest	2024-09-25 09:43:12 +00:00

38 Commits