Progress Syncing Module
Syncs reading progress across e-readers using file checksums for book identification.
Quick Start
from cps.progress_syncing import calculate_and_store_checksum, get_latest_checksum
# Generate and store checksum
checksum = calculate_and_store_checksum(
book_id=123,
book_format='EPUB',
file_path='/calibre-library/Author/Book/book.epub'
)
# Returns: '9e107d9d372bb6826bd81d3542a419d6'
# Retrieve for sync lookup
checksum = get_latest_checksum(book_id=123, book_format='EPUB')
Structure
cps/progress_syncing/
├── models.py # Database schema
├── checksums/
│ ├── koreader.py # Partial MD5 algorithm
│ └── manager.py # Storage and retrieval
└── protocols/
└── kosync.py # KOSync protocol (KOReader)
Checksum Algorithm
KOReader partialMD5: Samples 1024 bytes at 11 positions (0, 4K, 16K, 64K, 256K, 1M, 4M, 16M, 64M, 256M, 1G) instead of hashing entire file. Returns 32-character MD5 hex.
Auto-generated on:
- Book import
- Metadata enforcement
- EPUB fixes
- Format conversion
- OPDS download with embedded metadata
- Container startup (backfills existing books)
History tracking: All checksums stored indefinitely. New checksums added alongside old ones when files change, enabling sync with any file version on user devices.
Database Schema
CREATE TABLE book_format_checksums (
id INTEGER PRIMARY KEY,
book INTEGER REFERENCES books(id),
format TEXT, -- 'EPUB', 'PDF', etc.
checksum TEXT(32), -- MD5 hex
version TEXT, -- 'koreader'
created TIMESTAMP
);
CREATE INDEX idx_checksum_lookup ON book_format_checksums(checksum, format);
Stored in metadata.db. Query orders by created DESC to prefer latest checksums.
API
Checksum Management
from cps.progress_syncing import (
calculate_koreader_partial_md5,
calculate_and_store_checksum,
get_latest_checksum,
get_checksum_history
)
# Calculate without storing
checksum = calculate_koreader_partial_md5('/path/to/book.epub')
# => '9e107d9d372bb6826bd81d3542a419d6' or None
# Calculate and store
checksum = calculate_and_store_checksum(
book_id=123,
book_format='EPUB',
file_path='/path/to/book.epub'
)
# Get latest
checksum = get_latest_checksum(book_id=123, book_format='EPUB')
# Get history (newest first)
history = get_checksum_history(book_id=123, book_format='EPUB')
# => [('abc...', 'koreader', '2025-11-09T10:30:00'), ...]
Book Lookup
from cps.progress_syncing.protocols.kosync import get_book_by_checksum
result = get_book_by_checksum('9e107d9d372bb6826bd81d3542a419d6')
if result:
book_id, format, title, path, version = result
KOSync Protocol
HTTP Basic Auth endpoints for KOReader devices:
GET /kosync- Plugin downloadGET /kosync/users/auth- AuthenticationGET /kosync/syncs/progress/<checksum>- Get progressPUT /kosync/syncs/progress- Update progress
Progress stored in kosync_progress table (app.db). Updates ReadBook and KoboReadingState tables for web UI and Kobo device sync.
Testing
# Unit tests - Algorithm, storage, database
pytest tests/unit/test_progress_syncing_*.py -v
# Integration tests - Full sync workflow
pytest tests/integration/test_progress_syncing_*.py -v
pytest tests/integration/test_kosync_*.py -v
Extending
New Sync Protocol
# protocols/new_protocol.py
from flask import Blueprint, request
from .kosync import get_book_by_checksum
new_protocol = Blueprint('new_protocol', __name__)
@new_protocol.route('/new_protocol/sync', methods=['PUT'])
def sync():
checksum = request.json['document']
book_id, format, title, path, version = get_book_by_checksum(checksum)
# Implement sync logic
Register in cps/main.py:
from .progress_syncing.protocols.new_protocol import new_protocol
app.register_blueprint(new_protocol)
New Checksum Algorithm
# checksums/new_algorithm.py
def calculate_new_algorithm(filepath: str) -> str:
# Implementation
return checksum
# Store alongside existing
from .manager import store_checksum
store_checksum(book_id=123, book_format='EPUB',
checksum=new_checksum, version='algorithm_v2')
Database supports multiple algorithm versions per book automatically.
References
- KOReader partialMD5
- KOSync protocol
- CWA plugin:
koreader/plugins/cwasync.koplugin/