2.0.20.1364 RC9

submod: OCR update eng data
update guessit to d96859d056864b8956cbeb8c8f5bb6875d270e39
2017-05-24 21:47:51 +02:00 · 2017-05-24 21:41:51 +02:00 · 2017-05-24 21:40:12 +02:00 · 2017-05-24 18:03:53 +02:00 · 2017-05-24 18:02:23 +02:00 · 2017-05-24 16:24:23 +02:00
105 changed files with 25128 additions and 735 deletions
@@ -1,3 +1,110 @@
+2.0.19.1337 RC8
+- napiprojekt: fixed: couldn't convert microdvd to SRT in certain occasions
+- core: when normalize to UTF-8 is enabled, also store the subtitle in UTF-8 encoding in the internal storage
+- core: add more encodings for western/eastern/northern europe
+- submod: OCR: update dictionaries from SubtitleEdit
+- submod: common: be smarter about uppercase i's in words that should have lowercase L's
+- submod: fix unopened/unclosed font style tags after modification
+- core: re-enable OMDB support
+- core: update guessit for better matching
+- core: fix SearchAllRecentlyMissing (was broken since RC3)
+
+
+2.0.19.1299 RC7
+- submod: offset mods now get merged internally when applied multiple times (to avoid errors and increase performance)
+- submod: improve performance
+- submod: core mods (OCR, common, remove_HI) now are always applied in a fixed order internally, regardless of the order they were added in
+- submod: CM_spaces_in_numbers: don't break up ellipses (30... 29... 28...)
+- submod: CM_spaces_in_numbers: don't fix countdown numbers (30, 29, 28)
+- submod: remove_HI: make bracket removal more aggressive
+- submod: remove_HI: be less aggressive when removing text-before-colon
+- submod: remove_HI: remove all-uppercase-before-sentence (THIS IS ALL UPPERCASE And here starts a sentence -> And here starts a sentence)
+- submod: fix all character ranges to include non-ASCII characters
+- add new README for 2.0
+
+
+2.0.19.1267 RC6
+- core: add new SZ subtitle storage format
+  - smaller data files and less cumbersome
+  - it will auto migrate when old data is accessed - to speed this up, use "Trigger subtitle storage migration (expensive)" in advanced menu)
+- core: performance optimizations
+- addic7ed: when release group matches, assume the format matches, too (leftover change from RC5)
+- submod: fix patterns for beginlines/endlines
+- submod: add our own dictionaries to OCR fixes (english)
+- submod: hearing impaired: also remove full-caps with punctuation inside
+- submod: correctly handle partiallines
+- submod: in numbers with spaces (incorrect), also allow for some punctuation (,.:')
+
+
+2.0.18.1245 RC5
+- core: add more debug info
+- core: fix subtitle modifications (was broken in RC4, created non-usable subtitles)
+- submod: add ANSI colors
+- menu/submod: add color mod menu
+- submod: exclusive mods now are mutually exclusive and get cleaned on duplicate
+- menu/core: naming
+
+For everyone who runs RC4: your subtitles are broken. Go to the advanced menu and trigger `Re-Apply mods of all stored subtitles` to fix them.
+
+
+2.0.17.1234 RC4
+- core: backport provider-download-retry implementation
+- core: implement custom user agent (for OpenSubtitles)
+- core/menu: correct handling of media with multiple files
+- core: fix SearchAllRecentlyMissing; also wait 5 seconds between searches
+- core: SearchAllRecentlyMissing: honor physical ignores
+- submod: pattern fixes
+- submod: better unicode handling
+- submod: add color mod (only automatic by now)
+
+
+2.0.15.1216 RC3
+- core: fixes
+- scheduler: revert some of the aggressive changes in RC2
+- submod: be smarter about WholeLine matches
+
+
+2.0.15.1209 RC2
+- core: fixes
+- core: submod-common: fix multiple dots at start of line
+- core/menu: add subtitle modification debug setting
+- core/menu: when manually listing available subtitles in menu, display those with wrong FPS also (opensubtitles), because you can fix them later
+- core/menu: advanced-menu: add apply-all-default-mods menu item; add re-apply all mods menu item
+- core: always look for currently (not-) existing subtitles when called; hopefully fixes #276
+- scheduler/menu: be faster; also launch scheduled tasks in threads, not just manually launched ones
+- core: don't delete subtitles with .custom or .embedded in their filenames when running auto cleanup, if the correct media file exists
+- menu: add back-to-previous menu items
+
+
+2.0.12.1180 RC1
+- core: update subliminal to version 2
+- core: update all dependencies
+- core: add new providers: legendastv (pt-BR), napiprojekt (pl), shooter (cn), subscenter (heb)
+- core: rewritten all subliminal patches for version 2
+- menu: add icons for menu items; update main channel icon
+- core: use SSL again for opensubtitles
+- core: improved matching due to subliminal 2 (and SZ custom) tvdb/omdb refiners
+- menu: add "Get my logs" function to the advanced menu, which zips up all necessary logs suitable for posting in the forums
+- core: on non-windows systems, utilize a file-based cache database for provider media lists and subliminal refiner results
+- core: add manual and automatic subtitle modification framework (fix common OCR issues, remove hearing impaired etc.)
+- menu: add subtitle modifications (subtitle content fixes, offset-based shifting, framerate conversion)
+- menu: add recently played menu
+- improve almost everything Sub-Zero did in 1.4 :)
+
+
+1.4.27.973
+- core: ignore "obfuscated" and "scrambled" tags in filenames when searching for subtitles
+- core: exotic embedded subtitles are now also considered when searching (and when the option is enabled); fixes #264
+
+
+1.4.27.967
+- core: remember the last 10 played items; only consider on_playback for "playing" state within the first 60 seconds of an item
+
+
+1.4.27.965
+- core: on_playback activity bugfixes
+
+
 1.4.27.957
 - core: correctly fall back to the next best subtitle if the current one couldn't be downloaded; hopefully fixes #231
 - core: add "Scan: which external subtitles should be picked up?"-setting
@@ -24,12 +24,11 @@ import support
 import interface
 sys.modules["interface"] = interface

-from subliminal.cli import MutexLock
 from subzero.constants import OS_PLEX_USERAGENT, PERSONAL_MEDIA_IDENTIFIER
 from interface.menu import *
 from support.plex_media import media_to_videos, get_media_item_ids, scan_videos
 from support.subtitlehelpers import get_subtitles_from_metadata
-from support.storage import whack_missing_parts, save_subtitles, get_subtitle_storage
+from support.storage import whack_missing_parts, save_subtitles
 from support.items import is_ignored
 from support.config import config
 from support.lib import get_intent
@@ -43,13 +42,7 @@ def Start():
    HTTP.CacheTime = 0
    HTTP.Headers['User-agent'] = OS_PLEX_USERAGENT

-    try:
-        subliminal.region.configure('dogpile.cache.dbm', expiration_time=datetime.timedelta(days=30),
-                                    arguments={'filename': os.path.join(config.data_items_path, 'subzero.dbm'),
-                                               'lock_factory': MutexLock})
-    except:
-        Log.Warn("Not using file based cache!")
-        subliminal.region.configure('dogpile.cache.memory')
+    config.init_cache()

    # clear expired intents
    intent = get_intent()
@@ -191,6 +184,9 @@ class SubZeroAgent(object):
            config.init_subliminal_patches()
            videos = media_to_videos(media, kind=self.agent_type)

+            # find local media
+            update_local_media(metadata, media, media_type=self.agent_type)
+
            # media ignored?
            use_any_parts = False
            for video in videos:
@@ -211,9 +207,6 @@ class SubZeroAgent(object):

            set_refresh_menu_state(media, media_type=self.agent_type)

-            # find local media
-            update_local_media(metadata, media, media_type=self.agent_type)
-
            # scanned_video_part_map = {subliminal.Video: plex_part, ...}
            scanned_video_part_map = scan_videos(videos, kind=self.agent_type)

@@ -18,3 +18,6 @@ sys.modules["interface.refresh_item"] = refresh_item

 import item_details
 sys.modules["interface.item_details"] = item_details
+
+import sub_mod
+sys.modules["interface.modification"] = sub_mod
@@ -3,19 +3,23 @@ import datetime
 import StringIO
 import glob
 import os
+import traceback
 import urlparse

 from zipfile import ZipFile, ZIP_DEFLATED

+from babelfish import Language
+
 from subzero.lib.io import FileIO
 from subzero.constants import PREFIX, PLUGIN_IDENTIFIER
-from menu_helpers import SubFolderObjectContainer, debounce, set_refresh_menu_state, ZipObject
+from menu_helpers import SubFolderObjectContainer, debounce, set_refresh_menu_state, ZipObject, ObjectContainer
 from main import fatality
 from support.helpers import timestamp, pad_title
 from support.config import config
 from support.lib import Plex
-from support.storage import reset_storage, log_storage
+from support.storage import reset_storage, log_storage, get_subtitle_storage
 from support.scheduler import scheduler
+from support.items import set_mods_for_part, get_item_kind_from_rating_key


@route(PREFIX + '/advanced')
@@ -49,6 +53,18 @@ def AdvancedMenu(randomize=None, header=None, message=None):
        key=Callback(TriggerStorageMaintenance, randomize=timestamp()),
        title=pad_title("Trigger subtitle storage maintenance"),
    ))
+    oc.add(DirectoryObject(
+        key=Callback(TriggerStorageMigration, randomize=timestamp()),
+        title=pad_title("Trigger subtitle storage migration (expensive)"),
+    ))
+    oc.add(DirectoryObject(
+        key=Callback(ApplyDefaultMods, randomize=timestamp()),
+        title=pad_title("Apply configured default subtitle mods to all (active) stored subtitles"),
+    ))
+    oc.add(DirectoryObject(
+        key=Callback(ReApplyMods, randomize=timestamp()),
+        title=pad_title("Re-Apply mods of all stored subtitles"),
+    ))
    oc.add(DirectoryObject(
        key=Callback(LogStorage, key="tasks", randomize=timestamp()),
        title=pad_title("Log the plugin's scheduled tasks state storage"),
@@ -92,6 +108,7 @@ def Restart():


@route(PREFIX + '/storage/reset', sure=bool)
+@debounce
 def ResetStorage(key, randomize=None, sure=False):
    if not sure:
        oc = SubFolderObjectContainer(no_history=True, title1="Reset subtitle storage", title2="Are you sure?")
@@ -127,6 +144,7 @@ def LogStorage(key, randomize=None):


@route(PREFIX + '/triggerbetter')
+@debounce
 def TriggerBetterSubtitles(randomize=None):
    scheduler.dispatch_task("FindBetterSubtitles")
    return AdvancedMenu(
@@ -137,6 +155,7 @@ def TriggerBetterSubtitles(randomize=None):


@route(PREFIX + '/triggermaintenance')
+@debounce
 def TriggerStorageMaintenance(randomize=None):
    scheduler.dispatch_task("SubtitleStorageMaintenance")
    return AdvancedMenu(
@@ -146,27 +165,111 @@ def TriggerStorageMaintenance(randomize=None):
    )


+@route(PREFIX + '/triggerstoragemigration')
+@debounce
+def TriggerStorageMigration(randomize=None):
+    scheduler.dispatch_task("MigrateSubtitleStorage")
+    return AdvancedMenu(
+        randomize=timestamp(),
+        header='Success',
+        message='MigrateSubtitleStorage triggered'
+    )
+
+
+def apply_default_mods(reapply_current=False):
+    storage = get_subtitle_storage()
+    subs_applied = 0
+    for fn in storage.get_all_files():
+        data = storage.load(None, filename=fn)
+        if data:
+            video_id = data.video_id
+            item_type = get_item_kind_from_rating_key(video_id)
+            if not item_type:
+                continue
+
+            for part_id, part in data.parts.iteritems():
+                for lang, subs in part.iteritems():
+                    current_sub = subs.get("current")
+                    if not current_sub:
+                        continue
+                    sub = subs[current_sub]
+
+                    if not sub.content:
+                        continue
+
+                    current_mods = sub.mods or []
+                    if not reapply_current:
+                        add_mods = list(set(config.default_mods).difference(set(current_mods)))
+                        if not add_mods:
+                            continue
+                    else:
+                        if not current_mods:
+                            continue
+                        add_mods = []
+
+                    try:
+                        set_mods_for_part(video_id, part_id, Language.fromietf(lang), item_type, add_mods, mode="add")
+                    except:
+                        Log.Error("Couldn't set mods for %s:%s: %s", video_id, part_id, traceback.format_exc())
+                        continue
+
+                    subs_applied += 1
+    Log.Debug("Applied mods to %i items" % subs_applied)
+
+
+@route(PREFIX + '/applydefaultmods')
+@debounce
+def ApplyDefaultMods(randomize=None):
+    Thread.CreateTimer(1.0, apply_default_mods)
+    return AdvancedMenu(
+        randomize=timestamp(),
+        header='Success',
+        message='This may take some time ...'
+    )
+
+
+@route(PREFIX + '/reapplyallmods')
+@debounce
+def ReApplyMods(randomize=None):
+    Thread.CreateTimer(1.0, apply_default_mods, reapply_current=True)
+    return AdvancedMenu(
+        randomize=timestamp(),
+        header='Success',
+        message='This may take some time ...'
+    )
+
+
@route(PREFIX + '/get_logs_link')
 def GetLogsLink():
+    if not config.plex_token:
+        oc = ObjectContainer(title2="Download Logs", no_cache=True, no_history=True,
+                             header="Sorry, feature unavailable",
+                             message="Universal Plex token not available")
+        return oc
+
    # try getting the link base via the request in context, first, otherwise use the public ip
    req_headers = Core.sandbox.context.request.headers
+    get_external_ip = True
+    link_base = ""

    if "Origin" in req_headers:
        link_base = req_headers["Origin"]
        Log.Debug("Using origin-based link_base")
+        get_external_ip = False

    elif "Referer" in req_headers:
        parsed = urlparse.urlparse(req_headers["Referer"])
        link_base = "%s://%s:%s" % (parsed.scheme, parsed.hostname, parsed.port)
        Log.Debug("Using referer-based link_base")
+        get_external_ip = False

-    else:
+    if get_external_ip or "plex.tv" in link_base:
        ip = Core.networking.http_request("http://www.plexapp.com/ip.php", cacheTime=7200).content.strip()
        link_base = "https://%s:32400" % ip
        Log.Debug("Using ip-based fallback link_base")

-    logs_link = "%s%s?X-Plex-Token=%s" % (link_base, PREFIX + '/logs', config.universal_plex_token)
-    oc = ObjectContainer(title2="Download Logs", no_cache=True, no_history=True,
+    logs_link = "%s%s?X-Plex-Token=%s" % (link_base, PREFIX + '/logs', config.plex_token)
+    oc = ObjectContainer(title2=logs_link, no_cache=True, no_history=True,
                         header="Copy this link and open this in your browser, please",
                         message=logs_link)
    return oc
@@ -189,6 +292,7 @@ def DownloadLogs():


@route(PREFIX + '/invalidatecache')
+@debounce
 def InvalidateCache(randomize=None):
    from subliminal.cache import region
    region.invalidate()
@@ -1,23 +1,19 @@
 # coding=utf-8
 import os
-import traceback

-from babelfish import Language
-
-from subzero.constants import PREFIX
+from sub_mod import SubtitleModificationsMenu
 from menu_helpers import debounce, SubFolderObjectContainer, default_thumb, add_ignore_options, get_item_task_data, \
    set_refresh_menu_state
+
 from refresh_item import RefreshItem
+from subzero.constants import PREFIX
+from support.config import config
 from support.helpers import timestamp, cast_bool, df, get_language
 from support.items import get_item_kind_from_rating_key, get_item, get_current_sub
-from support.plex_media import get_plex_metadata, scan_videos
 from support.lib import Plex
-from support.storage import get_subtitle_storage, save_subtitles
-from support.config import config
+from support.plex_media import get_plex_metadata, scan_videos, PMSMediaProxy
 from support.scheduler import scheduler
-
-from subliminal_patch import PatchedSubtitle as Subtitle
-from subzero.modification import registry as mod_registry
+from support.storage import get_subtitle_storage


@route(PREFIX + '/item/{rating_key}/actions')
@@ -41,6 +37,29 @@ def ItemDetailsMenu(rating_key, title=None, base_title=None, item_title=None, ra
    timeout = 30

    oc = SubFolderObjectContainer(title2=title, replace_parent=True)
+
+    # add back to season for episode
+    if current_kind == "episode":
+        from interface.menu import MetadataMenu
+        show = get_item(item.show.rating_key)
+        season = get_item(item.season.rating_key)
+
+        oc.add(DirectoryObject(
+            key=Callback(MetadataMenu, rating_key=season.rating_key, title=season.title, base_title=show.title,
+                         previous_item_type="show", previous_rating_key=show.rating_key,
+                         display_items=True, randomize=timestamp()),
+            title=u"< Back to %s" % season.title,
+            summary="Back to %s > %s" % (show.title, season.title),
+            thumb=season.thumb or default_thumb
+        ))
+
+    oc.add(DirectoryObject(
+        key=Callback(UpdateLocalMedia, rating_key=rating_key, title=title, item_title=item_title, base_title=base_title,
+                     randomize=timestamp()),
+        title=u"Find local subtitles (doesn't refresh metadata)",
+        summary="Searches for locally available subtitles",
+        thumb=item.thumb or default_thumb
+    ))
    oc.add(DirectoryObject(
        key=Callback(RefreshItem, rating_key=rating_key, item_title=item_title, randomize=timestamp(),
                     timeout=timeout * 1000),
@@ -51,7 +70,7 @@ def ItemDetailsMenu(rating_key, title=None, base_title=None, item_title=None, ra
    oc.add(DirectoryObject(
        key=Callback(RefreshItem, rating_key=rating_key, item_title=item_title, force=True, randomize=timestamp(),
                     timeout=timeout * 1000),
-        title=u"Auto-search: %s" % item_title,
+        title=u"Force-find subtitles: %s" % item_title,
        summary="Issues a forced refresh, ignoring known subtitles and searching for new ones",
        thumb=item.thumb or default_thumb
    ))
@@ -63,52 +82,76 @@ def ItemDetailsMenu(rating_key, title=None, base_title=None, item_title=None, ra
    # get the plex item
    plex_item = list(Plex["library"].metadata(rating_key))[0]

-    # get current media info for that item
-    media = plex_item.media
-
    # look for subtitles for all available media parts and all of their languages
-    for part in media.parts:
-        filename = os.path.basename(part.file)
-        part_id = str(part.id)
+    has_multiple_parts = len(plex_item.media) > 1
+    part_index = 0
+    for media in plex_item.media:
+        for part in media.parts:
+            filename = os.path.basename(part.file)
+            if not os.path.exists(part.file):
+                continue

-        # iterate through all configured languages
-        for lang in config.lang_list:
-            lang_a2 = lang.alpha2
-            # ietf lang?
-            if cast_bool(Prefs["subtitles.language.ietf"]) and "-" in lang_a2:
-                lang_a2 = lang_a2.split("-")[0]
+            part_id = str(part.id)
+            part_index += 1

-            # get corresponding stored subtitle data for that media part (physical media item), for language
-            current_sub = stored_subs.get_any(part_id, lang_a2)
-            current_sub_id = None
-            current_sub_provider_name = None
+            # iterate through all configured languages
+            for lang in config.lang_list:
+                lang_a2 = lang.alpha2
+                # ietf lang?
+                if cast_bool(Prefs["subtitles.language.ietf"]) and "-" in lang_a2:
+                    lang_a2 = lang_a2.split("-")[0]

-            summary = u"No current subtitle in storage"
-            current_score = None
-            if current_sub:
-                current_sub_id = current_sub.id
-                current_sub_provider_name = current_sub.provider_name
-                current_score = current_sub.score
+                # get corresponding stored subtitle data for that media part (physical media item), for language
+                current_sub = stored_subs.get_any(part_id, lang_a2)
+                current_sub_id = None
+                current_sub_provider_name = None

-                summary = u"Current subtitle: %s (added: %s, %s), Language: %s, Score: %i, Storage: %s" % \
-                          (current_sub.provider_name, df(current_sub.date_added), current_sub.mode_verbose, lang,
-                           current_sub.score, current_sub.storage_type)
+                part_index_addon = ""
+                part_summary_addon = ""
+                if has_multiple_parts:
+                    part_index_addon = u"File %s: " % part_index
+                    part_summary_addon = "%s " % filename

-            oc.add(DirectoryObject(
-                key=Callback(SubtitleOptionsMenu, rating_key=rating_key, part_id=part_id, title=title,
-                             item_title=item_title, language=lang, language_name=lang.name, current_id=current_sub_id,
-                             item_type=plex_item.type, filename=filename, current_data=summary,
-                             randomize=timestamp(), current_provider=current_sub_provider_name,
-                             current_score=current_score),
-                title=u"Actions for %s subtitle" % lang.name,
-                summary=summary
-            ))
+                summary = u"%sNo current subtitle in storage" % part_summary_addon
+                current_score = None
+                if current_sub:
+                    current_sub_id = current_sub.id
+                    current_sub_provider_name = current_sub.provider_name
+                    current_score = current_sub.score
+
+                    summary = u"%sCurrent subtitle: %s (added: %s, %s), Language: %s, Score: %i, Storage: %s" % \
+                              (part_summary_addon, current_sub.provider_name, df(current_sub.date_added),
+                               current_sub.mode_verbose, lang, current_sub.score, current_sub.storage_type)
+
+                oc.add(DirectoryObject(
+                    key=Callback(SubtitleOptionsMenu, rating_key=rating_key, part_id=part_id, title=title,
+                                 item_title=item_title, language=lang, language_name=lang.name, current_id=current_sub_id,
+                                 item_type=plex_item.type, filename=filename, current_data=summary,
+                                 randomize=timestamp(), current_provider=current_sub_provider_name,
+                                 current_score=current_score),
+                    title=u"%sActions for %s subtitle" % (part_index_addon, lang.name),
+                    summary=summary
+                ))

    add_ignore_options(oc, "videos", title=item_title, rating_key=rating_key, callback_menu=IgnoreMenu)

    return oc


+@route(PREFIX + '/item/update_local_media/{rating_key}', force=bool)
+@debounce
+def UpdateLocalMedia(**kwargs):
+    from support.localmedia import find_subtitles
+    rating_key = kwargs["rating_key"]
+    parts = PMSMediaProxy(rating_key).get_all_parts()
+    for part in parts:
+        find_subtitles(part)
+
+    kwargs.pop("randomize")
+
+    return ItemDetailsMenu(**kwargs)
+
+
@route(PREFIX + '/item/current_sub/{rating_key}/{part_id}', force=bool)
@debounce
 def SubtitleOptionsMenu(**kwargs):
@@ -123,7 +166,7 @@ def SubtitleOptionsMenu(**kwargs):
    oc.add(DirectoryObject(
        key=Callback(ItemDetailsMenu, rating_key=kwargs["rating_key"], item_title=kwargs["item_title"],
                     title=kwargs["title"], randomize=timestamp()),
-        title=u"Back to: %s" % kwargs["title"],
+        title=u"< Back to %s" % kwargs["title"],
        summary=kwargs["current_data"],
        thumb=default_thumb
    ))
@@ -141,69 +184,6 @@ def SubtitleOptionsMenu(**kwargs):
    return oc


-@route(PREFIX + '/item/sub_mods/{rating_key}/{part_id}', force=bool)
-@debounce
-def SubtitleModificationsMenu(**kwargs):
-    rating_key = kwargs["rating_key"]
-    part_id = kwargs["part_id"]
-    language = kwargs["language"]
-    current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
-    kwargs.pop("randomize")
-
-    oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
-    for identifier, mod in mod_registry.mods.iteritems():
-        oc.add(DirectoryObject(
-            key=Callback(SubtitleApplyMod, mod_identifier=identifier, randomize=timestamp(), **kwargs),
-            title=mod.description
-        ))
-
-    oc.add(DirectoryObject(
-        key=Callback(SubtitleApplyMod, mod_identifier=None, randomize=timestamp(), **kwargs),
-        title="Restore original version",
-        summary=u"Currently applied mods: %s" % (", ".join(current_sub.mods) if current_sub.mods else "none")
-    ))
-
-    return oc
-
-
-@route(PREFIX + '/item/sub_add_mod/{rating_key}/{part_id}/{mod_identifier}', force=bool)
-@debounce
-def SubtitleApplyMod(mod_identifier=None, **kwargs):
-    if mod_identifier is not None and mod_identifier not in mod_registry.mods:
-        raise NotImplementedError
-
-    rating_key = kwargs["rating_key"]
-    part_id = kwargs["part_id"]
-    lang_a2 = kwargs["language"]
-    item_type = kwargs["item_type"]
-
-    language = Language.fromietf(lang_a2)
-
-    current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
-    current_sub.add_mod(mod_identifier)
-
-    storage.save(stored_subs)
-    metadata = get_plex_metadata(rating_key, part_id, item_type)
-    scanned_parts = scan_videos([metadata], kind="series" if item_type == "episode" else "movie", ignore_all=True)
-    video, plex_part = scanned_parts.items()[0]
-
-    subtitle = Subtitle(language, mods=current_sub.mods)
-    subtitle.content = current_sub.content
-    subtitle.plex_media_fps = plex_part.fps
-    subtitle.page_link = "modify subtitles with: %s" % (", ".join(current_sub.mods) if current_sub.mods else "none")
-    subtitle.language = language
-
-    try:
-        save_subtitles(scanned_parts, {video: [subtitle]}, mode="m", bare_save=True)
-        Log.Debug("Modified %s subtitle for: %s:%s with: %s", language.name, rating_key, part_id,
-                  ", ".join(current_sub.mods) if current_sub.mods else "none")
-    except:
-        Log.Error("Something went wrong when modifying subtitle: %s", traceback.format_exc())
-
-    kwargs.pop("randomize")
-    return SubtitleModificationsMenu(randomize=timestamp(), **kwargs)
-
-
@route(PREFIX + '/item/search/{rating_key}/{part_id}', force=bool)
@debounce
 def ListAvailableSubsForItemMenu(rating_key=None, part_id=None, title=None, item_title=None, filename=None,
@@ -223,7 +203,7 @@ def ListAvailableSubsForItemMenu(rating_key=None, part_id=None, title=None, item
    oc = SubFolderObjectContainer(title2=unicode(title), replace_parent=True)
    oc.add(DirectoryObject(
        key=Callback(ItemDetailsMenu, rating_key=rating_key, item_title=item_title, title=title, randomize=timestamp()),
-        title=u"Back to: %s" % title,
+        title=u"< Back to %s" % title,
        summary=current_data,
        thumb=default_thumb
    ))
@@ -269,11 +249,15 @@ def ListAvailableSubsForItemMenu(rating_key=None, part_id=None, title=None, item
        return oc

    for subtitle in search_results:
+        wrong_fps_addon = ""
+        if subtitle.wrong_fps:
+            wrong_fps_addon = " (wrong FPS, sub: %s, media: %s)" % (subtitle.fps, plex_part.fps)
+
        oc.add(DirectoryObject(
            key=Callback(TriggerDownloadSubtitle, rating_key=rating_key, randomize=timestamp(), item_title=item_title,
                         subtitle_id=str(subtitle.id), language=language),
-            title=u"%s: %s, score: %s" % ("Available" if current_id != subtitle.id else "Current",
-                                          subtitle.provider_name, subtitle.score),
+            title=u"%s: %s, score: %s%s" % ("Available" if current_id != subtitle.id else "Current",
+                                            subtitle.provider_name, subtitle.score, wrong_fps_addon),
            summary=u"Release: %s, Matches: %s" % (subtitle.release_info, ", ".join(subtitle.matches)),
            thumb=default_thumb
        ))
@@ -2,11 +2,13 @@

 from subzero.constants import PREFIX, TITLE, ART
 from support.config import config
-from support.helpers import pad_title, timestamp, df
+from support.helpers import pad_title, timestamp, df, get_plex_item_display_title
 from support.scheduler import scheduler
 from support.ignore import ignore_list
-from support.items import get_item_thumb, get_on_deck_items, get_all_items, get_items_info
-from menu_helpers import main_icon, debounce, SubFolderObjectContainer, default_thumb, dig_tree, add_ignore_options
+from support.items import get_item_thumb, get_on_deck_items, get_all_items, get_items_info, get_item, \
+    get_item_kind_from_item
+from menu_helpers import main_icon, debounce, SubFolderObjectContainer, default_thumb, dig_tree, add_ignore_options,\
+    ObjectContainer
 from item_details import ItemDetailsMenu


@@ -69,16 +71,24 @@ def fatality(randomize=None, force_title=None, header=None, message=None, only_r

        oc.add(DirectoryObject(
            key=Callback(OnDeckMenu),
-            title="On Deck items",
+            title="On-deck items",
            summary="Shows the current on deck items and allows you to individually (force-) refresh their metadata/"
                    "subtitles.",
            thumb=R("icon-ondeck.jpg")
        ))
+        if "last_played_items" in Dict and Dict["last_played_items"]:
+            oc.add(DirectoryObject(
+                key=Callback(RecentlyPlayedMenu),
+                title=pad_title("Recently played items"),
+                summary="Shows the %i recently played items and allows you to individually (force-) refresh their "
+                        "metadata/subtitles." % config.store_recently_played_amount,
+                thumb=R("icon-played.jpg")
+            ))
        oc.add(DirectoryObject(
            key=Callback(RecentlyAddedMenu),
-            title="Recently Added items",
+            title="Recently-added items",
            summary="Shows the recently added items per section.",
-            thumb=R("icon-recent.jpg")
+            thumb=R("icon-added.jpg")
        ))
        oc.add(DirectoryObject(
            key=Callback(RecentMissingSubtitlesMenu, randomize=timestamp()),
@@ -168,6 +178,31 @@ def OnDeckMenu(message=None):
    return mergedItemsMenu(title="Items On Deck", base_title="Items On Deck", itemGetter=get_on_deck_items)


+@route(PREFIX + '/recently_played')
+def RecentlyPlayedMenu():
+    base_title = "Recently Played"
+    oc = SubFolderObjectContainer(title2=base_title, replace_parent=True)
+
+    for item in [get_item(rating_key) for rating_key in Dict["last_played_items"]]:
+        kind = get_item_kind_from_item(item)
+        if kind not in ("episode", "movie"):
+            continue
+
+        if kind == "episode":
+            item_title = get_plex_item_display_title(item, "show", parent=item.season, section_title=None,
+                                                     parent_title=item.show.title)
+        else:
+            item_title = get_plex_item_display_title(item, kind, section_title=None)
+
+        oc.add(DirectoryObject(
+            title=item_title,
+            key=Callback(ItemDetailsMenu, title=base_title + " > " + item.title, item_title=item.title,
+                         rating_key=item.rating_key)
+        ))
+
+    return oc
+
+
@route(PREFIX + '/recently_added')
 def RecentlyAddedMenu(message=None):
    """
@@ -215,8 +250,6 @@ def RecentMissingSubtitlesMenu(force=False, randomize=None):
                thumb=get_item_thumb(item) or default_thumb
            ))

-        scheduler.clear_task_data("MissingSubtitles")
-
    return oc


@@ -1,5 +1,8 @@
 # coding=utf-8
+import locale
 import logging
+import os
+
 import logger

 from item_details import ItemDetailsMenu
@@ -11,10 +14,10 @@ from advanced import DispatchRestart
 from subzero.constants import ART, PREFIX, DEPENDENCY_MODULE_NAMES
 from support.scheduler import scheduler
 from support.config import config
-from support.helpers import timestamp,  df
+from support.helpers import timestamp, df
 from support.ignore import ignore_list
 from support.items import get_all_items, get_items_info, \
-    get_item_kind_from_rating_key
+    get_item_kind_from_rating_key, get_item

 # init GUI
 ObjectContainer.art = R(ART)
@@ -53,7 +56,7 @@ def FirstLetterMetadataMenu(rating_key, key, title=None, base_title=None, displa

@route(PREFIX + '/section/contents', display_items=bool)
 def MetadataMenu(rating_key, title=None, base_title=None, display_items=False, previous_item_type=None,
-                 previous_rating_key=None):
+                 previous_rating_key=None, randomize=None):
    """
    displays the contents of a section based on whether it has a deeper tree or not (movies->movie (item) list; series->series list)
    :param rating_key:
@@ -72,6 +75,22 @@ def MetadataMenu(rating_key, title=None, base_title=None, display_items=False, p
    current_kind = get_item_kind_from_rating_key(rating_key)

    if display_items:
+        timeout = 30
+
+        # add back to series for season
+        if current_kind == "season":
+            timeout = 360
+
+            show = get_item(previous_rating_key)
+            oc.add(DirectoryObject(
+                key=Callback(MetadataMenu, rating_key=show.rating_key, title=show.title, base_title=show.section.title,
+                             previous_item_type="section", display_items=True, randomize=timestamp()),
+                title=u"< Back to %s" % show.title,
+                thumb=show.thumb or default_thumb
+            ))
+        elif current_kind == "series":
+            timeout = 1800
+
        items = get_all_items(key="children", value=rating_key, base="library/metadata")
        kind, deeper = get_items_info(items)
        dig_tree(oc, items, MetadataMenu,
@@ -81,12 +100,6 @@ def MetadataMenu(rating_key, title=None, base_title=None, display_items=False, p
        if should_display_ignore(items, previous=previous_item_type):
            add_ignore_options(oc, "series", title=item_title, rating_key=rating_key, callback_menu=IgnoreMenu)

-        timeout = 30
-        if current_kind == "season":
-            timeout = 360
-        elif current_kind == "series":
-            timeout = 1800
-
        # add refresh
        oc.add(DirectoryObject(
            key=Callback(RefreshItem, rating_key=rating_key, item_title=title, refresh_kind=current_kind,
@@ -147,7 +160,6 @@ def RefreshMissing(randomize=None):
@route(PREFIX + '/ValidatePrefs', enforce_route=True)
 def ValidatePrefs():
    Core.log.setLevel(logging.DEBUG)
-    Log.Debug("Validate Prefs called.")

    # cache the channel state
    update_dict = False
@@ -182,9 +194,51 @@ def ValidatePrefs():
        Core.log.removeHandler(logger.console_handler)
        Log.Debug("Stop logging to console")

+    Log.Debug("Validate Prefs called.")
+
+    # SZ config debug
+    Log.Debug("--- SZ Config-Debug ---")
+    for attr in [
+            "app_support_path", "data_path", "data_items_path", "enable_agent",
+            "enable_channel", "permissions_ok", "missing_permissions", "fs_encoding", "enforce_encoding",
+            "subtitle_destination_folder"]:
+        Log.Debug("config.%s: %s", attr, getattr(config, attr))
+
+    for attr in ["plugin_log_path", "server_log_path"]:
+        value = getattr(config, attr)
+        access = os.access(value, os.R_OK)
+        if Core.runtime.os == "Windows":
+            try:
+                f = open(value, "r")
+                f.read(1)
+                f.close()
+            except:
+                access = False
+
+        Log.Debug("config.%s: %s (accessible: %s)", attr, value, access)
+
+    for attr in [
+            "subtitles.save.filesystem", ]:
+        Log.Debug("Pref.%s: %s", attr, Prefs[attr])
+
+    # fixme: check existance of and os access of logs
+    Log.Debug("Platform: %s", Core.runtime.platform)
+    Log.Debug("OS: %s", Core.runtime.os)
+    Log.Debug("----- Environment -----")
+    for key, value in os.environ.iteritems():
+        if key.startswith("PLEX") or key.startswith("SZ_"):
+            if "TOKEN" in key:
+                outval = "xxxxxxxxxxxxxxxxxxx"
+
+            else:
+                outval = value
+            Log.Debug("%s: %s", key, outval)
+    Log.Debug("Locale: %s", locale.getdefaultlocale())
+    Log.Debug("-----------------------")
+
    Log.Debug("Setting log-level to %s", Prefs["log_level"])
    logger.register_logging_handler(DEPENDENCY_MODULE_NAMES, level=Prefs["log_level"])
    Core.log.setLevel(logging.getLevelName(Prefs["log_level"]))
+    os.environ['U1pfT01EQl9LRVk'] = '789CF30DAC2C8B0AF433F5C9AD34290A712DF30D7135F12D0FB3E502006FDE081E'

    return
-
@@ -43,8 +43,8 @@ def add_ignore_options(oc, kind, callback_menu=None, title=None, rating_key=None

    oc.add(DirectoryObject(
        key=Callback(callback_menu, kind=use_kind, rating_key=rating_key, title=title),
-        title=u"%s %s \"%s\" %s the ignore list" % (
-            "Remove" if in_list else "Add", ignore_list.verbose(kind) if add_kind else "", unicode(title), "from" if in_list else "to")
+        title=u"%s %s \"%s\"" % (
+            "Un-Ignore" if in_list else "Ignore", ignore_list.verbose(kind) if add_kind else "", unicode(title))
    )
    )

@@ -157,7 +157,12 @@ def debounce(func):
                return ObjectContainer()
            else:
                Dict["menu_history"][key] = datetime.datetime.now() + datetime.timedelta(days=1)
-                Dict.Save()
+                try:
+                    Dict.Save()
+                except TypeError:
+                    Log.Error("Can't save menu history for: %r", key)
+                    del Dict["menu_history"][key]
+
        return func(*args, **kwargs)

    return wrap
@@ -0,0 +1,251 @@
+# coding=utf-8
+
+import traceback
+import types
+
+from babelfish import Language
+
+from menu_helpers import debounce, SubFolderObjectContainer, default_thumb
+from subzero.modification import registry as mod_registry, SubtitleModifications
+from subzero.constants import PREFIX
+from support.plex_media import get_plex_metadata, scan_videos
+from support.helpers import timestamp, pad_title
+from support.items import get_current_sub, set_mods_for_part
+
+
+@route(PREFIX + '/item/sub_mods/{rating_key}/{part_id}', force=bool)
+@debounce
+def SubtitleModificationsMenu(**kwargs):
+    rating_key = kwargs["rating_key"]
+    part_id = kwargs["part_id"]
+    language = kwargs["language"]
+    current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
+    kwargs.pop("randomize")
+
+    current_mods = current_sub.mods or []
+
+    oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
+
+    from interface.item_details import SubtitleOptionsMenu
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleOptionsMenu, randomize=timestamp(), **kwargs),
+        title=u"< Back to subtitle options for: %s" % kwargs["title"],
+        summary=kwargs["current_data"],
+        thumb=default_thumb
+    ))
+
+    for identifier, mod in mod_registry.mods.iteritems():
+        if mod.advanced:
+            continue
+
+        if mod.exclusive and identifier in current_mods:
+            continue
+
+        oc.add(DirectoryObject(
+            key=Callback(SubtitleSetMods, mods=identifier, mode="add", randomize=timestamp(), **kwargs),
+            title=pad_title(mod.description), summary=mod.long_description or ""
+        ))
+
+    fps_mod = SubtitleModifications.get_mod_class("change_FPS")
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleFPSModMenu, randomize=timestamp(), **kwargs),
+        title=pad_title(fps_mod.description), summary=fps_mod.long_description or ""
+    ))
+
+    shift_mod = SubtitleModifications.get_mod_class("shift_offset")
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleShiftModUnitMenu, randomize=timestamp(), **kwargs),
+        title=pad_title(shift_mod.description), summary=shift_mod.long_description or ""
+    ))
+
+    color_mod = SubtitleModifications.get_mod_class("color")
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleColorModMenu, randomize=timestamp(), **kwargs),
+        title=pad_title(color_mod.description), summary=color_mod.long_description or ""
+    ))
+
+    if current_mods:
+        oc.add(DirectoryObject(
+            key=Callback(SubtitleSetMods, mods=None, mode="remove_last", randomize=timestamp(), **kwargs),
+            title=pad_title("Remove last applied mod (%s)" % current_mods[-1]),
+            summary=u"Currently applied mods: %s" % (", ".join(current_mods) if current_mods else "none")
+        ))
+        oc.add(DirectoryObject(
+            key=Callback(SubtitleListMods, randomize=timestamp(), **kwargs),
+            title=pad_title("Manage applied mods"),
+            summary=u"Currently applied mods: %s" % (", ".join(current_mods))
+        ))
+
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleSetMods, mods=None, mode="clear", randomize=timestamp(), **kwargs),
+        title=pad_title("Restore original version"),
+        summary=u"Currently applied mods: %s" % (", ".join(current_mods) if current_mods else "none")
+    ))
+
+    return oc
+
+
+@route(PREFIX + '/item/sub_mod_fps/{rating_key}/{part_id}', force=bool)
+def SubtitleFPSModMenu(**kwargs):
+    rating_key = kwargs["rating_key"]
+    part_id = kwargs["part_id"]
+    item_type = kwargs["item_type"]
+
+    kwargs.pop("randomize")
+
+    oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
+
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleModificationsMenu, randomize=timestamp(), **kwargs),
+        title="< Back to subtitle modification menu"
+    ))
+
+    metadata = get_plex_metadata(rating_key, part_id, item_type)
+    scanned_parts = scan_videos([metadata], kind="series" if item_type == "episode" else "movie", ignore_all=True)
+    video, plex_part = scanned_parts.items()[0]
+
+    target_fps = plex_part.fps
+
+    for fps in ["23.976", "24.000", "25.000", "29.970", "30.000", "50.000", "59.940", "60.000"]:
+        if float(fps) == float(target_fps):
+            continue
+
+        if float(fps) > float(target_fps):
+            indicator = "subs constantly getting faster"
+        else:
+            indicator = "subs constantly getting slower"
+
+        mod_ident = SubtitleModifications.get_mod_signature("change_FPS", **{"from": fps, "to": target_fps})
+        oc.add(DirectoryObject(
+            key=Callback(SubtitleSetMods, mods=mod_ident, mode="add", randomize=timestamp(), **kwargs),
+            title="%s fps -> %s fps (%s)" % (fps, target_fps, indicator)
+        ))
+
+    return oc
+
+
+POSSIBLE_UNITS = (("ms", "milliseconds"), ("s", "seconds"), ("m", "minutes"), ("h", "hours"))
+POSSIBLE_UNITS_D = dict(POSSIBLE_UNITS)
+
+
+@route(PREFIX + '/item/sub_mod_shift_unit/{rating_key}/{part_id}', force=bool)
+def SubtitleShiftModUnitMenu(**kwargs):
+    oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
+
+    kwargs.pop("randomize")
+
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleModificationsMenu, randomize=timestamp(), **kwargs),
+        title="< Back to subtitle modifications"
+    ))
+
+    for unit, title in POSSIBLE_UNITS:
+        oc.add(DirectoryObject(
+            key=Callback(SubtitleShiftModMenu, unit=unit, randomize=timestamp(), **kwargs),
+            title="Adjust by %s" % title
+        ))
+
+    return oc
+
+
+@route(PREFIX + '/item/sub_mod_shift/{rating_key}/{part_id}/{unit}', force=bool)
+def SubtitleShiftModMenu(unit=None, **kwargs):
+    if unit not in POSSIBLE_UNITS_D:
+        raise NotImplementedError
+
+    kwargs.pop("randomize")
+
+    oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
+
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleShiftModUnitMenu, randomize=timestamp(), **kwargs),
+        title="< Back to unit selection"
+    ))
+
+    rng = []
+    if unit == "h":
+        rng = range(-10, 11)
+    elif unit in ("m", "s"):
+        rng = range(-15, 15)
+    elif unit == "ms":
+        rng = range(-900, 1000, 100)
+
+    for i in rng:
+        if i == 0:
+            continue
+
+        mod_ident = SubtitleModifications.get_mod_signature("shift_offset", **{unit: i})
+        oc.add(DirectoryObject(
+            key=Callback(SubtitleSetMods, mods=mod_ident, mode="add", randomize=timestamp(), **kwargs),
+            title="%s %s" % (("%s" if i < 0 else "+%s") % i, unit)
+        ))
+
+    return oc
+
+
+@route(PREFIX + '/item/sub_mod_colors/{rating_key}/{part_id}', force=bool)
+def SubtitleColorModMenu(**kwargs):
+    kwargs.pop("randomize")
+
+    color_mod = SubtitleModifications.get_mod_class("color")
+
+    oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
+
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleModificationsMenu, randomize=timestamp(), **kwargs),
+        title="< Back to subtitle modification menu"
+    ))
+
+    for color, code in color_mod.colors.iteritems():
+        mod_ident = SubtitleModifications.get_mod_signature("color", **{"name": color})
+        oc.add(DirectoryObject(
+            key=Callback(SubtitleSetMods, mods=mod_ident, mode="add", randomize=timestamp(), **kwargs),
+            title="%s (%s)" % (color, code)
+        ))
+
+    return oc
+
+
+@route(PREFIX + '/item/sub_set_mods/{rating_key}/{part_id}/{mods}/{mode}', force=bool)
+@debounce
+def SubtitleSetMods(mods=None, mode=None, **kwargs):
+    if not isinstance(mods, types.ListType) and mods:
+        mods = [mods]
+
+    rating_key = kwargs["rating_key"]
+    part_id = kwargs["part_id"]
+    lang_a2 = kwargs["language"]
+    item_type = kwargs["item_type"]
+
+    language = Language.fromietf(lang_a2)
+
+    set_mods_for_part(rating_key, part_id, language, item_type, mods, mode=mode)
+
+    kwargs.pop("randomize")
+    return SubtitleModificationsMenu(randomize=timestamp(), **kwargs)
+
+
+@route(PREFIX + '/item/sub_list_mods/{rating_key}/{part_id}', force=bool)
+@debounce
+def SubtitleListMods(**kwargs):
+    rating_key = kwargs["rating_key"]
+    part_id = kwargs["part_id"]
+    language = kwargs["language"]
+    current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
+
+    kwargs.pop("randomize")
+
+    oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
+
+    oc.add(DirectoryObject(
+        key=Callback(SubtitleModificationsMenu, randomize=timestamp(), **kwargs),
+        title="< Back to subtitle modifications"
+    ))
+
+    for identifier in current_sub.mods:
+        oc.add(DirectoryObject(
+            key=Callback(SubtitleSetMods, mods=identifier, mode="remove", randomize=timestamp(), **kwargs),
+            title="Remove: %s" % identifier
+        ))
+
+    return oc
@@ -18,7 +18,7 @@ sys.modules["support.plex_media"] = plex_media

 import localmedia

-sys.modules["subzero.localmedia"] = localmedia
+sys.modules["support.localmedia"] = localmedia

 import subtitlehelpers

@@ -11,9 +11,9 @@ class PlexActivityManager(object):
    def start(self):
        activity_sources_enabled = None

-        if config.universal_plex_token:
+        if config.plex_token:
            from plex import Plex
-            Plex.configuration.defaults.authentication(config.universal_plex_token)
+            Plex.configuration.defaults.authentication(config.plex_token)
            activity_sources_enabled = ["websocket"]
            Activity.on('websocket.playing', self.on_playing)

@@ -27,9 +27,6 @@ class PlexActivityManager(object):

    @throttle(5, instance_method=True)
    def on_playing(self, info):
-        if not config.use_activities:
-            return
-
        # ignore non-playing states and anything too far in
        if info["state"] != "playing" or info["viewOffset"] > 60000:
            return
@@ -41,13 +38,22 @@ class PlexActivityManager(object):
            return

        rating_key = info["ratingKey"]
-        if rating_key not in Dict["last_played_items"]:
-            # new playing; store last 10 recently played items
+        if rating_key in Dict["last_played_items"] and rating_key != Dict["last_played_items"][0]:
+            # shift last played
+            Dict["last_played_items"].insert(0,
+                                             Dict["last_played_items"].pop(Dict["last_played_items"].index(rating_key)))
+            Dict.Save()
+
+        elif rating_key not in Dict["last_played_items"]:
+            # new playing; store last X recently played items
            Dict["last_played_items"].insert(0, rating_key)
-            Dict["last_played_items"] = Dict["last_played_items"][:10]
+            Dict["last_played_items"] = Dict["last_played_items"][:config.store_recently_played_amount]

            Dict.Save()

+            if not config.react_to_activities:
+                return
+
            debug_msg = "Started playing %s. Refreshing it." % rating_key

            key_to_refresh = None
@@ -108,4 +114,5 @@ class PlexActivityManager(object):
                                if ep.index == 1:
                                    return ep

+
 activity = PlexActivityManager()
@@ -9,6 +9,7 @@ import datetime
 import subliminal
 import subliminal_patch
 from babelfish import Language
+from subliminal.cli import MutexLock
 from subzero.lib.io import FileIO, get_viable_encoding
 from subzero.constants import PLUGIN_NAME, PLUGIN_IDENTIFIER, MOVIE, SHOW
 from lib import Plex
@@ -45,6 +46,7 @@ class Config(object):
    data_path = None
    data_items_path = None
    universal_plex_token = None
+    plex_token = None
    is_development = False

    enable_channel = True
@@ -68,6 +70,9 @@ class Config(object):
    sections = None
    enabled_sections = None
    remove_hi = False
+    fix_ocr = False
+    fix_common = False
+    colors = ""
    enforce_encoding = False
    chmod = None
    forced_only = False
@@ -75,8 +80,12 @@ class Config(object):
    treat_und_as_first = False
    ext_match_strictness = False
    default_mods = None
-    use_activities = False
+    debug_mods = False
+    react_to_activities = False
    activity_mode = None
+    subtitles_save_to = None
+
+    store_recently_played_amount = 20

    initialized = False

@@ -91,6 +100,9 @@ class Config(object):
        self.data_path = getattr(Data, "_core").storage.data_path
        self.data_items_path = os.path.join(self.data_path, "DataItems")
        self.universal_plex_token = self.get_universal_plex_token()
+        self.plex_token = os.environ.get("PLEXTOKEN", self.universal_plex_token)
+
+        os.environ["SZ_USER_AGENT"] = self.get_user_agent()

        self.set_plugin_mode()
        self.set_plugin_lock()
@@ -98,6 +110,7 @@ class Config(object):

        self.lang_list = self.get_lang_list()
        self.subtitle_destination_folder = self.get_subtitle_destination_folder()
+        self.forced_only = cast_bool(Prefs["subtitles.only_foreign"])
        self.providers = self.get_providers()
        self.provider_settings = self.get_provider_settings()
        self.max_recent_items_per_library = int_or_default(Prefs["scheduler.max_recent_items_per_library"], 2000)
@@ -109,15 +122,37 @@ class Config(object):
        self.permissions_ok = self.check_permissions()
        self.notify_executable = self.check_notify_executable()
        self.remove_hi = cast_bool(Prefs['subtitles.remove_hi'])
+        self.fix_ocr = cast_bool(Prefs['subtitles.fix_ocr'])
+        self.fix_common = cast_bool(Prefs['subtitles.fix_common'])
+        self.colors = Prefs['subtitles.colors'] if Prefs['subtitles.colors'] != "don't change" else None
        self.enforce_encoding = cast_bool(Prefs['subtitles.enforce_encoding'])
+
+        os.environ["SZ_ENFORCE_ENCODING"] = str(self.enforce_encoding)
+
        self.chmod = self.check_chmod()
-        self.forced_only = cast_bool(Prefs["subtitles.only_foreign"])
        self.exotic_ext = cast_bool(Prefs["subtitles.scan.exotic_ext"])
        self.treat_und_as_first = cast_bool(Prefs["subtitles.language.treat_und_as_first"])
        self.ext_match_strictness = self.determine_ext_sub_strictness()
        self.default_mods = self.get_default_mods()
+        self.debug_mods = cast_bool(Prefs['log_debug_mods'])
+        self.subtitles_save_to = Prefs['subtitles.save.filesystem']
        self.initialized = True

+    def init_cache(self):
+        use_fallback_cache = True
+        if Core.runtime.os != "Windows":
+            try:
+                subliminal.region.configure('dogpile.cache.dbm', expiration_time=datetime.timedelta(days=30),
+                                            arguments={'filename': os.path.join(config.data_items_path, 'subzero.dbm'),
+                                                       'lock_factory': MutexLock})
+                use_fallback_cache = False
+            except:
+                pass
+
+        if use_fallback_cache:
+            Log.Warn("Not using file based cache!")
+            subliminal.region.configure('dogpile.cache.memory')
+
    def set_log_paths(self):
        # find log handler
        for handler in Core.log.handlers:
@@ -142,7 +177,9 @@ class Config(object):
            except:
                Log.Warn("Couldn't determine Plex Token")
        else:
-            Log("Did NOT find Preferences file - please check logfile and hierarchy. Aborting!")
+            Log("Did NOT find Preferences file - most likely Windows OS. Otherwise please check logfile and hierarchy.")
+
+        # fixme: windows

    def set_plugin_mode(self):
        if Prefs["plugin_mode"] == "only agent":
@@ -217,11 +254,17 @@ class Config(object):
        return all_permissions_ok

    def get_version(self):
+        return self.get_bare_version() + ("" if not self.is_development else " DEV")
+
+    def get_bare_version(self):
        result = VERSION_RE.search(self.plugin_info)
-        add = "" if not self.is_development else " DEV"

        if result:
-            return result.group(1) + add
+            return result.group(1)
+        return "2.x.x.x"
+
+    def get_user_agent(self):
+        return "Sub-Zero/%s" % (self.get_bare_version() + ("" if not self.is_development else "-dev"))

    def get_dev_mode(self):
        dev = DEV_RE.search(self.plugin_info)
@@ -347,10 +390,13 @@ class Config(object):
                     }

        # ditch non-forced-subtitles-reporting providers
-        if cast_bool(Prefs['subtitles.only_foreign']):
+        if self.forced_only:
            providers["addic7ed"] = False
            providers["tvsubtitles"] = False
            providers["legendastv"] = False
+            providers["napiprojekt"] = False
+            providers["shooter"] = False
+            providers["subscenter"] = False

        return filter(lambda prov: providers[prov], providers)

@@ -412,16 +458,22 @@ class Config(object):
        mods = []
        if self.remove_hi:
            mods.append("remove_HI")
+        if self.fix_ocr:
+            mods.append("OCR_fixes")
+        if self.fix_common:
+            mods.append("common")
+        if self.colors:
+            mods.append("color(name=%s)" % self.colors)

        return mods

    def set_activity_modes(self):
        val = Prefs["activity.on_playback"]
        if val == "never":
-            self.use_activities = False
+            self.react_to_activities = False
            return

-        self.use_activities = True
+        self.react_to_activities = True
        if val == "current media item":
            self.activity_mode = "refresh"
        elif val == "hybrid: current item or next episode":
@@ -9,15 +9,24 @@ import time
 import re
 import platform
 import subprocess
-
-from bs4 import UnicodeDammit
-
+import sys
 import chardet

+from bs4 import UnicodeDammit
 from babelfish import Language
-
 from subzero.analytics import track_event

+mswindows = (sys.platform == "win32")
+if mswindows:
+    from subprocess import list2cmdline
+    quote_args = list2cmdline
+else:
+    # POSIX
+    from pipes import quote
+
+    def quote_args(seq):
+        return ' '.join(quote(arg) for arg in seq)
+
 # Unicode control characters can appear in ID3v2 tags but are not legal in XML.
 RE_UNICODE_CONTROL = u'([\u0000-\u0008\u000b-\u000c\u000e-\u001f\ufffe-\uffff])' + \
                     u'|' + \
@@ -30,7 +39,7 @@ RE_UNICODE_CONTROL = u'([\u0000-\u0008\u000b-\u000c\u000e-\u001f\ufffe-\uffff])'


 def cast_bool(value):
-    return str(value) in ("true", "True")
+    return str(value).strip() in ("true", "True")


 # A platform independent way to split paths which might come in with different separators.
@@ -110,9 +119,9 @@ def str_pad(s, length, align='left', pad_char=' ', trim=False):
        raise ValueError("Unknown align type, expected either 'left' or 'right'")


-def pad_title(value):
+def pad_title(value, width=49):
    """Pad a title to 30 characters to force the 'details' view."""
-    return str_pad(value, 49, pad_char=' ')
+    return str_pad(value, width, pad_char=' ')


 def get_plex_item_display_title(item, kind, parent=None, parent_title=None, section_title=None,
@@ -236,13 +245,13 @@ def get_item_hints(data):
    :param data: video item dict of media_to_videos 
    :return: 
    """
-    hints = {"title": data["title"], "type": "movie"}
+    hints = {"title": data["original_title"] or data["title"], "type": "movie"}
    if data["type"] == "episode":
        hints.update(
            {
                "type": "episode",
                "episode_title": data["title"],
-                "title": data["series"],
+                "title": data["original_title"] or data["series"],
            }
        )
    return hints
@@ -273,9 +282,21 @@ def notify_executable(exe_info, videos, subtitles, storage):
            prepared_arguments = [arg % prepared_data for arg in arguments]

            Log.Debug(u"Calling %s with arguments: %s" % (exe, prepared_arguments))
+            env = os.environ
+            if not mswindows:
+                env_path = {"PATH": os.pathsep.join(
+                                        [
+                                            "/usr/local/bin",
+                                            "/usr/bin",
+                                            os.environ.get("PATH", "")
+                                        ]
+                                    )
+                            }
+                env = dict(os.environ, **env_path)
+
            try:
-                output = subprocess.check_output(subprocess.list2cmdline([exe] + prepared_arguments),
-                                                 stderr=subprocess.STDOUT, shell=True)
+                output = subprocess.check_output(quote_args([exe] + prepared_arguments),
+                                                 stderr=subprocess.STDOUT, shell=True, env=env)
            except subprocess.CalledProcessError:
                Log.Error(u"Calling %s failed: %s" % (exe, traceback.format_exc()))
            else:
@@ -303,3 +324,7 @@ def dispatch_track_usage(*args, **kwargs):

 def get_language(lang_short):
    return Language.fromietf(lang_short)
+
+
+class PartUnknownException(Exception):
+    pass
@@ -2,12 +2,15 @@

 import logging
 import re
+import traceback
 import types
 import os
 from ignore import ignore_list
-from helpers import is_recent, get_plex_item_display_title, query_plex
+from helpers import is_recent, get_plex_item_display_title, query_plex, PartUnknownException
 from lib import Plex, get_intent
 from config import config, IGNORE_FN
+from subliminal_patch.subtitle import ModifiedSubtitle
+from subzero.modification import registry as mod_registry, SubtitleModifications

 logger = logging.getLogger(__name__)

@@ -40,11 +43,11 @@ PLEX_API_TYPE_MAP = {

 def get_item_kind_from_rating_key(key):
    item = get_item(key)
-    return PLEX_API_TYPE_MAP[get_item_kind(item)]
+    return PLEX_API_TYPE_MAP.get(get_item_kind(item))


 def get_item_kind_from_item(item):
-    return PLEX_API_TYPE_MAP[get_item_kind(item)]
+    return PLEX_API_TYPE_MAP.get(get_item_kind(item))


 def get_item_thumb(item):
@@ -164,14 +167,17 @@ def get_recent_items():
        "X-Plex-Container-Size": "%s" % config.max_recent_items_per_library
    }

-    episode_re = re.compile(ur'ratingKey="(?P<key>\d+)"'
+    episode_re = re.compile(ur'(?su)ratingKey="(?P<key>\d+)"'
                            ur'.+?grandparentRatingKey="(?P<parent_key>\d+)"'
                            ur'.+?title="(?P<title>.*?)"'
                            ur'.+?grandparentTitle="(?P<parent_title>.*?)"'
                            ur'.+?index="(?P<episode>\d+?)"'
-                            ur'.+?parentIndex="(?P<season>\d+?)".+?addedAt="(?P<added>\d+)"')
-    movie_re = re.compile(ur'ratingKey="(?P<key>\d+)".+?title="(?P<title>.*?)".+?addedAt="(?P<added>\d+)"')
-    available_keys = ("key", "title", "parent_key", "parent_title", "season", "episode", "added")
+                            ur'.+?parentIndex="(?P<season>\d+?)".+?addedAt="(?P<added>\d+)"'
+                            ur'.+?<Part.+? file="(?P<filename>[^"]+?)"')
+    movie_re = re.compile(ur'(?su)ratingKey="(?P<key>\d+)".+?title="(?P<title>.*?)'
+                          ur'".+?addedAt="(?P<added>\d+)"'
+                          ur'.+?<Part.+? file="(?P<filename>[^"]+?)"')
+    available_keys = ("key", "title", "parent_key", "parent_title", "season", "episode", "added", "filename")
    recent = []

    for section in Plex["library"].sections():
@@ -182,8 +188,10 @@ def get_recent_items():
            continue

        use_args = args.copy()
+        plex_item_type = "Movie"
        if section.type == "show":
            use_args["type"] = "4"
+            plex_item_type = "Episode"

        url = "http://127.0.0.1:32400/library/sections/%s/all" % int(section.key)
        response = query_plex(url, use_args)
@@ -198,6 +206,10 @@ def get_recent_items():
            if data["key"] in ignore_list.videos:
                Log.Debug(u"Skipping item: %s" % data["title"])
                continue
+            if is_physically_ignored(data["filename"], plex_item_type):
+                Log.Debug(u"Skipping item: %s" % data["title"])
+                continue
+
            if is_recent(int(data["added"])):
                recent.append((int(data["added"]), section.type, section.title, data["key"]))

@@ -242,6 +254,16 @@ def is_ignored(rating_key, item=None):
        return True

    # physical/path ignore
+    if config.ignore_sz_files or config.ignore_paths:
+        for media in item.media:
+            for part in media.parts:
+                if is_physically_ignored(part.file, kind):
+                    return True
+
+    return False
+
+
+def is_physically_ignored(fn, kind):
    if config.ignore_sz_files or config.ignore_paths:
        # normally check current item folder and the library
        check_ignore_paths = [".", "../"]
@@ -249,18 +271,15 @@ def is_ignored(rating_key, item=None):
            # series/episode, we've got a season folder here, also
            check_ignore_paths.append("../../")

-        for part in item.media.parts:
-            if config.ignore_paths and config.is_path_ignored(part.file):
-                Log.Debug("Item %s's path is manually ignored" % rating_key)
-                return True
+        if config.ignore_paths and config.is_path_ignored(fn):
+            Log.Debug("Item %s's path is manually ignored" % fn)
+            return True

-            if config.ignore_sz_files:
-                for sub_path in check_ignore_paths:
-                    if config.is_physically_ignored(os.path.abspath(os.path.join(os.path.dirname(part.file), sub_path))):
-                        Log.Debug("An ignore file exists in either the items or its parent folders")
-                        return True
-
-    return False
+        if config.ignore_sz_files:
+            for sub_path in check_ignore_paths:
+                if config.is_physically_ignored(os.path.normpath(os.path.join(os.path.dirname(fn), sub_path))):
+                    Log.Debug("An ignore file exists in either the items or its parent folders")
+                    return True


 def refresh_item(rating_key, force=False, timeout=8000, refresh_kind=None, parent_rating_key=None):
@@ -292,4 +311,65 @@ def get_current_sub(rating_key, part_id, language):
    subtitle_storage = get_subtitle_storage()
    stored_subs = subtitle_storage.load_or_new(item)
    current_sub = stored_subs.get_any(part_id, language)
-    return current_sub, stored_subs, subtitle_storage
+    return current_sub, stored_subs, subtitle_storage
+
+
+def set_mods_for_part(rating_key, part_id, language, item_type, mods, mode="add"):
+    from support.plex_media import get_plex_metadata, scan_videos
+    from support.storage import save_subtitles
+
+    current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
+    if mode == "add":
+        for mod in mods:
+            identifier, args = SubtitleModifications.parse_identifier(mod)
+            mod_class = SubtitleModifications.get_mod_class(identifier)
+
+            if identifier not in mod_registry.mods_available:
+                raise NotImplementedError("Mod unknown or not registered")
+
+            # clean exclusive mods
+            if mod_class.exclusive and current_sub.mods:
+                for current_mod in current_sub.mods[:]:
+                    if current_mod.startswith(identifier):
+                        current_sub.mods.remove(current_mod)
+                        Log.Info("Removing superseded mod %s" % current_mod)
+
+            current_sub.add_mod(mod)
+    elif mode == "clear":
+        current_sub.add_mod(None)
+    elif mode == "remove":
+        for mod in mods:
+            current_sub.mods.remove(mod)
+
+    elif mode == "remove_last":
+        if current_sub.mods:
+            current_sub.mods.pop()
+    else:
+        raise NotImplementedError("Wrong mode given")
+    storage.save(stored_subs)
+
+    try:
+        metadata = get_plex_metadata(rating_key, part_id, item_type)
+    except PartUnknownException:
+        return
+
+    scanned_parts = scan_videos([metadata], kind="series" if item_type == "episode" else "movie", ignore_all=True)
+    video, plex_part = scanned_parts.items()[0]
+
+    subtitle = ModifiedSubtitle(language, mods=current_sub.mods)
+    subtitle.content = current_sub.content
+    if current_sub.encoding:
+        # thanks plex
+        setattr(subtitle, "_guessed_encoding", current_sub.encoding)
+
+    subtitle.plex_media_fps = plex_part.fps
+    subtitle.page_link = "modify subtitles with: %s" % (", ".join(current_sub.mods) if current_sub.mods else "none")
+    subtitle.language = language
+    subtitle.id = current_sub.id
+
+    try:
+        save_subtitles(scanned_parts, {video: [subtitle]}, mode="m", bare_save=True)
+        Log.Debug("Modified %s subtitle for: %s:%s with: %s", language.name, rating_key, part_id,
+                  ", ".join(current_sub.mods) if current_sub.mods else "none")
+    except:
+        Log.Error("Something went wrong when modifying subtitle: %s", traceback.format_exc())
@@ -108,7 +108,8 @@ def find_subtitles(part):
                    if ext.lower()[1:] in config.SUBTITLE_EXTS:
                        # get fn without forced/default/normal tag
                        split_tag = root.rsplit(".", 1)
-                        if len(split_tag) > 1 and split_tag[1].lower() in ['forced', 'normal', 'default']:
+                        if len(split_tag) > 1 and split_tag[1].lower() in ['forced', 'normal', 'default', 'embedded',
+                                                                           'custom']:
                            root = split_tag[0]

                        # get associated media file name without language
@@ -160,9 +161,8 @@ def find_subtitles(part):
        # determine whether to pick up the subtitle based on our match strictness
        elif not filename_matches_part:
            if sz_config.ext_match_strictness == "strict" or (
-                    sz_config.ext_match_strictness == "loose" and not filename_contains_part):
-
-                #Log.Debug("%s doesn't match %s, skipping" % (helpers.unicodize(local_filename),
+                            sz_config.ext_match_strictness == "loose" and not filename_contains_part):
+                # Log.Debug("%s doesn't match %s, skipping" % (helpers.unicodize(local_filename),
                #                                             helpers.unicodize(part_basename)))
                continue

@@ -1,5 +1,6 @@
 # coding=utf-8
 import traceback
+import time

 from support.config import config
 from support.helpers import get_plex_item_display_title, cast_bool
@@ -8,8 +9,6 @@ from support.lib import Plex


 def item_discover_missing_subs(rating_key, kind="show", added_at=None, section_title=None, internal=False, external=True, languages=()):
-    existing_subs = {"internal": [], "external": [], "count": 0}
-
    item_id = int(rating_key)
    item = get_item(rating_key)

@@ -18,36 +17,41 @@ def item_discover_missing_subs(rating_key, kind="show", added_at=None, section_t
    else:
        item_title = get_plex_item_display_title(item, kind, section_title=section_title)

-    video = item.media
+    missing = set()
+    languages_set = set(languages)
+    for media in item.media:
+        existing_subs = {"internal": [], "external": [], "count": 0}
+        for part in media.parts:
+            for stream in part.streams:
+                if stream.stream_type == 3:
+                    if stream.index:
+                        key = "internal"
+                    else:
+                        key = "external"

-    for part in video.parts:
-        for stream in part.streams:
-            if stream.stream_type == 3:
-                if stream.index:
-                    key = "internal"
-                else:
-                    key = "external"
+                    existing_subs[key].append(Locale.Language.Match(stream.language_code or ""))
+                    existing_subs["count"] = existing_subs["count"] + 1

-                existing_subs[key].append(Locale.Language.Match(stream.language_code or ""))
-                existing_subs["count"] = existing_subs["count"] + 1
+        missing_from_part = set(languages_set)
+        if existing_subs["count"]:
+            existing_flat = set((existing_subs["internal"] if internal else []) + (existing_subs["external"] if external else []))
+            if languages_set.issubset(existing_flat) or (len(existing_flat) >= 1 and Prefs['subtitles.only_one']):
+                # all subs found
+                #Log.Info(u"All subtitles exist for '%s'", item_title)
+                continue

-    missing = languages
-    if existing_subs["count"]:
-        existing_flat = (existing_subs["internal"] if internal else []) + (existing_subs["external"] if external else [])
-        languages_set = set(languages)
-        if languages_set.issubset(existing_flat) or (len(existing_flat) >= 1 and Prefs['subtitles.only_one']):
-            # all subs found
-            Log.Info(u"All subtitles exist for '%s'", item_title)
-            return
+            missing_from_part = languages_set - existing_flat

-        missing = languages_set - set(existing_flat)
-        Log.Info(u"Subs still missing for '%s': %s", item_title, missing)
+        if missing_from_part:
+            Log.Info(u"Subs still missing for '%s' (%s: %s): %s", item_title, rating_key, media.id,
+                     missing_from_part)
+            missing.update(missing_from_part)

    if missing:
        return added_at, item_id, item_title, item, missing


-def items_get_all_missing_subs(items):
+def items_get_all_missing_subs(items, sleep_after_request=False):
    missing = []
    for added_at, kind, section_title, key in items:
        try:
@@ -65,6 +69,8 @@ def items_get_all_missing_subs(items):
                missing.append(state)
        except:
            Log.Error("Something went wrong when getting the state of item %s: %s", key, traceback.format_exc())
+        if sleep_after_request:
+            time.sleep(sleep_after_request)
    return missing


@@ -1,15 +1,14 @@
 # coding=utf-8

 import os
+from urllib2 import URLError

 import helpers
-
+from config import config
 from items import get_item
 from lib import get_intent, Plex
-from config import config
 from subzero.video import parse_video

-
 def get_metadata_dict(item, part, add):
    data = {
        "item": item,
@@ -22,6 +21,54 @@ def get_metadata_dict(item, part, add):
    return data


+imdb_guid_identifier = "com.plexapp.agents.imdb://"
+tvdb_guid_identifier = "com.plexapp.agents.thetvdb://"
+
+
+def get_plexapi_stream_info(plex_item, part_id=None):
+    d = {"stream": {}}
+    data = d["stream"]
+
+    # find current part
+    current_part = None
+    current_media = None
+    for media in plex_item.media:
+        for part in media.parts:
+            if not part_id or str(part.id) == part_id:
+                current_part = part
+                current_media = media
+                break
+        if current_part:
+            break
+
+    if not current_part:
+        return d
+
+    data["video_codec"] = current_media.video_codec
+    data["audio_codec"] = current_media.audio_codec.upper()
+
+    if data["audio_codec"] == "DCA":
+        data["audio_codec"] = "DTS"
+
+    if current_media.audio_channels == 8:
+        data["audio_channels"] = "7.1"
+
+    elif current_media.audio_channels == 6:
+        data["audio_channels"] = "5.1"
+    else:
+        data["audio_channels"] = "%s.0" % str(current_media.audio_channels)
+
+    # iter streams
+    for stream in current_part.streams:
+        if stream.stream_type == 1:
+            # video stream
+            data["resolution"] = "%s%s" % (current_media.video_resolution,
+                                           "i" if stream.scan_type != "progressive" else "p")
+            break
+
+    return d
+
+
 def media_to_videos(media, kind="series"):
    """
    iterates through media and returns the associated parts (videos)
@@ -31,36 +78,61 @@ def media_to_videos(media, kind="series"):
    """
    videos = []

+    # this is a Show or a Movie object
+    plex_item = get_item(media.id)
+    year = plex_item.year
+    original_title = plex_item.title_original
+
    if kind == "series":
        for season in media.seasons:
            season_object = media.seasons[season]
            for episode in media.seasons[season].episodes:
                ep = media.seasons[season].episodes[episode]

+                tvdb_id = None
+                series_tvdb_id = None
+                if tvdb_guid_identifier in ep.guid:
+                    tvdb_id = ep.guid[len(tvdb_guid_identifier):].split("?")[0]
+                    series_tvdb_id = tvdb_id.split("/")[0]
+
                # get plex item via API for additional metadata
                plex_episode = get_item(ep.id)
+                stream_info = get_plexapi_stream_info(plex_episode)

                for item in media.seasons[season].episodes[episode].items:
                    for part in item.parts:
                        videos.append(
                            get_metadata_dict(plex_episode, part,
-                                              {"plex_part": part, "type": "episode", "title": ep.title,
-                                               "series": media.title, "id": ep.id,
-                                               "series_id": media.id, "season_id": season_object.id,
-                                               "episode": plex_episode.index, "season": plex_episode.season.index,
-                                               "section": plex_episode.section.title
-                                               })
+                                              dict(stream_info, **{"plex_part": part, "type": "episode",
+                                                                    "title": ep.title,
+                                                                    "series": media.title, "id": ep.id, "year": year,
+                                                                    "series_id": media.id,
+                                                                    "season_id": season_object.id,
+                                                                    "imdb_id": None, "series_tvdb_id": series_tvdb_id,
+                                                                    "tvdb_id": tvdb_id,
+                                                                    "original_title": original_title,
+                                                                    "episode": plex_episode.index,
+                                                                    "season": plex_episode.season.index,
+                                                                    "section": plex_episode.section.title
+                                                                    })
+                                              )
                        )
    else:
-        plex_item = get_item(media.id)
+        stream_info = get_plexapi_stream_info(plex_item)
+        imdb_id = None
+        if imdb_guid_identifier in media.guid:
+            imdb_id = media.guid[len(imdb_guid_identifier):].split("?")[0]
        for item in media.items:
            for part in item.parts:
                videos.append(
-                    get_metadata_dict(plex_item, part, {"plex_part": part, "type": "movie",
-                                                        "title": media.title, "id": media.id,
-                                                        "series_id": None,
-                                                        "season_id": None,
-                                                        "section": plex_item.section.title})
+                    get_metadata_dict(plex_item, part, dict(stream_info, **{"plex_part": part, "type": "movie",
+                                                                             "title": media.title, "id": media.id,
+                                                                             "series_id": None, "year": year,
+                                                                             "season_id": None, "imdb_id": imdb_id,
+                                                                             "original_title": original_title,
+                                                                             "series_tvdb_id": None, "tvdb_id": None,
+                                                                             "section": plex_item.section.title})
+                                      )
                )
    return videos

@@ -92,10 +164,10 @@ def get_media_item_ids(media, kind="series"):
    return ids


-def scan_video(plex_part, ignore_all=False, hints=None, rating_key=None):
+def scan_video(pms_video_info, ignore_all=False, hints=None, rating_key=None):
    """
    returnes a subliminal/guessit-refined parsed video
-    :param plex_part: 
+    :param pms_video_info: 
    :param ignore_all: 
    :param hints: 
    :param rating_key: 
@@ -104,6 +176,8 @@ def scan_video(plex_part, ignore_all=False, hints=None, rating_key=None):
    embedded_subtitles = not ignore_all and Prefs['subtitles.scan.embedded']
    external_subtitles = not ignore_all and Prefs['subtitles.scan.external']

+    plex_part = pms_video_info["plex_part"]
+
    if ignore_all:
        Log.Debug("Force refresh intended.")

@@ -111,7 +185,10 @@ def scan_video(plex_part, ignore_all=False, hints=None, rating_key=None):
        plex_part.file, external_subtitles, embedded_subtitles))

    known_embedded = []
-    parts = list(Plex["library"].metadata(rating_key))[0].media.parts
+    parts = []
+    for media in list(Plex["library"].metadata(rating_key))[0].media:
+        parts += media.parts
+
    plexpy_part = None
    for part in parts:
        if int(part.id) == int(plex_part.id):
@@ -139,7 +216,7 @@ def scan_video(plex_part, ignore_all=False, hints=None, rating_key=None):

    try:
        # get basic video info scan (filename)
-        video = parse_video(plex_part.file, hints, external_subtitles=external_subtitles,
+        video = parse_video(plex_part.file, pms_video_info, hints, external_subtitles=external_subtitles,
                            embedded_subtitles=embedded_subtitles, known_embedded=known_embedded,
                            forced_only=config.forced_only, video_fps=plex_part.fps)

@@ -165,7 +242,7 @@ def scan_videos(videos, kind="series", ignore_all=False):

        hints = helpers.get_item_hints(video)
        video["plex_part"].fps = get_stream_fps(video["plex_part"].streams)
-        scanned_video = scan_video(video["plex_part"], ignore_all=force_refresh or ignore_all, hints=hints,
+        scanned_video = scan_video(video, ignore_all=force_refresh or ignore_all, hints=hints,
                                   rating_key=video["id"])

        if not scanned_video:
@@ -179,49 +256,78 @@ def scan_videos(videos, kind="series", ignore_all=False):
    return ret


-class PartUnknownException(Exception):
-    pass
-
-
 def get_plex_metadata(rating_key, part_id, item_type):
    """
    uses the Plex 3rd party API accessor to get metadata information

-    :param rating_key:
+    :param rating_key: movie or episode
    :param part_id:
    :param item_type:
    :return:
    """

-    plex_item = list(Plex["library"].metadata(rating_key))[0]
+    try:
+        plex_item = list(Plex["library"].metadata(rating_key))[0]
+    except URLError:
+        return None

    # find current part
    current_part = None
-    for part in plex_item.media.parts:
-        if str(part.id) == part_id:
-            current_part = part
+    for media in plex_item.media:
+        for part in media.parts:
+            if str(part.id) == part_id:
+                current_part = part

    if not current_part:
-        raise PartUnknownException("Part unknown")
+        raise helpers.PartUnknownException("Part unknown")
+
+    stream_info = get_plexapi_stream_info(plex_item, part_id)

    # get normalized metadata
+    # fixme: duplicated logic of media_to_videos
    if item_type == "episode":
+        show = list(Plex["library"].metadata(plex_item.show.rating_key))[0]
+        year = show.year
+        tvdb_id = None
+        series_tvdb_id = None
+        original_title = show.title_original
+        if tvdb_guid_identifier in plex_item.guid:
+            tvdb_id = plex_item.guid[len(tvdb_guid_identifier):].split("?")[0]
+            series_tvdb_id = tvdb_id.split("/")[0]
        metadata = get_metadata_dict(plex_item, current_part,
-                                     {"plex_part": current_part, "type": "episode", "title": plex_item.title,
-                                      "series": plex_item.show.title, "id": plex_item.rating_key,
-                                      "series_id": plex_item.show.rating_key,
-                                      "season_id": plex_item.season.rating_key,
-                                      "season": plex_item.season.index,
-                                      "episode": plex_item.index
-                                      })
+                                     dict(stream_info,
+                                          **{"plex_part": current_part, "type": "episode", "title": plex_item.title,
+                                             "series": plex_item.show.title, "id": plex_item.rating_key,
+                                             "series_id": plex_item.show.rating_key,
+                                             "season_id": plex_item.season.rating_key,
+                                             "imdb_id": None,
+                                             "year": year,
+                                             "tvdb_id": tvdb_id,
+                                             "series_tvdb_id": series_tvdb_id,
+                                             "original_title": original_title,
+                                             "season": plex_item.season.index,
+                                             "episode": plex_item.index
+                                             })
+                                     )
    else:
-        metadata = get_metadata_dict(plex_item, current_part, {"plex_part": current_part, "type": "movie",
-                                                               "title": plex_item.title, "id": plex_item.rating_key,
-                                                               "series_id": None,
-                                                               "season_id": None,
-                                                               "season": None,
-                                                               "episode": None,
-                                                               "section": plex_item.section.title})
+        imdb_id = None
+        original_title = plex_item.title_original
+        if imdb_guid_identifier in plex_item.guid:
+            imdb_id = plex_item.guid[len(imdb_guid_identifier):].split("?")[0]
+        metadata = get_metadata_dict(plex_item, current_part,
+                                     dict(stream_info, **{"plex_part": current_part, "type": "movie",
+                                                           "title": plex_item.title, "id": plex_item.rating_key,
+                                                           "series_id": None,
+                                                           "season_id": None,
+                                                           "imdb_id": imdb_id,
+                                                           "year": plex_item.year,
+                                                           "tvdb_id": None,
+                                                           "series_tvdb_id": None,
+                                                           "original_title": original_title,
+                                                           "season": None,
+                                                           "episode": None,
+                                                           "section": plex_item.section.title})
+                                     )
    return metadata


@@ -257,3 +363,24 @@ class PMSMediaProxy(object):
                break

            m = m.children[0]
+
+    def get_all_parts(self):
+        """
+        walk the mediatree until the given part was found; if no part was given, return the first one
+        :param part_id:
+        :return:
+        """
+        m = self.mediatree
+        parts = []
+        while 1:
+            if m.items:
+                media_item = m.items[0]
+                for part in media_item.parts:
+                    parts.append(part)
+                break
+
+            if not m.children:
+                break
+
+            m = m.children[0]
+        return parts
@@ -168,6 +168,7 @@ class DefaultScheduler(object):
                for args, kwargs in queue:
                    Log.Debug("Dispatching single task: %s, %s", args, kwargs)
                    Thread.Create(self.run_task, True, *args, **kwargs)
+                    Thread.Sleep(5.0)

            # scheduled tasks
            for name, info in self.tasks.iteritems():
@@ -185,9 +186,13 @@ class DefaultScheduler(object):
                    continue

                if not task.last_run or (task.last_run + datetime.timedelta(**{frequency_key: frequency_num}) <= now):
+                    # fixme: scheduled tasks run synchronously. is this the best idea?
+                    #Thread.Create(self.run_task, True, name)
+                    #Thread.Sleep(5.0)
                    self.run_task(name)
+                    Thread.Sleep(5.0)

-            Thread.Sleep(5.0)
+            Thread.Sleep(1)


 scheduler = DefaultScheduler()
@@ -137,7 +137,8 @@ def save_subtitles_to_file(subtitles):
                os.makedirs(fld)
        subliminal.save_subtitles(video, video_subtitles, directory=fld, single=cast_bool(Prefs['subtitles.only_one']),
                                      encode_with=force_utf8 if config.enforce_encoding else None,
-                                      chmod=config.chmod, forced_tag=config.forced_only, path_decoder=force_unicode)
+                                      chmod=config.chmod, forced_tag=config.forced_only, path_decoder=force_unicode,
+                                      debug_mods=config.debug_mods)
    return True


@@ -145,7 +146,8 @@ def save_subtitles_to_metadata(videos, subtitles):
    for video, video_subtitles in subtitles.items():
        mediaPart = videos[video]
        for subtitle in video_subtitles:
-            content = force_utf8(subtitle.text) if config.enforce_encoding else subtitle.content
+            content = force_utf8(subtitle.get_modified_text(debug=config.debug_mods)) if config.enforce_encoding else \
+                subtitle.get_modified_content(debug=config.debug_mods)

            if not isinstance(mediaPart, Framework.api.agentkit.MediaPart):
                # we're being handed a Plex.py model instance here, not an internal PMS MediaPart object.
@@ -204,6 +206,8 @@ def save_subtitles(scanned_video_part_map, downloaded_subtitles, mode="a", bare_
    if not bare_save and save_successful and config.notify_executable:
        notify_executable(config.notify_executable, scanned_video_part_map, downloaded_subtitles, storage)

-    if not bare_save:
+    if not bare_save and save_successful:
        store_subtitle_info(scanned_video_part_map, downloaded_subtitles, storage, mode=mode)

+    return save_successful
+
@@ -129,9 +129,8 @@ class DefaultSubtitleHelper(SubtitleHelper):
                default = '1'

        # Attempt to extract the language from the filename (e.g. Avatar (2009).eng)
-        language = ""
-
-        # IETF support thanks to https://github.com/hpsbranco/LocalMedia.bundle/commit/4fad9aefedece78a1fa96401304351347f644369
+        # IETF support thanks to
+        # https://github.com/hpsbranco/LocalMedia.bundle/commit/4fad9aefedece78a1fa96401304351347f644369
        language = Locale.Language.Match(match_ietf_language(file))

        # skip non-SRT if wanted
@@ -194,7 +193,10 @@ def get_subtitles_from_metadata(part):
 def force_utf8(content):
    a = UnicodeDammit(content)

-    Log.Debug("detected encoding: %s (None: most likely already successfully decoded)" % a.original_encoding)
+    if a.original_encoding:
+        Log.Debug("detected encoding: %s (None: most likely already successfully decoded)" % a.original_encoding)
+    else:
+        Log.Debug("detected encoding: unicode (already decoded)")

    # easy way out - already utf-8
    if a.original_encoding and a.original_encoding == "utf-8":
@@ -4,6 +4,7 @@ import datetime
 import time
 import operator
 import traceback
+from urllib2 import URLError

 from subliminal_patch.score import compute_score
 from subliminal_patch.core import download_subtitles
@@ -16,8 +17,8 @@ from storage import save_subtitles, whack_missing_parts, get_subtitle_storage
 from support.config import config
 from support.items import get_recent_items, is_ignored, get_item
 from support.lib import Plex
-from support.helpers import track_usage, get_title_for_video_metadata, cast_bool
-from support.plex_media import scan_videos, get_plex_metadata, PartUnknownException
+from support.helpers import track_usage, get_title_for_video_metadata, cast_bool, PartUnknownException
+from support.plex_media import scan_videos, get_plex_metadata


 class Task(object):
@@ -80,14 +81,16 @@ class Task(object):
        return

    def run(self):
+        Log.Info(u"Task: running: %s", self.name)
        self.time_start = datetime.datetime.now()

    def post_run(self, data_holder):
        self.running = False
        self.last_run = datetime.datetime.now()
-        if self.time_start:
+        if self.time_start and self.last_run:
            self.last_run_time = self.last_run - self.time_start
        self.time_start = None
+        Log.Info(u"Task: ran: %s", self.name)


 class SearchAllRecentlyAddedMissing(Task):
@@ -122,7 +125,7 @@ class SearchAllRecentlyAddedMissing(Task):
    def prepare(self, *args, **kwargs):
        self.items_done = []
        recent_items = get_recent_items()
-        missing = items_get_all_missing_subs(recent_items)
+        missing = items_get_all_missing_subs(recent_items, sleep_after_request=0.2)
        ids = set([id for added_at, id, title, item, missing_languages in missing if not is_ignored(id, item=item)])
        self.items_searching = missing
        self.items_searching_ids = ids
@@ -138,14 +141,19 @@ class SearchAllRecentlyAddedMissing(Task):

        for added_at, item_id, title, item, missing_languages in self.items_searching:
            Log.Debug(u"Task: %s, triggering refresh for %s (%s)", self.name, title, item_id)
-            refresh_item(item_id)
+            try:
+                refresh_item(item_id)
+            except URLError:
+                # timeout
+                pass
            search_started = datetime.datetime.now()
            tries = 1
            while 1:
                if item_id in self.items_done:
                    items_done_count += 1
-                    Log.Debug(u"Task: %s, item %s done", self.name, item_id)
                    self.percentage = int(items_done_count * 100 / missing_count)
+                    Log.Debug(u"Task: %s, item %s done (%s%%, %s/%s)", self.name, item_id, self.percentage,
+                              items_done_count, missing_count)
                    break

                # item considered stalled after self.stall_time seconds passed after last refresh
@@ -158,14 +166,18 @@ class SearchAllRecentlyAddedMissing(Task):
                    Log.Debug(u"Task: %s, item stalled for %s seconds: %s, retrying", self.name, self.stall_time,
                              item_id)
                    tries += 1
-                    refresh_item(item_id)
+                    try:
+                        refresh_item(item_id)
+                    except URLError:
+                        pass
                    search_started = datetime.datetime.now()
                    time.sleep(1)
                time.sleep(0.1)
            # we can't hammer the PMS, otherwise requests will be stalled
-            time.sleep(1)
+            time.sleep(5)

-        Log.Debug("Task: %s, done. Failed items: %s", self.name, self.items_failed)
+        Log.Debug("Task: %s, done (%s%%, %s/%s). Failed items: %s", self.name, self.percentage,
+                  items_done_count, missing_count, self.items_failed)
        self.running = False

    def post_run(self, task_data):
@@ -179,13 +191,11 @@ class SearchAllRecentlyAddedMissing(Task):


 class SubtitleListingMixin(object):
-    def list_subtitles(self, rating_key, item_type, part_id, language):
+    def list_subtitles(self, rating_key, item_type, part_id, language, skip_wrong_fps=True):
        metadata = get_plex_metadata(rating_key, part_id, item_type)

-        if item_type == "episode":
-            min_score = 240
-        else:
-            min_score = 60
+        if not metadata:
+            return

        scanned_parts = scan_videos([metadata], kind="series" if item_type == "episode" else "movie", ignore_all=True)
        if not scanned_parts:
@@ -195,9 +205,21 @@ class SubtitleListingMixin(object):
        video, plex_part = scanned_parts.items()[0]
        config.init_subliminal_patches()

+        provider_settings = config.provider_settings.copy()
+        if not skip_wrong_fps:
+            provider_settings = config.provider_settings.copy()
+            provider_settings["opensubtitles"]["skip_wrong_fps"] = False
+
+        if item_type == "episode":
+            min_score = 240
+            if video.is_special:
+                min_score = 180
+        else:
+            min_score = 60
+
        available_subs = list_all_subtitles(scanned_parts, {Language.fromietf(language)},
                                            providers=config.providers,
-                                            provider_configs=config.provider_settings,
+                                            provider_configs=provider_settings,
                                            pool_class=config.provider_pool)

        use_hearing_impaired = Prefs['subtitles.search.hearingImpaired'] in ("prefer", "force HI")
@@ -248,7 +270,7 @@ class DownloadSubtitleMixin(object):
        if subtitle.content:
            try:
                whack_missing_parts(scanned_parts)
-                save_subtitles(scanned_parts, {video: [subtitle]}, mode=mode)
+                save_subtitles(scanned_parts, {video: [subtitle]}, mode=mode, mods=config.default_mods)
                Log.Debug("Manually downloaded subtitle for: %s", rating_key)
                download_successful = True
                refresh_item(rating_key)
@@ -291,7 +313,13 @@ class AvailableSubsForItem(SubtitleListingMixin, Task):
        super(AvailableSubsForItem, self).run()
        self.running = True
        track_usage("Subtitle", "manual", "list", 1)
-        self.data = self.list_subtitles(self.rating_key, self.item_type, self.part_id, self.language)
+        subs = self.list_subtitles(self.rating_key, self.item_type, self.part_id, self.language, skip_wrong_fps=False)
+        if not subs:
+            self.data = None
+            return
+
+        # we can't have nasty unpicklable stuff like ZipFile, BytesIO etc in self.data
+        self.data = [s.make_picklable() for s in subs]

    def post_run(self, task_data):
        super(AvailableSubsForItem, self).post_run(task_data)
@@ -362,13 +390,26 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):
                return

        now = datetime.datetime.now()
+        min_score_series = int(Prefs["subtitles.search.minimumTVScore2"].strip())
+        min_score_movies = int(Prefs["subtitles.search.minimumMovieScore2"].strip())
+        overwrite_manually_modified = cast_bool(
+            Prefs["scheduler.tasks.FindBetterSubtitles.overwrite_manually_modified"])
+        overwrite_manually_selected = cast_bool(
+            Prefs["scheduler.tasks.FindBetterSubtitles.overwrite_manually_selected"])

        subtitle_storage = get_subtitle_storage()
        recent_subs = subtitle_storage.load_recent_files(age_days=max_search_days)
+        viable_item_count = 0

        for fn, stored_subs in recent_subs.iteritems():
            video_id = stored_subs.video_id
-            cutoff = self.series_cutoff if stored_subs.item_type == "episode" else self.movies_cutoff
+
+            if stored_subs.item_type == "episode":
+                cutoff = self.series_cutoff
+                min_score = min_score_series
+            else:
+                cutoff = self.movies_cutoff
+                min_score = min_score_movies

            # don't search for better subtitles until at least 30 minutes have passed
            if stored_subs.added_at + datetime.timedelta(minutes=30) > now:
@@ -379,6 +420,7 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):
            if stored_subs.added_at + datetime.timedelta(days=max_search_days) <= now:
                continue

+            viable_item_count += 1
            ditch_parts = []

            # look through all stored subtitle data
@@ -398,14 +440,20 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):

                    # late cutoff met? skip
                    if current_score >= cutoff:
-                        Log.Debug(u"Skipping finding better subs, cutoff met (current: %s, cutoff: %s): %s",
-                                  current_score, cutoff, stored_subs.title)
+                        Log.Debug(u"Skipping finding better subs, cutoff met (current: %s, cutoff: %s): %s (%s)",
+                                  current_score, cutoff, stored_subs.title, video_id)
                        continue

                    # got manual subtitle but don't want to touch those?
-                    if current_mode == "m" and \
-                            not cast_bool(Prefs["scheduler.tasks.FindBetterSubtitles.overwrite_manually_selected"]):
-                        Log.Debug(u"Skipping finding better subs, had manual: %s", stored_subs.title)
+                    if current_mode == "m" and not overwrite_manually_selected:
+                        Log.Debug(u"Skipping finding better subs, had manual: %s (%s)", stored_subs.title, video_id)
+                        continue
+
+                    # subtitle modifications different from default
+                    if not overwrite_manually_modified and current.mods \
+                            and set(current.mods).difference(set(config.default_mods)):
+                        Log.Debug(u"Skipping finding better subs, it has manual modifications: %s (%s)",
+                                  stored_subs.title, video_id)
                        continue

                    try:
@@ -420,7 +468,7 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):
                        better_downloaded = False
                        better_tried_download = 0
                        for sub in subs:
-                            if sub.score > current_score:
+                            if sub.score > current_score and sub.score > min_score:
                                Log.Debug("Better subtitle found for %s, downloading", video_id)
                                better_tried_download += 1
                                ret = self.download_subtitle(sub, video_id, mode="b")
@@ -444,8 +492,13 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):
                        pass
                subtitle_storage.save(stored_subs)

+            time.sleep(1)
+
        if better_found:
-            Log.Debug("Task: %s, done. Better subtitles found for %s items", self.name, better_found)
+            Log.Debug("Task: %s, done. Better subtitles found for %s/%s items", self.name, better_found,
+                      viable_item_count)
+        else:
+            Log.Debug("Task: %s, done. No better subtitles found for %s items", self.name, viable_item_count)


 class SubtitleStorageMaintenance(Task):
@@ -465,9 +518,27 @@ class SubtitleStorageMaintenance(Task):
            Log.Info("Nothing to do")


+class MigrateSubtitleStorage(Task):
+    periodic = False
+    frequency = None
+
+    def run(self):
+        super(MigrateSubtitleStorage, self).run()
+        self.running = True
+        Log.Info("Running subtitle storage migration")
+        storage = get_subtitle_storage()
+        for fn in storage.get_all_files():
+            if fn.endswith(".json.gz"):
+                continue
+            Log.Debug("Migrating %s", fn)
+            storage.load(None, fn)
+
+
 scheduler.register(SearchAllRecentlyAddedMissing)
 scheduler.register(AvailableSubsForItem)
 scheduler.register(DownloadSubtitleForItem)
 scheduler.register(MissingSubtitles)
 scheduler.register(FindBetterSubtitles)
 scheduler.register(SubtitleStorageMaintenance)
+scheduler.register(MigrateSubtitleStorage)
+
@@ -258,13 +258,14 @@
      "35",
      "30",
      "25",
+      "21",
      "20",
      "15",
      "10",
      "5",
      "0"
    ],
-    "default": "25"
+    "default": "21"
  },
  {
    "id": "provider.addic7ed.use_random_agents",
@@ -332,7 +333,7 @@
  },
  {
    "id": "providers.multithreading",
-    "label": "Search enabled providers simuntaneously (multithreading)",
+    "label": "Search enabled providers simultaneously (multithreading)",
    "type": "bool",
    "default": "true"
  },
@@ -356,7 +357,7 @@
  },
  {
    "id": "subtitles.scan.exotic_ext",
-    "label": "Scan: include \"exotic\" external subtitle formats (anything else than .srt/.ssa/.ass)",
+    "label": "Scan: include \"exotic\" subtitle formats (anything else than .srt/.ssa/.ass; embedded or external)",
    "type": "bool",
    "default": "false"
  },
@@ -381,7 +382,7 @@
    "id": "subtitles.search.minimumMovieScore2",
    "label": "Minimum score for movies (min: 60, def/sane: 69, min-ideal: 82; see http://v.ht/szscores)",
    "type": "text",
-    "default": "69"
+    "default": "60"
  },
  {
    "id": "subtitles.search.hearingImpaired",
@@ -399,14 +400,51 @@
    "id": "subtitles.remove_hi",
    "label": "Remove Hearing Impaired tags from downloaded subtitles",
    "type": "bool",
+    "default": "false"
+  },
+  {
+    "id": "subtitles.fix_common",
+    "label": "Fix common whitespace/punctuation issues in subtitles",
+    "type": "bool",
+    "default": "true"
+  },
+  {
+    "id": "subtitles.fix_ocr",
+    "label": "Fix common OCR errors in downloaded subtitles",
+    "type": "bool",
    "default": "true"
  },
  {
    "id": "subtitles.enforce_encoding",
-    "label": "Normalize subtitle encoding to UTF-8",
+    "label": "Normalize subtitle encoding to UTF-8 (highly recommended!)",
    "type": "bool",
    "default": "true"
  },
+  {
+    "id": "subtitles.colors",
+    "label": "Change colors of subtitles to",
+    "type": "enum",
+    "values": [
+      "don't change",
+      "white",
+      "light-grey",
+      "red",
+      "green",
+      "yellow",
+      "blue",
+      "magenta",
+      "cyan",
+      "black",
+      "dark-red",
+      "dark-green",
+      "dark-yellow",
+      "dark-blue",
+      "dark-magenta",
+      "dark-cyan",
+      "dark-grey"
+    ],
+    "default": "don't change"
+  },
  {
    "id": "subtitles.save.filesystem",
    "label": "Store subtitles next to media files (instead of metadata)",
@@ -498,7 +536,7 @@
    "id": "scheduler.max_recent_items_per_library",
    "label": "Scheduler: Recent items to consider per library",
    "type": "text",
-    "default": "500"
+    "default": "1000"
  },
  {
    "id": "scheduler.tasks.FindBetterSubtitles.frequency",
@@ -524,6 +562,12 @@
    "type": "bool",
    "default": "true"
  },
+  {
+    "id": "scheduler.tasks.FindBetterSubtitles.overwrite_manually_modified",
+    "label": "Scheduler: Overwrite subtitles with non-default subtitle modifications when better found",
+    "type": "bool",
+    "default": "false"
+  },
  {
    "id": "history_size",
    "label": "History: amount of items to store historical data for",
@@ -599,7 +643,7 @@
  },
  {
    "id": "notify_executable",
-    "label": "Call this executable upon successful subtitle download",
+    "label": "Call this executable upon successful subtitle download (see Wiki for details)",
    "type": "text",
    "default": ""
  },
@@ -622,6 +666,12 @@
    ],
    "default": "WARNING"
  },
+  {
+    "id": "log_debug_mods",
+    "label": "Log subtitle modification (debug)",
+    "type": "bool",
+    "default": "false"
+  },
  {
    "id": "log_console",
    "label": "Log to console (for development/debugging)",
@@ -9,11 +9,11 @@
        <key>CFBundleInfoDictionaryVersion</key>
        <string>6.0</string>
        <key>CFBundleShortVersionString</key>
-        <string>2.0.0</string>
+        <string>2.0.20</string>
        <key>CFBundleSignature</key>
        <string>????</string>
        <key>CFBundleVersion</key>
-        <string>2.0.0.10</string>
+        <string>2.0.20.1364</string>
        <key>PlexFrameworkVersion</key>
        <string>2</string>
        <key>PlexPluginClass</key>
@@ -32,7 +32,7 @@

 &lt;h1&gt;Sub-Zero for Plex&lt;/h1&gt;&lt;i&gt;Subtitles done right&lt;/i&gt;

-Version 2.0.0.10 DEV
+Version 2.0.20.1364 RC9

 Originally based on @bramwalet's awesome &lt;a href=&quot;https://github.com/bramwalet/Subliminal.bundle&quot;&gt;Subliminal.bundle&lt;/a&gt;

@@ -369,7 +369,8 @@ class Chapter(object):
        if chapterdisplays:
            string = chapterdisplays[0].get('ChapString')
            language = chapterdisplays[0].get('ChapLanguage')
-        return cls(start, hidden, enabled, end, string, language)
+            return cls(start, hidden, enabled, end, string, language)
+        return cls(start, hidden, enabled, end)

    def __repr__(self):
        return '<%s [%s, enabled=%s]>' % (self.__class__.__name__, self.start, self.enabled)
@@ -168,9 +168,13 @@ def parse(stream, specs, size=None, ignore_element_types=None, ignore_element_na
    while size is None or stream.tell() - start < size:
        try:
            element = parse_element(stream, specs)
+            if not element or not hasattr(element, "type"):
+                stream.seek(element.size, 1)
+                continue
+
            if element.type is None:
-                logger.error('Element with id 0x%x is not in the specs' % element_id)
-                stream.seek(element_size, 1)
+                logger.error('Element with id 0x%x is not in the specs' % element.id)
+                stream.seek(element.size, 1)
                continue
            elif element.type in ignore_element_types or element.name in ignore_element_names:
                logger.info('%s %s %s ignored', element.__class__.__name__, element.name, element.type)
@@ -39,12 +39,13 @@ def audio_codec():
    rebulk.defaults(name="audio_codec", conflict_solver=audio_codec_priority)

    rebulk.regex("MP3", "LAME", r"LAME(?:\d)+-?(?:\d)+", value="MP3")
-    rebulk.regex("Dolby", "DolbyDigital", "Dolby-Digital", "DDP?", value="DolbyDigital")
+    rebulk.regex("Dolby", "DolbyDigital", "Dolby-Digital", "DD", value="DolbyDigital")
    rebulk.regex("DolbyAtmos", "Dolby-Atmos", "Atmos", value="DolbyAtmos")
-    rebulk.regex("AAC", value="AAC")
+    rebulk.string("AAC", value="AAC")
    rebulk.regex("AC3D?", value="AC3")
-    rebulk.regex("Flac", value="FLAC")
-    rebulk.regex("DTS", value="DTS")
+    rebulk.string('EAC3', 'DDP', 'DD+', value="EAC3")
+    rebulk.string("Flac", value="FLAC")
+    rebulk.string("DTS", value="DTS")
    rebulk.regex("True-?HD", value="TrueHD")

    rebulk.defaults(name="audio_profile")
@@ -34,15 +34,17 @@ def container():
              'ogv', 'qt', 'ra', 'ram', 'rm', 'ts', 'wav', 'webm', 'wma', 'wmv',
              'iso', 'vob']
    torrent = ['torrent']
+    nzb = ['nzb']

    rebulk.regex(r'\.'+build_or_pattern(subtitles)+'$', exts=subtitles, tags=['extension', 'subtitle'])
    rebulk.regex(r'\.'+build_or_pattern(info)+'$', exts=info, tags=['extension', 'info'])
    rebulk.regex(r'\.'+build_or_pattern(videos)+'$', exts=videos, tags=['extension', 'video'])
    rebulk.regex(r'\.'+build_or_pattern(torrent)+'$', exts=torrent, tags=['extension', 'torrent'])
+    rebulk.regex(r'\.'+build_or_pattern(nzb)+'$', exts=nzb, tags=['extension', 'nzb'])

    rebulk.defaults(name='container',
                    validator=seps_surround,
-                    formatter=lambda s: s.upper(),
+                    formatter=lambda s: s.lower(),
                    conflict_solver=lambda match, other: match
                    if other.name in ['format',
                                      'video_codec'] or other.name == 'container' and 'extension' in other.tags
@@ -51,5 +53,6 @@ def container():
    rebulk.string(*[sub for sub in subtitles if sub not in ['sub']], tags=['subtitle'])
    rebulk.string(*videos, tags=['video'])
    rebulk.string(*torrent, tags=['torrent'])
+    rebulk.string(*nzb, tags=['nzb'])

    return rebulk
@@ -5,7 +5,7 @@ Episode title
 """
 from collections import defaultdict

-from rebulk import Rebulk, Rule, AppendMatch, RenameMatch, POST_PROCESS
+from rebulk import Rebulk, Rule, AppendMatch, RemoveMatch, RenameMatch, POST_PROCESS

 from ..common import seps, title_seps
 from ..common.formatters import cleanup
@@ -19,8 +19,12 @@ def episode_title():
    :return: Created Rebulk object
    :rtype: Rebulk
    """
-    rebulk = Rebulk().rules(EpisodeTitleFromPosition,
-                            AlternativeTitleReplace,
+    previous_names = ('episode', 'episode_details', 'episode_count',
+                      'season', 'season_count', 'date', 'title', 'year')
+
+    rebulk = Rebulk().rules(RemoveConflictsWithEpisodeTitle(previous_names),
+                            EpisodeTitleFromPosition(previous_names),
+                            AlternativeTitleReplace(previous_names),
                            TitleToEpisodeTitle,
                            Filepart3EpisodeTitle,
                            Filepart2EpisodeTitle,
@@ -28,6 +32,62 @@ def episode_title():
    return rebulk


+class RemoveConflictsWithEpisodeTitle(Rule):
+    """
+    Remove conflicting matches that might lead to wrong episode_title parsing.
+    """
+
+    priority = 64
+    consequence = RemoveMatch
+
+    def __init__(self, previous_names):
+        super(RemoveConflictsWithEpisodeTitle, self).__init__()
+        self.previous_names = previous_names
+        self.next_names = ('streaming_service', 'screen_size', 'format',
+                           'video_codec', 'audio_codec', 'other', 'container')
+        self.affected_if_holes_after = ('part', )
+        self.affected_names = ('part', 'year')
+
+    def when(self, matches, context):
+        to_remove = []
+        for filepart in matches.markers.named('path'):
+            for match in matches.range(filepart.start, filepart.end,
+                                       predicate=lambda m: m.name in self.affected_names):
+                before = matches.previous(match, index=0,
+                                          predicate=lambda m, fp=filepart: not m.private and m.start >= fp.start)
+                if not before or before.name not in self.previous_names:
+                    continue
+
+                after = matches.next(match, index=0,
+                                     predicate=lambda m, fp=filepart: not m.private and m.end <= fp.end)
+                if not after or after.name not in self.next_names:
+                    continue
+
+                group = matches.markers.at_match(match, predicate=lambda m: m.name == 'group', index=0)
+
+                def has_value_in_same_group(current_match, current_group=group):
+                    """Return true if current match has value and belongs to the current group."""
+                    return current_match.value.strip(seps) and (
+                        current_group == matches.markers.at_match(current_match,
+                                                                  predicate=lambda mm: mm.name == 'group', index=0)
+                    )
+
+                holes_before = matches.holes(before.end, match.start, predicate=has_value_in_same_group)
+                holes_after = matches.holes(match.end, after.start, predicate=has_value_in_same_group)
+
+                if not holes_before and not holes_after:
+                    continue
+
+                if match.name in self.affected_if_holes_after and not holes_after:
+                    continue
+
+                to_remove.append(match)
+                if match.parent:
+                    to_remove.append(match.parent)
+
+        return to_remove
+
+
 class TitleToEpisodeTitle(Rule):
    """
    If multiple different title are found, convert the one following episode number to episode_title.
@@ -65,12 +125,14 @@ class EpisodeTitleFromPosition(TitleBaseRule):
    """
    dependency = TitleToEpisodeTitle

+    def __init__(self, previous_names):
+        super(EpisodeTitleFromPosition, self).__init__('episode_title', ['title'])
+        self.previous_names = previous_names
+
    def hole_filter(self, hole, matches):
        episode = matches.previous(hole,
                                   lambda previous: any(name in previous.names
-                                                        for name in ['episode', 'episode_details',
-                                                                     'episode_count', 'season', 'season_count',
-                                                                     'date', 'title', 'year']),
+                                                        for name in self.previous_names),
                                   0)

        crc32 = matches.named('crc32')
@@ -88,9 +150,6 @@ class EpisodeTitleFromPosition(TitleBaseRule):
            return False
        return super(EpisodeTitleFromPosition, self).should_remove(match, matches, filepart, hole, context)

-    def __init__(self):
-        super(EpisodeTitleFromPosition, self).__init__('episode_title', ['title'])
-
    def when(self, matches, context):
        if matches.named('episode_title'):
            return
@@ -104,6 +163,10 @@ class AlternativeTitleReplace(Rule):
    dependency = EpisodeTitleFromPosition
    consequence = RenameMatch

+    def __init__(self, previous_names):
+        super(AlternativeTitleReplace, self).__init__()
+        self.previous_names = previous_names
+
    def when(self, matches, context):
        if matches.named('episode_title'):
            return
@@ -115,10 +178,7 @@ class AlternativeTitleReplace(Rule):
            if main_title:
                episode = matches.previous(main_title,
                                           lambda previous: any(name in previous.names
-                                                                for name in ['episode', 'episode_details',
-                                                                             'episode_count', 'season',
-                                                                             'season_count',
-                                                                             'date', 'title', 'year']),
+                                                                for name in self.previous_names),
                                           0)

                crc32 = matches.named('crc32')
@@ -231,14 +231,16 @@ def episodes():
                 formatter={'season': int, 'other': lambda match: 'Complete'})

    # 12, 13
-    rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int}) \
+    rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int},
+                 disabled=lambda context: context.get('type') == 'movie') \
        .defaults(validator=None) \
        .regex(r'(?P<episode>\d{2})') \
        .regex(r'v(?P<version>\d+)').repeater('?') \
        .regex(r'(?P<episodeSeparator>[x-])(?P<episode>\d{2})').repeater('*')

    # 012, 013
-    rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int}) \
+    rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int},
+                 disabled=lambda context: context.get('type') == 'movie') \
        .defaults(validator=None) \
        .regex(r'0(?P<episode>\d{1,2})') \
        .regex(r'v(?P<version>\d+)').repeater('?') \
@@ -246,7 +248,8 @@ def episodes():

    # 112, 113
    rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int},
-                 disabled=lambda context: not context.get('episode_prefer_number', False)) \
+                 disabled=lambda context: (not context.get('episode_prefer_number', False) or
+                                           context.get('type') == 'movie')) \
        .defaults(validator=None) \
        .regex(r'(?P<episode>\d{3,4})') \
        .regex(r'v(?P<version>\d+)').repeater('?') \
@@ -287,7 +290,8 @@ def episodes():
    rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode', 'weak-duplicate'],
                 formatter={'season': int, 'episode': int, 'version': int},
                 conflict_solver=lambda match, other: match if other.name == 'year' else '__default__',
-                 disabled=lambda context: context.get('episode_prefer_number', False)) \
+                 disabled=lambda context: (context.get('episode_prefer_number', False) or
+                                           context.get('type') == 'movie')) \
        .defaults(validator=None) \
        .regex(r'(?P<season>\d{1,2})(?P<episode>\d{2})') \
        .regex(r'v(?P<version>\d+)').repeater('?') \
@@ -460,8 +464,21 @@ class RemoveWeakIfMovie(Rule):
        return context.get('type') != 'episode'

    def when(self, matches, context):
-        if matches.named('year'):
-            return matches.tagged('weak-movie')
+        to_remove = []
+        to_ignore = set()
+        remove = False
+        for filepart in matches.markers.named('path'):
+            year = matches.range(filepart.start, filepart.end, predicate=lambda m: m.name == 'year', index=0)
+            if year:
+                remove = True
+                next_match = matches.next(year, predicate=lambda m, fp=filepart: m.private and m.end <= fp.end, index=0)
+                if next_match and not matches.at_match(next_match, predicate=lambda m: m.name == 'year'):
+                    to_ignore.add(next_match.initiator)
+
+        if remove:
+            to_remove.extend(matches.tagged('weak-movie', predicate=lambda m: m.initiator not in to_ignore))
+
+        return to_remove


 class RemoveWeakIfSxxExx(Rule):
@@ -39,8 +39,7 @@ COMMON_WORDS_STRICT = frozenset(['brazil'])

 UNDETERMINED = babelfish.Language('und')

-SYN = {('und', None): ['unknown', 'inconnu', 'unk'],
-       ('ell', None): ['gr', 'greek'],
+SYN = {('ell', None): ['gr', 'greek'],
       ('spa', None): ['esp', 'español', 'espanol'],
       ('fra', None): ['français', 'vf', 'vff', 'vfi', 'vfq'],
       ('swe', None): ['se'],
@@ -85,6 +85,7 @@ class ValidateWebsitePrefix(Rule):
    """
    Validate website prefixes
    """
+    priority = 64
    consequence = RemoveMatch

    def when(self, matches, context):
@@ -1814,7 +1814,7 @@
  format: HDTV
  video_codec: h264
  audio_codec: AAC
-  container: MP4
+  container: mp4
  release_group: k3n
  type: episode

@@ -1885,7 +1885,7 @@

 ? Breaking.Bad.S01E01.2008.BluRay.VC1.1080P.5.1.WMV-NOVO
 : audio_channels: '5.1'
-  container: WMV
+  container: wmv
  episode: 1
  format: BluRay
  release_group: NOVO
@@ -1922,9 +1922,7 @@

 ? Fear.The.Walking.Dead.S02E01.HDTV.x264.AAC.MP4-k3n.mp4
 : audio_codec: AAC
-  container:
-  - MP4
-  - mp4
+  container: mp4
  episode: 1
  format: HDTV
  mimetype: video/mp4
@@ -2242,7 +2240,7 @@
  screen_size: 1080p
  streaming_service: Amazon Prime
  format: WEBRip
-  audio_codec: DolbyDigital
+  audio_codec: EAC3
  audio_channels: '5.1'
  video_codec: h264
  type: episode
@@ -2692,7 +2690,7 @@
  screen_size: 4K
  streaming_service: Amazon Prime
  format: WEBRip
-  audio_codec: DolbyDigital
+  audio_codec: EAC3
  audio_channels: '5.1'
  video_codec: h264
  release_group: Group
@@ -3311,7 +3309,7 @@
  screen_size: 720p
  format: WEBRip
  video_codec: h264
-  container: MKV
+  container: mkv
  audio_codec: AC3
  audio_channels: '5.1'
  release_group: Ehhhh
@@ -3846,3 +3844,113 @@
  release_group: 0SEC [GloDLS]
  container: mkv
  type: episode
+
+? Anthony.Bourdain.Parts.Unknown.S09E01.Los.Angeles.720p.HDTV.x264-MiNDTHEGAP
+: title: Anthony Bourdain Parts Unknown
+  season: 9
+  episode: 1
+  episode_title: Los Angeles
+  screen_size: 720p
+  format: HDTV
+  video_codec: h264
+  release_group: MiNDTHEGAP
+  type: episode
+
+? -feud.s01e05.and.the.winner.is.(the.oscars.of.1963).720p.amzn.webrip.dd5.1.x264-casstudio.mkv
+: year: 1963
+
+? feud.s01e05.and.the.winner.is.(the.oscars.of.1963).720p.amzn.webrip.dd5.1.x264-casstudio.mkv
+: title: feud
+  season: 1
+  episode: 5
+  episode_title: and the winner is
+  screen_size: 720p
+  streaming_service: Amazon Prime
+  format: WEBRip
+  audio_codec: DolbyDigital
+  audio_channels: '5.1'
+  video_codec: h264
+  release_group: casstudio
+  container: mkv
+  type: episode
+
+? Adventure.Time.S08E16.Elements.Part.1.Skyhooks.720p.WEB-DL.AAC2.0.H.264-RTN.mkv
+: title: Adventure Time
+  season: 8
+  episode: 16
+  season: 8
+  episode: 16
+  episode_title: Elements Part 1 Skyhooks
+  screen_size: 720p
+  format: WEB-DL
+  audio_codec: AAC
+  audio_channels: '2.0'
+  video_codec: h264
+  release_group: RTN
+  container: mkv
+  type: episode
+
+? D:\TV\SITCOMS (CLASSIC)\That '70s Show\Season 07\That '70s Show - S07E22 - 2000 Light Years from Home.mkv
+: title: That '70s Show
+  season: 7
+  episode: 22
+  episode_title: 2000 Light Years from Home
+  other: Classic
+  container: mkv
+  mimetype: video/x-matroska
+  type: episode
+
+? Show.Name.S02E01.Super.Title.720p.WEB-DL.DD5.1.H.264-ABC.nzb
+: title: Show Name
+  season: 2
+  episode: 1
+  episode_title: Super Title
+  screen_size: 720p
+  format: WEB-DL
+  audio_codec: DolbyDigital
+  audio_channels: '5.1'
+  video_codec: h264
+  release_group: ABC
+  container: nzb
+  type: episode
+
+? "[SGKK] Bleach 312v1 [720p/mkv]-Group.mkv"
+: title: Bleach
+  season: 3
+  episode: 12
+  version: 1
+  screen_size: 720p
+  release_group: Group
+  container: mkv
+  type: episode
+
+? The.Expanse.S02E08.720p.WEBRip.x264.EAC3-KiNGS.mkv
+: title: The Expanse
+  season: 2
+  episode: 8
+  screen_size: 720p
+  format: WEBRip
+  video_codec: h264
+  audio_codec: EAC3
+  release_group: KiNGS
+  container: mkv
+  type: episode
+
+? Series_name.2005.211.episode.title.avi
+: title: Series name
+  year: 2005
+  season: 2
+  episode: 11
+  episode_title: episode title
+  container: avi
+  type: episode
+
+? the.flash.2014.208.hdtv-lol[ettv].mkv
+: title: the flash
+  year: 2014
+  season: 2
+  episode: 8
+  format: HDTV
+  release_group: lol[ettv]
+  container: mkv
+  type: episode
@@ -644,7 +644,7 @@
    - Timsit
    - Lindon
  screen_size: 1080p
-  container: MKV
+  container: mkv
  format: HDTV

 ? some.movie.720p.bluray.x264-mind
@@ -1082,3 +1082,18 @@
  format: BluRay
  screen_size: 1080p
  type: movie
+
+? 10 Cloverfield Lane.[Blu-Ray 1080p].[MULTI]
+: options: --type movie
+  title: 10 Cloverfield Lane
+  format: BluRay
+  screen_size: 1080p
+  language: Multiple languages
+  type: movie
+
+? 007.Spectre.[HDTC.MD].[TRUEFRENCH]
+: options: --type movie
+  title: 007 Spectre
+  format: HDTC
+  language: French
+  type: movie
@@ -10,10 +10,14 @@

 ? +DolbyDigital
 ? +DD
-? +DDP
 ? +Dolby Digital
 : audio_codec: DolbyDigital

+? +DDP
+? +DD+
+? +EAC3
+: audio_codec: EAC3
+
 ? +DolbyAtmos
 ? +Dolby Atmos
 ? +Atmos
@@ -146,7 +146,7 @@
 ? Show.Name.-.Season.1.to.3.-.Mp4.1080p
 ? Show.Name.-.Season.1~3.-.Mp4.1080p
 ? Show.Name.-.Saison.1.a.3.-.Mp4.1080p
-: container: MP4
+: container: mp4
  screen_size: 1080p
  season:
  - 1
@@ -761,14 +761,15 @@
  type: episode
  video_codec: h264

+# Episode title is indeed 'October 8, 2014'
+# https://thetvdb.com/?tab=episode&seriesid=82483&seasonid=569935&id=4997362&lid=7
 ? The Soup - 11x41 - October 8, 2014.mp4
 : container: mp4
  episode: 41
-  episode_title: October 8
+  episode_title: October 8, 2014
  season: 11
  title: The Soup
  type: episode
-  year: 2014

 ? Red.Rock.S02E59.WEB-DLx264-JIVE
 : episode: 59
@@ -0,0 +1,24 @@
+
+from .utils import hashodict, NoNumpyException, NoPandasException, get_scalar_repr, encode_scalars_inplace
+from .comment import strip_comment_line_with_symbol, strip_comments
+from .encoders import TricksEncoder, json_date_time_encode, class_instance_encode, json_complex_encode, \
+	numeric_types_encode, ClassInstanceEncoder, json_set_encode, pandas_encode, nopandas_encode, \
+	numpy_encode, NumpyEncoder, nonumpy_encode, NoNumpyEncoder
+from .decoders import DuplicateJsonKeyException, TricksPairHook, json_date_time_hook, json_complex_hook, \
+	numeric_types_hook, ClassInstanceHook, json_set_hook, pandas_hook, nopandas_hook, json_numpy_obj_hook, \
+	json_nonumpy_obj_hook
+from .nonp import dumps, dump, loads, load
+
+
+try:
+	# find_module takes just as long as importing, so no optimization possible
+	import numpy
+except ImportError:
+	NUMPY_MODE = False
+	# from .nonp import dumps, dump, loads, load, nonumpy_encode as numpy_encode, json_nonumpy_obj_hook as json_numpy_obj_hook
+else:
+	NUMPY_MODE = True
+	# from .np import dumps, dump, loads, load, numpy_encode, NumpyEncoder, json_numpy_obj_hook
+	# from .np_utils import encode_scalars_inplace
+
+
@@ -0,0 +1,29 @@
+
+from re import findall
+
+
+def strip_comment_line_with_symbol(line, start):
+	parts = line.split(start)
+	counts = [len(findall(r'(?:^|[^"\\]|(?:\\\\|\\")+)(")', part)) for part in parts]
+	total = 0
+	for nr, count in enumerate(counts):
+		total += count
+		if total % 2 == 0:
+			return start.join(parts[:nr+1]).rstrip()
+	else:
+		return line.rstrip()
+
+
+def strip_comments(string, comment_symbols=frozenset(('#', '//'))):
+	"""
+	:param string: A string containing json with comments started by comment_symbols.
+	:param comment_symbols: Iterable of symbols that start a line comment (default # or //).
+	:return: The string with the comments removed.
+	"""
+	lines = string.splitlines()
+	for k in range(len(lines)):
+		for symbol in comment_symbols:
+			lines[k] = strip_comment_line_with_symbol(lines[k], start=symbol)
+	return '\n'.join(lines)
+
+
@@ -0,0 +1,248 @@
+
+from datetime import datetime, date, time, timedelta
+from fractions import Fraction
+from importlib import import_module
+from collections import OrderedDict
+from decimal import Decimal
+from logging import warning
+from json_tricks import NoPandasException, NoNumpyException
+
+
+class DuplicateJsonKeyException(Exception):
+	""" Trying to load a json map which contains duplicate keys, but allow_duplicates is False """
+
+
+class TricksPairHook(object):
+	"""
+	Hook that converts json maps to the appropriate python type (dict or OrderedDict)
+	and then runs any number of hooks on the individual maps.
+	"""
+	def __init__(self, ordered=True, obj_pairs_hooks=None, allow_duplicates=True):
+		"""
+		:param ordered: True if maps should retain their ordering.
+		:param obj_pairs_hooks: An iterable of hooks to apply to elements.
+		"""
+		self.map_type = OrderedDict
+		if not ordered:
+			self.map_type = dict
+		self.obj_pairs_hooks = []
+		if obj_pairs_hooks:
+			self.obj_pairs_hooks = list(obj_pairs_hooks)
+		self.allow_duplicates = allow_duplicates
+
+	def __call__(self, pairs):
+		if not self.allow_duplicates:
+			known = set()
+			for key, value in pairs:
+				if key in known:
+					raise DuplicateJsonKeyException(('Trying to load a json map which contains a' +
+						' duplicate key "{0:}" (but allow_duplicates is False)').format(key))
+				known.add(key)
+		map = self.map_type(pairs)
+		for hook in self.obj_pairs_hooks:
+			map = hook(map)
+		return map
+
+
+def json_date_time_hook(dct):
+	"""
+	Return an encoded date, time, datetime or timedelta to it's python representation, including optional timezone.
+
+	:param dct: (dict) json encoded date, time, datetime or timedelta
+	:return: (date/time/datetime/timedelta obj) python representation of the above
+	"""
+	def get_tz(dct):
+		if not 'tzinfo' in dct:
+			return None
+		try:
+			import pytz
+		except ImportError as err:
+			raise ImportError(('Tried to load a json object which has a timezone-aware (date)time. '
+				'However, `pytz` could not be imported, so the object could not be loaded. '
+				'Error: {0:}').format(str(err)))
+		return pytz.timezone(dct['tzinfo'])
+
+	if isinstance(dct, dict):
+		if '__date__' in dct:
+			return date(year=dct.get('year', 0), month=dct.get('month', 0), day=dct.get('day', 0))
+		elif '__time__' in dct:
+			tzinfo = get_tz(dct)
+			return time(hour=dct.get('hour', 0), minute=dct.get('minute', 0), second=dct.get('second', 0),
+				microsecond=dct.get('microsecond', 0), tzinfo=tzinfo)
+		elif '__datetime__' in dct:
+			tzinfo = get_tz(dct)
+			return datetime(year=dct.get('year', 0), month=dct.get('month', 0), day=dct.get('day', 0),
+				hour=dct.get('hour', 0), minute=dct.get('minute', 0), second=dct.get('second', 0),
+				microsecond=dct.get('microsecond', 0), tzinfo=tzinfo)
+		elif '__timedelta__' in dct:
+			return timedelta(days=dct.get('days', 0), seconds=dct.get('seconds', 0),
+				microseconds=dct.get('microseconds', 0))
+	return dct
+
+
+def json_complex_hook(dct):
+	"""
+	Return an encoded complex number to it's python representation.
+
+	:param dct: (dict) json encoded complex number (__complex__)
+	:return: python complex number
+	"""
+	if isinstance(dct, dict):
+		if '__complex__' in dct:
+			parts = dct['__complex__']
+			assert len(parts) == 2
+			return parts[0] + parts[1] * 1j
+	return dct
+
+
+def numeric_types_hook(dct):
+	if isinstance(dct, dict):
+		if '__decimal__' in dct:
+			return Decimal(dct['__decimal__'])
+		if '__fraction__' in dct:
+			return Fraction(numerator=dct['numerator'], denominator=dct['denominator'])
+	return dct
+
+
+class ClassInstanceHook(object):
+	"""
+	This hook tries to convert json encoded by class_instance_encoder back to it's original instance.
+	It only works if the environment is the same, e.g. the class is similarly importable and hasn't changed.
+	"""
+	def __init__(self, cls_lookup_map=None):
+		self.cls_lookup_map = cls_lookup_map or {}
+
+	def __call__(self, dct):
+		if isinstance(dct, dict) and '__instance_type__' in dct:
+			mod, name = dct['__instance_type__']
+			attrs = dct['attributes']
+			if mod is None:
+				try:
+					Cls = getattr((__import__('__main__')), name)
+				except (ImportError, AttributeError) as err:
+					if not name in self.cls_lookup_map:
+						raise ImportError(('class {0:s} seems to have been exported from the main file, which means '
+							'it has no module/import path set; you need to provide cls_lookup_map which maps names '
+							'to classes').format(name))
+					Cls = self.cls_lookup_map[name]
+			else:
+				imp_err = None
+				try:
+					module = import_module('{0:}'.format(mod, name))
+				except ImportError as err:
+					imp_err = ('encountered import error "{0:}" while importing "{1:}" to decode a json file; perhaps '
+						'it was encoded in a different environment where {1:}.{2:} was available').format(err, mod, name)
+				else:
+					if not hasattr(module, name):
+						imp_err = 'imported "{0:}" but could find "{1:}" inside while decoding a json file (found {2:}'.format(
+							module, name, ', '.join(attr for attr in dir(module) if not attr.startswith('_')))
+					Cls = getattr(module, name)
+				if imp_err:
+					if 'name' in self.cls_lookup_map:
+						Cls = self.cls_lookup_map[name]
+					else:
+						raise ImportError(imp_err)
+			try:
+				obj = Cls.__new__(Cls)
+			except TypeError:
+				raise TypeError(('problem while decoding instance of "{0:s}"; this instance has a special '
+					'__new__ method and can\'t be restored').format(name))
+			if hasattr(obj, '__json_decode__'):
+				obj.__json_decode__(**attrs)
+			else:
+				obj.__dict__ = dict(attrs)
+			return  obj
+		return dct
+
+
+def json_set_hook(dct):
+	"""
+	Return an encoded set to it's python representation.
+	"""
+	if isinstance(dct, dict):
+		if '__set__' in dct:
+			return set((tuple(item) if isinstance(item, list) else item) for item in dct['__set__'])
+	return dct
+
+
+def pandas_hook(dct):
+	if '__pandas_dataframe__' in dct or '__pandas_series__' in dct:
+		# todo: this is experimental
+		if not getattr(pandas_hook, '_warned', False):
+			pandas_hook._warned = True
+			warning('Pandas loading support in json-tricks is experimental and may change in future versions.')
+	if '__pandas_dataframe__' in dct:
+		try:
+			from pandas import DataFrame
+		except ImportError:
+			raise NoPandasException('Trying to decode a map which appears to represent a pandas data structure, but pandas appears not to be installed.')
+		from numpy import dtype, array
+		meta = dct.pop('__pandas_dataframe__')
+		indx = dct.pop('index') if 'index' in dct else None
+		dtypes = dict((colname, dtype(tp)) for colname, tp in zip(meta['column_order'], meta['types']))
+		data = OrderedDict()
+		for name, col in dct.items():
+			data[name] = array(col, dtype=dtypes[name])
+		return DataFrame(
+			data=data,
+			index=indx,
+			columns=meta['column_order'],
+			# mixed `dtypes` argument not supported, so use duct of numpy arrays
+		)
+	elif '__pandas_series__' in dct:
+		from pandas import Series
+		from numpy import dtype, array
+		meta = dct.pop('__pandas_series__')
+		indx = dct.pop('index') if 'index' in dct else None
+		return Series(
+			data=dct['data'],
+			index=indx,
+			name=meta['name'],
+			dtype=dtype(meta['type']),
+		)
+	return dct
+
+
+def nopandas_hook(dct):
+	if isinstance(dct, dict) and ('__pandas_dataframe__' in dct or '__pandas_series__' in dct):
+		raise NoPandasException(('Trying to decode a map which appears to represent a pandas '
+			'data structure, but pandas support is not enabled, perhaps it is not installed.'))
+	return dct
+
+
+def json_numpy_obj_hook(dct):
+	"""
+	Replace any numpy arrays previously encoded by NumpyEncoder to their proper
+	shape, data type and data.
+
+	:param dct: (dict) json encoded ndarray
+	:return: (ndarray) if input was an encoded ndarray
+	"""
+	if isinstance(dct, dict) and '__ndarray__' in dct:
+		try:
+			from numpy import asarray
+			import numpy as nptypes
+		except ImportError:
+			raise NoNumpyException('Trying to decode a map which appears to represent a numpy '
+				'array, but numpy appears not to be installed.')
+		order = 'A'
+		if 'Corder' in dct:
+			order = 'C' if dct['Corder'] else 'F'
+		if dct['shape']:
+			return asarray(dct['__ndarray__'], dtype=dct['dtype'], order=order)
+		else:
+			dtype = getattr(nptypes, dct['dtype'])
+			return dtype(dct['__ndarray__'])
+	return dct
+
+
+def json_nonumpy_obj_hook(dct):
+	"""
+	This hook has no effect except to check if you're trying to decode numpy arrays without support, and give you a useful message.
+	"""
+	if isinstance(dct, dict) and '__ndarray__' in dct:
+		raise NoNumpyException(('Trying to decode a map which appears to represent a numpy array, '
+			'but numpy support is not enabled, perhaps it is not installed.'))
+	return dct
+
+
@@ -0,0 +1,311 @@
+
+from datetime import datetime, date, time, timedelta
+from fractions import Fraction
+from logging import warning
+from json import JSONEncoder
+from sys import version
+from decimal import Decimal
+from .utils import hashodict, call_with_optional_kwargs, NoPandasException, NoNumpyException
+
+
+class TricksEncoder(JSONEncoder):
+	"""
+	Encoder that runs any number of encoder functions or instances on
+	the objects that are being encoded.
+
+	Each encoder should make any appropriate changes and return an object,
+	changed or not. This will be passes to the other encoders.
+	"""
+	def __init__(self, obj_encoders=None, silence_typeerror=False, primitives=False, **json_kwargs):
+		"""
+		:param obj_encoders: An iterable of functions or encoder instances to try.
+		:param silence_typeerror: If set to True, ignore the TypeErrors that Encoder instances throw (default False).
+		"""
+		self.obj_encoders = []
+		if obj_encoders:
+			self.obj_encoders = list(obj_encoders)
+		self.silence_typeerror = silence_typeerror
+		self.primitives = primitives
+		super(TricksEncoder, self).__init__(**json_kwargs)
+
+	def default(self, obj, *args, **kwargs):
+		"""
+		This is the method of JSONEncoders that is called for each object; it calls
+		all the encoders with the previous one's output used as input.
+
+		It works for Encoder instances, but they are expected not to throw
+		`TypeError` for unrecognized types (the super method does that by default).
+
+		It never calls the `super` method so if there are non-primitive types
+		left at the end, you'll get an encoding error.
+		"""
+		prev_id = id(obj)
+		for encoder in self.obj_encoders:
+			if hasattr(encoder, 'default'):
+				#todo: write test for this scenario (maybe ClassInstanceEncoder?)
+				try:
+					obj = call_with_optional_kwargs(encoder.default, obj, primitives=self.primitives)
+				except TypeError as err:
+					if not self.silence_typeerror:
+						raise
+			elif hasattr(encoder, '__call__'):
+				obj = call_with_optional_kwargs(encoder, obj, primitives=self.primitives)
+			else:
+				raise TypeError('`obj_encoder` {0:} does not have `default` method and is not callable'.format(encoder))
+		if id(obj) == prev_id:
+			#todo: test
+			raise TypeError('Object of type {0:} could not be encoded by {1:} using encoders [{2:s}]'.format(
+				type(obj), self.__class__.__name__, ', '.join(str(encoder) for encoder in self.obj_encoders)))
+		return obj
+
+
+def json_date_time_encode(obj, primitives=False):
+	"""
+	Encode a date, time, datetime or timedelta to a string of a json dictionary, including optional timezone.
+
+	:param obj: date/time/datetime/timedelta obj
+	:return: (dict) json primitives representation of date, time, datetime or timedelta
+	"""
+	if primitives and isinstance(obj, (date, time, datetime)):
+		return obj.isoformat()
+	if isinstance(obj, datetime):
+		dct = hashodict([('__datetime__', None), ('year', obj.year), ('month', obj.month),
+			('day', obj.day), ('hour', obj.hour), ('minute', obj.minute),
+			('second', obj.second), ('microsecond', obj.microsecond)])
+		if obj.tzinfo:
+			dct['tzinfo'] = obj.tzinfo.zone
+	elif isinstance(obj, date):
+		dct = hashodict([('__date__', None), ('year', obj.year), ('month', obj.month), ('day', obj.day)])
+	elif isinstance(obj, time):
+		dct = hashodict([('__time__', None), ('hour', obj.hour), ('minute', obj.minute),
+			('second', obj.second), ('microsecond', obj.microsecond)])
+		if obj.tzinfo:
+			dct['tzinfo'] = obj.tzinfo.zone
+	elif isinstance(obj, timedelta):
+		if primitives:
+			return obj.total_seconds()
+		else:
+			dct = hashodict([('__timedelta__', None), ('days', obj.days), ('seconds', obj.seconds),
+				('microseconds', obj.microseconds)])
+	else:
+		return obj
+	for key, val in tuple(dct.items()):
+		if not key.startswith('__') and not val:
+			del dct[key]
+	return dct
+
+
+def class_instance_encode(obj, primitives=False):
+	"""
+	Encodes a class instance to json. Note that it can only be recovered if the environment allows the class to be
+	imported in the same way.
+	"""
+	if isinstance(obj, list) or isinstance(obj, dict):
+		return obj
+	if hasattr(obj, '__class__') and hasattr(obj, '__dict__'):
+		if not hasattr(obj, '__new__'):
+			raise TypeError('class "{0:s}" does not have a __new__ method; '.format(obj.__class__) +
+				('perhaps it is an old-style class not derived from `object`; add `object` as a base class to encode it.'
+					if (version[:2] == '2.') else 'this should not happen in Python3'))
+		try:
+			obj.__new__(obj.__class__)
+		except TypeError:
+			raise TypeError(('instance "{0:}" of class "{1:}" cannot be encoded because it\'s __new__ method '
+				'cannot be called, perhaps it requires extra parameters').format(obj, obj.__class__))
+		mod = obj.__class__.__module__
+		if mod == '__main__':
+			mod = None
+			warning(('class {0:} seems to have been defined in the main file; unfortunately this means'
+				' that it\'s module/import path is unknown, so you might have to provide cls_lookup_map when '
+				'decoding').format(obj.__class__))
+		name = obj.__class__.__name__
+		if hasattr(obj, '__json_encode__'):
+			attrs = obj.__json_encode__()
+		else:
+			attrs = hashodict(obj.__dict__.items())
+		if primitives:
+			return attrs
+		else:
+			return hashodict((('__instance_type__', (mod, name)), ('attributes', attrs)))
+	return obj
+
+
+def json_complex_encode(obj, primitives=False):
+	"""
+	Encode a complex number as a json dictionary of it's real and imaginary part.
+
+	:param obj: complex number, e.g. `2+1j`
+	:return: (dict) json primitives representation of `obj`
+	"""
+	if isinstance(obj, complex):
+		if primitives:
+			return [obj.real, obj.imag]
+		else:
+			return hashodict(__complex__=[obj.real, obj.imag])
+	return obj
+
+
+def numeric_types_encode(obj, primitives=False):
+	"""
+	Encode Decimal and Fraction.
+	
+	:param primitives: Encode decimals and fractions as standard floats. You may lose precision. If you do this, you may need to enable `allow_nan` (decimals always allow NaNs but floats do not).
+	"""
+	if isinstance(obj, Decimal):
+		if primitives:
+			return float(obj)
+		else:
+			return {
+				'__decimal__': str(obj.canonical()),
+			}
+	if isinstance(obj, Fraction):
+		if primitives:
+			return float(obj)
+		else:
+			return hashodict((
+				('__fraction__', True),
+				('numerator', obj.numerator),
+				('denominator', obj.denominator),
+			))
+	return obj
+
+
+class ClassInstanceEncoder(JSONEncoder):
+	"""
+	See `class_instance_encoder`.
+	"""
+	# Not covered in tests since `class_instance_encode` is recommended way.
+	def __init__(self, obj, encode_cls_instances=True, **kwargs):
+		self.encode_cls_instances = encode_cls_instances
+		super(ClassInstanceEncoder, self).__init__(obj, **kwargs)
+
+	def default(self, obj, *args, **kwargs):
+		if self.encode_cls_instances:
+			obj = class_instance_encode(obj)
+		return super(ClassInstanceEncoder, self).default(obj, *args, **kwargs)
+
+
+def json_set_encode(obj, primitives=False):
+	"""
+	Encode python sets as dictionary with key __set__ and a list of the values.
+
+	Try to sort the set to get a consistent json representation, use arbitrary order if the data is not ordinal.
+	"""
+	if isinstance(obj, set):
+		try:
+			repr = sorted(obj)
+		except Exception:
+			repr = list(obj)
+		if primitives:
+			return repr
+		else:
+			return hashodict(__set__=repr)
+	return obj
+
+
+def pandas_encode(obj, primitives=False):
+	from pandas import DataFrame, Series
+	if isinstance(obj, (DataFrame, Series)):
+		#todo: this is experimental
+		if not getattr(pandas_encode, '_warned', False):
+			pandas_encode._warned = True
+			warning('Pandas dumping support in json-tricks is experimental and may change in future versions.')
+	if isinstance(obj, DataFrame):
+		repr = hashodict()
+		if not primitives:
+			repr['__pandas_dataframe__'] = hashodict((
+				('column_order', tuple(obj.columns.values)),
+				('types', tuple(str(dt) for dt in obj.dtypes)),
+			))
+		repr['index'] = tuple(obj.index.values)
+		for k, name in enumerate(obj.columns.values):
+			repr[name] = tuple(obj.ix[:, k].values)
+		return repr
+	if isinstance(obj, Series):
+		repr = hashodict()
+		if not primitives:
+			repr['__pandas_series__'] = hashodict((
+				('name', str(obj.name)),
+				('type', str(obj.dtype)),
+			))
+		repr['index'] = tuple(obj.index.values)
+		repr['data'] = tuple(obj.values)
+		return repr
+	return obj
+
+
+def nopandas_encode(obj):
+	if ('DataFrame' in getattr(obj.__class__, '__name__', '') or 'Series' in getattr(obj.__class__, '__name__', '')) \
+			and 'pandas.' in getattr(obj.__class__, '__module__', ''):
+		raise NoPandasException(('Trying to encode an object of type {0:} which appears to be '
+			'a numpy array, but numpy support is not enabled, perhaps it is not installed.').format(type(obj)))
+	return obj
+
+
+def numpy_encode(obj, primitives=False):
+	"""
+	Encodes numpy `ndarray`s as lists with meta data.
+	
+	Encodes numpy scalar types as Python equivalents. Special encoding is not possible,
+	because int64 (in py2) and float64 (in py2 and py3) are subclasses of primitives,
+	which never reach the encoder.
+	
+	:param primitives: If True, arrays are serialized as (nested) lists without meta info.
+	"""
+	from numpy import ndarray, generic
+	if isinstance(obj, ndarray):
+		if primitives:
+			return obj.tolist()
+		else:
+			dct = hashodict((
+				('__ndarray__', obj.tolist()),
+				('dtype', str(obj.dtype)),
+				('shape', obj.shape),
+			))
+			if len(obj.shape) > 1:
+				dct['Corder'] = obj.flags['C_CONTIGUOUS']
+			return dct
+	elif isinstance(obj, generic):
+		if NumpyEncoder.SHOW_SCALAR_WARNING:
+			NumpyEncoder.SHOW_SCALAR_WARNING = False
+			warning('json-tricks: numpy scalar serialization is experimental and may work differently in future versions')
+		return obj.item()
+	return obj
+
+
+class NumpyEncoder(ClassInstanceEncoder):
+	"""
+	JSON encoder for numpy arrays.
+	"""
+	SHOW_SCALAR_WARNING = True  # show a warning that numpy scalar serialization is experimental
+	
+	def default(self, obj, *args, **kwargs):
+		"""
+		If input object is a ndarray it will be converted into a dict holding
+		data type, shape and the data. The object can be restored using json_numpy_obj_hook.
+		"""
+		warning('`NumpyEncoder` is deprecated, use `numpy_encode`')  #todo
+		obj = numpy_encode(obj)
+		return super(NumpyEncoder, self).default(obj, *args, **kwargs)
+
+
+def nonumpy_encode(obj):
+	"""
+	Raises an error for numpy arrays.
+	"""
+	if 'ndarray' in getattr(obj.__class__, '__name__', '') and 'numpy.' in getattr(obj.__class__, '__module__', ''):
+		raise NoNumpyException(('Trying to encode an object of type {0:} which appears to be '
+			'a pandas data stucture, but pandas support is not enabled, perhaps it is not installed.').format(type(obj)))
+	return obj
+
+
+class NoNumpyEncoder(JSONEncoder):
+	"""
+	See `nonumpy_encode`.
+	"""
+	def default(self, obj, *args, **kwargs):
+		warning('`NoNumpyEncoder` is deprecated, use `nonumpy_encode`')  #todo
+		obj = nonumpy_encode(obj)
+		return super(NoNumpyEncoder, self).default(obj, *args, **kwargs)
+
+
@@ -0,0 +1,207 @@
+
+from gzip import GzipFile
+from io import BytesIO
+from json import loads as json_loads
+from os import fsync
+from sys import exc_info, version
+from .utils import NoNumpyException  # keep 'unused' imports
+from .comment import strip_comment_line_with_symbol, strip_comments  # keep 'unused' imports
+from .encoders import TricksEncoder, json_date_time_encode, class_instance_encode, ClassInstanceEncoder, \
+	json_complex_encode, json_set_encode, numeric_types_encode, numpy_encode, nonumpy_encode, NoNumpyEncoder, \
+	nopandas_encode, pandas_encode  # keep 'unused' imports
+from .decoders import DuplicateJsonKeyException, TricksPairHook, json_date_time_hook, ClassInstanceHook, \
+	json_complex_hook, json_set_hook, numeric_types_hook, json_numpy_obj_hook, json_nonumpy_obj_hook, \
+	nopandas_hook, pandas_hook  # keep 'unused' imports
+from json import JSONEncoder
+
+
+is_py3 = (version[:2] == '3.')
+str_type = str if is_py3 else (basestring, unicode,)
+ENCODING = 'UTF-8'
+
+
+_cih_instance = ClassInstanceHook()
+DEFAULT_ENCODERS = [json_date_time_encode, class_instance_encode, json_complex_encode, json_set_encode, numeric_types_encode,]
+DEFAULT_HOOKS = [json_date_time_hook, _cih_instance, json_complex_hook, json_set_hook, numeric_types_hook,]
+
+try:
+	import numpy
+except ImportError:
+	DEFAULT_ENCODERS = [nonumpy_encode,] + DEFAULT_ENCODERS
+	DEFAULT_HOOKS = [json_nonumpy_obj_hook,] + DEFAULT_HOOKS
+else:
+	# numpy encode needs to be before complex
+	DEFAULT_ENCODERS = [numpy_encode,] + DEFAULT_ENCODERS
+	DEFAULT_HOOKS = [json_numpy_obj_hook,] + DEFAULT_HOOKS
+
+try:
+	import pandas
+except ImportError:
+	DEFAULT_ENCODERS = [nopandas_encode,] + DEFAULT_ENCODERS
+	DEFAULT_HOOKS = [nopandas_hook,] + DEFAULT_HOOKS
+else:
+	DEFAULT_ENCODERS = [pandas_encode,] + DEFAULT_ENCODERS
+	DEFAULT_HOOKS = [pandas_hook,] + DEFAULT_HOOKS
+
+
+DEFAULT_NONP_ENCODERS = [nonumpy_encode,] + DEFAULT_ENCODERS    # DEPRECATED
+DEFAULT_NONP_HOOKS = [json_nonumpy_obj_hook,] + DEFAULT_HOOKS   # DEPRECATED
+
+
+def dumps(obj, sort_keys=None, cls=TricksEncoder, obj_encoders=DEFAULT_ENCODERS, extra_obj_encoders=(),
+		primitives=False, compression=None, allow_nan=False, conv_str_byte=False, **jsonkwargs):
+	"""
+	Convert a nested data structure to a json string.
+
+	:param obj: The Python object to convert.
+	:param sort_keys: Keep this False if you want order to be preserved.
+	:param cls: The json encoder class to use, defaults to NoNumpyEncoder which gives a warning for numpy arrays.
+	:param obj_encoders: Iterable of encoders to use to convert arbitrary objects into json-able promitives.
+	:param extra_obj_encoders: Like `obj_encoders` but on top of them: use this to add encoders without replacing defaults. Since v3.5 these happen before default encoders.
+	:param allow_nan: Allow NaN and Infinity values, which is a (useful) violation of the JSON standard (default False).
+	:param conv_str_byte: Try to automatically convert between strings and bytes (assuming utf-8) (default False).
+	:return: The string containing the json-encoded version of obj.
+
+	Other arguments are passed on to `cls`. Note that `sort_keys` should be false if you want to preserve order.
+	"""
+	if not hasattr(extra_obj_encoders, '__iter__'):
+		raise TypeError('`extra_obj_encoders` should be a tuple in `json_tricks.dump(s)`')
+	encoders = tuple(extra_obj_encoders) + tuple(obj_encoders)
+	txt = cls(sort_keys=sort_keys, obj_encoders=encoders, allow_nan=allow_nan,
+		primitives=primitives, **jsonkwargs).encode(obj)
+	if not is_py3 and isinstance(txt, str):
+		txt = unicode(txt, ENCODING)
+	if not compression:
+		return txt
+	if compression is True:
+		compression = 5
+	txt = txt.encode(ENCODING)
+	sh = BytesIO()
+	with GzipFile(mode='wb', fileobj=sh, compresslevel=compression) as zh:
+		zh.write(txt)
+	gzstring = sh.getvalue()
+	return gzstring
+
+
+def dump(obj, fp, sort_keys=None, cls=TricksEncoder, obj_encoders=DEFAULT_ENCODERS, extra_obj_encoders=(),
+		 primitives=False, compression=None, force_flush=False, allow_nan=False, conv_str_byte=False, **jsonkwargs):
+	"""
+	Convert a nested data structure to a json string.
+
+	:param fp: File handle or path to write to.
+	:param compression: The gzip compression level, or None for no compression.
+	:param force_flush: If True, flush the file handle used, when possibly also in the operating system (default False).
+	
+	The other arguments are identical to `dumps`.
+	"""
+	txt = dumps(obj, sort_keys=sort_keys, cls=cls, obj_encoders=obj_encoders, extra_obj_encoders=extra_obj_encoders,
+		primitives=primitives, compression=compression, allow_nan=allow_nan, conv_str_byte=conv_str_byte, **jsonkwargs)
+	if isinstance(fp, str_type):
+		fh = open(fp, 'wb+')
+	else:
+		fh = fp
+		if conv_str_byte:
+			try:
+				fh.write(b'')
+			except TypeError:
+				pass
+				# if not isinstance(txt, str_type):
+				# 	# Cannot write bytes, so must be in text mode, but we didn't get a text
+				# 	if not compression:
+				# 		txt = txt.decode(ENCODING)
+			else:
+				try:
+					fh.write(u'')
+				except TypeError:
+					if isinstance(txt, str_type):
+						txt = txt.encode(ENCODING)
+	try:
+		if 'b' not in getattr(fh, 'mode', 'b?') and not isinstance(txt, str_type) and compression:
+			raise IOError('If compression is enabled, the file must be opened in binary mode.')
+		try:
+			fh.write(txt)
+		except TypeError as err:
+			err.args = (err.args[0] + '. A possible reason is that the file is not opened in binary mode; '
+				'be sure to set file mode to something like "wb".',)
+			raise
+	finally:
+		if force_flush:
+			fh.flush()
+			try:
+				if fh.fileno() is not None:
+					fsync(fh.fileno())
+			except (ValueError,):
+				pass
+		if isinstance(fp, str_type):
+			fh.close()
+	return txt
+
+
+def loads(string, preserve_order=True, ignore_comments=True, decompression=None, obj_pairs_hooks=DEFAULT_HOOKS,
+		extra_obj_pairs_hooks=(), cls_lookup_map=None, allow_duplicates=True, conv_str_byte=False, **jsonkwargs):
+	"""
+	Convert a nested data structure to a json string.
+
+	:param string: The string containing a json encoded data structure.
+	:param decode_cls_instances: True to attempt to decode class instances (requires the environment to be similar the the encoding one).
+	:param preserve_order: Whether to preserve order by using OrderedDicts or not.
+	:param ignore_comments: Remove comments (starting with # or //).
+	:param decompression: True to use gzip decompression, False to use raw data, None to automatically determine (default). Assumes utf-8 encoding!
+	:param obj_pairs_hooks: A list of dictionary hooks to apply.
+	:param extra_obj_pairs_hooks: Like `obj_pairs_hooks` but on top of them: use this to add hooks without replacing defaults. Since v3.5 these happen before default hooks.
+	:param cls_lookup_map: If set to a dict, for example ``globals()``, then classes encoded from __main__ are looked up this dict.
+	:param allow_duplicates: If set to False, an error will be raised when loading a json-map that contains duplicate keys.
+	:param parse_float: A function to parse strings to integers (e.g. Decimal). There is also `parse_int`.
+	:param conv_str_byte: Try to automatically convert between strings and bytes (assuming utf-8) (default False).
+	:return: The string containing the json-encoded version of obj.
+
+	Other arguments are passed on to json_func.
+	"""
+	if not hasattr(extra_obj_pairs_hooks, '__iter__'):
+		raise TypeError('`extra_obj_pairs_hooks` should be a tuple in `json_tricks.load(s)`')
+	if decompression is None:
+		decompression = string[:2] == b'\x1f\x8b'
+	if decompression:
+		with GzipFile(fileobj=BytesIO(string), mode='rb') as zh:
+			string = zh.read()
+			string = string.decode(ENCODING)
+	if not isinstance(string, str_type):
+		if conv_str_byte:
+			string = string.decode(ENCODING)
+		else:
+			raise TypeError(('Cannot automatically encode object of type "{0:}" in `json_tricks.load(s)` since '
+				'the encoding is not known. You should instead encode the bytes to a string and pass that '
+				'string to `load(s)`, for example bytevar.encode("utf-8") if utf-8 is the encoding.').format(type(string)))
+	if ignore_comments:
+		string = strip_comments(string)
+	obj_pairs_hooks = tuple(obj_pairs_hooks)
+	_cih_instance.cls_lookup_map = cls_lookup_map or {}
+	hooks = tuple(extra_obj_pairs_hooks) + obj_pairs_hooks
+	hook = TricksPairHook(ordered=preserve_order, obj_pairs_hooks=hooks, allow_duplicates=allow_duplicates)
+	return json_loads(string, object_pairs_hook=hook, **jsonkwargs)
+
+
+def load(fp, preserve_order=True, ignore_comments=True, decompression=None, obj_pairs_hooks=DEFAULT_HOOKS,
+		extra_obj_pairs_hooks=(), cls_lookup_map=None, allow_duplicates=True, conv_str_byte=False, **jsonkwargs):
+	"""
+	Convert a nested data structure to a json string.
+
+	:param fp: File handle or path to load from.
+
+	The other arguments are identical to loads.
+	"""
+	try:
+		if isinstance(fp, str_type):
+			with open(fp, 'rb') as fh:
+				string = fh.read()
+		else:
+			string = fp.read()
+	except UnicodeDecodeError as err:
+		# todo: not covered in tests, is it relevant?
+		raise Exception('There was a problem decoding the file content. A possible reason is that the file is not ' +
+			'opened  in binary mode; be sure to set file mode to something like "rb".').with_traceback(exc_info()[2])
+	return loads(string, preserve_order=preserve_order, ignore_comments=ignore_comments, decompression=decompression,
+		obj_pairs_hooks=obj_pairs_hooks, extra_obj_pairs_hooks=extra_obj_pairs_hooks, cls_lookup_map=cls_lookup_map,
+		allow_duplicates=allow_duplicates, conv_str_byte=conv_str_byte, **jsonkwargs)
+
+
@@ -0,0 +1,28 @@
+
+"""
+This file exists for backward compatibility reasons.
+"""
+
+from logging import warning
+from .nonp import NoNumpyException, DEFAULT_ENCODERS, DEFAULT_HOOKS, dumps, dump, loads, load  # keep 'unused' imports
+from .utils import hashodict, NoPandasException
+from .comment import strip_comment_line_with_symbol, strip_comments  # keep 'unused' imports
+from .encoders import TricksEncoder, json_date_time_encode, class_instance_encode, ClassInstanceEncoder, \
+	numpy_encode, NumpyEncoder # keep 'unused' imports
+from .decoders import DuplicateJsonKeyException, TricksPairHook, json_date_time_hook, ClassInstanceHook, \
+	json_complex_hook, json_set_hook, json_numpy_obj_hook  # keep 'unused' imports
+
+try:
+	import numpy
+except ImportError:
+	raise NoNumpyException('Could not load numpy, maybe it is not installed? If you do not want to use numpy encoding '
+		'or decoding, you can import the functions from json_tricks.nonp instead, which do not need numpy.')
+
+
+# todo: warning('`json_tricks.np` is deprecated, you can import directly from `json_tricks`')
+
+
+DEFAULT_NP_ENCODERS = [numpy_encode,] + DEFAULT_ENCODERS    # DEPRECATED
+DEFAULT_NP_HOOKS = [json_numpy_obj_hook,] + DEFAULT_HOOKS   # DEPRECATED
+
+
@@ -0,0 +1,15 @@
+
+"""
+This file exists for backward compatibility reasons.
+"""
+
+from .utils import hashodict, get_scalar_repr, encode_scalars_inplace
+from .nonp import NoNumpyException
+from . import np
+
+# try:
+# 	from numpy import generic, complex64, complex128
+# except ImportError:
+# 	raise NoNumpyException('Could not load numpy, maybe it is not installed?')
+
+
@@ -0,0 +1,81 @@
+
+from collections import OrderedDict
+
+
+class hashodict(OrderedDict):
+	"""
+	This dictionary is hashable. It should NOT be mutated, or all kinds of weird
+	bugs may appear. This is not enforced though, it's only used for encoding.
+	"""
+	def __hash__(self):
+		return hash(frozenset(self.items()))
+
+
+try:
+	from inspect import signature
+except ImportError:
+	try:
+		from inspect import getfullargspec
+	except ImportError:
+		from inspect import getargspec
+		def get_arg_names(callable):
+			argspec = getargspec(callable)
+			return set(argspec.args)
+	else:
+		#todo: this is not covered in test case (py 3+ uses `signature`, py2 `getfullargspec`); consider removing it
+		def get_arg_names(callable):
+			argspec = getfullargspec(callable)
+			return set(argspec.args) | set(argspec.kwonlyargs)
+else:
+	def get_arg_names(callable):
+		sig = signature(callable)
+		return set(sig.parameters.keys())
+
+
+def call_with_optional_kwargs(callable, *args, **optional_kwargs):
+	accepted_kwargs = get_arg_names(callable)
+	use_kwargs = {}
+	for key, val in optional_kwargs.items():
+		if key in accepted_kwargs:
+			use_kwargs[key] = val
+	return callable(*args, **use_kwargs)
+
+
+class NoNumpyException(Exception):
+	""" Trying to use numpy features, but numpy cannot be found. """
+
+
+class NoPandasException(Exception):
+	""" Trying to use pandas features, but pandas cannot be found. """
+
+
+def get_scalar_repr(npscalar):
+	return hashodict((
+		('__ndarray__', npscalar.item()),
+		('dtype', str(npscalar.dtype)),
+		('shape', ()),
+	))
+
+
+def encode_scalars_inplace(obj):
+	"""
+	Searches a data structure of lists, tuples and dicts for numpy scalars
+	and replaces them by their dictionary representation, which can be loaded
+	by json-tricks. This happens in-place (the object is changed, use a copy).
+	"""
+	from numpy import generic, complex64, complex128
+	if isinstance(obj, (generic, complex64, complex128)):
+		return get_scalar_repr(obj)
+	if isinstance(obj, dict):
+		for key, val in tuple(obj.items()):
+			obj[key] = encode_scalars_inplace(val)
+		return obj
+	if isinstance(obj, list):
+		for k, val in enumerate(obj):
+			obj[k] = encode_scalars_inplace(val)
+		return obj
+	if isinstance(obj, (tuple, set)):
+		return type(obj)(encode_scalars_inplace(val) for val in obj)
+	return obj
+
+
@@ -23,6 +23,17 @@ class Media(Descriptor):
    bitrate = Property(type=int)
    duration = Property(type=int)

+    #@classmethod
+    #def from_node(cls, client, node):
+    #    return cls.construct(client, cls.helpers.find(node, 'Media'), child=True)
+
    @classmethod
    def from_node(cls, client, node):
-        return cls.construct(client, cls.helpers.find(node, 'Media'), child=True)
+        items = []
+
+        for genre in cls.helpers.findall(node, 'Media'):
+            _, obj = Media.construct(client, genre, child=True)
+
+            items.append(obj)
+
+        return [], items
@@ -1,27 +1,27 @@
 # addic7ed
-python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; ProviderPool(providers=['addic7ed'], provider_configs={'addic7ed': {'use_random_agents': True}})['addic7ed'].query('Game of Thrones', 2)"
+python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; ProviderPool(providers=['addic7ed'], provider_configs={'addic7ed': {'use_random_agents': True}})['addic7ed'].query('Game of Thrones', 2)"

 # opensubtitles
-python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; ProviderPool(providers=['opensubtitles'], )['opensubtitles'].query([Language('eng')], query='Game of Thrones', season=2, episode=1, tag='Game.of.Thrones.S06E01.The.Red.Woman.720p.WEB-DL.DD5.1.H.264-NTB.mkv', use_tag_search=True)"
+python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; from subzero.video import parse_video; SZProviderPool(providers=['opensubtitles'], )['opensubtitles'].list_subtitles(parse_video('FULL_PATH', {}, {'type': 'episode'}), languages=[Language('eng')])"

 # podnapisi
-python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; ProviderPool(providers=['podnapisi'], )['podnapisi'].query([Language('eng')], 'Game of Thrones', season=2, episode=1)"
+python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; SZProviderPool(providers=['podnapisi'], )['podnapisi'].query([Language('eng')], 'Game of Thrones', season=2, episode=1)"

 # tvsubtitles
-python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; ProviderPool(providers=['tvsubtitles'], )['tvsubtitles'].query('Game of Thrones', 2, 1)"
+python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; SZProviderPool(providers=['tvsubtitles'], )['tvsubtitles'].query('Game of Thrones', 2, 1)"

 # napiprojekt:list
-python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; from subliminal.core import scan_video; print ProviderPool(providers=['napiprojekt'], )['napiprojekt'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('pol')])"
+python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; from subliminal.core import scan_video; print SZProviderPool(providers=['napiprojekt'], )['napiprojekt'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('pol')])"

 # napiprojekt:download
-python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import PatchedProviderPool; from subliminal import download_best_subtitles; from babelfish import Language; from subliminal.core import scan_video; subs = download_best_subtitles([scan_video('FULL_PATH')], languages={Language('eng')}, providers=['napiprojekt'], ); print subs.values()[0][0].is_valid()"
+python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from subliminal_patch.score import compute_score; from subliminal import download_best_subtitles; from babelfish import Language; from subliminal.core import scan_video; subs = download_best_subtitles([scan_video('FULL_PATH')], languages={Language('eng')}, providers=['napiprojekt'], pool_class=SZProviderPool, compute_score=compute_score); print subs.values()[0][0].is_valid()"


 # shooter:list
-python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; from subliminal.core import scan_video; print ProviderPool(providers=['shooter'], )['shooter'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('zho')])"
+python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; from subliminal.core import scan_video; print SZProviderPool(providers=['shooter'], )['shooter'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('zho')])"

 # subscenter:list
-python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; from subliminal.core import scan_video; print ProviderPool(providers=['subscenter'], )['subscenter'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('heb')])"
+python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; from subliminal.core import scan_video; print SZProviderPool(providers=['subscenter'], )['subscenter'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('heb')])"


 # refining
@@ -9,12 +9,6 @@ from .providers import Provider
 from .http import RetryingSession
 subliminal.subtitle.Subtitle = PatchedSubtitle

-try:
-    subliminal.provider_manager.register('napiprojekt = subliminal.providers.napiprojekt:NapiProjektProvider',)
-except ValueError:
-    # already registered
-    pass
-
 # add our patched base classes
 for name in ("Addic7ed", "Podnapisi", "TVsubtitles", "OpenSubtitles", "LegendasTV", "NapiProjekt", "Shooter",
             "SubsCenter"):
@@ -28,13 +22,18 @@ for name in ("Addic7ed", "Podnapisi", "TVsubtitles", "OpenSubtitles", "LegendasT
 from .core import scan_video, search_external_subtitles, list_all_subtitles, save_subtitles, refine
 from .score import compute_score
 from .extensions import provider_manager
+from .video import Video

 # patch subliminal's core functions
 subliminal.scan_video = subliminal.core.scan_video = scan_video
 subliminal.core.search_external_subtitles = search_external_subtitles
 subliminal.save_subtitles = subliminal.core.save_subtitles = save_subtitles
 subliminal.refine = subliminal.core.refine = refine
+subliminal.video.Video = subliminal.Video = Video
+subliminal.video.Episode.__bases__ = (Video,)
+subliminal.video.Movie.__bases__ = (Video,)

 # add our own list_all_subtitles
 subliminal.list_all_subtitles = subliminal.core.list_all_subtitles = list_all_subtitles
-subliminal.provider_manager = subliminal.core.provider_manager = provider_manager
+subliminal.provider_manager = subliminal.core.provider_manager = subliminal.extensions.provider_manager = \
+    provider_manager
@@ -102,14 +102,18 @@ class SZProviderPool(ProviderPool):
            try:
                self[subtitle.provider_name].download_subtitle(subtitle)
                break
-            except (requests.Timeout, socket.timeout):
-                logger.error('Provider %r timed out', subtitle.provider_name)
-            except ProviderError:
-                logger.error('Unexpected error in provider %r, Traceback: %s', subtitle.provider_name,
-                             traceback.format_exc())
+            except (requests.ConnectionError,
+                    requests.exceptions.ProxyError,
+                    requests.exceptions.SSLError,
+                    requests.Timeout,
+                    socket.timeout):
+                logger.error('Provider %r connection error', subtitle.provider_name)
+
            except:
                logger.exception('Unexpected error in provider %r, Traceback: %s', subtitle.provider_name,
                                 traceback.format_exc())
+                self.discarded_providers.add(subtitle.provider_name)
+                return False

            if tries == DOWNLOAD_TRIES:
                self.discarded_providers.add(subtitle.provider_name)
@@ -121,6 +125,10 @@ class SZProviderPool(ProviderPool):
                         subtitle.provider_name, DOWNLOAD_RETRY_SLEEP)
            time.sleep(DOWNLOAD_RETRY_SLEEP)

+        if os.environ.get("SZ_ENFORCE_ENCODING", "False") == "True":
+            logger.info("Enforcing encoding of %s from %s to %s", subtitle, subtitle.guess_encoding(), "utf-8")
+            subtitle.set_encoding("utf-8")
+
        # check subtitle validity
        if not subtitle.is_valid():
            logger.error('Invalid subtitle')
@@ -192,7 +200,8 @@ class SZProviderPool(ProviderPool):
                continue

            # bail out if hearing_impaired was wrong
-            if "hearing_impaired" not in matches and hearing_impaired in ("force HI", "force non-HI"):
+            if subtitle.hearing_impaired_verifiable and "hearing_impaired" not in matches and \
+                            hearing_impaired in ("force HI", "force non-HI"):
                logger.debug('%r: Skipping subtitle with score %d because hearing-impaired set to %s', subtitle,
                             score, hearing_impaired)
                continue
@@ -460,7 +469,7 @@ def get_subtitle_path(video_path, language=None, extension='.srt', forced_tag=Fa


 def save_subtitles(video, subtitles, single=False, directory=None, encoding=None, encode_with=None, chmod=None,
-                   forced_tag=False, path_decoder=None):
+                   forced_tag=False, path_decoder=None, debug_mods=False):
    """Save subtitles on filesystem.

    Subtitles are saved in the order of the list. If a subtitle with a language has already been saved, other subtitles
@@ -515,7 +524,8 @@ def save_subtitles(video, subtitles, single=False, directory=None, encoding=None

        # save normalized subtitle if encoder or no encoding is given
        if has_encoder or encoding is None:
-            content = encode_with(subtitle.get_modified_text()) if has_encoder else subtitle.get_modified_content()
+            content = encode_with(subtitle.get_modified_text(debug=debug_mods)) if has_encoder else \
+                subtitle.get_modified_content(debug=debug_mods)
            with io.open(subtitle_path, 'wb') as f:
                f.write(content)

@@ -3,7 +3,7 @@ import subliminal
 import babelfish
 from subliminal.extensions import RegistrableExtensionManager

-provider_manager = RegistrableExtensionManager('subliminal.providers', [
+provider_manager = RegistrableExtensionManager('subliminal_patch.providers', [
    'addic7ed = subliminal_patch.providers.addic7ed:Addic7edProvider',
    'legendastv = subliminal_patch.providers.legendastv:LegendasTVProvider',
    'opensubtitles = subliminal_patch.providers.opensubtitles:OpenSubtitlesProvider',
@@ -19,4 +19,5 @@ provider_manager = RegistrableExtensionManager('subliminal.providers', [
 babelfish.language_converters.unregister('addic7ed = subliminal.converters.addic7ed:Addic7edConverter')
 babelfish.language_converters.register('addic7ed = subliminal_patch.language:PatchedAddic7edConverter')
 subliminal.refiner_manager.register('sz_metadata = subliminal_patch.refiners.metadata:refine')
+subliminal.refiner_manager.register('sz_omdb = subliminal_patch.refiners.omdb:refine')

@@ -4,7 +4,8 @@ from xmlrpclib import SafeTransport
 import certifi
 import ssl
 import os
-from requests import Session
+import socket
+from requests import Session, exceptions
 from retry.api import retry_call

 pem_file = os.path.normpath(os.path.join(os.path.dirname(os.path.realpath(__file__)), "..", certifi.where()))
@@ -23,7 +24,14 @@ class RetryingSession(Session):
        self.verify = pem_file

    def retry_method(self, method, *args, **kwargs):
-        return retry_call(getattr(super(RetryingSession, self), method), fargs=args, fkwargs=kwargs, tries=3, delay=1)
+        return retry_call(getattr(super(RetryingSession, self), method), fargs=args, fkwargs=kwargs, tries=3, delay=5,
+                          exceptions=(exceptions.ConnectionError,
+                                      exceptions.ProxyError,
+                                      exceptions.SSLError,
+                                      exceptions.Timeout,
+                                      exceptions.ConnectTimeout,
+                                      exceptions.ReadTimeout,
+                                      socket.timeout))

    def get(self, *args, **kwargs):
        return self.retry_method("get", *args, **kwargs)
@@ -5,3 +5,6 @@ from subliminal.providers import Provider as _Provider

 class Provider(_Provider):
    hash_verifiable = False
+    hearing_impaired_verifiable = False
+    skip_wrong_fps = True
+
@@ -17,6 +17,8 @@ series_year_re = re.compile(r'^(?P<series>[ \w\'.:(),&!?-]+?)(?: \((?P<year>\d{4


 class Addic7edSubtitle(_Addic7edSubtitle):
+    hearing_impaired_verifiable = True
+
    def __init__(self, language, hearing_impaired, page_link, series, season, episode, title, year, version,
                 download_link):
        super(Addic7edSubtitle, self).__init__(language, hearing_impaired, page_link, series, season, episode,
@@ -28,6 +30,10 @@ class Addic7edSubtitle(_Addic7edSubtitle):
        if not subliminal.score.episode_scores.get("addic7ed_boost"):
            return matches

+        # if the release group matches, the format is most likely correct, as well
+        if "release_group" in matches:
+            matches.add("format")
+
        if {"series", "season", "episode", "year"}.issubset(matches) and "format" in matches:
            matches.add("addic7ed_boost")
            logger.info("Boosting Addic7ed subtitle by %s" % subliminal.score.episode_scores.get("addic7ed_boost"))
@@ -40,6 +46,7 @@ class Addic7edSubtitle(_Addic7edSubtitle):

 class Addic7edProvider(_Addic7edProvider):
    USE_ADDICTED_RANDOM_AGENTS = False
+    hearing_impaired_verifiable = True
    subtitle_class = Addic7edSubtitle

    def __init__(self, username=None, password=None, use_random_agents=False):
@@ -13,6 +13,10 @@ class LegendasTVSubtitle(_LegendasTVSubtitle):
        self.release_info = archive.name
        self.page_link = archive.link

+    def make_picklable(self):
+        self.archive.content = None
+        return self
+

 class LegendasTVProvider(_LegendasTVProvider):
    subtitle_class = LegendasTVSubtitle
@@ -3,6 +3,7 @@
 import re
 import time
 import logging
+import traceback

 logger = logging.getLogger(__name__)

@@ -33,10 +34,11 @@ class ProviderRetryMixin(object):
        while i <= amount:
            try:
                return f()
-            except exc, e:
+            except exc:
+                formatted_exc = traceback.format_exc()
                i += 1
                if i == amount:
                    raise

-            logger.debug(u"Retrying %s, try: %i/%i, exception: %s" % (self.__class__.__name__, i, amount, e))
+            logger.debug(u"Retrying %s, try: %i/%i, exception: %s" % (self.__class__.__name__, i, amount, formatted_exc))
            time.sleep(retry_timeout)
@@ -1,14 +1,50 @@
 # coding=utf-8
+import logging

 from subliminal.providers.napiprojekt import NapiProjektProvider as _NapiProjektProvider, \
-    NapiProjektSubtitle as _NapiProjektSubtitle
+    NapiProjektSubtitle as _NapiProjektSubtitle, get_subhash
+
+logger = logging.getLogger(__name__)


 class NapiProjektSubtitle(_NapiProjektSubtitle):
-    def __init__(self, language, hash):
+    def __init__(self, language, hash, fps):
        super(NapiProjektSubtitle, self).__init__(language, hash)
        self.release_info = hash
+        self.plex_media_fps = float(fps)
+
+    def __repr__(self):
+        return '<%s %r [%s]>' % (
+            self.__class__.__name__, self.release_info, self.language)


 class NapiProjektProvider(_NapiProjektProvider):
    subtitle_class = NapiProjektSubtitle
+
+    def query(self, language, hash, fps):
+        params = {
+            'v': 'dreambox',
+            'kolejka': 'false',
+            'nick': '',
+            'pass': '',
+            'napios': 'Linux',
+            'l': language.alpha2.upper(),
+            'f': hash,
+            't': get_subhash(hash)}
+        logger.info('Searching subtitle %r', params)
+        response = self.session.get(self.server_url, params=params, timeout=10)
+        response.raise_for_status()
+
+        # handle subtitles not found and errors
+        if response.content[:4] == b'NPc0':
+            logger.debug('No subtitles found')
+            return None
+
+        subtitle = self.subtitle_class(language, hash, fps)
+        subtitle.content = response.content
+        logger.debug('Found subtitle %r', subtitle)
+
+        return subtitle
+
+    def list_subtitles(self, video, languages):
+        return [s for s in [self.query(l, video.hashes['napiprojekt'], video.fps) for l in languages] if s is not None]
@@ -16,10 +16,11 @@ logger = logging.getLogger(__name__)

 class OpenSubtitlesSubtitle(_OpenSubtitlesSubtitle):
    hash_verifiable = True
+    hearing_impaired_verifiable = True

    def __init__(self, language, hearing_impaired, page_link, subtitle_id, matched_by, movie_kind, hash, movie_name,
                 movie_release_name, movie_year, movie_imdb_id, series_season, series_episode, query_parameters,
-                 filename, encoding, fps):
+                 filename, encoding, fps, skip_wrong_fps=True):
        super(OpenSubtitlesSubtitle, self).__init__(language, hearing_impaired, page_link, subtitle_id,
                                                    matched_by, movie_kind, hash,
                                                    movie_name, movie_release_name, movie_year, movie_imdb_id,
@@ -27,6 +28,8 @@ class OpenSubtitlesSubtitle(_OpenSubtitlesSubtitle):
        self.query_parameters = query_parameters or {}
        self.fps = fps
        self.release_info = movie_release_name
+        self.wrong_fps = False
+        self.skip_wrong_fps = skip_wrong_fps

    def get_matches(self, video, hearing_impaired=False):
        matches = super(OpenSubtitlesSubtitle, self).get_matches(video)
@@ -39,9 +42,14 @@ class OpenSubtitlesSubtitle(_OpenSubtitlesSubtitle):

        # video has fps info, sub also, and sub's fps is greater than 0
        if video.fps and sub_fps and (video.fps != self.fps):
-            logger.debug("Wrong FPS (expected: %s, got: %s, lowering score massively)", video.fps, self.fps)
-            # fixme: may be too harsh
-            return set()
+            self.wrong_fps = True
+
+            if self.skip_wrong_fps:
+                logger.debug("Wrong FPS (expected: %s, got: %s, lowering score massively)", video.fps, self.fps)
+                # fixme: may be too harsh
+                return set()
+            else:
+                logger.debug("Wrong FPS (expected: %s, got: %s, continuing)", video.fps, self.fps)

        # matched by tag?
        if self.matched_by == "tag":
@@ -57,8 +65,10 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
    only_foreign = True
    subtitle_class = OpenSubtitlesSubtitle
    hash_verifiable = True
+    hearing_impaired_verifiable = True
+    skip_wrong_fps = True

-    def __init__(self, username=None, password=None, use_tag_search=False, only_foreign=False):
+    def __init__(self, username=None, password=None, use_tag_search=False, only_foreign=False, skip_wrong_fps=True):
        if username is not None and password is None or username is None and password is not None:
            raise ConfigurationError('Username and password must be specified')

@@ -66,6 +76,7 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
        self.password = password or ''
        self.use_tag_search = use_tag_search
        self.only_foreign = only_foreign
+        self.skip_wrong_fps = skip_wrong_fps

        if use_tag_search:
            logger.info("Using tag/exact filename search")
@@ -81,7 +92,7 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
        # fixme: retry on SSLError
        response = self.retry(
            lambda: checked(
-                self.server.LogIn(self.username, self.password, 'eng', 'subliminal v%s' % __short_version__)
+                self.server.LogIn(self.username, self.password, 'eng', os.environ.get("SZ_USER_AGENT", "Sub-Zero/2"))
            )
        )
        self.token = response['token']
@@ -101,6 +112,12 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
            query = video.series
            season = video.season
            episode = video.episode
+
+            if video.is_special:
+                season = None
+                episode = None
+                query = u"%s %s" % (video.series, video.title)
+                logger.info("%s: Searching for special: %r", self.__class__, query)
        # elif ('opensubtitles' not in video.hashes or not video.size) and not video.imdb_id:
        #    query = video.name.split(os.sep)[-1]
        else:
@@ -176,7 +193,7 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
                                           movie_kind,
                                           hash, movie_name, movie_release_name, movie_year, movie_imdb_id,
                                           series_season, series_episode, query_parameters, filename, encoding,
-                                           movie_fps)
+                                           movie_fps, skip_wrong_fps=self.skip_wrong_fps)
            logger.debug('Found subtitle %r by %s', subtitle, matched_by)
            subtitles.append(subtitle)

@@ -22,6 +22,7 @@ logger = logging.getLogger(__name__)

 class PodnapisiSubtitle(_PodnapisiSubtitle):
    provider_name = 'podnapisi'
+    hearing_impaired_verifiable = True

    def __init__(self, language, hearing_impaired, page_link, pid, releases, title, season=None, episode=None,
                 year=None):
@@ -33,6 +34,7 @@ class PodnapisiSubtitle(_PodnapisiSubtitle):
 class PodnapisiProvider(_PodnapisiProvider):
    only_foreign = False
    subtitle_class = PodnapisiSubtitle
+    hearing_impaired_verifiable = True

    def __init__(self, only_foreign=False):
        self.only_foreign = only_foreign
@@ -43,6 +45,10 @@ class PodnapisiProvider(_PodnapisiProvider):
        super(PodnapisiProvider, self).__init__()

    def list_subtitles(self, video, languages):
+        if video.is_special:
+            logger.info("%s can't search for specials right now, skipping", self)
+            return []
+
        if isinstance(video, Episode):
            return [s for l in languages for s in self.query(l, video.series, season=video.season,
                                                             episode=video.episode, year=video.year,
@@ -5,6 +5,8 @@ from subliminal.providers.subscenter import SubsCenterProvider as _SubsCenterPro


 class SubsCenterSubtitle(_SubsCenterSubtitle):
+    hearing_impaired_verifiable = True
+
    def __init__(self, language, hearing_impaired, page_link, series, season, episode, title, subtitle_id, subtitle_key,
                 subtitle_version, downloaded, releases):
        super(SubsCenterSubtitle, self).__init__(language, hearing_impaired, page_link, series, season, episode, title,
@@ -20,3 +22,4 @@ class SubsCenterSubtitle(_SubsCenterSubtitle):

 class SubsCenterProvider(_SubsCenterProvider):
    subtitle_class = SubsCenterSubtitle
+    hearing_impaired_verifiable = True
@@ -0,0 +1,67 @@
+# coding=utf-8
+import os
+import subliminal
+import base64
+import zlib
+from subliminal import __short_version__
+from subliminal.refiners.omdb import OMDBClient, refine
+
+
+class SZOMDBClient(OMDBClient):
+    def __init__(self, version=1, session=None, headers=None, timeout=10):
+        super(SZOMDBClient, self).__init__(version=version, session=session, headers=headers, timeout=timeout)
+
+    def get_params(self, params):
+        self.session.params['apikey'] = \
+            zlib.decompress(base64.b16decode(os.environ['U1pfT01EQl9LRVk']))\
+            .decode('cm90MTM=\n'.decode("base64")) \
+            .decode('YmFzZTY0\n'.decode("base64")).split("x")[0]
+        return dict(self.session.params, **params)
+
+    def get(self, id=None, title=None, type=None, year=None, plot='short', tomatoes=False):
+        # build the params
+        params = {}
+        if id:
+            params['i'] = id
+        if title:
+            params['t'] = title
+        if not params:
+            raise ValueError('At least id or title is required')
+        params['type'] = type
+        params['y'] = year
+        params['plot'] = plot
+        params['tomatoes'] = tomatoes
+
+        # perform the request
+        r = self.session.get(self.base_url, params=self.get_params(params))
+        r.raise_for_status()
+
+        # get the response as json
+        j = r.json()
+
+        # check response status
+        if j['Response'] == 'False':
+            return None
+
+        return j
+
+    def search(self, title, type=None, year=None, page=1):
+        # build the params
+        params = {'s': title, 'type': type, 'y': year, 'page': page}
+
+        # perform the request
+        r = self.session.get(self.base_url, params=self.get_params(params))
+        r.raise_for_status()
+
+        # get the response as json
+        j = r.json()
+
+        # check response status
+        if j['Response'] == 'False':
+            return None
+
+        return j
+
+
+omdb_client = SZOMDBClient(headers={'User-Agent': 'Subliminal/%s' % __short_version__})
+subliminal.refiners.omdb.omdb_client = omdb_client
@@ -45,16 +45,18 @@ def compute_score(matches, subtitle, video, hearing_impaired=None):
            # hash is error-prone, try to fix that
            hash_valid_if = episode_hash_valid_if if is_episode else movie_hash_valid_if

-            if hash_valid_if <= set(matches):
-                # series, season and episode matched, hash is valid
-                logger.debug('%r: Using valid hash, as %s are correct (%r) and (%r)', subtitle, hash_valid_if, matches,
-                             video)
-                matches &= {'hash', 'hearing_impaired'}
-            else:
-                # no match, invalidate hash
-                logger.debug('%r: Ignoring hash as other matches are wrong (missing: %r) and (%r)', subtitle,
-                             hash_valid_if - matches, video)
-                matches -= {"hash"}
+            # don't validate hashes of specials, as season and episode tend to be wrong
+            if is_movie or not video.is_special:
+                if hash_valid_if <= set(matches):
+                    # series, season and episode matched, hash is valid
+                    logger.debug('%r: Using valid hash, as %s are correct (%r) and (%r)', subtitle, hash_valid_if, matches,
+                                 video)
+                    matches &= {'hash'}
+                else:
+                    # no match, invalidate hash
+                    logger.debug('%r: Ignoring hash as other matches are wrong (missing: %r) and (%r)', subtitle,
+                                 hash_valid_if - matches, video)
+                    matches -= {"hash"}
    elif 'hash' in matches:
        logger.debug('%r: Hash not verifiable for this provider. Keeping it', subtitle)

@@ -75,6 +77,13 @@ def compute_score(matches, subtitle, video, hearing_impaired=None):
        if 'series_tvdb_id' in matches:
            logger.debug('Adding series_tvdb_id match equivalents')
            matches |= {'series', 'year'}
+
+        # specials
+        if video.is_special and 'title' in matches and 'series' in matches \
+                and 'year' in matches:
+            logger.debug('Adding special title match equivalent')
+            matches |= {'season', 'episode'}
+
    elif is_movie:
        if 'imdb_id' in matches:
            logger.debug('Adding imdb_id match equivalents')
@@ -2,13 +2,18 @@


 import logging
+import traceback
+
+import re

 import chardet
 import pysrt
 import pysubs2
 from bs4 import UnicodeDammit
-from subliminal import Subtitle
+from pysubs2 import SSAStyle
+from pysubs2.subrip import ms_to_timestamp, parse_tags
 from subzero.modification import SubtitleModifications
+from subliminal import Subtitle

 logger = logging.getLogger(__name__)

@@ -18,8 +23,13 @@ class PatchedSubtitle(Subtitle):
    release_info = None
    matches = None
    hash_verifiable = False
+    hearing_impaired_verifiable = False
    mods = None
    plex_media_fps = None
+    skip_wrong_fps = False
+    wrong_fps = False
+
+    _guessed_encoding = None

    def __init__(self, language, hearing_impaired=False, page_link=None, encoding=None, mods=None):
        super(PatchedSubtitle, self).__init__(language, hearing_impaired=hearing_impaired, page_link=page_link,
@@ -30,6 +40,21 @@ class PatchedSubtitle(Subtitle):
        return '<%s %r [%s]>' % (
            self.__class__.__name__, self.page_link, self.language)

+    def make_picklable(self):
+        """
+        some subtitle instances might have unpicklable objects stored; clean them up here 
+        :return: self
+        """
+        return self
+
+    def set_encoding(self, encoding):
+        if encoding == self.guess_encoding():
+            return
+
+        unicontent = self.text
+        self.content = unicontent.encode(encoding)
+        self._guessed_encoding = encoding
+
    def guess_encoding(self):
        """Guess encoding using the language, falling back on chardet.

@@ -37,11 +62,17 @@ class PatchedSubtitle(Subtitle):
        :rtype: str

        """
+        if self._guessed_encoding:
+            logger.info('Encoding already guessed: %s', self._guessed_encoding)
+            return self._guessed_encoding
+
        logger.info('Guessing encoding for language %s', self.language.alpha3)

        encodings = ['utf-8']

        # add language-specific encodings
+        # http://scratchpad.wikia.com/wiki/Character_Encoding_Recommendation_for_Languages
+
        if self.language.alpha3 == 'zho':
            encodings.extend(['gb18030', 'big5'])
        elif self.language.alpha3 == 'jpn':
@@ -67,15 +98,15 @@ class PatchedSubtitle(Subtitle):
        elif self.language.alpha3 in ('pol', 'cze', 'ces', 'slk', 'slo', 'slv', 'hun', 'bos', 'hbs', 'hrv', 'rsb',
                                      'ron', 'rum', 'sqi', 'alb'):
            # Eastern European Group 1
-            encodings.append('windows-1250')
+            encodings.extend(['iso-8859-2', 'windows-1250'])

-        # Bulgarian, Serbian and Macedonian
-        elif self.language.alpha3 in ('bul', 'srp', 'mkd', 'mac'):
+        # Bulgarian, Serbian and Macedonian, Ukranian and Russian
+        elif self.language.alpha3 in ('bul', 'srp', 'mkd', 'mac', 'rus', 'ukr'):
            # Eastern European Group 2
-            encodings.append('windows-1251')
+            encodings.extend(['iso-8859-5', 'windows-1251'])
        else:
-            # Western European (windows-1252)
-            encodings.append('latin-1')
+            # Western European (windows-1252) / Northern European
+            encodings.extend(['iso-8859-15', 'iso-8859-9', 'iso-8859-4', 'iso-8859-1', 'latin-1'])

        # try to decode
        logger.debug('Trying encodings %r', encodings)
@@ -86,6 +117,7 @@ class PatchedSubtitle(Subtitle):
                pass
            else:
                logger.info('Guessed encoding %s', encoding)
+                self._guessed_encoding = encoding
                return encoding

        logger.warning('Could not guess encoding from language')
@@ -102,9 +134,11 @@ class PatchedSubtitle(Subtitle):
            Log.Debug("bs4 detected encoding: %s" % a.original_encoding)

            if a.original_encoding:
+                self._guessed_encoding = a.original_encoding
                return a.original_encoding
            raise ValueError(u"Couldn't guess the proper encoding for %s" % self)

+        self._guessed_encoding = encoding
        return encoding

    def is_valid(self):
@@ -114,50 +148,95 @@ class PatchedSubtitle(Subtitle):
        :rtype: bool

        """
-        if not self.text:
+        text = self.text
+        if not text:
            return False

        # valid srt
        try:
-            pysrt.from_string(self.text, error_handling=pysrt.ERROR_RAISE)
-        except Exception, e:
-            logger.error("PySRT-parsing failed: %s, trying pysubs2", e)
+            pysrt.from_string(text, error_handling=pysrt.ERROR_RAISE)
+        except Exception:
+            logger.error("PySRT-parsing failed, trying pysubs2")
        else:
            return True

        # something else, try to return srt
        try:
            logger.debug("Trying parsing with PySubs2")
-            subs = pysubs2.SSAFile.from_string(self.text)
-            self.content = subs.to_string("srt")
+            try:
+                # in case of microdvd, try parsing the fps from the subtitle
+                subs = pysubs2.SSAFile.from_string(text)
+                if subs.format == "microdvd":
+                    logger.info("Got FPS from MicroDVD subtitle: %s", subs.fps)
+            except pysubs2.UnknownFPSError:
+                # if parsing failed, suggest our media file's fps
+                subs = pysubs2.SSAFile.from_string(text, fps=self.plex_media_fps)
+                if subs.format == "microdvd":
+                    logger.info("No FPS info in subtitle. Using our own media FPS for the MicroDVD subtitle: %s",
+                                subs.fps)
+
+            unicontent = self.pysubs2_to_unicode(subs)
+            self.content = unicontent.encode(self.guess_encoding())
        except:
-            logger.exception("Couldn't convert subtitle %s to .srt format", self)
+            logger.exception("Couldn't convert subtitle %s to .srt format: %s", self, traceback.format_exc())
            return False

        return True

-    def get_modified_content(self):
+    @classmethod
+    def pysubs2_to_unicode(cls, sub):
+        def prepare_text(text, style):
+            body = []
+            for fragment, sty in parse_tags(text, style, sub.styles):
+                fragment = fragment.replace(ur"\h", u" ")
+                fragment = fragment.replace(ur"\n", u"\n")
+                fragment = fragment.replace(ur"\N", u"\n")
+                if sty.italic: fragment = u"<i>%s</i>" % fragment
+                if sty.underline: fragment = u"<u>%s</u>" % fragment
+                if sty.strikeout: fragment = u"<s>%s</s>" % fragment
+                body.append(fragment)
+
+            return re.sub(u"\n+", u"\n", u"".join(body).strip())
+
+        visible_lines = (line for line in sub if not line.is_comment)
+
+        out = []
+
+        for i, line in enumerate(visible_lines, 1):
+            start = ms_to_timestamp(line.start)
+            end = ms_to_timestamp(line.end)
+            text = prepare_text(line.text, sub.styles.get(line.style, SSAStyle.DEFAULT_STYLE))
+
+            out.append(u"%d\n" % i)
+            out.append(u"%s --> %s\n" % (start, end))
+            out.append(u"%s%s" % (text, "\n\n"))
+
+        return u"".join(out)
+
+    def get_modified_content(self, debug=False):
        """
-        :param language: 
-        :param fps: 
        :return: string 
        """
        if not self.mods:
            return self.content

        encoding = self.guess_encoding()
-
-        submods = SubtitleModifications()
-        submods.load(content=self.text, fps=self.plex_media_fps)
+        submods = SubtitleModifications(debug=debug)
+        submods.load(content=self.text, language=self.language)
        submods.modify(*self.mods)
-        return submods.to_string("srt", encoding=encoding).encode(encoding=encoding)

-    def get_modified_text(self):
+        return self.pysubs2_to_unicode(submods.f).encode(encoding=encoding)
+
+    def get_modified_text(self, debug=False):
        """
-        :param language: 
-        :param fps: 
        :return: unicode 
        """
-        content = self.get_modified_content()
+        content = self.get_modified_content(debug=debug)
+        if not content:
+            return
        encoding = self.guess_encoding()
        return content.decode(encoding=encoding)
+
+
+class ModifiedSubtitle(PatchedSubtitle):
+    id = None
@@ -0,0 +1,7 @@
+# coding=utf-8
+
+from subliminal.video import Video as Video_
+
+
+class Video(Video_):
+    is_special = False
@@ -1,7 +1,10 @@
 # coding=utf-8

-import sys
 import logging
+import sys
+import codecs
+
+from babelfish import Language

 logger = logging.getLogger(__name__)

@@ -14,7 +17,18 @@ if debug:
    logging.basicConfig(level=logging.DEBUG)

 submod = SubMod(debug=debug)
-submod.load(fn)
-submod.modify("remove_HI")
+submod.load(fn, language=Language.fromietf("eng"), encoding="utf-8")
+submod.modify("remove_HI", "OCR_fixes", "common", "OCR_fixes", "shift_offset(s=20)", "OCR_fixes", "color(color=#FF0000)", "shift_offset(s=-5, ms=-350)")
+
+#srt = submod.to_unicode()
+#print submod.f.to_string("srt", encoding="utf-8")
+#print repr(srt)
+#f = codecs.open("testout.srt", "w+", encoding="latin-1")
+#f.write(srt)
+#f.close()
+#print submod.f.to_string("srt")
+#submod.modify("OCR_fixes")
+#submod.modify("change_FPS(from=24,to=25)")
+#submod.modify("common")

 #print submod.f.to_string("srt")
@@ -3,6 +3,7 @@
 import datetime
 import logging
 import traceback
+import types

 from constants import mode_map

@@ -71,9 +72,10 @@ class SubtitleHistory(object):
            self.history_items = storage.LoadObject("subtitle_history") or []
        except:
            logger.error("Failed to load history storage: %s" % traceback.format_exc())
+        if not isinstance(self.history_items, types.ListType):
+            self.history_items = []

    def add(self, item_title, rating_key, section_title=None, subtitle=None, mode="a", time=None):
-        # create copy
        items = self.history_items
        item = SubtitleHistoryItem(item_title, rating_key, section_title=section_title, subtitle=subtitle, mode=mode, time=time)

@@ -1,246 +0,0 @@
-# coding=utf-8
-
-import re
-import traceback
-from collections import OrderedDict
-
-import pysubs2
-import logging
-
-logger = logging.getLogger(__name__)
-
-
-class SubtitleModifications(object):
-    debug = False
-
-    def __init__(self, debug=False):
-        self.debug = debug
-
-    def load(self, fn=None, content=None, fps=None):
-        """
-        
-        :param fn:  filename
-        :param content: unicode 
-        :param fps: 
-        :return: 
-        """
-        try:
-            if fn:
-                self.f = pysubs2.load(fn, fps=fps)
-            elif content:
-                self.f = pysubs2.SSAFile.from_string(content, fps=fps)
-        except (IOError,
-                UnicodeDecodeError,
-                pysubs2.exceptions.UnknownFPSError,
-                pysubs2.exceptions.UnknownFormatIdentifierError,
-                pysubs2.exceptions.FormatAutodetectionError):
-            if fn:
-                logger.exception("Couldn't load subtitle: %s: %s", fn, traceback.format_exc())
-            elif content:
-                logger.exception("Couldn't load subtitle: %s", traceback.format_exc())
-
-    def modify(self, *mods):
-        new_f = []
-        for line in self.f:
-            applied_mods = []
-            for identifier in mods:
-                if identifier in registry.mods:
-                    mod = registry.mods[identifier]
-
-                    # don't bother reapplying exclusive mods multiple times
-                    if mod.exclusive and identifier in applied_mods:
-                        continue
-
-                    new_content = mod.modify(line.text, debug=self.debug)
-                    if not new_content:
-                        if self.debug:
-                            logger.debug("%s: deleting %s", identifier, line)
-                        continue
-
-                    line.text = new_content
-                    new_f.append(line)
-                    applied_mods.append(identifier)
-
-        self.f.events = new_f
-
-    def to_string(self, format="srt", encoding="utf-8"):
-        return self.f.to_string(format, encoding=encoding)
-
-    def save(self, fn):
-        self.f.save(fn)
-
-
-SubMod = SubtitleModifications
-
-
-class SubtitleModRegistry(object):
-    mods = None
-    mods_available = None
-
-    def __init__(self):
-        self.mods = OrderedDict()
-        self.mods_available = []
-
-    def register(self, mod):
-        self.mods[mod.identifier] = mod
-        self.mods_available.append(mod.identifier)
-
-registry = SubtitleModRegistry()
-
-
-class Processor(object):
-    """
-    Processor base class
-    """
-    name = None
-
-    def __init__(self, name=None):
-        self.name = name
-
-    @property
-    def info(self):
-        return self.name
-
-    def process(self, content):
-        return content
-
-    def __repr__(self):
-        return "Processor <%s %s>" % (self.__class__.__name__, self.info)
-
-    def __str__(self):
-        return repr(self)
-
-    def __unicode__(self):
-        return unicode(repr(self))
-
-
-class StringProcessor(Processor):
-    """
-    String replacement processor base
-    """
-
-    def __init__(self, search, replace, name=None):
-        super(StringProcessor, self).__init__(name=name)
-        self.search = search
-        self.replace = replace
-
-    def process(self, content):
-        return content.replace(self.search, self.replace)
-
-
-class ReProcessor(Processor):
-    """
-    Regex processor
-    """
-    pattern = None
-    replace_with = None
-
-    def __init__(self, pattern, replace_with, name=None):
-        super(ReProcessor, self).__init__(name=name)
-        self.pattern = pattern
-        self.replace_with = replace_with
-
-    def process(self, content, debug=False):
-        return self.pattern.sub(self.replace_with, content)
-
-
-class NReProcessor(ReProcessor):
-    """
-    Single line regex processor
-    """
-
-    def process(self, content, debug=False):
-        lines = []
-        for line in content.split(r"\N"):
-            a = super(NReProcessor, self).process(line, debug=debug)
-            if not a:
-                continue
-            lines.append(a)
-        return r"\N".join(lines)
-
-
-class SubtitleModification(object):
-    identifier = None
-    description = None
-    exclusive = False
-    pre_processors = []
-    processors = []
-    post_processors = []
-
-    @classmethod
-    def _process(cls, content, processors, debug=False):
-        if not content:
-            return
-
-        new_content = content
-        for processor in processors:
-            old_content = new_content
-            new_content = processor.process(new_content, debug=debug)
-            if not new_content:
-                if debug:
-                    logger.debug("Processor returned empty line: %s", processor)
-                break
-            if debug:
-                if old_content == new_content:
-                    continue
-                logger.debug("%s: %s -> %s", processor, old_content, new_content)
-        return new_content
-
-    @classmethod
-    def pre_process(cls, content, debug=False):
-        return cls._process(content, cls.pre_processors, debug=debug)
-
-    @classmethod
-    def process(cls, content, debug=False):
-        return cls._process(content, cls.processors, debug=debug)
-
-    @classmethod
-    def post_process(cls, content, debug=False):
-        return cls._process(content, cls.post_processors, debug=debug)
-
-    @classmethod
-    def modify(cls, content, debug=False):
-        new_content = content
-        for method in ("pre_process", "process", "post_process"):
-            new_content = getattr(cls, method)(new_content, debug=debug)
-
-        return new_content
-
-
-class SubtitleTextModification(SubtitleModification):
-    post_processors = [
-        # empty tag
-        ReProcessor(re.compile(r'({\\\w+1})[\s.,-_!?]+({\\\w+0})'), "", name="empty_tag"),
-
-        # empty line (needed?)
-        NReProcessor(re.compile(r'^\s+$'), "", name="empty_line"),
-
-        # empty dash line (needed?)
-        NReProcessor(re.compile(r'(^[\s]*[\-]+[\s]*)$'), "", name="empty_dash_line"),
-
-        # clean whitespace at start and end
-        ReProcessor(re.compile(r'^\s*([^\s]+)\s*$'), r"\1", name="surrounding_whitespace"),
-    ]
-
-
-class HearingImpaired(SubtitleTextModification):
-    identifier = "remove_HI"
-    description = "Remove Hearing Impaired tags"
-    exclusive = True
-
-    processors = [
-        # brackets
-        NReProcessor(re.compile(r'(?sux)[([].+[)\]]'), "", name="HI_brackets"),
-
-        # text before colon (and possible dash in front)
-        NReProcessor(re.compile(r'(?u)(^[A-z\-]+[\w\s]*:[^0-9{2}][\s]*)'), "", name="HI_before_colon"),
-
-        # all caps line (at least 3 chars)
-        NReProcessor(re.compile(r'(?u)(^[A-Z]{3,}$)'), "", name="HI_all_caps"),
-
-        # dash in front
-        NReProcessor(re.compile(r'(?u)^\s*-\s*'), "", name="HI_starting_dash"),
-    ]
-
-
-registry.register(HearingImpaired)
@@ -0,0 +1,5 @@
+# coding=utf-8
+
+from registry import registry
+from mods import hearing_impaired, ocr_fixes, fps, offset, common, color
+from main import SubtitleModifications, SubMod
@@ -0,0 +1,3 @@
+# coding=utf-8
+
+from data import data
@@ -0,0 +1,98 @@
+# coding=utf-8
+
+import re
+import os
+import pprint
+from collections import OrderedDict
+
+from bs4 import BeautifulSoup
+
+TEMPLATE = """\
+import re
+from collections import OrderedDict
+data = """
+
+TEMPLATE_END = """\
+
+for lang, grps in data.iteritems():
+    for grp in grps.iterkeys():
+        if data[lang][grp]["pattern"]:
+            data[lang][grp]["pattern"] = re.compile(data[lang][grp]["pattern"])
+"""
+
+
+SZ_FIX_DATA = {
+    "eng": {
+        "PartialWordsAlways": {
+            u"°x°": u"%",
+            u"compiete": u"complete",
+            u"Âs": u"'s",
+            u"ÃÂs": u"'s",
+            u"a/ion": u"ation",
+            u"at/on": u"ation",
+            u"l/an": u"lian",
+        },
+        "WholeWords": {
+            u"I'11": u"I'll",
+            u"Tun": u"Run",
+            u"pan'": u"part",
+            u"al'": u"at",
+            u"a re": u"are",
+            u"wail'": u"wait",
+            u"he)'": u"hey",
+            u"He)'": u"Hey",
+            u"Yea h": u"Yeah",
+            u"yea h": u"yeah",
+            u"h is": u"his",
+            u" 're ": u"'re ",
+            u"LAst": u"Last",
+        }
+    }
+}
+
+if __name__ == "__main__":
+    cur_dir = os.path.dirname(os.path.realpath(__file__))
+    xml_dir = os.path.join(cur_dir, "xml")
+    file_list = os.listdir(xml_dir)
+
+    data = {}
+
+    for fn in file_list:
+        if fn.endswith("_OCRFixReplaceList.xml"):
+            lang = fn.split("_")[0]
+            soup = BeautifulSoup(open(os.path.join(xml_dir, fn)), "xml")
+
+            fetch_data = (
+                    # group, item_name, pattern
+                    ("WholeLines", "Line", None),
+                    ("WholeWords", "Word", lambda d: (ur"(?um)\b(?:" + u"|".join([re.escape(k) for k in d.keys()])
+                                                      + ur')\b') if d else None),
+                    ("PartialWordsAlways", "WordPart", None),
+                    ("PartialLines", "LinePart", lambda d: (ur"(?um)(?:(?<=\s)|(?<=^)|(?<=\b))(?:" +
+                                                            u"|".join([re.escape(k) for k in d.keys()]) +
+                                                            ur")(?:(?=\s)|(?=$)|(?=\b))") if d else None),
+                    ("BeginLines", "Beginning", lambda d: (ur"(?um)^(?:"+u"|".join([re.escape(k) for k in d.keys()])
+                                                           + ur')') if d else None),
+                    ("EndLines", "Ending", lambda d: (ur"(?um)(?:" + u"|".join([re.escape(k) for k in d.keys()]) +
+                                                      ur")$") if d else None,),
+            )
+
+            data[lang] = dict((grp, {"data": OrderedDict(), "pattern": None}) for grp, item_name, pattern in fetch_data)
+
+            for grp, item_name, pattern in fetch_data:
+                for grp_data in soup.find_all(grp):
+                    for line in grp_data.find_all(item_name):
+                        data[lang][grp]["data"][line["from"]] = line["to"]
+
+                # add our own dictionaries
+                if lang in SZ_FIX_DATA and grp in SZ_FIX_DATA[lang]:
+                    data[lang][grp]["data"].update(SZ_FIX_DATA[lang][grp])
+
+                if pattern:
+                    data[lang][grp]["pattern"] = pattern(data[lang][grp]["data"])
+
+    f = open(os.path.join(cur_dir, "data.py"), "w+")
+    f.write(TEMPLATE)
+    f.write(pprint.pformat(data, width=1))
+    f.write(TEMPLATE_END)
+    f.close()
@@ -0,0 +1,10 @@
+# coding=utf-8
+
+from babelfish import Language
+from data import data
+
+#for lang, data in data.iteritems():
+#    print Language.fromietf(lang).alpha2
+
+for find, rep in data["dan"].iteritems():
+    print find, rep
@@ -0,0 +1,638 @@
+<OCRFixReplaceList>
+  <WholeWords>
+    <Word from="Haner" to="Han er" />
+    <Word from="JaveL" to="Javel" />
+    <Word from="Pa//e" to="Palle" />
+    <Word from="bffte" to="bitte" />
+    <Word from="Utro//gt" to="Utroligt" />
+    <Word from="Kommerdu" to="Kommer du" />
+    <Word from="smi/er" to="smiler" />
+    <Word from="/eg" to="leg" />
+    <Word from="harvinger" to="har vinger" />
+    <Word from="/et" to="let" />
+    <Word from="erjeres" to="er jeres" />
+    <Word from="hardet" to="har det" />
+    <Word from="tænktjer" to="tænkt jer" />
+    <Word from="erjo" to="er jo" />
+    <Word from="sti/" to="stil" />
+    <Word from="Iappe" to="lappe" />
+    <Word from="Beklagelç" to="Beklager," />
+    <Word from="vardet" to="var det" />
+    <Word from="afden" to="af den" />
+    <Word from="snupperjeg" to="snupper jeg" />
+    <Word from="ikkejeg" to="ikke jeg" />
+    <Word from="bliverjeg" to="bliver jeg" />
+    <Word from="hartravit" to="har travlt" />
+    <Word from="pandekagef/ag" to="pandekageflag" />
+    <Word from="Stormvarsell" to="Stormvarsel!" />
+    <Word from="stormvejn" to="stormvejr." />
+    <Word from="morgenkomp/et" to="morgenkomplet" />
+    <Word from="/yv" to="lyv" />
+    <Word from="varjo" to="var jo" />
+    <Word from="/eger" to="leger" />
+    <Word from="harjeg" to="har jeg" />
+    <Word from="havdejeg" to="havde jeg" />
+    <Word from="hvorjeg" to="hvor jeg" />
+    <Word from="nårjeg" to="når jeg" />
+    <Word from="gårvi" to="går vi" />
+    <Word from="atjeg" to="at jeg" />
+    <Word from="isine" to="i sine" />
+    <Word from="fårjeg" to="får jeg" />
+    <Word from="kærtighed" to="kærlighed" />
+    <Word from="skullejeg" to="skulle jeg" />
+    <Word from="laest" to="læst" />
+    <Word from="laese" to="læse" />
+    <Word from="gørjeg" to="gør jeg" />
+    <Word from="gørvi" to="gør vi" />
+    <Word from="angrerjo" to="angrer jo" />
+    <Word from="Hvergang" to="Hver gang" />
+    <Word from="erder" to="er der" />
+    <Word from="villetilgive" to="ville tilgive" />
+    <Word from="ﬁeme" to="fjeme" />
+    <Word from="genopståri" to="genopstår i" />
+    <Word from="svigtejer" to="svigte jer" />
+    <Word from="kommernu" to="kommer nu" />
+    <Word from="nårman" to="når man" />
+    <Word from="erfire" to="er fire" />
+    <Word from="Hvorforﬁnderdu" to="Hvorfor ﬁnder du" />
+    <Word from="undertigt" to="underligt" />
+    <Word from="itroen" to="i troen" />
+    <Word from="erløgnt" to="er løgn!" />
+    <Word from="gørden" to="gør den" />
+    <Word from="forhelvede" to="for helvede" />
+    <Word from="hjpe" to="hjælpe" />
+    <Word from="togeti" to="toget i" />
+    <Word from="Måjeg" to="Må jeg" />
+    <Word from="savnerjer" to="savner jer" />
+    <Word from="erjeg" to="er jeg" />
+    <Word from="vaere" to="være" />
+    <Word from="geme" to="gerne" />
+    <Word from="trorpå" to="tror på" />
+    <Word from="forham" to="for ham" />
+    <Word from="afham" to="af ham" />
+    <Word from="harjo" to="har jo" />
+    <Word from="ovemaﬁet" to="overnattet" />
+    <Word from="begaeﬁighed" to="begærlighed" />
+    <Word from="sy’g" to="syg" />
+    <Word from="Imensjeg" to="Imens jeg" />
+    <Word from="bliverdu" to="bliver du" />
+    <Word from="ﬁser" to="fiser" />
+    <Word from="manipuierer" to="manipulerer" />
+    <Word from="forjeg" to="for jeg" />
+    <Word from="iivgivendefor" to="livgivende for" />
+    <Word from="formig" to="for mig" />
+    <Word from="Hardu" to="Har du" />
+    <Word from="fornold" to="forhold" />
+    <Word from="defrelste" to="de frelste" />
+    <Word from="Såjeg" to="Så jeg" />
+    <Word from="varjeg" to="var jeg" />
+    <Word from="gørved" to="gør ved" />
+    <Word from="kalderjeg" to="kalder jeg" />
+    <Word from="ﬂytte" to="flytte" />
+    <Word from="handlerdet" to="handler det" />
+    <Word from="trorjeg" to="tror jeg" />
+    <Word from="ﬂytter" to="flytter" />
+    <Word from="soverjeg" to="sover jeg" />
+    <Word from="ﬁnderud" to="ﬁnder ud" />
+    <Word from="naboerpå" to="naboer på" />
+    <Word from="ervildt" to="er vildt" />
+    <Word from="væreher" to="være her" />
+    <Word from="hyggerjer" to="hygger jer" />
+    <Word from="borjo" to="bor jo" />
+    <Word from="kommerikke" to="kommer ikke" />
+    <Word from="folkynde" to="forkynde" />
+    <Word from="farglad" to="far glad" />
+    <Word from="misterjeg" to="mister jeg" />
+    <Word from="ﬁnt" to="fint" />
+    <Word from="Harl" to="Har I" />
+    <Word from="bedejer" to="bede jer" />
+    <Word from="synesjeg" to="synes jeg" />
+    <Word from="vartil" to="var til" />
+    <Word from="eren" to="er en" />
+    <Word from="\Al" to="Vil" />
+    <Word from="\A" to="Vi" />
+    <Word from="fjeme" to="fjerne" />
+    <Word from="Iigefyldt" to="lige fyldt" />
+    <Word from="ertil" to="er til" />
+    <Word from="faﬁigt" to="farligt" />
+    <Word from="ﬁnder" to="finder" />
+    <Word from="ﬁndes" to="findes" />
+    <Word from="irettesaeﬁelse" to="irettesættelse" />
+    <Word from="ermed" to="er med" />
+    <Word from="èn" to="én" />
+    <Word from="gikjoi" to="gik jo i" />
+    <Word from="Hvisjeg" to="Hvis jeg" />
+    <Word from="ovemaﬁer" to="overnatter" />
+    <Word from="hoident" to="holdent" />
+    <Word from="\Adne" to="Vidne" />
+    <Word from="fori" to="for i" />
+    <Word from="vei" to="vel" />
+    <Word from="savnerjerjo" to="savner jer jo" />
+    <Word from="elskerjer" to="elsker jer" />
+    <Word from="harløjet" to="har løjet" />
+    <Word from="eri" to="er i" />
+    <Word from="ﬁende" to="fjende" />
+    <Word from="derjo" to="der jo" />
+    <Word from="sigerjo" to="siger jo" />
+    <Word from="menerjeg" to="mener jeg" />
+    <Word from="Harjeg" to="Har jeg" />
+    <Word from="sigerjeg" to="siger jeg" />
+    <Word from="splitterjeg" to="splitter jeg" />
+    <Word from="erjournalist" to="er journalist" />
+    <Word from="erjoumalist" to="er journalist" />
+    <Word from="Forjeg" to="For jeg" />
+    <Word from="gârjeg" to="går jeg" />
+    <Word from="Nârjeg" to="Når jeg" />
+    <Word from="aﬂlom" to="afkom" />
+    <Word from="farerjo" to="farer jo" />
+    <Word from="tagerjeg" to="tager jeg" />
+    <Word from="Virkerjeg" to="Virker jeg" />
+    <Word from="morerjer" to="morer jer" />
+    <Word from="kommerjo" to="kommer jo" />
+    <Word from="istand" to="i stand" />
+    <Word from="bøm" to="børn" />
+    <Word from="frygterjeg" to="frygter jeg" />
+    <Word from="kommerjeg" to="kommer jeg" />
+    <Word from="eriournalistelev" to="er journalistelev" />
+    <Word from="harfat" to="har fat" />
+    <Word from="fårﬁngre" to="får ﬁngre" />
+    <Word from="slârjeg" to="slår jeg" />
+    <Word from="bam" to="barn" />
+    <Word from="erjournalistelev" to="er journalistelev" />
+    <Word from="politietjo" to="politiet jo" />
+    <Word from="elskerjo" to="elsker jo" />
+    <Word from="vari" to="var i" />
+    <Word from="fornemmerjeres" to="fornemmer jeres" />
+    <Word from="udklækketl" to="udklækket!" />
+    <Word from="í" to="i" />
+    <Word from="nyi" to="ny i" />
+    <Word from="Iumijelse" to="fornøjelse" />
+    <Word from="vures" to="vores" />
+    <Word from="I/Vashíngtan" to="Washington" />
+    <Word from="opleverjeg" to="oplever jeg" />
+    <Word from="PANTELÃNER" to="PANTELÅNER" />
+    <Word from="Gudmurgen" to="Godmorgen" />
+    <Word from="SKYDEVÃBEN" to="SKYDEVÅBEN" />
+    <Word from="PÃLIDELIG" to="PÅLIDELIG" />
+    <Word from="avertalte" to="overtalte" />
+    <Word from="Omsíder" to="Omsider" />
+    <Word from="lurtebåd" to="lortebåd" />
+    <Word from="Telrslning" to="Tekstning" />
+    <Word from="miUø" to="miljø" />
+    <Word from="gåri" to="går i" />
+    <Word from="Fan/el" to="Farvel" />
+    <Word from="abeﬁæs" to="abefjæs" />
+    <Word from="hartalt" to="har talt" />
+    <Word from="\Årkelig" to="Virkelig" />
+    <Word from="beklagerjeg" to="beklager jeg" />
+    <Word from="Nårjeg" to="Når jeg" />
+    <Word from="rnaend" to="mænd" />
+    <Word from="vaskebjorn" to="vaskebjørn" />
+    <Word from="Ivil" to="I vil" />
+    <Word from="besog" to="besøg" />
+    <Word from="Vaer" to="Vær" />
+    <Word from="Undersogte" to="Undersøgte" />
+    <Word from="modte" to="mødte" />
+    <Word from="toj" to="tøj" />
+    <Word from="fodt" to="født" />
+    <Word from="gore" to="gøre" />
+    <Word from="provede" to="prøvede" />
+    <Word from="forste" to="første" />
+    <Word from="igang" to="i gang" />
+    <Word from="ligenu" to="lige nu" />
+    <Word from="clet" to="det" />
+    <Word from="Strombell" to="Strombel!" />
+    <Word from="tmvlt" to="travlt" />
+    <Word from="studererjournalistik" to="studerer journalistik" />
+    <Word from="inforrnererjeg" to="informerer jeg" />
+    <Word from="omkﬁng" to="omkring" />
+    <Word from="tilAsgård" to="til Asgård" />
+    <Word from="Kederjeg" to="Keder jeg" />
+    <Word from="jaettetamp" to="jættetamp" />
+    <Word from="erjer" to="er jer" />
+    <Word from="atjulehygge" to="at julehygge" />
+    <Word from="Ueneste" to="tjeneste" />
+    <Word from="foltsaetter" to="fortsætter" />
+    <Word from="A/ice" to="Alice" />
+    <Word from="tvivlerjeg" to="tvivler jeg" />
+    <Word from="henterjer" to="henter jer" />
+    <Word from="forstårjeg" to="forstår jeg" />
+    <Word from="hvisjeg" to="hvis jeg" />
+    <Word from="/ært" to="lært" />
+    <Word from="vfgtrgt" to="vigtigt" />
+    <Word from="hurtigtjeg" to="hurtigt jeg" />
+    <Word from="kenderjo" to="kender jo" />
+    <Word from="seiv" to="selv" />
+    <Word from="/ægehuset" to="lægehuset" />
+    <Word from="herjo" to="her jo" />
+    <Word from="stolerjeg" to="stoler jeg" />
+    <Word from="digi" to="dig i" />
+    <Word from="taberi" to="taber i" />
+    <Word from="slårjeres" to="slår jeres" />
+    <Word from="laere" to="lære" />
+    <Word from="trænerwushu" to="træner wushu" />
+    <Word from="efterjeg" to="efter jeg" />
+    <Word from="eﬁer" to="efter" />
+    <Word from="dui" to="du i" />
+    <Word from="aﬁen" to="aften" />
+    <Word from="bliveri" to="bliver i" />
+    <Word from="acceptererjer" to="accepterer jer" />
+    <Word from="drikkerjo" to="drikker jo" />
+    <Word from="ﬁanjin" to="Tianjin" />
+    <Word from="erlænge" to="er længe" />
+    <Word from="erikke" to="er ikke" />
+    <Word from="medjer" to="med jer" />
+    <Word from="Tmykke" to="Tillykke" />
+    <Word from="'ﬁanjins" to="Tianjins" />
+    <Word from="Mesteri" to="Mester i" />
+    <Word from="sagdetil" to="sagde til" />
+    <Word from="indei" to="inde i" />
+    <Word from="oﬁe" to="ofte" />
+    <Word from="'ﬁlgiv" to="Tilgiv" />
+    <Word from="Lfår" to="I får" />
+    <Word from="viserjer" to="viser jer" />
+    <Word from="Rejsjerblot" to="Rejs jer blot" />
+    <Word from="'ﬁllad" to="Tillad" />
+    <Word from="iiiieﬁnger" to="lilleﬁnger" />
+    <Word from="VILOMFATTE" to="VIL OMFATTE" />
+    <Word from="moﬁo" to="motto" />
+    <Word from="gørjer" to="gør jer" />
+    <Word from="gifi" to="gift" />
+    <Word from="hardu" to="har du" />
+    <Word from="giﬁ" to="gift" />
+    <Word from="Iaeggerjeg" to="lægger jeg" />
+    <Word from="iet" to="i et" />
+    <Word from="sv/yte" to="svigte" />
+    <Word from="ti/" to="til" />
+    <Word from="Wdal" to="Vidal" />
+    <Word from="ﬁået" to="fået" />
+    <Word from="Hvo/for" to="Hvorfor" />
+    <Word from="hellerikke" to="heller ikke" />
+    <Word from="Wlle" to="Ville" />
+    <Word from="dr/ver" to="driver" />
+    <Word from="V\ﬂlliam" to="William" />
+    <Word from="V\ﬂlliams" to="Williams" />
+    <Word from="Vkﬁlliam" to="William" />
+    <Word from="vådejakke" to="våde jakke" />
+    <Word from="kæﬂl" to="kæft!" />
+    <Word from="sagdejeg" to="sagde jeg" />
+    <Word from="oven/ejet" to="overvejet" />
+    <Word from="karameisauce" to="karamelsauce" />
+    <Word from="Lfølgejødisk" to="Ifølge jødisk" />
+    <Word from="blevjo" to="blev jo" />
+    <Word from="asiateri" to="asiater i" />
+    <Word from="erV\ﬂlliam" to="er William" />
+    <Word from="lidtﬂov" to="lidt ﬂov" />
+    <Word from="sagdejo" to="sagde jo" />
+    <Word from="erlige" to="er lige" />
+    <Word from="Vtﬁlliam" to="William" />
+    <Word from="WﬁII" to="Will" />
+    <Word from="aﬂdarede" to="afklarede" />
+    <Word from="hjæiperjeg" to="hjælper jeg" />
+    <Word from="laderjeg" to="lader jeg" />
+    <Word from="Hândledsbeskyttere" to="Håndledsbeskyttere" />
+    <Word from="Lsabels" to="Isabels" />
+    <Word from="Gørjeg" to="Gør jeg" />
+    <Word from="mâjeg" to="må jeg" />
+    <Word from="ogjeg" to="og jeg" />
+    <Word from="gjordejeg" to="gjorde jeg" />
+    <Word from="villejeg" to="ville jeg" />
+    <Word from="Vlﬂlliams" to="Williams" />
+    <Word from="Dajeg" to="Da jeg" />
+    <Word from="iorden" to="i orden" />
+    <Word from="fandtjeg" to="fandt jeg" />
+    <Word from="Tilykke" to="Tillykke" />
+    <Word from="kørerjer" to="kører jer" />
+    <Word from="gøfjeg" to="gør jeg" />
+    <Word from="Selvflgelig" to="Selvfølgelig" />
+    <Word from="fdder" to="fadder" />
+    <Word from="bnfaldt" to="bønfaldt" />
+    <Word from="t\/ehovedede" to="tvehovedede" />
+    <Word from="EIler" to="Eller" />
+    <Word from="ringerjeg" to="ringer jeg" />
+    <Word from="blevvæk" to="blev væk" />
+    <Word from="stárjeg" to="står jeg" />
+    <Word from="varforbi" to="var forbi" />
+    <Word from="harfortalt" to="har fortalt" />
+    <Word from="iflere" to="i flere" />
+    <Word from="tørjeg" to="tør jeg" />
+    <Word from="kunnejeg" to="kunne jeg" />
+    <Word from="má" to="må" />
+    <Word from="hartænkt" to="har tænkt" />
+    <Word from="Fárjeg" to="Får jeg" />
+    <Word from="afdelingervar" to="afdelinger var" />
+    <Word from="0rd" to="ord" />
+    <Word from="pástá" to="påstå" />
+    <Word from="gráharet" to="gråharet" />
+    <Word from="varforbløffende" to="var forbløffende" />
+    <Word from="holdtjeg" to="holdt jeg" />
+    <Word from="hængerjo" to="hænger jo" />
+    <Word from="fikjeg" to="fik jeg" />
+    <Word from="fár" to="får" />
+    <Word from="Hvorforfølerjeg" to="Hvorfor føler jeg" />
+    <Word from="harfeber" to="har feber" />
+    <Word from="ándssvagt" to="åndssvagt" />
+    <Word from="0g" to="Og" />
+    <Word from="vartre" to="var tre" />
+    <Word from="abner" to="åbner" />
+    <Word from="garjeg" to="går jeg" />
+    <Word from="sertil" to="ser til" />
+    <Word from="hvorfin" to="hvor fin" />
+    <Word from="harfri" to="har fri" />
+    <Word from="forstarjeg" to="forstår jeg" />
+    <Word from="Sä" to="Så" />
+    <Word from="hvorfint" to="hvor fint" />
+    <Word from="mærkerjeg" to="mærker jeg" />
+    <Word from="ogsa" to="også" />
+    <Word from="nárjeg" to="når jeg" />
+    <Word from="Jasá" to="Jaså" />
+    <Word from="bándoptager" to="båndoptager" />
+    <Word from="bedárende" to="bedårende" />
+    <Word from="sá" to="så" />
+    <Word from="nár" to="når" />
+    <Word from="kunnejo" to="kunne jo" />
+    <Word from="Brammertil" to="Brammer til" />
+    <Word from="serjeg" to="ser jeg" />
+    <Word from="gikjeg" to="gik jeg" />
+    <Word from="udholderjeg" to="udholder jeg" />
+    <Word from="máneder" to="måneder" />
+    <Word from="vartræt" to="var træt" />
+    <Word from="dárligt" to="dårligt" />
+    <Word from="klaretjer" to="klaret jer" />
+    <Word from="pavirkelig" to="påvirkelig" />
+    <Word from="spekulererjeg" to="spekulerer jeg" />
+    <Word from="forsøgerjeg" to="forsøger jeg" />
+    <Word from="huskerjeg" to="husker jeg" />
+    <Word from="ifavnen" to="i favnen" />
+    <Word from="skullejo" to="skulle jo" />
+    <Word from="vartung" to="var tung" />
+    <Word from="varfuldstændig" to="var fuldstændig" />
+    <Word from="Paskedag" to="Påskedag" />
+    <Word from="turi" to="tur i" />
+    <Word from="spillerschumanns" to="spiller Schumanns" />
+    <Word from="forstárjeg" to="forstår jeg" />
+    <Word from="istedet" to="i stedet" />
+    <Word from="nárfrem" to="når frem" />
+    <Word from="habertrods" to="håber trods" />
+    <Word from="forførste" to="for første" />
+    <Word from="varto" to="var to" />
+    <Word from="overtil" to="over til" />
+    <Word from="forfem" to="for fem" />
+    <Word from="holdtjo" to="holdt jo" />
+    <Word from="passerjo" to="passer jo" />
+    <Word from="ellerto" to="eller to" />
+    <Word from="hartrods" to="har trods" />
+    <Word from="harfuldstændig" to="har fuldstændig" />
+    <Word from="gårjeg" to="går jeg" />
+    <Word from="giderjeg" to="gider jeg" />
+    <Word from="forjer" to="for jer" />
+    <Word from="erindrerjeg" to="erindrer jeg" />
+    <Word from="tænkerjeg" to="tænker jeg" />
+    <Word from="GAEt" to="GÅET" />
+    <Word from="hørerjo" to="hører jo" />
+    <Word from="forladerjeg" to="forlader jeg" />
+    <Word from="kosterjo" to="koster jo" />
+    <Word from="fortællerjeg" to="fortæller jeg" />
+    <Word from="Forstyrrerjeg" to="Forstyrrer jeg" />
+    <Word from="tjekkerjeg" to="tjekker jeg" />
+    <Word from="erjurist" to="er jurist" />
+    <Word from="tlLBUD" to="TILBUD" />
+    <Word from="serjo" to="se rjo" />
+    <Word from="bederjeg" to="beder jeg" />
+    <Word from="bilderjeg" to="bilder jeg" />
+    <Word from="ULVEtlME" to="ULVETlME" />
+    <Word from="skærerjo" to="skærer jo" />
+    <Word from="afjer" to="af jer" />
+    <Word from="ordnerjeg" to="ordner jeg" />
+    <Word from="giverjeg" to="giver jeg" />
+    <Word from="rejservi" to="rejser vi" />
+    <Word from="fangerjeg" to="fanger jeg" />
+    <Word from="erjaloux" to="er jaloux" />
+    <Word from="glemmerjeg" to="glemmer jeg" />
+    <Word from="Behøverjeg" to="Behøver jeg" />
+    <Word from="harvi" to="har vi" />
+    <Word from="ertyndere" to="er tyndere" />
+    <Word from="fårtordenvejr" to="får tordenvejr" />
+    <Word from="varfærdig" to="var færdig" />
+    <Word from="hørerfor" to="hører for" />
+    <Word from="varvel" to="var vel" />
+    <Word from="erforbi" to="er forbi" />
+    <Word from="AIle" to="Alle" />
+    <Word from="læserjo" to="læser jo" />
+    <Word from="Edgarer" to="Edgar er" />
+    <Word from="hartaget" to="har taget" />
+    <Word from="derer" to="der er" />
+    <Word from="stikkerfrem" to="stikker frem" />
+    <Word from="haraldrig" to="har aldrig" />
+    <Word from="ellerfar" to="eller far" />
+    <Word from="erat" to="er at" />
+    <Word from="turtil" to="tur til" />
+    <Word from="erfærdig" to="er færdig" />
+    <Word from="følerjeg" to="føler jeg" />
+    <Word from="jerfra" to="jer fra" />
+    <Word from="eralt" to="er alt" />
+    <Word from="harfaktisk" to="har faktisk" />
+    <Word from="harfundet" to="har fundet" />
+    <Word from="harvendt" to="har vendt" />
+    <Word from="Kunstneraf" to="Kunstner af" />
+    <Word from="ervel" to="er vel" />
+    <Word from="ståransigt" to="står ansigt" />
+    <Word from="Erjeg" to="Er jeg" />
+    <Word from="venterjeg" to="venter jeg" />
+    <Word from="Hvorvar" to="Hvor var" />
+    <Word from="varfint" to="var fint" />
+    <Word from="ervarmt" to="er varmt" />
+    <Word from="gårfint" to="går fint" />
+    <Word from="flyverforbi" to="flyver forbi" />
+    <Word from="Dervar" to="Der var" />
+    <Word from="dervar" to="der var" />
+    <Word from="meneråndeligt" to="mener åndeligt" />
+    <Word from="forat" to="for at" />
+    <Word from="herovertil" to="herover til" />
+    <Word from="soverfor" to="sover for" />
+    <Word from="begyndtejeg" to="begyndte jeg" />
+    <Word from="vendertilbage" to="vender tilbage" />
+    <Word from="erforfærdelig" to="er forfærdelig" />
+    <Word from="gøraltid" to="gør altid" />
+    <Word from="ertilbage" to="er tilbage" />
+    <Word from="harværet" to="har været" />
+    <Word from="bagoverellertil" to="bagover eller til" />
+    <Word from="hertaler" to="her taler" />
+    <Word from="vågnerjeg" to="vågner jeg" />
+    <Word from="vartomt" to="var tomt" />
+    <Word from="gårfrem" to="går frem" />
+    <Word from="talertil" to="taler til" />
+    <Word from="ertryg" to="er tryg" />
+    <Word from="ansigtervendes" to="ansigter vendes" />
+    <Word from="hervirkeligt" to="her virkeligt" />
+    <Word from="herer" to="her er" />
+    <Word from="drømmerjo" to="drømmer jo" />
+    <Word from="erfuldkommen" to="er fuldkommen" />
+    <Word from="hveren" to="hver en" />
+    <Word from="erfej" to="er fej" />
+    <Word from="datterforgæves" to="datter forgæves" />
+    <Word from="forsøgerjo" to="forsøger jo" />
+    <Word from="ertom" to="er tom" />
+    <Word from="vareftermiddag" to="var eftermiddag" />
+    <Word from="vartom" to="var tom" />
+    <Word from="angerellerforventninger" to="anger eller forventninger" />
+    <Word from="kørtejeg" to="kørte jeg" />
+    <Word from="Hvorforfortæller" to="Hvorfor fortæller" />
+    <Word from="gårtil" to="går til" />
+    <Word from="ringerefter" to="ringer efter" />
+    <Word from="søgertilflugt" to="søger tilflugt" />
+    <Word from="ertvunget" to="er tvunget" />
+    <Word from="megetjeg" to="meget jeg" />
+    <Word from="varikke" to="var ikke" />
+    <Word from="Derermange" to="Der e rmange" />
+    <Word from="dervilhindre" to="der vil hindre" />
+    <Word from="erså" to="er så" />
+    <Word from="DetforstårLeggodt" to="Det forstår jeg godt" />
+    <Word from="ergodt" to="er godt" />
+    <Word from="vorventen" to="vor venten" />
+    <Word from="tagerfejl" to="tager fejl" />
+    <Word from="ellerer" to="eller er" />
+    <Word from="laverjeg" to="laver jeg" />
+    <Word from="0mgang" to="omgang" />
+    <Word from="afstár" to="afstår" />
+    <Word from="pá" to="på" />
+    <Word from="rejserjeg" to="rejser jeg" />
+    <Word from="ellertage" to="eller tage" />
+    <Word from="takkerjeg" to="takker jeg" />
+    <Word from="ertilfældigvis" to="er tilfældigvis" />
+    <Word from="fremstar" to="fremstår" />
+    <Word from="ertæt" to="er tæt" />
+    <Word from="ijeres" to="i jeres" />
+    <Word from="Sagdejeg" to="Sagde jeg" />
+    <Word from="overi" to="over i" />
+    <Word from="plukkerjordbær" to="plukker jordbær" />
+    <Word from="klarerjeg" to="klarer jeg" />
+    <Word from="jerfire" to="jer fire" />
+    <Word from="tábeligste" to="tåbeligste" />
+    <Word from="sigertvillingerne" to="siger tvillingerne" />
+    <Word from="erfaktisk" to="er faktisk" />
+    <Word from="gár" to="går" />
+    <Word from="harvasket" to="har vasket" />
+    <Word from="harplukketjordbærtil" to="har plukket jordbær til" />
+    <Word from="plukketjordbær" to="plukket jordbær" />
+    <Word from="klaverfirehændigt" to="klaver firehændigt" />
+    <Word from="erjævnaldrende" to="er jævnaldrende" />
+    <Word from="tierjeg" to="tier jeg" />
+    <Word from="Hvorerden" to="Hvor er den" />
+    <Word from="0veraltjeg" to="overalt jeg" />
+    <Word from="gårpå" to="går på" />
+    <Word from="finderjeg" to="finder jeg" />
+    <Word from="serhans" to="ser hans" />
+    <Word from="tiderbliver" to="tider bliver" />
+    <Word from="ellertrist" to="eller trist" />
+    <Word from="forstårjeres" to="forstår jeres" />
+    <Word from="Hvorsjælen" to="Hvor sjælen" />
+    <Word from="finderro" to="finder ro" />
+    <Word from="sidderjeg" to="sidder jeg" />
+    <Word from="tagerjo" to="tager jo" />
+    <Word from="efterjeres" to="efter jeres" />
+    <Word from="10O" to="100" />
+    <Word from="besluttedejeg" to="besluttede jeg" />
+    <Word from="varsket" to="var sket" />
+    <Word from="uadskillige" to="uadskillelige" />
+    <Word from="harjetlag" to="har jetlag" />
+    <Word from="lkke" to="Ikke" />
+    <Word from="lntet" to="Intet" />
+    <Word from="afslørerjeg" to="afslører jeg" />
+    <Word from="måjeg" to="må jeg" />
+    <Word from="Vl" to="VI" />
+    <Word from="atbygge" to="at bygge" />
+    <Word from="detmakabre" to="det makabre" />
+    <Word from="vilikke" to="vil ikke" />
+    <Word from="talsmandbekræfter" to="talsmand bekræfter" />
+    <Word from="vedatrenovere" to="ved at renovere" />
+    <Word from="forsøgeratforstå" to="forsøger at forstå" />
+    <Word from="ersket" to="er sket" />
+    <Word from="morderpå" to="morder på" />
+    <Word from="frifodiRosewood" to="fri fod i Rosewood" />
+    <Word from="holdtpressemøde" to="holdt pressemøde" />
+    <Word from="lngen" to="Ingen" />
+    <Word from="lND" to="IND" />
+    <Word from="henterjeg" to="henter jeg" />
+    <Word from="lsabel" to="Isabel" />
+    <Word from="lsabels" to="Isabels" />
+    <Word from="vinderjo" to="vinder jo" />
+    <Word from="rødmerjo" to="rødmer jo" />
+    <Word from="etjakkesæt" to="et jakkesæt" />
+    <Word from="glæderjeg" to="glæder jeg" />
+    <Word from="lgen" to="Igen" />
+    <Word from="lsær" to="Især" />
+    <Word from="iparken" to="i parken" />
+    <Word from="nårl" to="når I" />
+    <Word from="tilA1" to="til A1" />
+    <Word from="FBl" to="FBI" />
+    <Word from="viljo" to="vil jo" />
+    <Word from="detpå" to="det på" />
+    <Word from="KIar" to="Klar" />
+    <Word from="PIan" to="Plan" />
+    <Word from="EIIer" to="Eller" />
+    <Word from="FIot" to="Flot" />
+    <Word from="AIIe" to="Alle" />
+    <Word from="AIt" to="Alt" />
+    <Word from="KIap" to="Klap" />
+    <Word from="PIaza" to="Plaza" />
+    <Word from="SIap" to="Slap" />
+    <Word from="Iå" to="lå" />
+    <Word from="BIing" to="Bling" />
+    <Word from="GIade" to="Glade" />
+    <Word from="Iejrbålssange" to="lejrbålssange" />
+    <Word from="bedtjer" to="bedt jer" />
+    <Word from="hørerjeg" to="hører jeg" />
+    <Word from="Fårjeg" to="Får jeg" />
+    <Word from="fikJames" to="fik James" />
+    <Word from="atsnakke" to="at snakke" />
+    <Word from="varkun" to="var kun" />
+    <Word from="retterjeg" to="retter jeg" />
+    <Word from="ernormale" to="er normale" />
+    <Word from="viljeg" to="vil jeg" />
+    <Word from="Sætjer" to="Sæt jer" />
+    <Word from="udsatham" to="udsat ham" />
+  </WholeWords>
+  <PartialWordsAlways>
+    <WordPart from="¤" to="o" />
+    <WordPart from="IVI" to="M" />
+    <WordPart from="lVI" to="M" />
+    <WordPart from="IVl" to="M" />
+    <WordPart from="lVl" to="M" />
+  </PartialWordsAlways>
+  <PartialWords>
+    <!-- Will be used to check words not in dictionary -->
+    <!-- If new word(s) exists in spelling dictionary, it(they) is accepted -->
+    <WordPart from="ﬁ" to="fi" />
+    <WordPart from="ﬂ" to="fl" />
+    <WordPart from="/" to="l" />
+    <WordPart from="vv" to="w" />
+    <WordPart from="m" to="rn" />
+    <WordPart from="l" to="i" />
+    <WordPart from="€" to="e" />
+    <WordPart from="I" to="l" />
+    <WordPart from="c" to="o" />
+    <WordPart from="i" to="t" />
+    <WordPart from="cc" to="oo" />
+    <WordPart from="ii" to="tt" />
+    <WordPart from="n/" to="ry" />
+    <WordPart from="ae" to="æ" />
+    <!-- "f " will be two words -->
+    <WordPart from="f" to="f " />
+    <WordPart from="c" to="e" />
+    <WordPart from="o" to="e" />
+    <WordPart from="I" to="t" />
+    <WordPart from="n" to="o" />
+    <WordPart from="s" to="e" />
+    <WordPart from="\A" to="Vi" />
+    <WordPart from="n/" to="rv" />
+    <WordPart from="Ã" to="Å" />
+    <WordPart from="í" to="i" />
+  </PartialWords>
+  <PartialLines />
+  <PartialLinesAlways />
+  <BeginLines />
+  <EndLines />
+  <WholeLines />
+  <RegularExpressions />
+</OCRFixReplaceList>
@@ -0,0 +1,270 @@
+<OCRFixReplaceList>
+  <WholeWords>
+    <Word from="@immatriculation" to="d'immatriculation" />
+    <Word from="acquer" to="acquér" />
+    <Word from="acteurjoue" to="acteur joue" />
+    <Word from="aerien" to="aérien" />
+    <Word from="agreable" to="agréable" />
+    <Word from="aientjamais" to="aient jamais" />
+    <Word from="AII" to="All" />
+    <Word from="aitjamais" to="ait jamais" />
+    <Word from="aitjus" to="ait jus" />
+    <Word from="alle" to="allé" />
+    <Word from="alles" to="allés" />
+    <Word from="appele" to="appelé" />
+    <Word from="apres" to="après" />
+    <Word from="aujourdhui" to="aujourd'hui" />
+    <Word from="aupres" to="auprès" />
+    <Word from="beaute" to="beauté" />
+    <Word from="cabossee" to="cabossée" />
+    <Word from="carj'" to="car j'" />
+    <Word from="Carj'" to="Car j'" />
+    <Word from="carla" to="car la" />
+    <Word from="CEdipe" to="Œdipe" />
+    <Word from="Cest" to="C'est" />
+    <Word from="c'etaient" to="c'étaient" />
+    <Word from="Cétaient" to="C'étaient" />
+    <Word from="c'etait" to="c'était" />
+    <Word from="C'etait" to="C'était" />
+    <Word from="Cétait" to="C'était" />
+    <Word from="choregraphiee" to="chorégraphiée" />
+    <Word from="cinema" to="cinéma" />
+    <Word from="cl'AIcatraz" to="d'Alcatraz" />
+    <Word from="cles" to="clés" />
+    <Word from="cœurjoie" to="cœur-joie" />
+    <Word from="completer" to="compléter" />
+    <Word from="costumiere" to="costumière" />
+    <Word from="cree" to="créé" />
+    <Word from="daccord" to="d'accord" />
+    <Word from="d'AIbert" to="d'Albert" />
+    <Word from="d'AIdous" to="d'Aldous" />
+    <Word from="d'AIec" to="d'Alec" />
+    <Word from="danniversaire" to="d'anniversaire" />
+    <Word from="d'Arra'bida" to="d'Arrabida" />
+    <Word from="d'autodérision" to="d'auto-dérision" />
+    <Word from="dautres" to="d'autres" />
+    <Word from="debattait" to="débattait" />
+    <Word from="decor" to="décor" />
+    <Word from="decorateurs" to="décorateurs" />
+    <Word from="decors" to="décors" />
+    <Word from="defi" to="défi" />
+    <Word from="dejà" to="déjà" />
+    <Word from="déjàm" to="déjà..." />
+    <Word from="dejeunait" to="déjeunait" />
+    <Word from="dengager" to="d'engager" />
+    <Word from="déquipement" to="d'équipement" />
+    <Word from="dérnièré" to="dernière" />
+    <Word from="Desole" to="Désolé" />
+    <Word from="dessayage" to="d'essayage" />
+    <Word from="dessence" to="d'essence" />
+    <Word from="détaient" to="c'étaient" />
+    <Word from="detail" to="détail" />
+    <Word from="dexcellents" to="d'excellents" />
+    <Word from="dexpérience" to="d'expérience" />
+    <Word from="dexpériences" to="d'expériences" />
+    <Word from="d'héro'l'ne" to="d'héroïne" />
+    <Word from="d'idees" to="d'idées" />
+    <Word from="d'intensite" to="d'intensité" />
+    <Word from="dontj" to="dont j" />
+    <Word from="doublaitAlfo" to="doublait Alfo" />
+    <Word from="DrNo" to="Dr No" />
+    <Word from="e'" to="é" />
+    <Word from="ecrit" to="écrit" />
+    <Word from="elegant" to="élégant" />
+    <Word from="Ellé" to="Elle" />
+    <Word from="én" to="en" />
+    <Word from="equipe" to="équipe" />
+    <Word from="erjus" to="er jus" />
+    <Word from="estjamais" to="est jamais" />
+    <Word from="ét" to="et" />
+    <Word from="etaient" to="étaient" />
+    <Word from="etait" to="était" />
+    <Word from="ete" to="été" />
+    <Word from="etiez" to="étiez" />
+    <Word from="etj'" to="et j'" />
+    <Word from="Etj'" to="Et j'" />
+    <Word from="etje" to="et je" />
+    <Word from="Etje" to="Et je" />
+    <Word from="EtsouvenL" to="Et souvent" />
+    <Word from="eviter" to="éviter" />
+    <Word from="Fabsence" to="l'absence" />
+    <Word from="fadapter" to="t'adapter" />
+    <Word from="fadore" to="j'adore" />
+    <Word from="Fâge" to="l'âge" />
+    <Word from="Fagent" to="l'agent" />
+    <Word from="faiessayé" to="j'ai essayé" />
+    <Word from="Failure" to="l'alllure" />
+    <Word from="Fambiance" to="l'ambiance" />
+    <Word from="Famener" to="l'amener" />
+    <Word from="Fanniversaire" to="l'anniversaire" />
+    <Word from="Fapparence" to="l'apparence" />
+    <Word from="Fapres" to="l'apres" />
+    <Word from="Faprès" to="l'après" />
+    <Word from="Farmée" to="l'armée" />
+    <Word from="Farrière" to="l'arrière" />
+    <Word from="Farrivée" to="l'arrivée" />
+    <Word from="Fascenseur" to="l'ascenseur" />
+    <Word from="Fascension" to="l'ascension" />
+    <Word from="Fassaut" to="l'assaut" />
+    <Word from="Fassomme" to="l'assomme" />
+    <Word from="Fatmosphère" to="l'atmosphère" />
+    <Word from="Fattention" to="l'attention" />
+    <Word from="Favalanche" to="l'avalanche" />
+    <Word from="Féclairage" to="l'éclairage" />
+    <Word from="Fécran" to="l'écran" />
+    <Word from="Fémotion" to="l'émotion" />
+    <Word from="Femplacement" to="l'emplacement" />
+    <Word from="Fendroit" to="l'endroit" />
+    <Word from="Fenseigne" to="l'enseigne" />
+    <Word from="Fensemble" to="l'ensemble" />
+    <Word from="Fentouraient" to="l'entouraient" />
+    <Word from="Fentrée" to="l'entrée" />
+    <Word from="Fépaisseur" to="l'épaisseur" />
+    <Word from="Fépoque" to="l'époque" />
+    <Word from="Féquipe" to="Équipe" />
+    <Word from="Fespace" to="l'espace" />
+    <Word from="fespérais" to="j'espérais" />
+    <Word from="Fespère" to="l'espère" />
+    <Word from="Festhétique" to="l'esthétique" />
+    <Word from="Fetranger" to="l'etranger" />
+    <Word from="Févasion" to="l'évasion" />
+    <Word from="Févoque" to="l'évoque" />
+    <Word from="Fexpérience" to="l'expérience" />
+    <Word from="Fexplique" to="l'explique" />
+    <Word from="Fexplosion" to="l'explosion" />
+    <Word from="Fextérieur" to="l'extérieur" />
+    <Word from="Fhabituelle" to="l'habituelle" />
+    <Word from="Fhélicoptère" to="l'hélicoptère" />
+    <Word from="Fhéliport" to="l'héliport" />
+    <Word from="Fhélistation" to="l'hélistation" />
+    <Word from="Fhonneur" to="l'honneur" />
+    <Word from="Fhorloge" to="l'horloge" />
+    <Word from="Fidée" to="l'idée" />
+    <Word from="Fimage" to="l'image" />
+    <Word from="Fimportance" to="l'importance" />
+    <Word from="Fimpression" to="l'impression" />
+    <Word from="Finfluence" to="l'influence" />
+    <Word from="Finscription" to="l'inscription" />
+    <Word from="Fintérieur" to="l'intérieur" />
+    <Word from="Fintrigue" to="l'intrigue" />
+    <Word from="Fobjectif" to="l'objectif" />
+    <Word from="Foccasion" to="l'occasion" />
+    <Word from="Fordre" to="l'ordre" />
+    <Word from="Forigine" to="l'origine" />
+    <Word from="frêre" to="frère" />
+    <Word from="gaylns" to="gaijins" />
+    <Word from="general" to="général" />
+    <Word from="hawaïennel" to="hawaïenne" />
+    <Word from="hawa'l'en" to="hawaïen" />
+    <Word from="Ia" to="la" />
+    <Word from="Ià" to="là" />
+    <Word from="Iaryngotomie" to="laryngotomie" />
+    <Word from="idee" to="idée" />
+    <Word from="idees" to="idées" />
+    <Word from="Ie" to="le" />
+    <Word from="Ies" to="les" />
+    <Word from="Iester" to="Lester" />
+    <Word from="II" to="Il" />
+    <Word from="Iimit" to="limit" />
+    <Word from="IIs" to="Ils" />
+    <Word from="immediatement" to="immédiatement" />
+    <Word from="insufflee" to="insufflée" />
+    <Word from="integrer" to="intégrer" />
+    <Word from="interessante" to="intéressante" />
+    <Word from="Iogions" to="logions" />
+    <Word from="Iorsqu" to="lorsqu" />
+    <Word from="isee" to="isée" />
+    <Word from="Iumiere" to="lumiere" />
+    <Word from="Iynchage" to="lynchage" />
+    <Word from="J'espere" to="J'espère" />
+    <Word from="Jessaie" to="J'essaie" />
+    <Word from="j'etais" to="j'étais" />
+    <Word from="J'etais" to="J'étais" />
+    <Word from="latéralémént" to="latéralement" />
+    <Word from="lci" to="Ici" />
+    <Word from="Lci" to="Ici" />
+    <Word from="lé-" to="là-" />
+    <Word from="lepidopteres" to="lépidoptères" />
+    <Word from="litteraire" to="littéraire" />
+    <Word from="ll" to="il" />
+    <Word from="Ll" to="Il" />
+    <Word from="lls" to="ils" />
+    <Word from="Lls" to="Ils" />
+    <Word from="maintenanu" to="maintenant" />
+    <Word from="maniere" to="manière" />
+    <Word from="mariee" to="mariée" />
+    <Word from="Mayer/ing" to="Mayerling" />
+    <Word from="meilleurjour" to="meilleur jour" />
+    <Word from="melange" to="mélange" />
+    <Word from="n'avaiént" to="n'avaient" />
+    <Word from="n'etait" to="n'était" />
+    <Word from="oitjamais" to="oit jamais" />
+    <Word from="oitjus" to="oit jus" />
+    <Word from="ontete" to="ont été" />
+    <Word from="operateur" to="opérateur" />
+    <Word from="ouvérté" to="ouverte" />
+    <Word from="Pépreuve" to="l'épreuve" />
+    <Word from="pere" to="père" />
+    <Word from="plateforme" to="plate-forme" />
+    <Word from="pourjouer" to="pour jouer" />
+    <Word from="precipice" to="précipice" />
+    <Word from="preferes" to="préférés" />
+    <Word from="premierjour" to="premier jour" />
+    <Word from="presenter" to="présenter" />
+    <Word from="prevu" to="prévu" />
+    <Word from="prevue" to="prévue" />
+    <Word from="propriete" to="propriété" />
+    <Word from="protègeraient" to="protégeraient" />
+    <Word from="qué" to="que" />
+    <Word from="qwangoissé" to="qu'angoissé" />
+    <Word from="realisateur" to="réalisateur" />
+    <Word from="reception" to="réception" />
+    <Word from="reévalu" to="réévalu" />
+    <Word from="repute" to="réputé" />
+    <Word from="reussi" to="réussi" />
+    <Word from="s'arrétait" to="s'arrêtait" />
+    <Word from="s'ave'rer" to="s'avérer" />
+    <Word from="scenario" to="scénario" />
+    <Word from="scene" to="scène" />
+    <Word from="scenes" to="scènes" />
+    <Word from="seances" to="séances" />
+    <Word from="sequence" to="séquence" />
+    <Word from="sﬂécrasa" to="s'écrasa" />
+    <Word from="speciale" to="spéciale" />
+    <Word from="Supen" to="Super" />
+    <Word from="torturee" to="torturée" />
+    <Word from="Uadmirable" to="L'admirable" />
+    <Word from="Uensemblier" to="L'ensemblier" />
+    <Word from="Uexplosion" to="L'explosion" />
+    <Word from="Uouvre" to="L'ouvre" />
+    <Word from="Vaise" to="l'aise" />
+    <Word from="vecu" to="vécu" />
+    <Word from="vehicules" to="véhicules" />
+    <Word from="Ÿappréciais" to="J'appréciais" />
+    <Word from="Ÿespère" to="J'espère" />
+    <Word from="ÿétrangle" to="s'étrangle" />
+  </WholeWords>
+  <PartialWordsAlways />
+  <PartialWords />
+  <PartialLines>
+    <LinePart from=" I'" to=" l'" />
+    <LinePart from=" |'" to=" l'" />
+  </PartialLines>
+  <PartialLinesAlways />
+  <BeginLines />
+  <EndLines />
+  <WholeLines>
+    <Line from="&quot;D'ac:c:ord.&quot;" to="&quot;D'accord.&quot;" />
+    <Line from="“i QUÎ gagne, qui perd," to="ni qui gagne, qui perd," />
+    <Line from="L'ac:c:ent est mis &#xD;&#xA; &#xD;&#xA; sur son trajet jusqu'en Suisse." to="L'accent est mis &#xD;&#xA; &#xD;&#xA; sur son trajet jusqu'en Suisse." />
+    <Line from="C'est la plus gentille chose &#xD;&#xA; &#xD;&#xA; qu'Hitchc:oc:k m'ait jamais dite." to="C'est la plus gentille chose &#xD;&#xA; &#xD;&#xA; qu'Hitchcock m'ait jamais dite." />
+    <Line from="Tout le monde, en revanche, qualifie &#xD;&#xA; &#xD;&#xA; Goldfinger d'aventu re structurée," to="Tout le monde, en revanche, qualifie &#xD;&#xA; &#xD;&#xA; Goldfinger d'aventure structurée," />
+    <Line from="et le film Shadow of a man &#xD;&#xA; &#xD;&#xA; a lancé sa carrière au cinéma." to="et le film &lt;i&gt;Shadow of a man&lt;/i&gt; &#xD;&#xA; &#xD;&#xA; a lancé sa carrière au cinéma." />
+    <Line from="En 1948, Young est passé à la réalisation &#xD;&#xA; &#xD;&#xA; avec One night with you." to="En 1948, Young est passé à la réalisation &#xD;&#xA; &#xD;&#xA; avec &lt;i&gt;One night with you&lt;/i&gt;." />
+    <Line from="Il a construit tous ces véhicules &#xD;&#xA; &#xD;&#xA; à C)c:ala, en Floride." to="Il a construit tous ces véhicules &#xD;&#xA; &#xD;&#xA; à Ocala, en Floride." />
+    <Line from="Tokyo Pop et A Taxing Woman? Return." to="Tokyo Pop et A Taxing Woman's Return." />
+    <Line from="Peter H u nt." to="Peter Hunt." />
+    <Line from="&quot;C'est bien mieux dans Peau. &#xD;&#xA; &#xD;&#xA; On peut sﬂéclabousser, faire du bruit.&quot;" to="&quot;C'est bien mieux dans l'eau. &#xD;&#xA; &#xD;&#xA; On peut s'éclabousser, faire du bruit.&quot;" />
+  </WholeLines>
+  <RegularExpressions />
+</OCRFixReplaceList>
@@ -0,0 +1,25 @@
+<OCRFixReplaceList>
+  <WholeWords />
+  <PartialWordsAlways />
+  <PartialWords />
+  <PartialLines />
+  <PartialLinesAlways />
+  <BeginLines />
+  <EndLines />
+  <WholeLines />
+  <RegularExpressions>
+    <!-- nagy I-l javítások -->
+    <RegEx find="([\x41-\x5a\x61-\x7a\xc1-\xfc])II" replaceWith="$1ll" />
+    <RegEx find="II([\x61-\x7a\xe1-\xfc])" replaceWith="ll$1" />
+    <RegEx find="([\x61-\x7a\xe1-\xfc])I" replaceWith="$1l" />
+    <RegEx find="([\x20])I([^aeou\x41-\x5a\xc1-\xdc])" replaceWith="$1l$2" />
+    <RegEx find="\bl([bcdfghjklmnpqrstvwxz])" replaceWith="I$1" />
+    <RegEx find="([\x41-\x5a\xc1-\xdc])I([\x61-\x7a\xe1-\xfc])" replaceWith="$1l$2" />
+    <RegEx find="([\x61-\x7a\xe1-\xfc][\-])I([\x61-\x7a\xe1-\xfc])" replaceWith="$1l$2" />
+    <RegEx find="([\x41-\x5a\xc1-\xdc])I([\-][\x41-\x5a\xc1-\xdc][\x61-\x7a\xe1-\xfc])" replaceWith="$1l$2" />
+    <RegEx find="\b([AEÜÓ])I([^\x41-\x5a\xc1-\xdc])" replaceWith="$1l$2" />
+    <RegEx find="\bI([aáeéiíoóöuúüy\xf5\xfb])" replaceWith="l$1" />
+    <RegEx find="\b(?:II|ll)" replaceWith="Il" />
+    <RegEx find="([\xf5\xfb])I" replaceWith="$1l" />
+  </RegularExpressions>
+</OCRFixReplaceList>
@@ -0,0 +1,24 @@
+<OCRFixReplaceList>
+  <WholeWords>
+    <Word from="ls" to="Is" />
+    <Word from="ln" to="In" />
+    <Word from="lk" to="Ik" />
+    <Word from="ledereen" to="Iedereen" />
+    <Word from="ledere" to="Iedere" />
+    <Word from="lemand" to="Iemand" />
+  </WholeWords>
+  <PartialWordsAlways />
+  <PartialWords />
+  <PartialLines />
+  <PartialLinesAlways />
+  <BeginLines />
+  <EndLines />
+  <WholeLines />
+  <RegularExpressions>
+    <RegEx find="\blk(?=\p{Ll}{2})" replaceWith="Ik" />
+    <RegEx find="\bln(?=\p{Ll}{2})" replaceWith="In" />
+    <RegEx find="\bls(?=\p{Ll}{2})" replaceWith="Is" />
+    <RegEx find="\beIk" replaceWith="elk" />
+    <RegEx find="\bler(land|se|s|)\b" replaceWith="Ier$1" />
+  </RegularExpressions>
+</OCRFixReplaceList>
@@ -0,0 +1,43 @@
+<OCRFixReplaceList>
+  <WholeWords />
+  <PartialWordsAlways />
+  <PartialWords>
+    <!-- Will be used to check words not in dictionary -->
+    <!-- If new word(s) exists in spelling dictionary, it(they) is accepted -->
+    <WordPart from="¤" to="o" />
+    <WordPart from="ﬁ" to="fi" />
+    <WordPart from="ﬂ" to="fl" />
+    <WordPart from="/" to="l" />
+    <WordPart from="vv" to="w" />
+    <WordPart from="IVI" to="M" />
+    <WordPart from="lVI" to="M" />
+    <WordPart from="IVl" to="M" />
+    <WordPart from="lVl" to="M" />
+    <WordPart from="m" to="rn" />
+    <WordPart from="l" to="i" />
+    <WordPart from="€" to="e" />
+    <WordPart from="I" to="l" />
+    <WordPart from="c" to="o" />
+    <WordPart from="i" to="t" />
+    <WordPart from="cc" to="oo" />
+    <WordPart from="ii" to="tt" />
+    <WordPart from="n/" to="ry" />
+    <WordPart from="ae" to="æ" />
+    <!-- "f " will be two words -->
+    <WordPart from="f" to="f " />
+    <WordPart from="c" to="e" />
+    <WordPart from="I" to="t" />
+    <WordPart from="n" to="o" />
+    <WordPart from="s" to="e" />
+    <WordPart from="\A" to="Vi" />
+    <WordPart from="n/" to="rv" />
+    <WordPart from="Ã" to="Å" />
+    <WordPart from="í" to="i" />
+  </PartialWords>
+  <PartialLines />
+  <PartialLinesAlways />
+  <BeginLines />
+  <EndLines />
+  <WholeLines />
+  <RegularExpressions />
+</OCRFixReplaceList>
@@ -0,0 +1,508 @@
+<OCRFixReplaceList>
+  <WholeWords>
+    <Word from="abitual" to="habitual" />
+    <Word from="àcerca" to="acerca" />
+    <Word from="acessor" to="assessor" />
+    <Word from="acólico" to="acólito" />
+    <Word from="açoreano" to="açoriano" />
+    <Word from="actuacao" to="actuação" />
+    <Word from="acucar" to="açúcar" />
+    <Word from="açucar" to="açúcar" />
+    <Word from="advinhar" to="adivinhar" />
+    <Word from="africa" to="África" />
+    <Word from="ajuisar" to="ajuizar" />
+    <Word from="album" to="álbum" />
+    <Word from="alcoolémia" to="alcoolemia" />
+    <Word from="aldião" to="aldeão" />
+    <Word from="algerino" to="argelino" />
+    <Word from="ameixeal" to="ameixial" />
+    <Word from="amiaça" to="ameaça" />
+    <Word from="analizar" to="analisar" />
+    <Word from="andáste" to="andaste" />
+    <Word from="anemona" to="anémona" />
+    <Word from="antartico" to="antárctico" />
+    <Word from="antártico" to="antárctico" />
+    <Word from="antepôr" to="antepor" />
+    <Word from="apárte" to="aparte" />
+    <Word from="apiadeiro" to="apeadeiro" />
+    <Word from="apiar" to="apear" />
+    <Word from="apreciacao" to="apreciação" />
+    <Word from="arctico" to="árctico" />
+    <Word from="arrazar" to="arrasar" />
+    <Word from="ártico" to="árctico" />
+    <Word from="artifice" to="artífice" />
+    <Word from="artifícial" to="artificial" />
+    <Word from="ascenção" to="ascensão" />
+    <!-- <Word from="assucar" to="açúcar" /> assucar é uma palavra existente no dicionário -->
+    <Word from="assúcar" to="açúcar" />
+    <Word from="aste" to="haste" />
+    <Word from="asterístico" to="asterisco" />
+    <Word from="averção" to="aversão" />
+    <Word from="avizar" to="avisar" />
+    <Word from="avulsso" to="avulso" />
+    <Word from="baínha" to="bainha" />
+    <Word from="banca-rota" to="bancarrota" />
+    <Word from="bandeija" to="bandeja" />
+    <Word from="bébé" to="bebé" />
+    <Word from="beige" to="bege" />
+    <Word from="benção" to="bênção" />
+    <Word from="beneficiência" to="beneficência" />
+    <Word from="beneficiente" to="beneficente" />
+    <Word from="benvinda" to="bem-vinda" />
+    <Word from="benvindo" to="bem-vindo" />
+    <Word from="boasvindas" to="boas-vindas" />
+    <Word from="borborinho" to="burburinho" />
+    <Word from="Brazil" to="Brasil" />
+    <Word from="bussula" to="bússola" />
+    <Word from="cabo-verdeano" to="cabo-verdiano" />
+    <Word from="caimbras" to="cãibras" />
+    <Word from="calcáreo" to="calcário" />
+    <Word from="calsado" to="calçado" />
+    <Word from="calvíce" to="calvície" />
+    <Word from="camoneano" to="camoniano" />
+    <Word from="campião" to="campeão" />
+    <Word from="cançacos" to="cansaços" />
+    <Word from="caracter" to="carácter" />
+    <Word from="caractéres" to="caracteres" />
+    <Word from="catequeze" to="catequese" />
+    <Word from="catequisador" to="catequizador" />
+    <Word from="catequisar" to="catequizar" />
+    <Word from="chícara" to="xícara" />
+    <Word from="ciclano" to="sicrano" />
+    <Word from="cicrano" to="sicrano" />
+    <Word from="cidadães" to="cidadãos" />
+    <Word from="cidadões" to="cidadãos" />
+    <Word from="cincoenta" to="cinquenta" />
+    <Word from="cinseiro" to="cinzeiro" />
+    <Word from="cinsero" to="sincero" />
+    <Word from="citacoes" to="citações" />
+    <Word from="coalizão" to="colisão" />
+    <Word from="côdia" to="côdea" />
+    <Word from="combóio" to="comboio" />
+    <Word from="compôr" to="compor" />
+    <Word from="concerteza" to="com certeza" />
+    <Word from="constituia" to="constituía" />
+    <Word from="constituíu" to="constituiu" />
+    <Word from="contato" to="contacto" />
+    <Word from="contensão" to="contenção" />
+    <Word from="contribuicoes" to="contribuições" />
+    <Word from="côr" to="cor" />
+    <Word from="corassão" to="coração" />
+    <Word from="corçario" to="corsário" />
+    <Word from="corçário" to="corsário" />
+    <Word from="cornprimidosinbo" to="comprimidozinho" />
+    <!-- <Word from="cota-parte" to="quota-parte" /> é uma palavra existente no dicionário -->
+    <Word from="crâneo" to="crânio" />
+    <Word from="dE" to="de" />
+    <Word from="defenição" to="definição" />
+    <Word from="defenido" to="definido" />
+    <Word from="defenir" to="definir" />
+    <Word from="deficite" to="défice" />
+    <Word from="degladiar" to="digladiar" />
+    <Word from="deiche" to="deixe" />
+    <Word from="desinteria" to="disenteria" />
+    <Word from="despendio" to="dispêndio" />
+    <Word from="despêndio" to="dispêndio" />
+    <Word from="desplicência" to="displicência" />
+    <Word from="dificulidade" to="dificuldade" />
+    <Word from="dispender" to="despender" />
+    <Word from="dispendio" to="dispêndio" />
+    <Word from="distribuido" to="distribuído" />
+    <Word from="druída" to="druida" />
+    <Word from="écrã" to="ecrã" />
+    <Word from="ecran" to="ecrã" />
+    <Word from="écran" to="ecrã" />
+    <Word from="êle" to="ele" />
+    <Word from="elice" to="hélice" />
+    <Word from="élice" to="hélice" />
+    <Word from="emiratos" to="emirados" />
+    <Word from="engolis-te" to="engoliste" />
+    <Word from="engulir" to="engolir" />
+    <Word from="enguliste" to="engoliste" />
+    <Word from="entertido" to="entretido" />
+    <Word from="entitular" to="intitular" />
+    <Word from="entreterimento" to="entretenimento" />
+    <Word from="entreti-me" to="entretive-me" />
+    <Word from="envólucro" to="invólucro" />
+    <Word from="erói" to="herói" />
+    <Word from="escluir" to="excluir" />
+    <Word from="esclusão" to="exclusão" />
+    <Word from="escrivões" to="escrivães" />
+    <Word from="esqueiro" to="isqueiro" />
+    <Word from="esquesito" to="esquisito" />
+    <Word from="estacoes" to="estações" />
+    <Word from="esteje" to="esteja" />
+    <Word from="excavação" to="escavação" />
+    <Word from="excavar" to="escavar" />
+    <Word from="exdrúxula" to="esdrúxula" />
+    <Word from="exdrúxulas" to="esdrúxulas" />
+    <Word from="exitar" to="hesitar" />
+    <Word from="explicacoes" to="explicações" />
+    <Word from="exquisito" to="esquisito" />
+    <Word from="extende" to="estende" />
+    <Word from="extender" to="estender" />
+    <Word from="fàcilmenfe" to="facilmente" />
+    <Word from="fàcilmente" to="facilmente" />
+    <Word from="fariam-lhe" to="far-lhe-iam" />
+    <Word from="FARMÁClAS" to="FARMÁCIAS" />
+    <Word from="farmecêutico" to="farmacêutico" />
+    <Word from="fassa" to="faça" />
+    <Word from="fébre" to="febre" />
+    <Word from="fecula" to="fécula" />
+    <Word from="fémea" to="fêmea" />
+    <Word from="femenino" to="feminino" />
+    <Word from="femininismo" to="feminismo" />
+    <Word from="físiologista" to="fisiologista" />
+    <Word from="fizémos" to="fizemos" />
+    <Word from="fizes-te" to="fizeste" />
+    <Word from="flôr" to="flor" />
+    <Word from="forão" to="foram" />
+    <Word from="formalisar" to="formalizar" />
+    <Word from="fôro" to="foro" />
+    <Word from="fos-te" to="foste" />
+    <Word from="fragância" to="fragrância" />
+    <Word from="françês" to="francês" />
+    <Word from="frasqutnho" to="frasquinho" />
+    <Word from="frustado" to="frustrado" />
+    <Word from="furá" to="furar" />
+    <Word from="gaz" to="gás" />
+    <Word from="gáz" to="gás" />
+    <Word from="geito" to="jeito" />
+    <Word from="geneceu" to="gineceu" />
+    <Word from="geropiga" to="jeropiga" />
+    <Word from="glicémia" to="glicemia" />
+    <Word from="gorgeta" to="gorjeta" />
+    <Word from="grangear" to="granjear" />
+    <Word from="guizar" to="guisar" />
+    <Word from="hectar" to="hectare" />
+    <Word from="herméticamente" to="hermeticamente" />
+    <Word from="hernia" to="hérnia" />
+    <Word from="higiéne" to="higiene" />
+    <Word from="hilariedade" to="hilaridade" />
+    <Word from="hiperacídez" to="hiperacidez" />
+    <Word from="hontem" to="ontem" />
+    <Word from="igiene" to="higiene" />
+    <Word from="igienico" to="higiénico" />
+    <Word from="igiénico" to="higiénico" />
+    <Word from="igreija" to="igreja" />
+    <Word from="iguasu" to="iguaçu" />
+    <Word from="ilacção" to="ilação" />
+    <Word from="imbigo" to="umbigo" />
+    <Word from="impecilho" to="empecilho" />
+    <Word from="íncas" to="incas" />
+    <Word from="incêsto" to="incesto" />
+    <Word from="inclusivé" to="inclusive" />
+    <Word from="incômodos" to="incómodos" />
+    <Word from="incontestávelmente" to="incontestavelmente" />
+    <Word from="incontestàvelmente" to="incontestavelmente" />
+    <Word from="indespensáveis" to="indispensáveis" />
+    <Word from="indespensável" to="indispensável" />
+    <Word from="India" to="Índia" />
+    <Word from="indiguinação" to="indignação" />
+    <Word from="indiguinado" to="indignado" />
+    <Word from="indiguinar" to="indignar" />
+    <Word from="inflacção" to="inflação" />
+    <Word from="ingreja" to="igreja" />
+    <Word from="INSCRICOES" to="INSCRIÇÕES" />
+    <Word from="intensão" to="intenção" />
+    <Word from="intertido" to="entretido" />
+    <Word from="intoxica" to="Intoxica" />
+    <Word from="intrega" to="entrega" />
+    <Word from="inverosímel" to="inverosímil" />
+    <Word from="iorgute" to="iogurte" />
+    <Word from="ipopótamo" to="hipopótamo" />
+    <Word from="ipsilon" to="ípsilon" />
+    <Word from="ipslon" to="ípsilon" />
+    <Word from="isquesito" to="esquisito" />
+    <Word from="juíz" to="juiz" />
+    <Word from="juiza" to="juíza" />
+    <Word from="júniores" to="juniores" />
+    <Word from="justanzente" to="justamente" />
+    <Word from="juz" to="jus" />
+    <Word from="kilo" to="quilo" />
+    <Word from="laboratório-porque" to="laboratório porque" />
+    <Word from="ladravaz" to="ladrava" />
+    <Word from="lamentàvelmente" to="lamentavelmente" />
+    <Word from="lampeão" to="lampião" />
+    <Word from="largartixa" to="lagartixa" />
+    <Word from="largarto" to="lagarto" />
+    <Word from="lêm" to="lêem" />
+    <Word from="leucémia" to="leucemia" />
+    <Word from="licensa" to="licença" />
+    <Word from="linguísta" to="linguista" />
+    <Word from="lisongear" to="lisonjear" />
+    <Word from="logista" to="lojista" />
+    <Word from="maçajar" to="massajar" />
+    <Word from="Macfadden-o" to="Macfadden o" />
+    <Word from="mae" to="mãe" />
+    <Word from="magestade" to="majestade" />
+    <Word from="mãgua" to="mágoa" />
+    <Word from="mangerico" to="manjerico" />
+    <Word from="mangerona" to="manjerona" />
+    <Word from="manteem-se" to="mantêm-se" />
+    <Word from="mantega" to="manteiga" />
+    <Word from="mantem-se" to="mantém-se" />
+    <Word from="massiço" to="maciço" />
+    <Word from="massisso" to="maciço" />
+    <Word from="médica-Rio" to="médica Rio" />
+    <Word from="menistro" to="ministro" />
+    <Word from="merciaria" to="mercearia" />
+    <Word from="metrelhadora" to="metralhadora" />
+    <Word from="miscegenação" to="miscigenação" />
+    <Word from="misogenia" to="misoginia" />
+    <Word from="misogeno" to="misógino" />
+    <Word from="misógeno" to="misógino" />
+    <Word from="mº" to="º" />
+    <Word from="môlho" to="molho" />
+    <Word from="monumentânea" to="momentânea" />
+    <Word from="mortandela" to="mortadela" />
+    <Word from="morteIa" to="mortela" />
+    <Word from="muinto" to="muito" />
+    <Word from="nasaias" to="nasais" />
+    <Word from="nêle" to="nele" />
+    <Word from="nest" to="neste" />
+    <Word from="Nivea" to="Nívea" />
+    <Word from="nonagessimo" to="nonagésimo" />
+    <Word from="nonagéssimo" to="nonagésimo" />
+    <Word from="nornal" to="normal" />
+    <Word from="notàvelmente" to="notavelmente" />
+    <Word from="obcessão" to="obsessão" />
+    <Word from="obesidae" to="obesidade" />
+    <Word from="óbviamente" to="obviamente" />
+    <Word from="òbviamente" to="obviamente" />
+    <Word from="ofecina" to="oficina" />
+    <Word from="oje" to="hoje" />
+    <Word from="omem" to="homem" />
+    <Word from="opcoes" to="opções" />
+    <Word from="opóbrio" to="opróbrio" />
+    <Word from="opróbio" to="opróbrio" />
+    <Word from="orfão" to="órfão" />
+    <Word from="organigrama" to="organograma" />
+    <Word from="organisar" to="organizar" />
+    <Word from="orgão" to="órgão" />
+    <Word from="orta" to="horta" />
+    <Word from="ótima" to="óptima" />
+    <Word from="ótimos" to="óptimos" />
+    <Word from="paralização" to="paralisação" />
+    <Word from="paralizado" to="paralisado" />
+    <Word from="paralizar" to="paralisar" />
+    <Word from="paráste" to="paraste" />
+    <Word from="Pátria" to="pátria" />
+    <Word from="paúl" to="Paul" />
+    <Word from="pecalço" to="percalço" />
+    <Word from="pêga" to="pega" />
+    <Word from="periodo" to="período" />
+    <Word from="pertubar" to="perturbar" />
+    <Word from="perú" to="peru" />
+    <Word from="piqueno" to="pequeno" />
+    <Word from="pirinéus" to="Pirenéus" />
+    <Word from="poblema" to="problema" />
+    <Word from="pobrema" to="problema" />
+    <Word from="poden" to="podem" />
+    <Word from="poder-mos" to="pudermos" />
+    <Word from="ponteagudo" to="pontiagudo" />
+    <Word from="pontuacoes" to="pontuações" />
+    <Word from="prazeiroso" to="prazeroso" />
+    <Word from="precaridade" to="precariedade" />
+    <Word from="precizar" to="precisar" />
+    <Word from="preserverança" to="perseverança" />
+    <Word from="previlégio" to="privilégio" />
+    <Word from="primária-que" to="primária que" />
+    <Word from="priúdo" to="período" />
+    <Word from="probalidade" to="probabilidade" />
+    <Word from="progreso" to="progresso" />
+    <Word from="proibído" to="proibido" />
+    <Word from="proíbido" to="proibido" />
+    <Word from="própia" to="própria" />
+    <Word from="propiedade" to="propriedade" />
+    <Word from="propio" to="próprio" />
+    <Word from="própio" to="próprio" />
+    <Word from="provocacoes" to="provocações" />
+    <Word from="prsença" to="presença" />
+    <Word from="prustituta" to="prostituta" />
+    <Word from="pudérmos" to="pudermos" />
+    <Word from="púlico" to="público" />
+    <Word from="pús" to="pus" />
+    <Word from="pusémos" to="pusemos" />
+    <Word from="quadricomia" to="quadricromia" />
+    <Word from="quadriplicado" to="quadruplicado" />
+    <Word from="quaisqueres" to="quaisquer" />
+    <Word from="quer-a" to="quere-a" />
+    <Word from="quere-se" to="quer-se" />
+    <Word from="quer-o" to="quere-o" />
+    <Word from="químco" to="químico" />
+    <Word from="quises-te" to="quiseste" />
+    <Word from="quizer" to="quiser" />
+    <Word from="quizeram" to="quiseram" />
+    <Word from="quizesse" to="quisesse" />
+    <Word from="quizessem" to="quisessem" />
+    <Word from="raínha" to="rainha" />
+    <Word from="raíz" to="raiz" />
+    <Word from="raizes" to="raízes" />
+    <Word from="ratato" to="retrato" />
+    <Word from="raúl" to="raul" />
+    <Word from="razar" to="rasar" />
+    <Word from="rectaguarda" to="retaguarda" />
+    <Word from="rédia" to="rédea" />
+    <Word from="reestabelecer" to="restabelecer" />
+    <Word from="refeicoes" to="refeições" />
+    <Word from="refêrencia" to="referência" />
+    <Word from="regeitar" to="rejeitar" />
+    <Word from="regurjitar" to="regurgitar" />
+    <Word from="reinvidicação" to="reivindicação" />
+    <Word from="reinvidicar" to="reivindicar" />
+    <Word from="requer-a" to="requere-a" />
+    <Word from="requere-se" to="requer-se" />
+    <Word from="requer-o" to="requere-o" />
+    <Word from="requesito" to="requisito" />
+    <Word from="requisicoes" to="requisições" />
+    <Word from="RESIDENCIA" to="RESIDÊNCIA" />
+    <Word from="respiraçáo" to="respiração" />
+    <Word from="restablecer" to="restabelecer" />
+    <Word from="réstea" to="réstia" />
+    <Word from="ruborisar" to="ruborizar" />
+    <Word from="rúbrica" to="rubrica" />
+    <Word from="sàdia" to="sadia" />
+    <Word from="saiem" to="saem" />
+    <Word from="salchicha" to="salsicha" />
+    <Word from="salchichas" to="salsichas" />
+    <Word from="saloice" to="saloiice" />
+    <Word from="salvé" to="salve" />
+    <Word from="salve-raínha" to="salve-rainha" />
+    <Word from="salvé-rainha" to="salve-rainha" />
+    <Word from="salvé-raínha" to="salve-rainha" />
+    <Word from="sao" to="são" />
+    <Word from="sargeta" to="sarjeta" />
+    <Word from="seções" to="secções" />
+    <Word from="seija" to="seja" />
+    <Word from="seissentos" to="seiscentos" />
+    <Word from="seje" to="seja" />
+    <Word from="semiar" to="semear" />
+    <Word from="séniores" to="seniores" />
+    <Word from="sensibilidadc" to="sensibilidade" />
+    <Word from="sensívelmente" to="sensivelmente" />
+    <Word from="setessentos" to="setecentos" />
+    <Word from="siclano" to="sicrano" />
+    <Word from="Sifilis" to="Sífilis" />
+    <Word from="sifílis" to="sífilis" />
+    <Word from="sinão" to="senão" />
+    <Word from="sinmtoma" to="sintoma" />
+    <Word from="sintéticamente" to="sinteticamente" />
+    <Word from="sintetisa" to="sintetiza" />
+    <Word from="SÓ" to="só" />
+    <Word from="sôfra" to="sofra" />
+    <Word from="sôfregamente" to="sofregamente" />
+    <Word from="somáste" to="somaste" />
+    <Word from="sombracelha" to="sobrancelha" />
+    <Word from="sombrancelha" to="sobrancelha" />
+    <Word from="sombrancelhas" to="sobrancelhas" />
+    <Word from="suavisar" to="suavizar" />
+    <Word from="substituido" to="substituído" />
+    <Word from="suburbio" to="subúrbio" />
+    <!-- <Word from="sues" to="seus" /> sues existe "Cuidado, não sues muito." -->
+    <Word from="suI" to="sul" />
+    <Word from="Suiça" to="Suíça" />
+    <Word from="suiças" to="suíças" />
+    <Word from="suiço" to="suíço" />
+    <Word from="suiços" to="suíços" />
+    <Word from="supôr" to="supor" />
+    <Word from="tabeliões" to="tabeliães" />
+    <Word from="taínha" to="tainha" />
+    <Word from="tava" to="estava" />
+    <Word from="têem" to="têm" />
+    <Word from="telemovel" to="telemóvel" />
+    <Word from="telémovel" to="telemóvel" />
+    <Word from="terminacoes" to="terminações" />
+    <Word from="toráxico" to="torácico" />
+    <Word from="tou" to="estou" />
+    <Word from="transpôr" to="transpor" />
+    <Word from="trasnporte" to="transporte" />
+    <Word from="tumors" to="tumores" />
+    <Word from="úmida" to="húmida" />
+    <Word from="umidade" to="unidade" />
+    <Word from="vai-vem" to="vaivém" />
+    <Word from="vegilância" to="vigilância" />
+    <Word from="vegilante" to="vigilante" />
+    <Word from="ventoínha" to="ventoinha" />
+    <Word from="verosímel" to="verosímil" />
+    <Word from="video" to="vídeo" />
+    <Word from="virus" to="vírus" />
+    <Word from="visiense" to="viseense" />
+    <Word from="voçe" to="você" />
+    <Word from="voçê" to="você" />
+    <Word from="vôo" to="voo" />
+    <Word from="xadrês" to="xadrez" />
+    <Word from="xafariz" to="chafariz" />
+    <Word from="xéxé" to="xexé" />
+    <Word from="xilindró" to="chilindró" />
+    <Word from="zaíre" to="Zaire" />
+    <Word from="zepelin" to="zepelim" />
+    <Word from="zig-zag" to="ziguezague" />
+    <Word from="zoô" to="zoo" />
+    <Word from="zôo" to="zoo" />
+    <Word from="zuar" to="zoar" />
+    <Word from="zum-zum" to="zunzum" />
+  </WholeWords>
+  <PartialWordsAlways />
+  <PartialWords />
+  <PartialLines>
+    <LinePart from="IN 6-E" to="N 6 E" />
+    <LinePart from="in tegrar-se" to="integrar-se" />
+    <LinePart from="in teresse" to="interesse" />
+    <LinePart from="in testinos" to="intestinos" />
+    <LinePart from="indica ção" to="indicação" />
+    <LinePart from="inte tino" to="intestino" />
+    <LinePart from="intes tinos" to="intestinos" />
+    <LinePart from="L da" to="Lda" />
+    <LinePart from="mal estar" to="mal-estar" />
+    <LinePart from="mastiga çáo" to="mastigação" />
+    <LinePart from="médi cas" to="médicas" />
+    <LinePart from="mineo rais" to="minerais" />
+    <LinePart from="mola res" to="molares" />
+    <LinePart from="movi mentos" to="movimentos" />
+    <LinePart from="movimen to" to="movimento" />
+    <LinePart from="N 5-Estendido" to="Nº 5 Estendido" />
+    <LinePart from="oxigé nio" to="oxigénio" />
+    <LinePart from="pod mos" to="podemos" />
+    <LinePart from="poder-se ia" to="poder-se-ia" />
+    <LinePart from="pos sibilidade" to="possibilidade" />
+    <LinePart from="possibi lidades" to="possibilidades" />
+    <LinePart from="pro duto" to="produto" />
+    <LinePart from="procu rar" to="procurar" />
+    <LinePart from="Q u e" to="Que" />
+    <LinePart from="qualifi cam" to="qualificam" />
+    <LinePart from="R egião" to="Região" />
+    <LinePart from="unsuficien temente" to="insuficientemente" />
+  </PartialLines>
+  <PartialLinesAlways />
+  <BeginLines />
+  <EndLines />
+  <WholeLines />
+  <RegularExpressions>
+    <!-- <RegEx find="\bi\b" replaceWith="I" /> just an example - do not use this regex -->
+    <RegEx find="([0-9]) +º" replaceWith="$1º" />
+    <RegEx find="\Bcao\b" replaceWith="ção" />
+    <RegEx find="\Bcoes\b" replaceWith="ções" />
+    <!-- <RegEx find="\Bccao\b" replaceWith="cção" /> não faz sentido ter este e ter a linha de cima -->
+    <!-- <RegEx find="\Bccoes\b" replaceWith="cções" /> não faz sentido ter este e ter a linha de cima -->
+    <RegEx find="\b(m|M)ae\b" replaceWith="$1ãe" />
+    <RegEx find="\Bdmnis\B" replaceWith="dminis" />
+    <RegEx find="\Blcól\B" replaceWith="lcoól" />
+    <RegEx find="\b(t|T)a[nm]b[eé]m\b" replaceWith="$1ambém" />
+    <RegEx find="\bzeppeli[mn]\b" replaceWith="zepelim" />
+    <RegEx find="\b(s|S)ufe?ciente\b" replaceWith="$1uficiente" />
+    <RegEx find="\b(n|N)ao\b" replaceWith="$1ão" />
+    <RegEx find="\b(B|b)elem\b" replaceWith="$1elém" />
+    <RegEx find="\b(s|S)u[íi]sso(s)?\b" replaceWith="$1uíço$2" />
+    <RegEx find="\b(s|S)u[íi]ssa(s)?\b" replaceWith="$1uíça$2" />
+    <RegEx find="\b(p|P)rivelig[ie]\p{Ll}d" replaceWith="$1rivelegiad" />
+    <RegEx find="\bpud(?:és|e-)se\b" replaceWith="pudesse" />
+    <RegEx find="\biquilíbr(?:e|i)o\b" replaceWith="equilíbrio" />
+    <RegEx find="\b(c|C)orregi\B" replaceWith="$1orrigid" />
+    <RegEx find="(?&lt;=A|a)ssociacao" replaceWith="ssociação" />
+    <RegEx find="(?&lt;=N|n)inguem" replaceWith="inguém" />
+    <RegEx find="(?&lt;=g|G)rat(?:uí|úi)to" replaceWith="ratuito" />
+    <RegEx find="(?&lt;=d|D)esiquilíbr[ei]o" replaceWith="esequilíbrio" />
+    <RegEx find="\b[k|K]il(ogramas?|ómetros?)" replaceWith="qui$1" />
+  </RegularExpressions>
+</OCRFixReplaceList>
@@ -0,0 +1,257 @@
+<OCRFixReplaceList>
+  <WholeWords>
+    <Word from="НЄЙ" to="НЕЙ" />
+    <Word from="ОРГЗНИЗМОБ" to="ОРГАНИЗМА" />
+    <Word from="Чї0" to="ЧТО" />
+    <Word from="НЭ" to="НА" />
+    <Word from="СОСЄДНЮЮ" to="СОСЕДНЮЮ" />
+    <Word from="ПЛЗНЄТУ" to="ПЛАНЕТУ" />
+    <Word from="ЗЗГЭДОК" to="ЗАГАДОК" />
+    <Word from="СОТВОРЄНИЯ" to="СОТВОРЕНИЯ" />
+    <Word from="МИРЭ" to="МИРА" />
+    <Word from="ПОЯБЛЄНИЯ" to="ПОЯВЛЕНИЯ" />
+    <Word from="ЗЄМЛЄ" to="ЗЕМЛЕ" />
+    <Word from="ЄЩЄ" to="ЕЩЁ" />
+    <Word from="ТЄМНЬІХ" to="ТЕМНЫХ" />
+    <Word from="СЄРЬЄЗНЬІМ" to="СЕРЬЕЗНЫМ" />
+    <Word from="ПОШІІ0" to="ПОШЛО" />
+    <Word from="Пр0ИЗ0ШЄЛ" to="ПРОИЗОШЕЛ" />
+    <Word from="СЄКРЄТЭМИ" to="СЕКРЕТАМИ" />
+    <Word from="МЭТЄРИЗЛЬІ" to="МАТЕРИАЛЫ" />
+    <Word from="ПЯТЄН" to="ПЯТЕН" />
+    <Word from="ПЛаНЄїЄ" to="ПЛАНЕТЕ" />
+    <Word from="КЗТЭКЛИЗМ" to="КАТАКЛИЗМ" />
+    <Word from="ОКЗЗЗЛСЯ" to="ОКАЗАЛСЯ" />
+    <Word from="ДЭЛЬШЕ" to="ДАЛЬШЕ" />
+    <Word from="ТВК" to="ТАК" />
+    <Word from="ПЛЗНЄТЗ" to="ПЛАНЕТА" />
+    <Word from="ЧЄГО" to="ЧЕГО" />
+    <Word from="УЗНЭТЬ" to="УЗНАТЬ" />
+    <Word from="ПЛЭНЄТЄ" to="ПЛАНЕТЕ" />
+    <Word from="НЄМ" to="НЕМ" />
+    <Word from="БОЗМОЖНЗ" to="ВОЗМОЖНА" />
+    <Word from="СОБЄРШЄННО" to="СОВЕРШЕННО" />
+    <Word from="ИНЭЧЄ" to="ИНАЧЕ" />
+    <Word from="БСЄ" to="ВСЕ" />
+    <Word from="НЕДОСТЗТКИ" to="НЕДОСТАТКИ" />
+    <Word from="НОВЬІЄ" to="НОВЫЕ" />
+    <Word from="ВЄЛИКОЛЄПНЭЯ" to="ВЕЛИКОЛЕПНАЯ" />
+    <Word from="ОСТЭІІОСЬ" to="ОСТАЛОСЬ" />
+    <Word from="НЗЛИЧИЄ" to="НАЛИЧИЕ" />
+    <Word from="бЫ" to="бы" />
+    <Word from="ПРОЦВЕТВТЬ" to="ПРОЦВЕТАТЬ" />
+    <Word from="КЗК" to="КАК" />
+    <Word from="ВОДЗ" to="ВОДА" />
+    <Word from="НЗШЕЛ" to="НАШЕЛ" />
+    <Word from="НЄ" to="НЕ" />
+    <Word from="ТОЖЄ" to="ТОЖЕ" />
+    <Word from="ВУЛКЭНИЧЄСКОЙ" to="ВУЛКАНИЧЕСКОЙ" />
+    <Word from="ЭКТИБНОСТИ" to="АКТИВНОСТИ" />
+    <Word from="ПОЯВИЛЗСЬ" to="ПОЯВИЛАСЬ" />
+    <Word from="НОВЗЯ" to="НОВАЯ" />
+    <Word from="СТРЭТЄГИЯ" to="СТРАТЕГИЯ" />
+    <Word from="УСПЄШН0" to="УСПЕШНО" />
+    <Word from="ПОСЗДКУ" to="ПОСАДКУ" />
+    <Word from="ГОТОБЫ" to="ГОТОВЫ" />
+    <Word from="НЗЧЗТЬ" to="НАЧАТЬ" />
+    <Word from="ОХОТЭ" to="ОХОТА" />
+    <Word from="ПРИЗНЗКЗМИ" to="ПРИЗНАКАМИ" />
+    <Word from="Пр0ШЛОМ" to="ПРОШЛОМ" />
+    <Word from="НЭСТОЯЩЄМ" to="НАСТОЯЩЕМ" />
+    <Word from="ПУСТОТЗХ" to="ПУСТОТАХ" />
+    <Word from="БЛЗЖНОЙ" to="ВЛАЖНОЙ" />
+    <Word from="ПОЧБЄ" to="ПОЧВЕ" />
+    <Word from="МЬІ" to="МЫ" />
+    <Word from="СЄЙЧЗС" to="СЕЙЧАС" />
+    <Word from="ЄСЛИ" to="ЕСЛИ" />
+    <Word from="ЗЗТРОНЕМ" to="ЗАТРОНЕМ" />
+    <Word from="ОПЗСЗЄМСЯ" to="ОПАСАЕМСЯ" />
+    <Word from="СИЛЬН0" to="СИЛЬНО" />
+    <Word from="ОТЛИЧЗЄТСЯ" to="ОТЛИЧАЕТСЯ" />
+    <Word from="РЭНЬШЄ" to="РАНЬШЕ" />
+    <Word from="НЗЗЬІВЗЮТ" to="НАЗЫВАЮТ" />
+    <Word from="ТЄКЛ3" to="ТЕКЛА" />
+    <Word from="ОСЗДОЧНЫМИ" to="ОСАДОЧНЫМИ" />
+    <Word from="ПОСТЄПЄНН0" to="ПОСТЕПЕННО" />
+    <Word from="ИСПЭРЯЛЗСЬ" to="ИСПАРЯЛАСЬ" />
+    <Word from="ЄОЛЬШОЄ" to="БОЛЬШОЕ" />
+    <Word from="КОЛИЧЄСТБО" to="КОЛИЧЕСТВО" />
+    <Word from="ГЄМЗТИТЕ" to="ГЕМАТИТА" />
+    <Word from="ПОЛУЧЭЄТ" to="ПОЛУЧАЕТ" />
+    <Word from="НЄДОСТЗЧН0" to="НЕДОСТАТОЧНО" />
+    <Word from="ПИТЭНИЯ" to="ПИТАНИЯ" />
+    <Word from="ПОКЗ" to="ПОКА" />
+    <Word from="БЬІХОДИЛИ" to="ВЫХОДИЛИ" />
+    <Word from="ЗЄМІІЄ" to="ЗЕМЛЕ" />
+    <Word from="ВЄСЬІИЗ" to="ВЕСЬМА" />
+    <Word from="ЗЄМЛИ" to="ЗЕМЛИ" />
+    <Word from="бЬІЛО" to="БЫЛО" />
+    <Word from="КИЗНИ" to="ЖИЗНИ" />
+    <Word from="СТЗНОВИЛЗСЬ" to="СТАНОВИЛАСЬ" />
+    <Word from="СОЛЄНЄЄ" to="СОЛЁНЕЕ" />
+    <Word from="МЭГНИТНЫМ" to="МАГНИТНЫМ" />
+    <Word from="ЧТОбЬІ" to="ЧТОБЫ" />
+    <Word from="СОЗДЕТЬ" to="СОЗДАТЬ" />
+    <Word from="МЗГНИТНОЄ" to="МАГНИТНОЕ" />
+    <Word from="КЭЖУТСЯ" to="КАЖУТСЯ" />
+    <Word from="ОЗНЗЧЗЄТ" to="ОЗНАЧАЕТ" />
+    <Word from="МОГЛЗ" to="МОГЛА" />
+    <Word from="ИМЄТЬ" to="ИМЕТЬ" />
+    <Word from="КОСМОСЭ" to="КОСМОСА" />
+    <Word from="СОЛНЄЧНЗЯ" to="СОЛНЕЧНАЯ" />
+    <Word from="СИСТЄМЗ" to="СИСТЕМА" />
+    <Word from="ПОСІІУЖИЛО" to="ПОСЛУЖИЛО" />
+    <Word from="МЗГНИТНОГО" to="МАГНИТНОГО" />
+    <Word from="ПЛВНЄТЫ" to="ПЛАНЕТЫ" />
+    <Word from="ЛОКЗЛЬНЬІХ" to="ЛОКАЛЬНЫХ" />
+    <Word from="ПОЛЄЙ" to="ПОЛЕЙ" />
+    <Word from="КЗЖУТСЯ" to="КАЖУТСЯ" />
+    <Word from="КЗКОГО" to="КАКОГО" />
+    <Word from="СТРЗШНОГО" to="СТРАШНОГО" />
+    <Word from="СТОЛКНОЕЄНИЯ" to="СТОЛКНОВЕНИЯ" />
+    <Word from="МЕСТЗМИ" to="МЕСТАМИ" />
+    <Word from="СДЄЛЗТЬ" to="СДЕЛАТЬ" />
+    <Word from="СТЗЛО" to="СТАЛО" />
+    <Word from="МЭГНИТНОГО" to="МАГНИТНОГО" />
+    <Word from="ЗЗКЛЮЧЗВШЄЙСЯ" to="ЗАКЛЮЧАВШЕЙСЯ" />
+    <Word from="ЄГО" to="ЕГО" />
+    <Word from="ЯДРЄ" to="ЯДРЕ" />
+    <Word from="НЗ" to="НА" />
+    <Word from="ИСЧЄЗЛ3" to="ИСЧЕЗЛА" />
+    <Word from="СЧИТЗЮ" to="СЧИТАЮ" />
+    <Word from="ШЭНСЫ" to="ШАНСЫ" />
+    <Word from="ИНЗЧЄ" to="ИНАЧЕ" />
+    <Word from="СТЗЛ" to="СТАЛ" />
+    <Word from="ТРЗТИТЬ" to="ТРАТИТЬ" />
+    <Word from="НЗПРЗВЛЯЄТСЯ" to="НАПРАВЛЯЕТСЯ" />
+    <Word from="ОБЛЭСТИ" to="ОБЛАСТИ" />
+    <Word from="ЯВЛЯІОТСЯ" to="ЯВЛЯЮТСЯ" />
+    <Word from="ГЛЭВНОЙ" to="ГЛАВНОЙ" />
+    <Word from="ДОКЗЗЗТЄЛЬСТВ" to="ДОКАЗАТЕЛЬСТВ" />
+    <Word from="КИСЛОТЭМИ" to="КИСЛОТАМИ" />
+    <Word from="ОНЭ" to="ОНА" />
+    <Word from="ПРЗКТИЧЄСКИ" to="ПРАКТИЧЕСКИ" />
+    <Word from="ЛЄСУ" to="ЛЕСУ" />
+    <Word from="УСЛОБИЯМ" to="УСЛОВИЯМ" />
+    <Word from="СПЗСТИСЬ" to="СПАСТИСЬ" />
+    <Word from="РЗЗВИВЗЮЩИЄСЯ" to="РАЗВИВАЮЩИЕСЯ" />
+    <Word from="ШЭПКИ" to="ШАПКИ" />
+    <Word from="ЗНЗЄМ" to="ЗНАЕМ" />
+    <Word from="СООИРЭЄМСЯ" to="СОБИРАЕМСЯ" />
+    <Word from="БЫЯСНИТЬ" to="ВЫЯСНИТЬ" />
+    <Word from="СЗМ" to="САМ" />
+    <Word from="РЗСПОЗНЗТЬ" to="РАСПОЗНАТЬ" />
+    <Word from="УЗНЗТЬ" to="УЗНАТЬ" />
+    <Word from="КЭЖЄТСЯ" to="КАЖЕТСЯ" />
+    <Word from="ОРЄИТЗЛЬНЬІЄ" to="ОРБИТАЛЬНЫЕ" />
+    <Word from="ЛЄТЭТЄЛЬНЬІЄ" to="ЛЕТАТЕЛЬНЫЕ" />
+    <Word from="ЗППЗРЕТЬІ" to="АППАРАТЫ" />
+    <Word from="ЖЄ" to="ЖЕ" />
+    <Word from="ТЗКЗЯ" to="ТАКАЯ" />
+    <Word from="МЗЛЄНЬКЗЯ" to="МАЛЕНЬКАЯ" />
+    <Word from="ПЛЭНЄТЗ" to="ПЛАНЕТА" />
+    <Word from="СПЗІІЬКО" to="СТОЛЬКО" />
+    <Word from="бЬІЛ3" to="БЫЛА" />
+    <Word from="ЁЕСЧИСЛЄННОЄ" to="БЕСЧИСЛЕННОЕ" />
+    <Word from="МЗГНИїНЬІХ" to="МАГНИТНЫХ" />
+    <Word from="ПОСТраД3Л" to="ПОСТРАДАЛ" />
+    <Word from="ДЗЖЄ" to="ДАЖЕ" />
+    <Word from="РЗЗНЬІМИ" to="РАЗНЫМИ" />
+    <Word from="СУЩЄСТБОВЭНИЄ" to="СУЩЕСТВОВАНИЕ" />
+    <Word from="ПЛаНЄїЬІ" to="ПЛАНЕТЫ" />
+    <Word from="ПОДВЄРГЛЗСЬ" to="ПОДВЕРГЛАСЬ" />
+    <Word from="ОПЗСІ-ІОСТИ" to="ОПАСНОСТИ" />
+    <Word from="ПЛЗНЄТЄ" to="ПЛАНЕТЕ" />
+    <Word from="Н0" to="НО" />
+    <Word from="бЬІ" to="БЫ" />
+    <Word from="ОТДЗЛЄННЫЄ" to="ОТДАЛЁННЫЕ" />
+    <Word from="ПОЛЯРНЬІЄ" to="ПОЛЯРНЫЕ" />
+    <Word from="ЦЄЛЬІ-О" to="ЦЕЛЬЮ" />
+    <Word from="ПЄЩЄРЗХ" to="ПЕЩЕРАХ" />
+    <Word from="НЗПОЛНЄННЬІХ" to="НАПОЛНЕННЫХ" />
+    <Word from="ИСПЗРЄНИЯМИ" to="ИСПАРЕНИЯМИ" />
+    <Word from="МИНИЗТЮРНЬІЄ" to="МИНИАТЮРНЫЕ" />
+    <Word from="ТЭКЗЯ" to="ТАКАЯ" />
+    <Word from="ПрИСП0СОбИТЬСЯ" to="ПРИСПОСОБИТЬСЯ" />
+    <Word from="НЄОЄХОДИМЬІЄ" to="НЕОБХОДИМЫЕ" />
+    <Word from="ОРГВНИЧЄСКИЄ" to="ОРГАНИЧЕСКИЕ" />
+    <Word from="МЗРСИЗНСКИЄ" to="МАРСИАНСКИЕ" />
+    <Word from="МЄСТЄ" to="МЕСТЕ" />
+    <Word from="І\/ІАККЕЙШ" to="МАККЕЙН" />
+    <Word from="НЗХОДЯЩИЄСЯ" to="НАХОДЯЩИЕСЯ" />
+    <Word from="НЄЗКТИВНОМ" to="НЕАКТИВНОМ" />
+    <Word from="ЗЭСНЯТЬ" to="ЗАСНЯТЬ" />
+    <Word from="ОРГЗНИЗМЬІ" to="ОРГАНИЗМЫ" />
+    <Word from="ВЗЕИМОДЄЙСТВОВЕТЬ" to="ВЗАИМОДЕЙСТВОВАТЬ" />
+    <Word from="ПУТЄШЄСТБИЄ" to="ПУТЕШЕСТВИЕ" />
+    <Word from="ПуСїЬІННЫХ" to="ПУСТЫННЫХ" />
+    <Word from="ТЗКИХ" to="ТАКИХ" />
+    <Word from="ПЄРЄТЗСКИВЗЄМ" to="ПЕРЕТАСКИВАЕМ" />
+    <Word from="ЧТ0" to="ЧТО" />
+    <Word from="ВЄСЬМЗ" to="ВЕСЬМА" />
+    <Word from="ПОЛОСЗМИ" to="ПОЛОСАМИ" />
+    <Word from="ОрїЭНИЗМЬІ" to="ОРГАНИЗМЫ" />
+    <Word from="ОЁЛЗСТИ" to="ОБЛАСТИ" />
+    <Word from="ЯБЛЯЮТСЯ" to="ЯВЛЯЮТСЯ" />
+    <Word from="ЦЄЛЬЮ" to="ЦЕЛЬЮ" />
+    <Word from="ПОИСКОБ" to="ПОИСКОВ" />
+    <Word from="ДОКЗЗЗТЄІІЬСТВ" to="ДОКАЗАТЕЛЬСТВ" />
+    <Word from="МОЖЄТ" to="МОЖЕТ" />
+    <Word from="НЭХОДИТЬСЯ" to="НАХОДИТЬСЯ" />
+    <Word from="ОЧЄНЬ" to="ОЧЕНЬ" />
+    <Word from="СРЗВНИТЬ" to="СРАВНИТЬ" />
+    <Word from="ОЄНЗРУЖИЛ" to="ОБНАРУЖИЛ" />
+    <Word from="ЛЬДЗ" to="ЛЬДА" />
+    <Word from="ПОТЄПЛЄНИЄІИ" to="ПОТЕПЛЕНИЕМ" />
+    <Word from="ПОХОЛОДЗНИЄБД" to="ПОХОЛОДАНИЕМ" />
+    <Word from="КЭК" to="КАК" />
+    <Word from="ТЄЛО" to="ТЕЛО" />
+    <Word from="бОЛЬШЄ" to="БОЛЬШЕ" />
+    <Word from="НЭКЛОНЯЄТСЯ" to="НАКЛОНЯЕТСЯ" />
+    <Word from="СОІІНЦУ" to="СОЛНЦУ" />
+    <Word from="СТ3бИЛИЗИрОБЗТЬ" to="СТАБИЛИЗИРОВАТЬ" />
+    <Word from="СТЭБИЛЬНЭ" to="СТАБИЛЬНА" />
+    <Word from="МИЛІІИОНОВ" to="МИЛЛИОНОВ" />
+    <Word from="НЗЗЭД" to="НАЗАД" />
+    <Word from="ТЄПЛ0" to="ТЕПЛО" />
+    <Word from="ПОІІЯРНЫХ" to="ПОЛЯРНЫХ" />
+    <Word from="СОІІЕНЫМИ" to="СОЛЕНЫМИ" />
+    <Word from="КЕКИМИ" to="КАКИМИ" />
+    <Word from="кислютнюсггь" to="кислотность" />
+    <Word from="ТЗМ" to="ТАМ" />
+    <Word from="ОРГЗНИЗМЫ" to="ОРГАНИЗМЫ" />
+    <Word from="СУЩЄСТВОВЄТЬ" to="СУЩЕСТВОВАТЬ" />
+    <Word from="ВНИМЗНИЄ" to="ВНИМАНИЕ" />
+    <Word from="СДЄЛЗЄТ" to="СДЕЛАЕТ" />
+    <Word from="ПОЗНЭКОМИТЬСЯ" to="ПОЗНАКОМИТЬСЯ" />
+    <Word from="НЭШИМ" to="НАШИМ" />
+    <Word from="ДОКЗЗЭТЄЛЬСТБО" to="ДОКАЗАТЕЛЬСТВО" />
+    <Word from="ЩЗЗЩЄНИЯ" to="ВРАЩЕНИЯ" />
+    <Word from="бЬІЛ0" to="БЫЛО" />
+    <Word from="ОЄЛЕСТЯХ" to="ОБЛАСТЯХ" />
+    <Word from="бЬІЛИ" to="БЫЛИ" />
+    <Word from="РЭЗМЬІШЛЯІІИ" to="РАЗМЫШЛЯЛИ" />
+    <Word from="КОЛИЧЄСТБЄ" to="КОЛИЧЕСТВЕ" />
+    <Word from="ЩЄІІОЧНЫЄ" to="ЩЕЛОЧНЫЕ" />
+    <Word from="НЄКОТЩЗЬІЄ" to="НЕКОТОРЫЕ" />
+    <Word from="ПрИБІ1ЕКуї" to="ПРИВЛЕКУТ" />
+    <Word from="НЗЗЬІВЭЄМЫЄ" to="НАЗЫВАЕМЫЕ" />
+    <Word from="Чї06Ы" to="ЧТОБЫ" />
+  </WholeWords>
+  <PartialWordsAlways />
+  <PartialWords>
+    <WordPart from="Є" to="Е" />
+    <WordPart from="ЬІ" to="Ы" />
+    <WordPart from="КЗ" to="КА" />
+    <WordPart from="ЛЗ" to="ЛА" />
+    <WordPart from="НЗ" to="НА" />
+    <WordPart from="ШЗ" to="ША" />
+    <WordPart from="І\/І" to="М" />
+  </PartialWords>
+  <PartialLines />
+  <PartialLinesAlways />
+  <BeginLines />
+  <EndLines />
+  <WholeLines />
+  <RegularExpressions />
+</OCRFixReplaceList>
@@ -0,0 +1,946 @@
+<OCRFixReplaceList>
+  <WholeWords>
+    <!-- Abreviaturas simples -->
+    <Word from="KBs" to="kB" />
+    <Word from="Vd" to="Ud" />
+    <Word from="N°" to="N.°" />
+    <Word from="n°" to="n.°" />
+    <Word from="nro." to="n.°" />
+    <Word from="Nro." to="N.°" />
+    <!-- Ortografía básica -->
+    <Word from="aca" to="acá" />
+    <Word from="actuas" to="actúas" />
+    <Word from="actues" to="actúes" />
+    <Word from="adios" to="adiós" />
+    <Word from="agarrenla" to="agárrenla" />
+    <Word from="agarrenlo" to="agárrenlo" />
+    <Word from="agarrandose" to="agarrándose" />
+    <Word from="algun" to="algún" />
+    <Word from="alli" to="allí" />
+    <Word from="alla" to="allá" />
+    <Word from="alejate" to="aléjate" />
+    <Word from="ahi" to="ahí" />
+    <Word from="angel" to="ángel" />
+    <Word from="angeles" to="ángeles" />
+    <Word from="apagala" to="apágala" />
+    <Word from="aqui" to="aquí" />
+    <Word from="asi" to="así" />
+    <Word from="bahia" to="bahía" />
+    <Word from="busqueda" to="búsqueda" />
+    <Word from="busquedas" to="búsquedas" />
+    <Word from="callate" to="cállate" />
+    <Word from="carcel" to="cárcel" />
+    <Word from="camara" to="cámara" />
+    <Word from="caido" to="caído" />
+    <Word from="cabron" to="cabrón" />
+    <Word from="camion" to="camión" />
+    <Word from="codigo" to="código" />
+    <Word from="codigos" to="códigos" />
+    <Word from="comence" to="comencé" />
+    <Word from="comprate" to="cómprate" />
+    <Word from="consegui" to="conseguí" />
+    <Word from="confias" to="confías" />
+    <Word from="convertira" to="convertirá" />
+    <Word from="corazon" to="corazón" />
+    <Word from="crei" to="creí" />
+    <Word from="creia" to="creía" />
+    <Word from="creido" to="creído" />
+    <Word from="creiste" to="creíste" />
+    <Word from="cubrenos" to="cúbrenos" />
+    <Word from="comio" to="comió" />
+    <Word from="dara" to="dará" />
+    <Word from="dia" to="día" />
+    <Word from="dias" to="días" />
+    <Word from="debio" to="debió" />
+    <Word from="demelo" to="démelo" />
+    <Word from="dimelo" to="dímelo" />
+    <Word from="denoslo" to="dénoslo" />
+    <Word from="deselo" to="déselo" />
+    <Word from="decia" to="decía" />
+    <Word from="decian" to="decían" />
+    <Word from="detras" to="detrás" />
+    <Word from="deberia" to="debería" />
+    <Word from="deberas" to="deberás" />
+    <Word from="deberias" to="deberías" />
+    <Word from="deberian" to="deberían" />
+    <Word from="deberiamos" to="deberíamos" />
+    <Word from="dejame" to="déjame" />
+    <Word from="dejate" to="déjate" />
+    <Word from="dejalo" to="déjalo" />
+    <Word from="dejarian" to="dejarían" />
+    <Word from="damela" to="dámela" />
+    <Word from="despues" to="después" />
+    <Word from="diciendome" to="diciéndome" />
+    <Word from="dificil" to="difícil" />
+    <Word from="dificiles" to="difíciles" />
+    <Word from="disculpate" to="discúlpate" />
+    <Word from="dolares" to="dólares" />
+    <Word from="hechar" to="echar" />
+    <Word from="examenes" to="exámenes" />
+    <Word from="empezo" to="empezó" />
+    <Word from="empujon" to="empujón" />
+    <Word from="empujalo" to="empújalo" />
+    <Word from="escondanme" to="escóndanme" />
+    <Word from="esperame" to="espérame" />
+    <Word from="estara" to="estará" />
+    <Word from="estare" to="estaré" />
+    <Word from="estaria" to="estaría" />
+    <Word from="estan" to="están" />
+    <Word from="estaran" to="estarán" />
+    <Word from="estabamos" to="estábamos" />
+    <Word from="estuvieramos" to="estuviéramos" />
+    <Word from="exito" to="éxito" />
+    <Word from="facil" to="fácil" />
+    <Word from="fiscalia" to="fiscalía" />
+    <Word from="fragil" to="frágil" />
+    <Word from="fragiles" to="frágiles" />
+    <Word from="frances" to="francés" />
+    <Word from="gustaria" to="gustaría" />
+    <Word from="habia" to="había" />
+    <Word from="habias" to="habías" />
+    <Word from="habian" to="habían" />
+    <Word from="habrian" to="habrían" />
+    <Word from="habrias" to="habrías" />
+    <Word from="hagalo" to="hágalo" />
+    <Word from="haria" to="haría" />
+    <Word from="increible" to="increíble" />
+    <Word from="incredulo" to="incrédulo" />
+    <Word from="intentalo" to="inténtalo" />
+    <Word from="ire" to="iré" />
+    <Word from="jovenes" to="jóvenes" />
+    <Word from="ladron" to="ladrón" />
+    <Word from="linea" to="línea" />
+    <Word from="llamame" to="llámame" />
+    <Word from="llevalo" to="llévalo" />
+    <Word from="mama" to="mamá" />
+    <Word from="maricon" to="maricón" />
+    <Word from="mayoria" to="mayoría" />
+    <Word from="metodo" to="método" />
+    <Word from="metodos" to="métodos" />
+    <Word from="mio" to="mío" />
+    <Word from="mostro" to="mostró" />
+    <Word from="morira" to="morirá" />
+    <Word from="muevete" to="muévete" />
+    <Word from="murio" to="murió" />
+    <Word from="numero" to="número" />
+    <Word from="numeros" to="números" />
+    <Word from="ningun" to="ningún" />
+    <Word from="oido" to="oído" />
+    <Word from="oidos" to="oídos" />
+    <Word from="oimos" to="oímos" />
+    <Word from="oiste" to="oíste" />
+    <Word from="pasale" to="pásale" />
+    <Word from="pasame" to="pásame" />
+    <Word from="paraiso" to="paraíso" />
+    <Word from="parate" to="párate" />
+    <Word from="pense" to="pensé" />
+    <Word from="peluqueria" to="peluquería" />
+    <Word from="platano" to="plátano" />
+    <Word from="plastico" to="plástico" />
+    <Word from="plasticos" to="plásticos" />
+    <Word from="policia" to="policía" />
+    <Word from="policias" to="policías" />
+    <Word from="poster" to="póster" />
+    <Word from="podia" to="podía" />
+    <Word from="podias" to="podías" />
+    <Word from="podria" to="podría" />
+    <Word from="podrian" to="podrían" />
+    <Word from="podrias" to="podrías" />
+    <Word from="podriamos" to="podríamos" />
+    <Word from="prometio" to="prometió" />
+    <Word from="proposito" to="propósito" />
+    <Word from="pideselo" to="pídeselo" />
+    <Word from="ponganse" to="pónganse" />
+    <Word from="prometeme" to="prométeme" />
+    <Word from="publico" to="público" />
+    <Word from="publicos" to="públicos" />
+    <Word from="publicamente" to="públicamente" />
+    <Word from="quedate" to="quédate" />
+    <Word from="queria" to="quería" />
+    <Word from="querrias" to="querrías" />
+    <Word from="querian" to="querían" />
+    <Word from="rapido" to="rápido" />
+    <Word from="rapidamente" to="rápidamente" />
+    <Word from="razon" to="razón" />
+    <Word from="rehusen" to="rehúsen" />
+    <Word from="rie" to="ríe" />
+    <Word from="rias" to="rías" />
+    <Word from="rindete" to="ríndete" />
+    <Word from="sacame" to="sácame" />
+    <Word from="sentian" to="sentían" />
+    <Word from="sientate" to="siéntate" />
+    <Word from="sera" to="será" />
+    <Word from="soplon" to="soplón" />
+    <Word from="sueltalo" to="suéltalo" />
+    <Word from="tambien" to="también" />
+    <Word from="teoria" to="teoría" />
+    <Word from="tendra" to="tendrá" />
+    <Word from="telefono" to="teléfono" />
+    <Word from="tipica" to="típica" />
+    <Word from="todavia" to="todavía" />
+    <Word from="tomalo" to="tómalo" />
+    <Word from="tonterias" to="tonterías" />
+    <Word from="torci" to="torcí" />
+    <Word from="traelos" to="tráelos" />
+    <Word from="traiganlo" to="tráiganlo" />
+    <Word from="traiganlos" to="tráiganlos" />
+    <Word from="trio" to="trío" />
+    <Word from="tuvieramos" to="tuviéramos" />
+    <Word from="union" to="unión" />
+    <Word from="ultimo" to="último" />
+    <Word from="ultima" to="última" />
+    <Word from="ultimos" to="últimos" />
+    <Word from="ultimas" to="últimas" />
+    <Word from="unica" to="única" />
+    <Word from="unico" to="único" />
+    <Word from="vamonos" to="vámonos" />
+    <Word from="vayanse" to="váyanse" />
+    <Word from="victima" to="víctima" />
+    <Word from="vivira" to="vivirá" />
+    <Word from="volvio" to="volvió" />
+    <Word from="volvia" to="volvía" />
+    <Word from="volvian" to="volvían" />
+    <!-- Palabras con eír/oír más usadas -->
+    <Word from="reir" to="reír" />
+    <Word from="freir" to="freír" />
+    <Word from="sonreir" to="sonreír" />
+    <Word from="hazmerreir" to="hazmerreír" />
+    <Word from="oir" to="oír" />
+    <Word from="oirlo" to="oírlo" />
+    <Word from="oirte" to="oírte" />
+    <Word from="oirse" to="oírse" />
+    <Word from="oirme" to="oírme" />
+    <Word from="oirle" to="oírle" />
+    <Word from="oirla" to="oírla" />
+    <Word from="oirles" to="oírles" />
+    <Word from="oirnos" to="oírnos" />
+    <Word from="oirlas" to="oírlas" />
+    <!-- Palabras que no llevan acento -->
+    <Word from="bién" to="bien" />
+    <Word from="crímen" to="crimen" />
+    <Word from="fué" to="fue" />
+    <Word from="fuí" to="fui" />
+    <Word from="quiéres" to="quieres" />
+    <Word from="tí" to="ti" />
+    <Word from="dí" to="di" />
+    <Word from="vá" to="va" />
+    <Word from="vé" to="ve" />
+    <Word from="ví" to="vi" />
+    <Word from="vió" to="vio" />
+    <Word from="ó" to="o" />
+    <Word from="clón" to="clon" />
+    <Word from="dió" to="dio" />
+    <Word from="guión" to="guion" />
+    <Word from="dón" to="don" />
+    <Word from="fé" to="fe" />
+    <Word from="áquel" to="aquel" />
+    <!-- Palabras donde se puede prescindir de la tilde diacrítica -->
+    <Word from="éste" to="este" />
+    <Word from="ésta" to="esta" />
+    <Word from="éstos" to="estos" />
+    <Word from="éstas" to="estas" />
+    <Word from="ése" to="ese" />
+    <Word from="ésa" to="esa" />
+    <Word from="ésos" to="esos" />
+    <Word from="ésas" to="esas" />
+    <Word from="sólo" to="solo" />
+    <!-- Errores no relacionados con los tildes -->
+    <Word from="coktel" to="cóctel" />
+    <Word from="cocktel" to="cóctel" />
+    <Word from="conciente" to="consciente" />
+    <Word from="comenzé" to="comencé" />
+    <Word from="desilucionarte" to="desilusionarte" />
+    <Word from="dijieron" to="dijeron" />
+    <Word from="empezé" to="empecé" />
+    <Word from="hize" to="hice" />
+    <Word from="ilucionarte" to="ilusionarte" />
+    <Word from="inconciente" to="inconsciente" />
+    <Word from="quize" to="quise" />
+    <Word from="quizo" to="quiso" />
+    <Word from="verguenza" to="vergüenza" />
+    <!-- Errores en nombres propios o de países -->
+    <Word from="Nuñez" to="Núñez" />
+    <Word from="Ivan" to="Iván" />
+    <Word from="Japon" to="Japón" />
+    <Word from="Monica" to="Mónica" />
+    <Word from="Maria" to="María" />
+    <Word from="Jose" to="José" />
+    <Word from="Ramon" to="Ramón" />
+    <Word from="Garcia" to="García" />
+    <Word from="Gonzalez" to="González" />
+    <Word from="Jesus" to="Jesús" />
+    <Word from="Alvarez" to="Álvarez" />
+    <Word from="Damian" to="Damián" />
+    <Word from="Rene" to="René" />
+    <Word from="Nicolas" to="Nicolás" />
+    <Word from="Jonas" to="Jonás" />
+    <Word from="Lopez" to="López" />
+    <Word from="Hernandez" to="Hernández" />
+    <Word from="Bermudez" to="Bermúdez" />
+    <Word from="Fernandez" to="Fernández" />
+    <Word from="Suarez" to="Suárez" />
+    <Word from="Sofia" to="Sofía" />
+    <Word from="Seneca" to="Séneca" />
+    <Word from="Tokyo" to="Tokio" />
+    <Word from="Canada" to="Canadá" />
+    <Word from="Paris" to="París" />
+    <Word from="Turquia" to="Turquía" />
+    <Word from="Mexico" to="México" />
+    <Word from="Mejico" to="México" />
+    <Word from="Matias" to="Matías" />
+    <Word from="Valentin" to="Valentín" />
+    <Word from="mejicano" to="mexicano" />
+    <Word from="mejicanos" to="mexicanos" />
+    <Word from="mejicana" to="mexicana" />
+    <Word from="mejicanas" to="mexicanas" />
+    <!-- Creados por SE -->
+    <Word from="io" to="lo" />
+    <Word from="ia" to="la" />
+    <Word from="ie" to="le" />
+    <Word from="Io" to="lo" />
+    <Word from="Ia" to="la" />
+    <Word from="AI" to="Al" />
+    <Word from="Ie" to="le" />
+    <Word from="EI" to="El" />
+    <Word from="subaﬂuente" to="subafluente" />
+    <Word from="aﬂójalo" to="aflójalo" />
+    <Word from="Aﬂójalo" to="Aflójalo" />
+    <Word from="perdi" to="perdí" />
+    <Word from="Podria" to="Podría" />
+    <Word from="confia" to="confía" />
+    <Word from="pasaria" to="pasaría" />
+    <Word from="Podias" to="Podías" />
+    <Word from="responsabke" to="responsable" />
+    <Word from="Todavia" to="Todavía" />
+    <Word from="envien" to="envíen" />
+    <Word from="Queria" to="Quería" />
+    <Word from="tio" to="tío" />
+    <Word from="traido" to="traído" />
+    <Word from="Asi" to="Así" />
+    <Word from="elegi" to="elegí" />
+    <Word from="habria" to="habría" />
+    <Word from="encantaria" to="encantaría" />
+    <Word from="leido" to="leído" />
+    <Word from="conocias" to="conocías" />
+    <Word from="harias" to="harías" />
+    <Word from="Aqui" to="Aquí" />
+    <Word from="decidi" to="decidí" />
+    <Word from="mia" to="mía" />
+    <Word from="Crei" to="Creí" />
+    <Word from="podiamos" to="podíamos" />
+    <Word from="avisame" to="avísame" />
+    <Word from="debia" to="debía" />
+    <Word from="pensarias" to="pensarías" />
+    <Word from="reuniamos" to="reuníamos" />
+    <Word from="POÏ" to="por" />
+    <Word from="vendria" to="vendría" />
+    <Word from="caida" to="caída" />
+    <Word from="venian" to="venían" />
+    <Word from="compañias" to="compañías" />
+    <Word from="leiste" to="leíste" />
+    <Word from="Leiste" to="Leíste" />
+    <Word from="fiaria" to="fiaría" />
+    <Word from="Hungria" to="Hungría" />
+    <Word from="fotografia" to="fotografía" />
+    <Word from="cafeteria" to="cafetería" />
+    <Word from="Digame" to="Dígame" />
+    <Word from="debias" to="debías" />
+    <Word from="tendria" to="tendría" />
+    <Word from="CÏGO" to="creo" />
+    <Word from="anteg" to="antes" />
+    <Word from="SóIo" to="Solo" />
+    <Word from="Ilamándola" to="llamándola" />
+    <Word from="Cáﬂaté" to="Cállate" />
+    <Word from="Ilamaste" to="llamaste" />
+    <Word from="daria" to="daría" />
+    <Word from="Iargaba" to="largaba" />
+    <Word from="Yati" to="Y a ti" />
+    <Word from="querias" to="querías" />
+    <Word from="Iimpiarlo" to="limpiarlo" />
+    <Word from="Iargado" to="largado" />
+    <Word from="galeria" to="galería" />
+    <Word from="Bartomeu" to="Bertomeu" />
+    <Word from="Iocalizarlo" to="localizarlo" />
+    <Word from="Ilámame" to="llámame" />
+  </WholeWords>
+  <PartialWordsAlways />
+  <PartialWords />
+  <PartialLines>
+    <!-- Varios -->
+    <LinePart from="de gratis" to="gratis" />
+    <LinePart from="si quiera" to="siquiera" />
+    <LinePart from="Cada una de los" to="Cada uno de los" />
+    <LinePart from="Cada uno de las" to="Cada una de las" />
+    <!-- Uso incorrecto de haber / a ver -->
+    <LinePart from="haber que" to="a ver qué" />
+    <LinePart from="haber qué" to="a ver qué" />
+    <LinePart from="Haber si" to="A ver si" />
+    <!-- Ponombres exclamativos o interrogativos Parte 1 -->
+    <LinePart from=" que hora" to=" qué hora" />
+    <LinePart from="yo que se" to="yo qué sé" />
+    <LinePart from="Yo que se" to="Yo qué sé" />
+    <!-- Acentos al final de los signos de exclamación -->
+    <LinePart from=" tu!" to=" tú!" />
+    <LinePart from=" si!" to=" sí!" />
+    <LinePart from=" mi!" to=" mí!" />
+    <LinePart from=" el!" to=" él!" />
+    <!-- Acentos al final de los signos de interrogación -->
+    <LinePart from=" tu?" to=" tú?" />
+    <LinePart from=" si?" to=" sí?" />
+    <LinePart from=" mi?" to=" mí?" />
+    <LinePart from=" el?" to=" él?" />
+    <LinePart from=" aun?" to=" aún?" />
+    <LinePart from=" mas?" to=" más?" />
+    <LinePart from=" que?" to=" qué?" />
+    <LinePart from=" paso?" to=" pasó?" />
+    <LinePart from=" cuando?" to=" cuándo?" />
+    <LinePart from=" cuanto?" to=" cuánto?" />
+    <LinePart from=" cuanta?" to=" cuánta?" />
+    <LinePart from=" cuantas?" to=" cuántas?" />
+    <LinePart from=" cuantos?" to=" cuántos?" />
+    <LinePart from=" donde?" to=" dónde?" />
+    <LinePart from=" quien?" to=" quién?" />
+    <LinePart from=" como?" to=" cómo?" />
+    <LinePart from=" adonde?" to=" adónde?" />
+    <LinePart from=" cual?" to=" cuál?" />
+    <!-- Acentos en los signos de interrogación completos -->
+    <LinePart from="¿Si?" to="¿Sí?" />
+    <LinePart from="¿esta bien?" to="¿está bien?" />
+    <!-- Enunciados que son a la vez interrogativos y exclamativos -->
+    <LinePart from="¿Pero qué haces?" to="¡¿Pero qué haces?!" />
+    <LinePart from="¿pero qué haces?" to="¡¿pero qué haces?!" />
+    <LinePart from="¿Es que no me has escuchado?" to="¡¿Es que no me has escuchado?!" />
+    <LinePart from="¡¿es que no me has escuchado?!" to="¡¿es que no me has escuchado?!" />
+    <!-- Acentos al principio de los signos de interrogación con minúsculas -->
+    <LinePart from="¿aun" to="¿aún" />
+    <LinePart from="¿tu " to="¿tú " />
+    <LinePart from="¿que " to="¿qué " />
+    <LinePart from="¿sabes que" to="¿sabes qué" />
+    <LinePart from="¿sabes adonde" to="¿sabes adónde" />
+    <LinePart from="¿sabes cual" to="¿sabes cuál" />
+    <LinePart from="¿sabes quien" to="¿sabes quién" />
+    <LinePart from="¿sabes como" to="¿sabes cómo" />
+    <LinePart from="¿sabes cuan" to="¿sabes cuán" />
+    <LinePart from="¿sabes cuanto" to="¿sabes cuánto" />
+    <LinePart from="¿sabes cuanta" to="¿sabes cuánta" />
+    <LinePart from="¿sabes cuantos" to="¿sabes cuántos" />
+    <LinePart from="¿sabes cuantas" to="¿sabes cuántas" />
+    <LinePart from="¿sabes cuando" to="¿sabes cuándo" />
+    <LinePart from="¿sabes donde" to="¿sabes dónde" />
+    <LinePart from="¿sabe que" to="¿sabe qué" />
+    <LinePart from="¿sabe adonde" to="¿sabe adónde" />
+    <LinePart from="¿sabe cual" to="¿sabe cuál" />
+    <LinePart from="¿sabe quien" to="¿sabe quién" />
+    <LinePart from="¿sabe como" to="¿sabe cómo" />
+    <LinePart from="¿sabe cuan" to="¿sabe cuán" />
+    <LinePart from="¿sabe cuanto" to="¿sabe cuánto" />
+    <LinePart from="¿sabe cuanta" to="¿sabe cuánta" />
+    <LinePart from="¿sabe cuantos" to="¿sabe cuántos" />
+    <LinePart from="¿sabe cuantas" to="¿sabe cuántas" />
+    <LinePart from="¿sabe cuando" to="¿sabe cuándo" />
+    <LinePart from="¿sabe donde" to="¿sabe dónde" />
+    <LinePart from="¿saben que" to="¿saben qué" />
+    <LinePart from="¿saben adonde" to="¿saben adónde" />
+    <LinePart from="¿saben cual" to="¿saben cuál" />
+    <LinePart from="¿saben quien" to="¿saben quién" />
+    <LinePart from="¿saben como" to="¿saben cómo" />
+    <LinePart from="¿saben cuan" to="¿saben cuán" />
+    <LinePart from="¿saben cuanto" to="¿saben cuánto" />
+    <LinePart from="¿saben cuanta" to="¿saben cuánta" />
+    <LinePart from="¿saben cuantos" to="¿saben cuántos" />
+    <LinePart from="¿saben cuantas" to="¿saben cuántas" />
+    <LinePart from="¿saben cuando" to="¿saben cuándo" />
+    <LinePart from="¿saben donde" to="¿saben dónde" />
+    <LinePart from="¿de que" to="¿de qué" />
+    <LinePart from="¿de donde" to="¿de dónde" />
+    <LinePart from="¿de cual" to="¿de cuál" />
+    <LinePart from="¿de quien" to="¿de quién" />
+    <LinePart from="¿de cuanto" to="¿de cuánto" />
+    <LinePart from="¿de cuanta" to="¿de cuánta" />
+    <LinePart from="¿de cuantos" to="¿de cuántos" />
+    <LinePart from="¿de cuantas" to="¿de cuántas" />
+    <LinePart from="¿de cuando" to="¿de cuándo" />
+    <LinePart from="¿sobre que" to="¿sobre qué" />
+    <LinePart from="¿como " to="¿cómo " />
+    <LinePart from="¿cual " to="¿cuál " />
+    <LinePart from="¿en cual" to="¿en cuál" />
+    <LinePart from="¿cuando" to="¿cuándo" />
+    <LinePart from="¿hasta cual" to="¿hasta cuál" />
+    <LinePart from="¿hasta quien" to="¿hasta quién" />
+    <LinePart from="¿hasta cuanto" to="¿hasta cuánto" />
+    <LinePart from="¿hasta cuantas" to="¿hasta cuántas" />
+    <LinePart from="¿hasta cuantos" to="¿hasta cuántos" />
+    <LinePart from="¿hasta cuando" to="¿hasta cuándo" />
+    <LinePart from="¿hasta donde" to="¿hasta dónde" />
+    <LinePart from="¿hasta que" to="¿hasta qué" />
+    <LinePart from="¿hasta adonde" to="¿hasta adónde" />
+    <LinePart from="¿desde que" to="¿desde qué" />
+    <LinePart from="¿desde cuando" to="¿desde cuándo" />
+    <LinePart from="¿desde quien" to="¿desde quién" />
+    <LinePart from="¿desde donde" to="¿desde dónde" />
+    <LinePart from="¿cuanto" to="¿cuánto" />
+    <LinePart from="¿cuantos" to="¿cuántos" />
+    <LinePart from="¿donde" to="¿dónde" />
+    <LinePart from="¿adonde" to="¿adónde" />
+    <LinePart from="¿con que" to="¿con qué" />
+    <LinePart from="¿con cual" to="¿con cuál" />
+    <LinePart from="¿con quien" to="¿con quién" />
+    <LinePart from="¿con cuantos" to="¿con cuántos" />
+    <LinePart from="¿con cuantas" to="¿con cuántas" />
+    <LinePart from="¿con cuanta" to="¿con cuánta" />
+    <LinePart from="¿con cuanto" to="¿con cuánto" />
+    <LinePart from="¿para donde" to="¿para dónde" />
+    <LinePart from="¿para adonde" to="¿para adónde" />
+    <LinePart from="¿para cuando" to="¿para cuándo" />
+    <LinePart from="¿para que" to="¿para qué" />
+    <LinePart from="¿para quien" to="¿para quién" />
+    <LinePart from="¿para cuanto" to="¿para cuánto" />
+    <LinePart from="¿para cuanta" to="¿para cuánta" />
+    <LinePart from="¿para cuantos" to="¿para cuántos" />
+    <LinePart from="¿para cuantas" to="¿para cuántas" />
+    <LinePart from="¿a donde" to="¿a dónde" />
+    <LinePart from="¿a que" to="¿a qué" />
+    <LinePart from="¿a cual" to="¿a cuál" />
+    <LinePart from="¿a quien" to="¿a quien" />
+    <LinePart from="¿a como" to="¿a cómo" />
+    <LinePart from="¿a cuanto" to="¿a cuánto" />
+    <LinePart from="¿a cuanta" to="¿a cuánta" />
+    <LinePart from="¿a cuantos" to="¿a cuántos" />
+    <LinePart from="¿a cuantas" to="¿a cuántas" />
+    <LinePart from="¿por que" to="¿por qué" />
+    <LinePart from="¿por cual" to="¿por cuál" />
+    <LinePart from="¿por quien" to="¿por quién" />
+    <LinePart from="¿por cuanto" to="¿por cuánto" />
+    <LinePart from="¿por cuanta" to="¿por cuánta" />
+    <LinePart from="¿por cuantos" to="¿por cuántos" />
+    <LinePart from="¿por cuantas" to="¿por cuántas" />
+    <LinePart from="¿por donde" to="¿por dónde" />
+    <LinePart from="¿porque" to="¿por qué" />
+    <LinePart from="¿porqué" to="¿por qué" />
+    <LinePart from="¿y que" to="¿y qué" />
+    <LinePart from="¿y como" to="¿y cómo" />
+    <LinePart from="¿y cuando" to="¿y cuándo" />
+    <LinePart from="¿y cual" to="¿y cuál" />
+    <LinePart from="¿y quien" to="¿y quién" />
+    <LinePart from="¿y cuanto" to="¿y cuánto" />
+    <LinePart from="¿y cuanta" to="¿y cuánta" />
+    <LinePart from="¿y cuantos" to="¿y cuántos" />
+    <LinePart from="¿y cuantas" to="¿y cuántas" />
+    <LinePart from="¿y donde" to="¿y dónde" />
+    <LinePart from="¿y adonde" to="¿y adónde" />
+    <LinePart from="¿quien " to="¿quién " />
+    <LinePart from="¿esta " to="¿está " />
+    <LinePart from="¿estas " to="¿estás " />
+    <!-- Acentos al principio de los signos de interrogación con mayúsculas -->
+    <LinePart from="¿Aun" to="¿Aún" />
+    <LinePart from="¿Que " to="¿Qué " />
+    <LinePart from="¿Sabes que" to="¿Sabes qué" />
+    <LinePart from="¿Sabes adonde" to="¿Sabes adónde" />
+    <LinePart from="¿Sabes cual" to="¿Sabes cuál" />
+    <LinePart from="¿Sabes quien" to="¿Sabes quién" />
+    <LinePart from="¿Sabes como" to="¿Sabes cómo" />
+    <LinePart from="¿Sabes cuan" to="¿Sabes cuán" />
+    <LinePart from="¿Sabes cuanto" to="¿Sabes cuánto" />
+    <LinePart from="¿Sabes cuanta" to="¿Sabes cuánta" />
+    <LinePart from="¿Sabes cuantos" to="¿Sabes cuántos" />
+    <LinePart from="¿Sabes cuantas" to="¿Sabes cuántas" />
+    <LinePart from="¿Sabes cuando" to="¿Sabes cuándo" />
+    <LinePart from="¿Sabes donde" to="¿Sabes dónde" />
+    <LinePart from="¿Sabe que" to="¿Sabe qué" />
+    <LinePart from="¿Sabe adonde" to="¿Sabe adónde" />
+    <LinePart from="¿Sabe cual" to="¿Sabe cuál" />
+    <LinePart from="¿Sabe quien" to="¿Sabe quién" />
+    <LinePart from="¿Sabe como" to="¿Sabe cómo" />
+    <LinePart from="¿Sabe cuan" to="¿Sabe cuán" />
+    <LinePart from="¿Sabe cuanto" to="¿Sabe cuánto" />
+    <LinePart from="¿Sabe cuanta" to="¿Sabe cuánta" />
+    <LinePart from="¿Sabe cuantos" to="¿Sabe cuántos" />
+    <LinePart from="¿Sabe cuantas" to="¿Sabe cuántas" />
+    <LinePart from="¿Sabe cuando" to="¿Sabe cuándo" />
+    <LinePart from="¿Sabe donde" to="¿Sabe dónde" />
+    <LinePart from="¿Saben que" to="¿Saben qué" />
+    <LinePart from="¿Saben adonde" to="¿Saben adónde" />
+    <LinePart from="¿Saben cual" to="¿Saben cuál" />
+    <LinePart from="¿Saben quien" to="¿Saben quién" />
+    <LinePart from="¿Saben como" to="¿Saben cómo" />
+    <LinePart from="¿Saben cuan" to="¿Saben cuán" />
+    <LinePart from="¿Saben cuanto" to="¿Saben cuánto" />
+    <LinePart from="¿Saben cuanta" to="¿Saben cuánta" />
+    <LinePart from="¿Saben cuantos" to="¿Saben cuántos" />
+    <LinePart from="¿Saben cuantas" to="¿Saben cuántas" />
+    <LinePart from="¿Saben cuando" to="¿Saben cuándo" />
+    <LinePart from="¿Saben donde" to="¿Saben dónde" />
+    <LinePart from="¿De que" to="¿De qué" />
+    <LinePart from="¿De donde" to="¿De dónde" />
+    <LinePart from="¿De cual" to="¿De cuál" />
+    <LinePart from="¿De quien" to="¿De quién" />
+    <LinePart from="¿De cuanto" to="¿De cuánto" />
+    <LinePart from="¿De cuanta" to="¿De cuánta" />
+    <LinePart from="¿De cuantos" to="¿De cuántos" />
+    <LinePart from="¿De cuantas" to="¿De cuántas" />
+    <LinePart from="¿De cuando" to="¿De cuándo" />
+    <LinePart from="¿Desde que" to="¿Desde qué" />
+    <LinePart from="¿Desde cuando" to="¿Desde cuándo" />
+    <LinePart from="¿Desde quien" to="¿Desde quién" />
+    <LinePart from="¿Desde donde" to="¿Desde dónde" />
+    <LinePart from="¿Sobre que" to="¿Sobre qué" />
+    <LinePart from="¿Como " to="¿Cómo " />
+    <LinePart from="¿Cual " to="¿Cuál " />
+    <LinePart from="¿En cual" to="¿En cuál" />
+    <LinePart from="¿Cuando" to="¿Cuándo" />
+    <LinePart from="¿Hasta cual" to="¿Hasta cuál" />
+    <LinePart from="¿Hasta quien" to="¿Hasta quién" />
+    <LinePart from="¿Hasta cuanto" to="¿Hasta cuánto" />
+    <LinePart from="¿Hasta cuantas" to="¿Hasta cuántas" />
+    <LinePart from="¿Hasta cuantos" to="¿Hasta cuántos" />
+    <LinePart from="¿Hasta cuando" to="¿Hasta cuándo" />
+    <LinePart from="¿Hasta donde" to="¿Hasta dónde" />
+    <LinePart from="¿Hasta que" to="¿Hasta qué" />
+    <LinePart from="¿Hasta adonde" to="¿Hasta adónde" />
+    <LinePart from="¿Cuanto" to="¿Cuánto" />
+    <LinePart from="¿Cuantos" to="¿Cuántos" />
+    <LinePart from="¿Donde" to="¿Dónde" />
+    <LinePart from="¿Adonde" to="¿Adónde" />
+    <LinePart from="¿Con que" to="¿Con qué" />
+    <LinePart from="¿Con cual" to="¿Con cuál" />
+    <LinePart from="¿Con quien" to="¿Con quién" />
+    <LinePart from="¿Con cuantos" to="¿Con cuántos" />
+    <LinePart from="¿Con cuanta" to="¿Con cuántas" />
+    <LinePart from="¿Con cuanta" to="¿Con cuánta" />
+    <LinePart from="¿Con cuanto" to="¿Con cuánto" />
+    <LinePart from="¿Para donde" to="¿Para dónde" />
+    <LinePart from="¿Para adonde" to="¿Para adónde" />
+    <LinePart from="¿Para cuando" to="¿Para cuándo" />
+    <LinePart from="¿Para que" to="¿Para qué" />
+    <LinePart from="¿Para quien" to="¿Para quién" />
+    <LinePart from="¿Para cuanto" to="¿Para cuánto" />
+    <LinePart from="¿Para cuanta" to="¿Para cuánta" />
+    <LinePart from="¿Para cuantos" to="¿Para cuántos" />
+    <LinePart from="¿Para cuantas" to="¿Para cuántas" />
+    <LinePart from="¿A donde" to="¿A dónde" />
+    <LinePart from="¿A que" to="¿A qué" />
+    <LinePart from="¿A cual" to="¿A cuál" />
+    <LinePart from="¿A quien" to="¿A quien" />
+    <LinePart from="¿A como" to="¿A cómo" />
+    <LinePart from="¿A cuanto" to="¿A cuánto" />
+    <LinePart from="¿A cuanta" to="¿A cuánta" />
+    <LinePart from="¿A cuantos" to="¿A cuántos" />
+    <LinePart from="¿A cuantas" to="¿A cuántas" />
+    <LinePart from="¿Por que" to="¿Por qué" />
+    <LinePart from="¿Por cual" to="¿Por cuál" />
+    <LinePart from="¿Por quien" to="¿Por quién" />
+    <LinePart from="¿Por cuanto" to="¿Por cuánto" />
+    <LinePart from="¿Por cuanta" to="¿Por cuánta" />
+    <LinePart from="¿Por cuantos" to="¿Por cuántos" />
+    <LinePart from="¿Por cuantas" to="¿Por cuántas" />
+    <LinePart from="¿Por donde" to="¿Por dónde" />
+    <LinePart from="¿Porque" to="¿Por qué" />
+    <LinePart from="¿Porqué" to="¿Por qué" />
+    <LinePart from="¿Y que" to="¿Y qué" />
+    <LinePart from="¿Y como" to="¿Y cómo" />
+    <LinePart from="¿Y cuando" to="¿Y cuándo" />
+    <LinePart from="¿Y cual" to="¿Y cuál" />
+    <LinePart from="¿Y quien" to="¿Y quién" />
+    <LinePart from="¿Y cuanto" to="¿Y cuánto" />
+    <LinePart from="¿Y cuanta" to="¿Y cuánta" />
+    <LinePart from="¿Y cuantos" to="¿Y cuántos" />
+    <LinePart from="¿Y cuantas" to="¿Y cuántas" />
+    <LinePart from="¿Y donde" to="¿Y dónde" />
+    <LinePart from="¿Y adonde" to="¿Y adónde" />
+    <LinePart from="¿Quien " to="¿Quién " />
+    <LinePart from="¿Esta " to="¿Está " />
+    <!-- Tilde diacrítica en oraciones interrogativas o exclamativas indirectas -->
+    <LinePart from="el porque" to="el porqué" />
+    <LinePart from="su porque" to="su porqué" />
+    <LinePart from="los porqués" to="los porqués" />
+    <!-- aún -->
+    <LinePart from="aun," to="aún," />
+    <LinePart from="aun no" to="aún no" />
+    <!-- dé -->
+    <LinePart from=" de y " to=" dé y " />
+    <LinePart from=" nos de " to=" nos dé " />
+    <!-- tú -->
+    <LinePart from=" tu ya " to=" tú ya " />
+    <LinePart from="Tu ya " to="Tú ya " />
+    <!-- casos específicos antes de la coma -->
+    <LinePart from=" de, " to=" dé," />
+    <LinePart from=" mi, " to=" mí," />
+    <LinePart from=" tu, " to=" tú," />
+    <LinePart from=" el, " to=" él," />
+    <LinePart from=" te, " to=" té," />
+    <LinePart from=" mas, " to=" más," />
+    <LinePart from=" quien, " to=" quién," />
+    <LinePart from=" cual," to=" cuál," />
+    <LinePart from="porque, " to="porqué," />
+    <LinePart from="cuanto, " to="cuánto," />
+    <LinePart from="cuando, " to="cuándo," />
+    <!-- sé -->
+    <LinePart from=" se," to=" sé," />
+    <LinePart from="se donde" to="sé dónde" />
+    <LinePart from="se cuando" to="sé cuándo" />
+    <LinePart from="se adonde" to="sé adónde" />
+    <LinePart from="se como" to="sé cómo" />
+    <LinePart from="se cual" to="sé cuál" />
+    <LinePart from="se quien" to="sé quién" />
+    <LinePart from="se cuanto" to="sé cuánto" />
+    <LinePart from="se cuanta" to="sé cuánta" />
+    <LinePart from="se cuantos" to="sé cuántos" />
+    <LinePart from="se cuantas" to="sé cuántas" />
+    <LinePart from="se cuan" to="sé cuán" />
+    <!-- si/sí -->
+    <LinePart from=" el si " to=" el sí " />
+    <LinePart from="si mismo" to="sí mismo" />
+    <LinePart from="si misma" to="sí misma" />
+    <!-- Errores de "l" en vez de "i" en casos específicos -->
+    <LinePart from=" llegal" to=" ilegal" />
+    <LinePart from=" lluminar" to=" iluminar" />
+    <LinePart from="sllbato" to="silbato" />
+    <LinePart from="sllenclo" to="silencio" />
+    <LinePart from="clemencla" to="clemencia" />
+    <LinePart from="socledad" to="sociedad" />
+    <LinePart from="tlene" to="tiene" />
+    <LinePart from="tlempo" to="tiempo" />
+    <LinePart from="equlvocaba" to="equivocaba" />
+    <LinePart from="qulnce" to="quince" />
+    <LinePart from="comlen" to="comien" />
+    <LinePart from="historl" to="histori" />
+    <LinePart from="misterl" to="misteri" />
+    <LinePart from="vivencl" to="vivenci" />
+  </PartialLines>
+  <PartialLinesAlways />
+  <BeginLines />
+  <EndLines>
+    <Ending from=".»." to="»." />
+  </EndLines>
+  <WholeLines>
+    <!-- Todas las líneas -->
+    <Line from="No" to="No." />
+  </WholeLines>
+  <RegularExpressions>
+    <!-- Abreviaturas compuestas -->
+    <RegEx find="\b[Ss](r|ra|rta)\b\.?" replaceWith="S$1." />
+    <RegEx find="\b[Dd](r|ra)\b\.?" replaceWith="D$1." />
+    <RegEx find="\b[Uu](d|ds)\b\.?" replaceWith="U$1." />
+    <RegEx find="(\d)(\s){0,1}([Aa])(\.){0,1}([Mm])(\.){0,1}(\W){0,1}" replaceWith="$1 a. m.$7" />
+    <RegEx find="(\d)(\s){0,1}([Pp])(\.){0,1}([Mm])(\.){0,1}(\W){0,1}" replaceWith="$1 p. m.$7" />
+    <RegEx find="(\d)(\s){0,1}(h)(s\b|r\b|rs\b){0,1}(\.){0,1}(\W){0,1}" replaceWith="$1 $3$6" />
+    <RegEx find="(\d)(\s){0,1}([Kk])(m\b|ms\b)(\.){0,1}(\W){0,1}" replaceWith="$1 km$6" />
+    <RegEx find="(\d)(\s){0,1}(s)(g\b|eg\b){0,1}(\.){0,1}(\W){0,1}" replaceWith="$1 s$6" />
+    <RegEx find="(\d)(\s){0,1}([Kk])(g\b|gs\b)(\.){0,1}(\W){0,1}" replaceWith="$1 kg$6" />
+    <RegEx find="(\d)(\s){0,1}(m)(t\b|ts\b){0,1}(\.){0,1}(\W){0,1}" replaceWith="$1 m$6" />
+    <RegEx find="(\d)KBs(\W){0,1}" replaceWith="$1 kB$2" />
+    <RegEx find="([Nn])°(\s){0,1}(\d)" replaceWith="$1.° $3" />
+    <RegEx find="([Nn])ro(\.){0,1}(\s){0,1}(\d)" replaceWith="$1.° $4" />
+    <!-- Signos invertidos -->
+    <RegEx find="\?¿(\W|\w)" replaceWith="? ¿$1" />
+    <RegEx find="\!¡(\W|\w)" replaceWith="! ¡$1" />
+    <RegEx find="\?¿¿(\W|\w)" replaceWith="? ¿$1" />
+    <RegEx find="\!¡¡(\W|\w)" replaceWith="! ¡$1" />
+    <!-- Inicio de línea -->
+    <RegEx find="^_(\s)" replaceWith="-$1" />
+    <RegEx find="^_(\w)" replaceWith="- $1" />
+    <!-- Uso de comillas según la recomendación de la RAE y la Wikipedia -->
+    <RegEx find="(«[^“«»]+)«" replaceWith="$1“" />
+    <RegEx find="(“[^«»”]+)»" replaceWith="$1”" />
+    <RegEx find="`" replaceWith="‘" />
+    <RegEx find="´" replaceWith="’" />
+    <RegEx find="([\wá-ú])(\.)(«|»)" replaceWith="$1»." />
+    <RegEx find="«(\?)" replaceWith="»?" />
+    <RegEx find="«(\!)" replaceWith="»!" />
+    <RegEx find="«\s" replaceWith="» " />
+    <RegEx find="«(\))" replaceWith="»)" />
+    <RegEx find="(\?)«" replaceWith="?»" />
+    <RegEx find="(\!)«" replaceWith="!»" />
+    <RegEx find="«(,)" replaceWith="»," />
+    <RegEx find="«(;)" replaceWith="»;" />
+    <RegEx find="«(:)" replaceWith="»:" />
+    <RegEx find="(¿)»" replaceWith="¿«" />
+    <RegEx find="(¡)»" replaceWith="¡«" />
+    <!-- Uso de comillas (ANSI) según la recomendación de la RAE («\x22» es el carácter «"») -->
+    <RegEx find="([\wá-ú])([\.,]) ?[\x22»]" replaceWith="$1»$2" />
+    <RegEx find="([\wá-ú])\?[\x22»](\s|$)" replaceWith="$1?».$2" />
+    <RegEx find="^(\.\.\.)(\s){0,1}\x22" replaceWith="$1«" />
+    <RegEx find="«\x22" replaceWith="«" />
+    <RegEx find="\x22»" replaceWith="»" />
+    <RegEx find="^\x22{2,}" replaceWith="«" />
+    <RegEx find="\x22{2,}$" replaceWith="»" />
+    <RegEx find="\x22\r" replaceWith="»" />
+    <RegEx find="^\x22" replaceWith="«" />
+    <RegEx find="\x22$" replaceWith="»." />
+    <RegEx find="([\wá-ú])\.[\x22»]" replaceWith="$1»." />
+    <RegEx find="\s\x22" replaceWith=" «" />
+    <RegEx find="\x22\s" replaceWith="» " />
+    <RegEx find="\x22(,)" replaceWith="»," />
+    <RegEx find="\x22(\.)" replaceWith="»." />
+    <RegEx find="\x22(;)" replaceWith="»;" />
+    <RegEx find="\x22(:)" replaceWith="»:" />
+    <RegEx find="(\!)\x22" replaceWith="!»" />
+    <RegEx find="\x22(\!)" replaceWith="»!" />
+    <RegEx find="(\?)\x22" replaceWith="?»" />
+    <RegEx find="\x22(\?)" replaceWith="»?" />
+    <RegEx find="\x22(¿)" replaceWith="«¿" />
+    <RegEx find="(¿)\x22" replaceWith="¿«" />
+    <RegEx find="\x22(¡)" replaceWith="«¡" />
+    <RegEx find="(¡)\x22" replaceWith="¡«" />
+    <RegEx find="\x22(\))" replaceWith="»)" />
+    <RegEx find="(\))\x22" replaceWith=")»" />
+    <RegEx find="(\()\x22" replaceWith="(«" />
+    <!-- Uso de comillas (Unicode) según la recomendación de la RAE («\u0022» es el carácter «"») -->
+    <RegEx find="^(\.\.\.)(\s){0,1}\u0022" replaceWith="$1«" />
+    <RegEx find="^\u0022{2,}" replaceWith="«" />
+    <RegEx find="\u0022{2,}$" replaceWith="»" />
+    <RegEx find="\u0022\r" replaceWith="»" />
+    <RegEx find="^\u0022" replaceWith="«" />
+    <RegEx find="\u0022$" replaceWith="»" />
+    <RegEx find="(\w)(\.)\u0022" replaceWith="$1»." />
+    <RegEx find="\s\u0022" replaceWith=" «" />
+    <RegEx find="\u0022\s" replaceWith="» " />
+    <RegEx find="\u0022(,)" replaceWith="»," />
+    <RegEx find="\u0022(\.)" replaceWith="»." />
+    <RegEx find="\u0022(;)" replaceWith="»;" />
+    <RegEx find="\u0022(:)" replaceWith="»:" />
+    <RegEx find="(\!)\u0022" replaceWith="!»" />
+    <RegEx find="\u0022(\!)" replaceWith="»!" />
+    <RegEx find="(\?)\u0022" replaceWith="?»" />
+    <RegEx find="\u0022(\?)" replaceWith="»?" />
+    <RegEx find="\u0022(¿)" replaceWith="«¿" />
+    <RegEx find="(¿)\u0022" replaceWith="¿«" />
+    <RegEx find="\u0022(¡)" replaceWith="«¡" />
+    <RegEx find="(¡)\u0022" replaceWith="¡«" />
+    <RegEx find="\u0022(\))" replaceWith="»)" />
+    <RegEx find="(\))\u0022" replaceWith=")»" />
+    <RegEx find="(\()\u0022" replaceWith="(«" />
+    <!-- Numeración -->
+    <RegEx find="([0-9])\.([0-9])\b" replaceWith="$1,$2" />
+    <RegEx find="(^|\s|[¡¿«])([0-9])(,|\.)?([0-9]{3})\b" replaceWith="$1$2$4" />
+    <RegEx find="(\d)\s(?=\d{2}\b)" replaceWith="$1-" />
+    <!-- "1 :", "2 :"... "n :" a "n:" -->
+    <RegEx find="(\d) ([:;])" replaceWith="$1$2" />
+    <!-- Corregir las comas y puntos por ej. «, ,» por «,» & «,,,» o similar por «...» -->
+    <RegEx find="(\.\.\.+)$" replaceWith="..." />
+    <RegEx find="(, ,+)$" replaceWith="," />
+    <RegEx find="(,\s),+\s" replaceWith="$1" />
+    <RegEx find="(\.\.\.),$" replaceWith="$1" />
+    <RegEx find="([\wá-ú])(\.\.)$" replaceWith="$1." />
+    <!-- Puntos innecesarios (complemento) -->
+    <RegEx find="([\w\W]\.{3})([¡¿])" replaceWith="$1 $2" />
+    <RegEx find="(\w)\.\.(\s)" replaceWith="$1.$2" />
+    <RegEx find="([\wá-ú\x22»])\.([\?\!])" replaceWith="$1$2" />
+    <RegEx find="([\:\;])\." replaceWith="$1" />
+    <RegEx find="\.([\:\;])" replaceWith="$1" />
+    <RegEx find="\:+" replaceWith=":" />
+    <!-- Terminaciones ción/sión -->
+    <RegEx find="([sc]i)o(n)\b" replaceWith="$1ó$2" />
+    <RegEx find="([SC]I)O(N)\b" replaceWith="$1Ó$2" />
+    <!-- "i" en vez de "l" en terminaciones «clón» -->
+    <RegEx find="clón\b" replaceWith="ción" />
+    <!-- "si" en vez de "sl" -->
+    <RegEx find="\b([Ss])(l)\b" replaceWith="$1i" />
+    <!-- Para corregir por ej. raclones, perforaclones, opclones, etc -->
+    <RegEx find="([Rr]ac)l(o)" replaceWith="$1i$2" />
+    <RegEx find="([Oo]pc)l(o)" replaceWith="$1i$2" />
+    <!-- Para corregir por ej. tenldo, víctlmas, olvldarlo, legítlmo, etc -->
+    <RegEx find="([BbCcDdFfHhMmNnRrSsTtVv])l([bcdhmnrstv])" replaceWith="$1i$2" />
+    <!-- Corrige los errores en el ripeo de la «o» mayúscula por el cero «0» y viceversa -->
+    <RegEx find="(\d)O" replaceWith="$1 0" />
+    <RegEx find="(\d)[,\.]O" replaceWith="$1.0" />
+    <RegEx find="([A-Z])0" replaceWith="$1O" />
+    <RegEx find="\b0([A-Za-z])" replaceWith="O$1" />
+    <!-- Signos musicales -->
+    <RegEx find="[♪♫☺☹♥©☮☯Σ∞≡⇒π#](\r\n)[♪♫☺☹♥©☮☯Σ∞≡⇒π#]" replaceWith="$1" />
+    <!-- Tilde diacrítica antes del punto -->
+    <RegEx find="(\s)([dst])e\.(\s|\$)" replaceWith="$1$2é.$3" />
+    <RegEx find="(\s)mi\.(\s|\$)" replaceWith="$1mí.$2" />
+    <RegEx find="(\s)el\.(\s|\$)" replaceWith="$1él.$2" />
+    <RegEx find="(\s)tu\.(\s|\$)" replaceWith="$1tú.$2" />
+    <RegEx find="(\s)si\.(\s|\$)" replaceWith="$1sí.$2" />
+    <RegEx find="(\s)aun\.(\s|\$)" replaceWith="$1aún.$2" />
+    <RegEx find="(\s)mas\.(\s|\$)" replaceWith="$1más.$2" />
+    <RegEx find="(\s)quien\.(\s|\$)" replaceWith="$1quién.$2" />
+    <RegEx find="(\s)cual\.(\s|\$)" replaceWith="$1cuál.$2" />
+    <RegEx find="(\s)que\.(\s|\$)" replaceWith="$1qué.$2" />
+    <RegEx find="(\s)porque\.(\s|\$)" replaceWith="$1porqué.$2" />
+    <RegEx find="(\s)cuanto\.(\s|\$)" replaceWith="$1cuánto.$2" />
+    <RegEx find="(\s)cuando\.(\s|\$)" replaceWith="$1cuándo.$2" />
+    <!-- Prefijos; palabras compuestas (simple) -->
+    <RegEx find="(\b[Ee]x|\b[Ss]uper|\b[Aa]nti|\b[Pp]os|\b[Pp]re|\b[Pp]ro|\b[Vv]ice)[\s\x2D]([a-zá-ú]{3,20})(\b)" replaceWith="$1$2" />
+    <!-- Prefijos; palabras compuestas (números) -->
+    <RegEx find="(\b[Ss]ub|\b[Ss]uper)[\s\x2D](\d{2})(\b)" replaceWith="$1-$2$3" />
+    <!-- Prefijos; palabras compuestas (mayúsculas) -->
+    <RegEx find="(\b[Aa]nti|\b[Mm]ini|\b[Pp]os|\b[Pp]ro)\s([A-Z]{1,10})([A-Z][a-zá-ú]){0,10}(\b)" replaceWith="$1-$2$3" />
+    <!-- Casos de mayúsculas con dos puntos -->
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(a)" replaceWith="$1A" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(b)" replaceWith="$1B" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(c)" replaceWith="$1C" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(d)" replaceWith="$1D" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(e)" replaceWith="$1E" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(f)" replaceWith="$1F" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(g)" replaceWith="$1G" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(h)" replaceWith="$1H" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(i)" replaceWith="$1I" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(j)" replaceWith="$1J" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(k)" replaceWith="$1K" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(l)" replaceWith="$1L" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(m)" replaceWith="$1M" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(n)" replaceWith="$1N" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(ñ)" replaceWith="$1Ñ" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(o)" replaceWith="$1O" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(p)" replaceWith="$1P" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(q)" replaceWith="$1Q" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(r)" replaceWith="$1R" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(s)" replaceWith="$1S" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(t)" replaceWith="$1T" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(u)" replaceWith="$1U" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(v)" replaceWith="$1V" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(w)" replaceWith="$1W" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(x)" replaceWith="$1X" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(y)" replaceWith="$1Y" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(z)" replaceWith="$1Z" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(á)" replaceWith="$1Á" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(é)" replaceWith="$1É" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(í)" replaceWith="$1Í" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(ó)" replaceWith="$1Ó" />
+    <RegEx find="([\wá-ú]:\s[«\x22]?)(ú)" replaceWith="$1Ú" />
+    <!-- Usos correctos de la coma -->
+    <RegEx find="(\b[Pp]ero),(\s)([¡¿])" replaceWith="$1$2$3" />
+    <RegEx find="(\b[Aa]unque),(\s|$)" replaceWith="$1$2" />
+    <!-- Vocativos -->
+    <RegEx find="(\bHola|\bBueno|\bBien|\bVen|\bVen acá|\besto|\bBuenos días|\bFeliz cumpleaños|\bsiento)\s([A-Z][a-zá-ú]{3,12}\b|seño(r|ra|rita)\b|hij(o|a) mío\b|amig(o|a)\b)" replaceWith="$1, $2" />
+    <!-- «aún» cuando son sinónimos de «incluso» o «hasta» -->
+    <RegEx find="(\W|^)(\b[Aa])ú(n)(\s)(así\b|cuando\b|los\b|las\b|negar(te|se)\b)" replaceWith="$1$2u$3$4$5" />
+    <RegEx find="(\b[Nn]i)(\s)(a)ú(n)(\W|$)" replaceWith="$1$2$3u$4$5" />
+    <!-- «sí» -->
+    <RegEx find="\b([Ss])i(:|;|\.)" replaceWith="$1í$2" />
+    <!-- «sé» -->
+    <RegEx find="(\b[Ll]o|\b[Ll]a|\b[Ll]e)(\s)se(\W|$)" replaceWith="$1$2sé$3" />
+    <RegEx find="[Ss]e\s(dónde\b|cuándo\b|adónde\b|cómo\b|cuál\b|quién\b|cuánto\b|cuánta\b|cuántos\b|cuántas\b|cuán\b)" replaceWith="sé $1" />
+    <!-- «té» -->
+    <RegEx find="\b([Tt])e\s(verde\b|negro\b|perla\b|de manzanilla\b|de lim[óo]n\b|de jazm[íi]n\b)" replaceWith="$1é $2" />
+    <!-- Apóstrofo -->
+    <RegEx find="(\b[A-Z][a-zá-ú]{3,12})\s(’|')(\d\d(\s|$))" replaceWith="$1 $3" />
+    <RegEx find="(\b[A-Z]{2,5})(’|')(s)" replaceWith="(Ej. Devedés)$1$3" />
+    <RegEx find="(\b\d{1,2})(’|')(\d{2})\s(s|m)(\W|$)" replaceWith="$1,$3 $4$5" />
+    <RegEx find="(\b\d{1,2})(’|')(\d{2})\s(h)(\W|$)" replaceWith="$1:$3 $4$5" />
+    <!-- Porcentaje (debe llevar espacio) -->
+    <RegEx find="(\b\d{1,3})%(\W)" replaceWith="$1 %$2" />
+    <!-- Haz/has -->
+    <RegEx find="(\b)([Hh])as\s(la\b|lo\b|clic\b)(\W)" replaceWith="$1$2az $3$4" />
+    <RegEx find="(\b)([Hh])az\s(de\b)(\W)" replaceWith="$1$2as $3$4" />
+    <RegEx find="(\b)([Hh])as(le\b|nos\b|me\b)(\W)" replaceWith="$1$2az$3$4" />
+    <!-- Quitar itálicas en 3 o menos letras -->
+    <RegEx find="\x3ci\x3e(.{1,3})\x3c\/i\x3e" replaceWith="$1" />
+    <!-- Miscelánea -->
+    <RegEx find="(\b[Cc]erca|\b[Ee]ncima|\b[Dd]ebajo|\b[Dd]etrás|\b[Dd]elante)(\s)mío" replaceWith="$1 de mí" />
+    <RegEx find="(\b[Cc]erca|\b[Ee]ncima|\b[Dd]ebajo|\b[Dd]etrás|\b[Dd]elante)(\s)tuyo" replaceWith="$1 de ti" />
+    <!-- Punto antes de «¿» y «¡» -->
+    <RegEx find="([\wá-ú»])\s(?=(¿|¡)[A-ZÁ-Ú])" replaceWith="$1. " />
+    <!-- Espacios después del guión -->
+    <RegEx find="(^|\n)(-)([^\s])" replaceWith="$1$2 $3" />
+    <!-- Punto antes del guión -->
+    <RegEx find="([^\.\?\!]) - " replaceWith="$1. - " />
+    <!-- Terminaciones en «ólogo», «ílogo» y «álogo» -->
+    <RegEx find="\Bo(log[ao]s?\b)" replaceWith="ó$1" />
+    <RegEx find="\Ba(log[ao]s?\b)" replaceWith="á$1" />
+    <RegEx find="\Bi(log[ao]s?\b)" replaceWith="í$1" />
+  </RegularExpressions>
+</OCRFixReplaceList>
@@ -0,0 +1,234 @@
+<!-- Credit goes to: MilanRS [http://www.prijevodi-online.org] -->
+<OCRFixReplaceList>
+  <WholeWords>
+    <Word from="ču" to="ću" />
+    <Word from="češ" to="ćeš" />
+    <Word from="če" to="će" />
+    <Word from="ćš" to="ćeš" />
+    <Word from="ćmo" to="ćemo" />
+    <Word from="ćte" to="ćete" />
+    <Word from="čemo" to="ćemo" />
+    <Word from="čete" to="čete" />
+    <Word from="djete" to="dijete" />
+    <Word from="Hey" to="Hej" />
+    <Word from="hey" to="hej" />
+    <Word from="htjeo" to="htio" />
+    <Word from="Hočeš" to="Hoćeš" />
+    <Word from="hočeš" to="hoćeš" />
+    <Word from="iči" to="ići" />
+    <Word from="jel" to="je l'" />
+    <Word from="Jel" to="Je l'" />
+    <Word from="nedaj" to="ne daj" />
+    <Word from="Rješit" to="Riješit" />
+    <Word from="smjeo" to="smio" />
+    <Word from="uopče" to="uopće" />
+    <Word from="valda" to="valjda" />
+    <Word from="želila" to="željela" />
+  </WholeWords>
+  <PartialWordsAlways />
+  <PartialWords>
+    <WordPart from="¤" to="o" />
+    <WordPart from="vv" to="w" />
+    <WordPart from="IVI" to="M" />
+    <WordPart from="lVI" to="M" />
+    <WordPart from="IVl" to="M" />
+    <WordPart from="lVl" to="M" />
+  </PartialWords>
+  <PartialLines>
+    <LinePart from="bi smo" to="bismo" />
+    <LinePart from="dali je" to="da li je" />
+    <LinePart from="dali si" to="da li si" />
+    <LinePart from="Dali si" to="Da li si" />
+    <LinePart from="Jel sam ti" to="Jesam li ti" />
+    <LinePart from="Jel si" to="Jesi li" />
+    <LinePart from="Jel' si" to="Jesi li" />
+    <LinePart from="Je I'" to="Jesi li" />
+    <LinePart from="Jel si to" to="Jesi li to" />
+    <LinePart from="Jel' si to" to="Da li si to" />
+    <LinePart from="jel si to" to="da li si to" />
+    <LinePart from="jel' si to" to="jesi li to" />
+    <LinePart from="Jel si ti" to="Da li si ti" />
+    <LinePart from="Jel' si ti" to="Da li si ti" />
+    <LinePart from="jel si ti" to="da li si ti" />
+    <LinePart from="jel' si ti" to="da li si ti" />
+    <LinePart from="jel ste " to="jeste li " />
+    <LinePart from="Jel ste" to="Jeste li" />
+    <LinePart from="jel' ste " to="jeste li " />
+    <LinePart from="Jel' ste " to="Jeste li " />
+    <LinePart from="Jel su " to="Jesu li " />
+    <LinePart from="Jel da " to="Zar ne" />
+    <LinePart from="jel da " to="zar ne" />
+    <LinePart from="jel'da " to="zar ne" />
+    <LinePart from="Jeli sve " to="Je li sve" />
+    <LinePart from="Jeli on " to="Je li on" />
+    <LinePart from="Jeli ti " to="Je li ti" />
+    <LinePart from="jeli ti " to="je li ti" />
+    <LinePart from="Jeli to " to="Je li to" />
+    <LinePart from="Nebrini" to="Ne brini" />
+    <LinePart from="nedaj" to="ne daj" />
+    <LinePart from="ne ću" to="neću" />
+    <LinePart from="Nemogu" to="Ne mogu" />
+    <LinePart from="ne mogu" to="ne mogu" />
+    <LinePart from="Nemoraš" to="Ne moraš" />
+    <LinePart from="od kako" to="otkako" />
+    <LinePart from="Si dobro" to="Jesi li dobro" />
+    <LinePart from="Svo vreme" to="Sve vrijeme" />
+    <LinePart from="Svo vrijeme" to="Sve vrijeme" />
+    <LinePart from="Cijelo vrijeme" to="Sve vrijeme" />
+  </PartialLines>
+  <PartialLinesAlways />
+  <BeginLines />
+  <EndLines />
+  <WholeLines />
+  <RegularExpressions>
+    <RegEx find="đž" replaceWith="dž" />
+    <RegEx find="ajsmiješnij" replaceWith="ajsmješnij" />
+    <RegEx find="boži[čć]([aeiu]|em|ima)?\b" replaceWith="Božić$1" />
+    <RegEx find=" g-dine\.$" replaceWith=" gospodine." />
+    <RegEx find=" g-dine +(?=[A-ZČĐŠŽ])" replaceWith=" g. " />
+    <RegEx find="([gG])dine? +(?=[A-ZČĐŠŽ])" replaceWith="$1. " />
+    <RegEx find="([gG])-đo +(?=[A-ZČĐŠŽ])" replaceWith="$1gđo " />
+    <RegEx find="gdina +(?=[A-ZČĐŠŽ])" replaceWith="g. " />
+    <RegEx find=" gosp +" replaceWith=" g. " />
+    <RegEx find="Jel si sigur" replaceWith="Jesi li sigur" />
+    <RegEx find="Jel' si sigur" replaceWith="Jesi li sigur" />
+    <RegEx find="\b([jJ])el\?" replaceWith="$1e l'?" />
+    <RegEx find="\bJel'" replaceWith="Je l'" />
+    <RegEx find="([kK]alib(?:ar|r[aeui]))\. *([0-9])" replaceWith="$1 .$2" />
+    <RegEx find="([mM])jenjati" replaceWith="$1ijenjati" />
+    <RegEx find="([mM])oguč" replaceWith="$1oguć" />
+    <RegEx find="\b([nN])ebih?" replaceWith="$1e bi" />
+    <RegEx find="\b([nN])eč([ue]š?|emo|ete)\b" replaceWith="$1eć$2" />
+    <RegEx find="\b([nN])emože(mo|š|te)?\b" replaceWith="$1e može$2" />
+    <RegEx find="\b([nN])ezna([šm]o?|t[ei]|ju|jući|vši)?\b" replaceWith="$1e zna$2" />
+    <RegEx find="najcijenjen" replaceWith="najcjenjen" />
+    <RegEx find="N[jJ]u Jork" replaceWith="Njujork" />
+    <RegEx find="([oO])d([kp])" replaceWith="$1t$2" />
+    <RegEx find="([oO])ružij([aeu])" replaceWith="$1ružj$2" />
+    <RegEx find="([oO])sječa" replaceWith="$1sjeća" />
+    <RegEx find="([pPdD])onje([lt])" replaceWith="$1onije$2" />
+    <RegEx find="([pP])objedi([mšto])" replaceWith="$1obijedi$2" />
+    <RegEx find="redamnom" replaceWith="reda mnom" />
+    <RegEx find="redpostav" replaceWith="retpostav" />
+    <RegEx find="([pP])rimjeti" replaceWith="$1rimijeti" />
+    <RegEx find="([pP])romjeni([mštol])" replaceWith="$1romijeni$2" />
+    <RegEx find="([rR])azumijeć" replaceWith="$1azumjeć" />
+    <RegEx find="rascjepljen" replaceWith="rascijepljen" />
+    <RegEx find="redhodn" replaceWith="rethodn" />
+    <RegEx find="rimjenjen" replaceWith="rimijenjen" />
+    <RegEx find="([^d])rješit" replaceWith="$1riješit" />
+    <RegEx find="([sSzZ])amnom" replaceWith="$1a mnom" />
+    <RegEx find="([sS])lijede[čć]([aeiu]|e[mg])" replaceWith="$1ljedeć$2" />
+    <RegEx find="([sS])mješno" replaceWith="$1miješno" />
+    <RegEx find="([uU])mijesto" replaceWith="$1mjesto" />
+    <RegEx find="([uU])spijeh" replaceWith="$1spjeh" />
+    <RegEx find="([uU])spiješ(an|n[aeiou]|no[mgj])" replaceWith="$1spješ$2" />
+    <RegEx find="([uU])vjek" replaceWith="$1vijek" />
+    <RegEx find="\b([vV])eč([aeiou])" replaceWith="$1eć$2" />
+    <RegEx find="([zZ])ahtijeva" replaceWith="$1ahtjeva" />
+    <RegEx find="([zZ])ahtjeva([ojlmšt])" replaceWith="$1ahtijeva$2" />
+    <RegEx find="([ks]ao)\.:" replaceWith="$1:" />
+    <RegEx find="(?&lt;=[a-zčđšž])Ij(?=[a-zčđšž])" replaceWith="lj" />
+    <RegEx find="(?&lt;=[^A-ZČĐŠŽa-zčđšž])Iju(?=bav|d|t)" replaceWith="lju" />
+    <!-- kad ima razmak između tagova </i> <i> -->
+    <!-- <RegEx find="(&gt;) +(&lt;)" replaceWith="$1$2" /> -->
+    <!-- ',"' to '",' -->
+    <RegEx find="(?&lt;=\w),&quot;(?=\s|$)" replaceWith="&quot;," />
+    <RegEx find=",\.{3}|\.{3},|\.{2} \." replaceWith="..." />
+    <!-- "1 :", "2 :"... "n :" to "n:" -->
+    <RegEx find="([0-9]) +: +(\D)" replaceWith="$1: $2" />
+    <!-- Two or more consecutive "," to "..." -->
+    <RegEx find=",{2,}" replaceWith="..." />
+    <!-- Two or more consecutive "-" to "..." -->
+    <RegEx find="-{2,}" replaceWith="..." />
+    <RegEx find="([^().])\.{2}([^().:])" replaceWith="$1...$2" />
+    <!-- separator stotica i decimalnog ostatka 1,499,000.00 -> 1.499.000,00 -->
+    <RegEx find="([0-9]{3})\.([0-9]{2}[^0-9])" replaceWith="$1,$2" />
+    <RegEx find="([0-9]),([0-9]{3}\D)" replaceWith="$1.$2" />
+    <!-- Apostrophes -->
+    <RegEx find="´´" replaceWith="&quot;" />
+    <!-- <RegEx find="[´`]" replaceWith="'" /> -->
+    <!-- <RegEx find="[“”]" replaceWith="&quot;" /> -->
+    <RegEx find="''" replaceWith="&quot;" />
+    <!-- Two or more consecutive '"' to one '"' -->
+    <RegEx find="&quot;{2,}" replaceWith="&quot;" />
+    <!-- Fix zero and capital 'o' ripping mistakes -->
+    <RegEx find="(?&lt;=[0-9]\.?)O" replaceWith="0" />
+    <RegEx find="\b0(?=[A-ZČĐŠŽa-zčđšž])" replaceWith="O" />
+    <!-- Brisanje crte - na početku 1. reda (i kada ima dva reda) -->
+    <RegEx find="\A- ?([A-ZČĐŠŽa-zčđšž0-9„'&quot;]|\.{3})" replaceWith="$1" />
+    <RegEx find="\A(&lt;[ibu]&gt;)- ?" replaceWith="$1" />
+    <RegEx find="  - " replaceWith=" -" />
+    <!-- Brisanje razmaka iza crte - na početku 2. reda -->
+    <RegEx find="(?&lt;=\n(&lt;[ibu]&gt;)?)- (?=[A-ZČĐŠŽčš0-9„'&quot;&lt;])" replaceWith="-" />
+    <!-- Korigovanje crte - kad je u sredini prvog reda -->
+    <RegEx find="([.!?&quot;&gt;]) - ([A-ZČĐŠŽčš'&quot;&lt;])" replaceWith="$1 -$2" />
+    <!-- Zatvoren tag pa razmak poslije crtice -->
+    <RegEx find="(&gt;) - ([A-ZČĐŠŽčš„'&quot;])" replaceWith="$1 -$2" />
+    <!-- Zatvoren tag pa crtica razmak -->
+    <RegEx find="(&gt;)- ([A-ZČĐŠŽčš„'&quot;])" replaceWith="$1-$2" />
+    <!-- Zagrada pa crtica razmak -->
+    <RegEx find="\(- ([A-ZČĐŠŽčš„'&quot;])" replaceWith="(-$1" />
+    <!-- Smart space after dot -->
+    <!-- osim kad je zadnje t (riječ kolt) -->
+    <RegEx find="(?&lt;=[a-su-zá-úñä-ü])\.(?=[^\s\n().:?!*^“”'&quot;&lt;])" replaceWith=". " />
+    <!-- Oznaka za kalibar. Npr. "Colt .45" -->
+    <!-- Da bi radilo, da bi ovaj razmak bio dozvoljen, odčekirajte "Razmaci ispred tačke" -->
+    <RegEx find="t\.(?=[0-9]{2})" replaceWith="t ." />
+    <!-- Joey(j)a -->
+    <RegEx find="(?&lt;=\b[A-Z][a-z])eyj(?=[a-z])" replaceWith="ey" />
+    <!-- Sređuje zarez sa razmakom -->
+    <RegEx find="(?&lt;=[A-ZČĐŠŽa-zčđšžá-úñä-ü&quot;]),(?=[^\s(),?!“&lt;])" replaceWith=", " />
+    <RegEx find=" +,(?=[A-ZČĐŠŽa-zčđšž])" replaceWith=", " />
+    <RegEx find=" +, +" replaceWith=", " />
+    <RegEx find=" +,$" replaceWith="," />
+    <RegEx find="([?!])-" replaceWith="$1 -" />
+    <!-- Space after last of some consecutive dots (eg. "...") -->
+    <RegEx find="(?&lt;=[a-zčđšž])(\.{3}|!)(?=[a-zčđšž])" replaceWith="$1 " />
+    <!-- Delete space after "..." that is at the beginning of the line. You may delete this line if you don't like it -->
+    <!-- <RegEx find="^\.{3} +" replaceWith="..." /> -->
+    <!-- "tekst ... tekst" mijenja u "tekst... tekst" -->
+    <RegEx find="(?&lt;=[A-ZČĐŠŽa-zčđšž]) +\.{3} +" replaceWith="... " />
+    <RegEx find="(?&lt;=\S)\. +&quot;" replaceWith=".&quot;" />
+    <RegEx find="&quot; +\." replaceWith="&quot;." />
+    <RegEx find="(?&lt;=\S\.{3}) +&quot;(?=\s|$)" replaceWith="&quot;" />
+    <RegEx find=" +\.{3}$" replaceWith="..." />
+    <RegEx find="(?&lt;=[a-zčđšž])(?: +\.{3}|\.{2}$)" replaceWith="..." />
+    <!-- Razmak ispred zagrade -->
+    <RegEx find="(?&lt;=[A-ZČĐŠŽa-zčđšž])\(" replaceWith=" (" />
+    <!-- Razmak iza upitnika -->
+    <RegEx find="\?(?=[A-ZČĐŠŽčš])" replaceWith="? " />
+    <RegEx find="(?&lt;=^|&gt;)\.{3} +(?=[A-ZČĐŠŽčš])" replaceWith="..." />
+    <!-- Brise ... kad je na poč. reda "... -->
+    <RegEx find="^&quot;\.{3} +" replaceWith="&quot;" />
+    <RegEx find="(?&lt;=[0-9])\$" replaceWith=" $$" />
+    <!-- ti š -> t š by Strider -->
+    <!-- Zamijeni sva "**ti šu*" s "**t šu*" i "**ti še*" s "**t še*" -->
+    <!-- <RegEx find="([a-z])ti (š+[eu])" replaceWith="$1t $2" /> -->
+    <!-- <RegEx find="([A-Za-z])ti( |\r?\n)(š[eu])" replaceWith="$1t$2$3" /> -->
+    <!-- <RegEx find="(?i)\b(ni)t (š[eu])" replaceWith="$1ti $2" /> -->
+    <!-- <RegEx find="\. +Mr. " replaceWith=". G. " /> -->
+    <!-- <RegEx find="\. +Mrs. " replaceWith=". Gđa " /> -->
+    <!-- <RegEx find="\. +Miss " replaceWith=". Gđica " /> -->
+    <!-- <RegEx find=", +Mrs. " replaceWith=", gđo " /> -->
+    <!-- <RegEx find=", +Miss " replaceWith=", gđice " /> -->
+    <!-- Razmak poslije <i> i poslije .. -->
+    <RegEx find="^(&lt;[ibu]&gt;) +" replaceWith="$1" />
+    <RegEx find="^\.{2} +" replaceWith="..." />
+    <!-- Razmak ? "</i> -->
+    <RegEx find="([.?!]) +(&quot;&lt;)" replaceWith="$1$2" />
+    <!-- Bez razmaka kod Npr.: -->
+    <RegEx find="(?&lt;=[Nn]pr\.) *: *" replaceWith=": " />
+    <RegEx find="\. ," replaceWith=".," />
+    <RegEx find="([?!])\." replaceWith="$1" />
+    <!-- Da ne kvari potpise sa ..:: -->
+    <RegEx find="\.{3}::" replaceWith="..::" />
+    <RegEx find="::\.{3}" replaceWith="::.." />
+    <RegEx find="\.{2} +::" replaceWith="..::" />
+    <!-- Skracenice bez razmaka -->
+    <RegEx find="d\. o\.o\." replaceWith="d.o.o." />
+    <!-- Kad red počinje sa ...pa malo slovo -->
+    <!-- <RegEx find="^\.{3}([a-zčđšž&quot;&lt;])" replaceWith="$1" /> -->
+    <!-- <RegEx find=" +([.?!])" replaceWith="$1" /> -->
+  </RegularExpressions>
+</OCRFixReplaceList>
@@ -0,0 +1,405 @@
+<OCRFixReplaceList>
+  <WholeWords>
+    <Word from="lârt" to="lärt" />
+    <Word from="hedervårda" to="hedervärda" />
+    <Word from="stormâstare" to="stormästare" />
+    <Word from="Avfârd" to="Avfärd" />
+    <Word from="tâlten" to="tälten" />
+    <Word from="ârjag" to="är jag" />
+    <Word from="ärjag" to="är jag" />
+    <Word from="jâmlikar" to="jämlikar" />
+    <Word from="Riskakoﬂ" to="Riskakor" />
+    <Word from="Karamellen/" to="Karamellen" />
+    <Word from="Lngenüng" to="Ingenting" />
+    <Word from="ärju" to="är ju" />
+    <Word from="Sá" to="Så" />
+    <Word from="närjag" to="när jag" />
+    <Word from="alltjag" to="allt jag" />
+    <Word from="görjag" to="gör jag" />
+    <Word from="trorjag" to="tror jag" />
+    <Word from="varju" to="var ju" />
+    <Word from="görju" to="gör ju" />
+    <Word from="kanju" to="kan ju" />
+    <Word from="blirjag" to="blir jag" />
+    <Word from="sägerjag" to="säger jag" />
+    <Word from="behållerjag" to="behåller jag" />
+    <Word from="prøblem" to="problem" />
+    <Word from="räddadeju" to="räddade ju" />
+    <Word from="honøm" to="honom" />
+    <Word from="Ln" to="In" />
+    <Word from="svårﬂörtad" to="svårflörtad" />
+    <Word from="øch" to="och" />
+    <Word from="ﬂörtar" to="flörtar" />
+    <Word from="kännerjag" to="känner jag" />
+    <Word from="ﬂickan" to="flickan" />
+    <Word from="snø" to="snö" />
+    <Word from="gerju" to="ger ju" />
+    <Word from="køntakter" to="kontakter" />
+    <Word from="ølycka" to="olycka" />
+    <Word from="nølla" to="nolla" />
+    <Word from="sinnenajublar" to="sinnena jublar" />
+    <Word from="ijobbet" to="i jobbet" />
+    <Word from="Fårjag" to="Får jag" />
+    <Word from="Ar" to="Är" />
+    <Word from="liggerju" to="ligger ju" />
+    <Word from="um" to="om" />
+    <Word from="lbland" to="Ibland" />
+    <Word from="skjuterjag" to="skjuter jag" />
+    <Word from="Vaddå" to="Vad då" />
+    <Word from="pratarjämt" to="pratar jämt" />
+    <Word from="harju" to="har ju" />
+    <Word from="sitterjag" to="sitter jag" />
+    <Word from="häﬂa" to="härja" />
+    <Word from="sﬁäl" to="stjäl" />
+    <Word from="FÖU" to="Följ" />
+    <Word from="varförjag" to="varför jag" />
+    <Word from="sﬁärna" to="stjärna" />
+    <Word from="böﬂar" to="börjar" />
+    <Word from="böﬂan" to="början" />
+    <Word from="stäri" to="står" />
+    <Word from="pä" to="på" />
+    <Word from="harjag" to="har jag" />
+    <Word from="attjag" to="att jag" />
+    <Word from="Verkarjag" to="Verkar jag" />
+    <Word from="Kännerjag" to="Känner jag" />
+    <Word from="därjag" to="där jag" />
+    <Word from="tuﬁ" to="tuff" />
+    <Word from="lurarjag" to="lurar jag" />
+    <Word from="varjättebra" to="var jättebra" />
+    <Word from="allvan" to="allvar" />
+    <Word from="dethär" to="det här" />
+    <Word from="vaﬂe" to="varje" />
+    <Word from="FöUer" to="Följer" />
+    <Word from="personalmötetl" to="personalmötet!" />
+    <Word from="harjust" to="har just" />
+    <Word from="ärjätteduktig" to="är jätteduktig" />
+    <Word from="därja" to="där ja" />
+    <Word from="lngenüng" to="lngenting" />
+    <Word from="iluften" to="i luften" />
+    <Word from="ösen" to="öser" />
+    <Word from="tvâ" to="två" />
+    <Word from="Uejerna" to="Tjejerna" />
+    <Word from="hån*" to="hårt" />
+    <Word from="Ärjag" to="Är jag" />
+    <Word from="keL" to="Okej" />
+    <Word from="Förjag" to="För jag" />
+    <Word from="varjättekul" to="var jättekul" />
+    <Word from="kämpan" to="kämpar" />
+    <Word from="mycketjobb" to="mycket jobb" />
+    <Word from="Uus" to="ljus" />
+    <Word from="serjag" to="ser jag" />
+    <Word from="vetjag" to="vet jag" />
+    <Word from="fårjag" to="får jag" />
+    <Word from="hurjag" to="hur jag" />
+    <Word from="försökerjag" to="försöker jag" />
+    <Word from="tánagel" to="tånagel" />
+    <Word from="vaüe" to="varje" />
+    <Word from="Uudet" to="ljudet" />
+    <Word from="amhopa" to="allihopa" />
+    <Word from="Väü" to="Välj" />
+    <Word from="gäri" to="går" />
+    <Word from="rödüus" to="rödljus" />
+    <Word from="Uuset" to="ljuset" />
+    <Word from="Ridàn" to="Ridån" />
+    <Word from="viüa" to="vilja" />
+    <Word from="gåri" to="går i" />
+    <Word from="Hurdå" to="Hur då" />
+    <Word from="inter\/juar" to="intervjuar" />
+    <Word from="menarjag" to="menar jag" />
+    <Word from="spyrjag" to="spyr jag" />
+    <Word from="briüera" to="briljera" />
+    <Word from="Närjag" to="När jag" />
+    <Word from="ner\/ös" to="nervös" />
+    <Word from="ilivets" to="i livets" />
+    <Word from="nägot" to="något" />
+    <Word from="pà" to="på" />
+    <Word from="Lnnan" to="Innan" />
+    <Word from="Uf" to="Ut" />
+    <Word from="lnnan" to="Innan" />
+    <Word from="Dàren" to="Dåren" />
+    <Word from="Fàrjag" to="Får jag" />
+    <Word from="VadärdetdäL" to="Vad är det där" />
+    <Word from="smàtjuv" to="småtjuv" />
+    <Word from="tàgrånare" to="tågrånare" />
+    <Word from="ditàt" to="ditåt" />
+    <Word from="sä" to="så" />
+    <Word from="vàrdslösa" to="vårdslösa" />
+    <Word from="nàn" to="nån" />
+    <Word from="kommerjag" to="kommer jag" />
+    <Word from="ärjättebra" to="är jättebra" />
+    <Word from="ärjävligt" to="är jävligt" />
+    <Word from="àkerjag" to="åker jag" />
+    <Word from="ellerjapaner" to="eller japaner" />
+    <Word from="attjaga" to="att jaga" />
+    <Word from="eften" to="efter" />
+    <Word from="hästan" to="hästar" />
+    <Word from="Lntensivare" to="Intensivare" />
+    <Word from="fràgarjag" to="frågar jag" />
+    <Word from="pen/ers" to="pervers" />
+    <Word from="ràbarkade" to="råbarkade" />
+    <Word from="styrkon" to="styrkor" />
+    <Word from="Difåf" to="Ditåt" />
+    <Word from="händen" to="händer" />
+    <Word from="föﬁa" to="följa" />
+    <Word from="Idioten/" to="Idioter!" />
+    <Word from="Varförjagade" to="Varför jagade" />
+    <Word from="därförjag" to="därför jag" />
+    <Word from="forjag" to="for jag" />
+    <Word from="Iivsgladje" to="livsglädje" />
+    <Word from="narjag" to="när jag" />
+    <Word from="sajag" to="sa jag" />
+    <Word from="genastja" to="genast ja" />
+    <Word from="rockumentàren" to="rockumentären" />
+    <Word from="turne" to="turné" />
+    <Word from="fickjag" to="fick jag" />
+    <Word from="sager" to="säger" />
+    <Word from="Ijushårig" to="ljushårig" />
+    <Word from="tradgårdsolycka" to="trädgårdsolycka" />
+    <Word from="kvavdes" to="kvävdes" />
+    <Word from="dàrja" to="där ja" />
+    <Word from="hedersgaster" to="hedersgäster" />
+    <Word from="Nar" to="När" />
+    <Word from="smakiösa" to="smaklösa" />
+    <Word from="lan" to="Ian" />
+    <Word from="Lan" to="Ian" />
+    <Word from="eri" to="er i" />
+    <Word from="universitetsamne" to="universitetsämne" />
+    <Word from="garna" to="gärna" />
+    <Word from="ar" to="är" />
+    <Word from="baltdjur" to="bältdjur" />
+    <Word from="varjag" to="var jag" />
+    <Word from="àr" to="är" />
+    <Word from="förförstàrkare" to="förförstärkare" />
+    <Word from="arjattespeciell" to="är jättespeciell" />
+    <Word from="hàrgår" to="här går" />
+    <Word from="Ia" to="la" />
+    <Word from="Iimousinen" to="limousinen" />
+    <Word from="krickettra" to="kricketträ" />
+    <Word from="hårdrockvàrlden" to="hårdrockvärlden" />
+    <Word from="tràbit" to="träbit" />
+    <Word from="Mellanvastern" to="Mellanvästern" />
+    <Word from="arju" to="är ju" />
+    <Word from="turnen" to="turnén" />
+    <Word from="kanns" to="känns" />
+    <Word from="battre" to="bättre" />
+    <Word from="vàrldsturne" to="världsturne" />
+    <Word from="dar" to="där" />
+    <Word from="sjàlvantànder" to="självantänder" />
+    <Word from="jattelange" to="jättelänge" />
+    <Word from="berattade" to="berättade" />
+    <Word from="Sä" to="Så" />
+    <Word from="vandpunkten" to="vändpunkten" />
+    <Word from="Nàrjag" to="När jag" />
+    <Word from="lasa" to="läsa" />
+    <Word from="skitlàskigt" to="skitläskigt" />
+    <Word from="sambandsvàg" to="sambandsväg" />
+    <Word from="valdigt" to="väldigt" />
+    <Word from="Stamgaﬁel" to="Stämgaffel" />
+    <Word from="àrjag" to="är jag" />
+    <Word from="tajming" to="tajmning" />
+    <Word from="utgäng" to="utgång" />
+    <Word from="Hàråt" to="Häråt" />
+    <Word from="hàråt" to="häråt" />
+    <Word from="anvander" to="använder" />
+    <Word from="harjobbat" to="har jobbat" />
+    <Word from="imageide" to="imageidé" />
+    <Word from="klaﬁen" to="klaffen" />
+    <Word from="sjalv" to="själv" />
+    <Word from="dvarg" to="dvärg" />
+    <Word from="detjag" to="det jag" />
+    <Word from="dvargarna" to="dvärgarna" />
+    <Word from="fantasivàrld" to="fantasivärld" />
+    <Word from="ﬁolliga" to="Fjolliga" />
+    <Word from="mandoiinstràngar" to="mandollnsträngar" />
+    <Word from="mittjobb" to="mitt jobb" />
+    <Word from="Skajag" to="Ska jag" />
+    <Word from="landari" to="landar i" />
+    <Word from="gang" to="gäng" />
+    <Word from="Detjag" to="Det jag" />
+    <Word from="Narmre" to="Närmre" />
+    <Word from="Iåtjavelni" to="låtjäveln" />
+    <Word from="Hållerjag" to="Håller jag" />
+    <Word from="visionarer" to="visionärer" />
+    <Word from="Tülvad" to="Till vad" />
+    <Word from="militàrbas" to="militärbas" />
+    <Word from="jattegiada" to="jätteglada" />
+    <Word from="Fastjag" to="Fast jag" />
+    <Word from="såjag" to="så jag" />
+    <Word from="rockvarlden" to="rockvärlden" />
+    <Word from="saknarjag" to="saknar jag" />
+    <Word from="allafall" to="alla fall" />
+    <Word from="ﬁanta" to="fjanta" />
+    <Word from="Kràma" to="Kräma" />
+    <Word from="stammer" to="stämmer" />
+    <Word from="budbàrare" to="budbärare" />
+    <Word from="Iivsfiiosofi" to="livsfiiosofi" />
+    <Word from="förjämnan" to="för jämnan" />
+    <Word from="gillarjag" to="gillar jag" />
+    <Word from="Iarvat" to="larvat" />
+    <Word from="klararjag" to="klarar jag" />
+    <Word from="hattaﬁ'àr" to="hattaffär" />
+    <Word from="Dà" to="Då" />
+    <Word from="uppﬁnna" to="uppfinna" />
+    <Word from="Ràttfåglar" to="Råttfåglar" />
+    <Word from="Sväüboda" to="Sväljboda" />
+    <Word from="Påböﬂar" to="Påbörjar" />
+    <Word from="slutarju" to="slutar ju" />
+    <Word from="niﬁskebuüken" to="i fiskebutiken" />
+    <Word from="härjäkeln" to="här jäkeln" />
+    <Word from="Hßppa" to="Hoppa" />
+    <Word from="förstörds" to="förstördes" />
+    <Word from="varjättegoda" to="var jättegoda" />
+    <Word from="Kor\/" to="Korv" />
+    <Word from="brüléel" to="brülée!" />
+    <Word from="Hei" to="Hej" />
+    <Word from="älskarjordgubbsglass" to="älskar jordgubbsglass" />
+    <Word from="Snöbom" to="Snöboll" />
+    <Word from="SnöboH" to="Snöboll" />
+    <Word from="Snöbol" to="Snöboll" />
+    <Word from="snöboH" to="snöboll" />
+    <Word from="Läggerpå" to="Lägger på" />
+    <Word from="lngeﬂ" to="lnget!" />
+    <Word from="Sägerjättesmarta" to="Säger jättesmarta" />
+    <Word from="dopplen/äderradar" to="dopplerväderradar" />
+    <Word from="säkertjättefin" to="säkert jättefin" />
+    <Word from="ärjättefin" to="är jättefin" />
+    <Word from="verkarju" to="verkar ju" />
+    <Word from="blirju" to="blir ju" />
+    <Word from="kor\/" to="korv" />
+    <Word from="naturkatastroﬁ" to="naturkatastrof!" />
+    <Word from="stickerjag" to="stickerj ag" />
+    <Word from="jättebuﬁé" to="jättebuffé" />
+    <Word from="beﬁnner" to="befinner" />
+    <Word from="Spﬂng" to="Spring" />
+    <Word from="trecﬁe" to="tredje" />
+    <Word from="ryckerjag" to="rycker jag" />
+    <Word from="skullejag" to="skulle jag" />
+    <Word from="vetju" to="vet ju" />
+    <Word from="aﬂjag" to="att jag" />
+    <Word from="ﬂnns" to="finns" />
+    <Word from="ärlång" to="är lång" />
+    <Word from="kåra" to="kära" />
+    <Word from="ärﬁna" to="är ﬁna" />
+    <Word from="äri" to="är i" />
+    <Word from="hörden" to="hör den" />
+    <Word from="ättjäg" to="att jäg" />
+    <Word from="gär" to="går" />
+    <Word from="föri" to="för i" />
+    <Word from="Hurvisste" to="Hur visste" />
+    <Word from="ﬁck" to="fick" />
+    <Word from="ﬁnns" to="finns" />
+    <Word from="ﬁn" to="fin" />
+    <Word from="Fa" to="Bra." />
+    <Word from="bori" to="bor i" />
+    <Word from="fiendeplanl" to="fiendeplan!" />
+    <Word from="iförnamn" to="i förnamn" />
+    <Word from="detju" to="det ju" />
+    <Word from="Nüd" to="Niki" />
+    <Word from="hatarjag" to="hatar jag" />
+    <Word from="Klararjag" to="Klarar jag" />
+    <Word from="detaﬁer" to="detaljer" />
+    <Word from="vä/" to="väl" />
+    <Word from="smakarju" to="smakar ju" />
+    <Word from="Teacheﬂ" to="Teacher!" />
+    <Word from="imorse" to="i morse" />
+    <Word from="drickerjag" to="dricker jag" />
+    <Word from="ståri" to="står i" />
+    <Word from="Harjag" to="Har jag" />
+    <Word from="Talarjag" to="Talar jag" />
+    <Word from="undrarjag" to="undrar jag" />
+    <Word from="ålderjag" to="ålder jag" />
+    <Word from="vaﬁe" to="varje" />
+    <Word from="förfalskningl" to="förfalskning!" />
+    <Word from="Viﬁiiiam" to="William" />
+    <Word from="V\ﬁlliams" to="Williams" />
+    <Word from="attjobba" to="att jobba" />
+    <Word from="intei" to="inte i" />
+    <Word from="närV\ﬁlliam" to="när William" />
+    <Word from="V\ﬁlliam" to="William" />
+    <Word from="Eﬁersom" to="Eftersom" />
+    <Word from="Vlﬁlliam" to="William" />
+    <Word from="Iängejag" to="länge jag" />
+    <Word from="'ﬁdigare" to="Tidigare" />
+    <Word from="börjadei" to="började i" />
+    <Word from="merjust" to="mer just" />
+    <Word from="eﬁeråt" to="efteråt" />
+    <Word from="gjordejag" to="gjorde jag" />
+    <Word from="hadeju" to="hade ju" />
+    <Word from="gårvi" to="går vi" />
+    <Word from="köperjag" to="köper jag" />
+    <Word from="Måstejag" to="Måste jag" />
+    <Word from="kännerju" to="känner ju" />
+    <Word from="ﬂn" to="fin" />
+    <Word from="treviig" to="trevlig" />
+    <Word from="Grattisl" to="Grattis!" />
+    <Word from="kande" to="kände" />
+    <Word from="'llden" to="Tiden" />
+    <Word from="sakjag" to="sak jag" />
+    <Word from="klartjag" to="klart jag" />
+    <Word from="häﬁigt" to="häftigt" />
+    <Word from="Iämnarjag" to="lämnar jag" />
+    <Word from="gickju" to="gick ju" />
+    <Word from="skajag" to="ska jag" />
+    <Word from="Görjag" to="Gör jag" />
+    <Word from="måstejag" to="måste jag" />
+    <Word from="gra\/iditet" to="graviditet" />
+    <Word from="hittadqdin" to="hittade din" />
+    <Word from="ärjobbigt" to="är jobbigt" />
+    <Word from="Overdrivet" to="Överdrivet" />
+    <Word from="hOgtidlig" to="högtidlig" />
+    <Word from="Overtyga" to="Övertyga" />
+    <Word from="SKILSMASSA" to="SKILSMÄSSA" />
+    <Word from="brukarju" to="brukar ju" />
+    <Word from="lsabel" to="Isabel" />
+    <Word from="kundejag" to="kunde jag" />
+    <Word from="ärläget" to="är läget" />
+    <Word from="blirinte" to="blir inte" />
+    <Word from="l'm" to="I'm" />
+    <Word from="lt's" to="It's" />
+    <Word from="ijakt" to="i jakt" />
+    <Word from="avjordens" to="av jordens" />
+  </WholeWords>
+  <PartialWordsAlways />
+  <PartialWords>
+    <!-- Will be used to check words not in dictionary -->
+    <!-- If new word(s) exists in spelling dictionary, it(they) is accepted -->
+    <WordPart from="¤" to="o" />
+    <WordPart from="ﬁ" to="fi" />
+    <WordPart from="â" to="ä" />
+    <WordPart from="/" to="l" />
+    <WordPart from="vv" to="w" />
+    <WordPart from="IVI" to="M" />
+    <WordPart from="lVI" to="M" />
+    <WordPart from="IVl" to="M" />
+    <WordPart from="lVl" to="M" />
+    <WordPart from="m" to="rn" />
+    <WordPart from="l" to="i" />
+    <WordPart from="€" to="e" />
+    <WordPart from="I" to="l" />
+    <WordPart from="c" to="o" />
+    <WordPart from="i" to="t" />
+    <WordPart from="cc" to="oo" />
+    <WordPart from="ii" to="tt" />
+    <WordPart from="n/" to="ry" />
+    <WordPart from="ae" to="æ" />
+    <!-- "f " will be two words -->
+    <WordPart from="f" to="f " />
+    <WordPart from="c" to="e" />
+    <WordPart from="o" to="e" />
+    <WordPart from="I" to="t" />
+    <WordPart from="n" to="o" />
+    <WordPart from="s" to="e" />
+    <WordPart from="å" to="ä" />
+    <WordPart from="à" to="å" />
+    <WordPart from="n/" to="rv" />
+  </PartialWords>
+  <PartialLines />
+  <PartialLinesAlways />
+  <BeginLines>
+    <Beginning from="Ln " to="In " />
+    <Beginning from="U ppfattat" to="Uppfattat" />
+  </BeginLines>
+  <EndLines />
+  <WholeLines />
+  <RegularExpressions />
+</OCRFixReplaceList>
@@ -0,0 +1,238 @@
+# coding=utf-8
+
+import traceback
+
+import pysubs2
+import logging
+import time
+
+from mods import EMPTY_TAG_PROCESSOR
+from registry import registry
+
+logger = logging.getLogger(__name__)
+
+
+class SubtitleModifications(object):
+    debug = False
+    language = None
+    initialized_mods = {}
+
+    font_style_tag_start = u"{\\"
+
+    def __init__(self, debug=False):
+        self.debug = debug
+        self.initialized_mods = {}
+
+    def load(self, fn=None, content=None, language=None, encoding="utf-8"):
+        """
+        
+        :param encoding: used for decoding the content when fn is given, not used in case content is given
+        :param language: babelfish.Language language of the subtitle
+        :param fn:  filename
+        :param content: unicode 
+        :return: 
+        """
+        self.language = language
+        self.initialized_mods = {}
+        try:
+            if fn:
+                self.f = pysubs2.load(fn, encoding=encoding)
+            elif content:
+                self.f = pysubs2.SSAFile.from_string(content)
+        except (IOError,
+                UnicodeDecodeError,
+                pysubs2.exceptions.UnknownFPSError,
+                pysubs2.exceptions.UnknownFormatIdentifierError,
+                pysubs2.exceptions.FormatAutodetectionError):
+            if fn:
+                logger.exception("Couldn't load subtitle: %s: %s", fn, traceback.format_exc())
+            elif content:
+                logger.exception("Couldn't load subtitle: %s", traceback.format_exc())
+
+    @classmethod
+    def parse_identifier(cls, identifier):
+        # simple identifier
+        if identifier in registry.mods:
+            return identifier, {}
+
+        # identifier with params; identifier(param=value)
+        split_args = identifier[identifier.find("(")+1:-1].split(",")
+        args = dict((key, value) for key, value in [sub.split("=") for sub in split_args])
+        return identifier[:identifier.find("(")], args
+
+    @classmethod
+    def get_mod_class(cls, identifier):
+        identifier, args = cls.parse_identifier(identifier)
+        return registry.mods[identifier]
+
+    @classmethod
+    def get_mod_signature(cls, identifier, **kwargs):
+        return cls.get_mod_class(identifier).get_signature(**kwargs)
+
+    def prepare_mods(self, *mods):
+        parsed_mods = [SubtitleModifications.parse_identifier(mod) for mod in mods]
+        final_mods = {}
+        line_mods = []
+        non_line_mods = []
+
+        for identifier, args in parsed_mods:
+            if identifier not in registry.mods:
+                logger.error("Mod %s not loaded", identifier)
+                continue
+
+            mod_cls = registry.mods[identifier]
+            # exclusive mod, kill old, use newest
+            if identifier in final_mods and mod_cls.exclusive:
+                final_mods.pop(identifier)
+
+            # merge args of duplicate mods if possible
+            elif identifier in final_mods and mod_cls.args_mergeable:
+                final_mods[identifier] = mod_cls.merge_args(final_mods[identifier], args)
+                continue
+            final_mods[identifier] = args
+
+        # separate all mods into line and non-line mods
+        for identifier, args in final_mods.iteritems():
+            mod_cls = registry.mods[identifier]
+            if mod_cls.modifies_whole_file:
+                non_line_mods.append((identifier, args))
+            else:
+                line_mods.append((mod_cls.order, identifier, args))
+
+            # initialize the mods
+            if identifier not in self.initialized_mods:
+                self.initialized_mods[identifier] = mod_cls(self)
+
+        return line_mods, non_line_mods
+
+    def modify(self, *mods):
+        new_entries = []
+        start = time.time()
+        line_mods, non_line_mods = self.prepare_mods(*mods)
+
+        # apply file mods
+        if non_line_mods:
+            non_line_mods_start = time.time()
+            self.apply_non_line_mods(non_line_mods)
+
+            if self.debug:
+                logger.debug("Non-Line mods took %ss", time.time() - non_line_mods_start)
+
+        # sort line mods
+        line_mods.sort(key=lambda x: (x is None, x))
+
+        # apply line mods
+        if line_mods:
+            line_mods_start = time.time()
+            self.apply_line_mods(new_entries, line_mods)
+
+            if self.debug:
+                logger.debug("Line mods took %ss", time.time() - line_mods_start)
+
+        self.f.events = new_entries
+        if self.debug:
+            logger.debug("Subtitle Modification took %ss", time.time() - start)
+
+    def apply_non_line_mods(self, mods):
+        for identifier, args in mods:
+            mod = self.initialized_mods[identifier]
+            mod.modify(None, debug=self.debug, parent=self, **args)
+
+    def apply_line_mods(self, new_entries, mods):
+        for entry in self.f:
+            applied_mods = []
+            lines = []
+
+            line_count = 0
+            start_tags = []
+            end_tags = []
+            for line in entry.text.split(ur"\N"):
+                # don't bother the mods with surrounding tags
+                old_line = line
+                line = line.strip()
+                skip_line = False
+                line_count += 1
+
+                # clean {\X0} tags before processing
+                # fixme: handle nested tags?
+                start_tag = u""
+                end_tag = u""
+                if line.startswith(self.font_style_tag_start):
+                    start_tag = line[:5]
+                    line = line[5:]
+                if line[-5:-3] == self.font_style_tag_start:
+                    end_tag = line[-5:]
+                    line = line[:-5]
+
+                for order, identifier, args in mods:
+                    mod = self.initialized_mods[identifier]
+
+                    line = mod.modify(line.strip(), debug=self.debug, parent=self, **args)
+                    if not line:
+                        if self.debug:
+                            logger.debug(u"%s: %r -> ''", identifier, old_line)
+                        skip_line = True
+                        break
+
+                    applied_mods.append(identifier)
+
+                if skip_line:
+                    continue
+
+                if start_tag:
+                    start_tags.append(start_tag)
+
+                if end_tag:
+                    end_tags.append(end_tag)
+
+                # append new line and clean possibly newly added empty tags
+                cleaned_line = EMPTY_TAG_PROCESSOR.process(start_tag + line + end_tag, debug=self.debug).strip()
+                if cleaned_line:
+                    # we may have a single closing tag, if so, try appending it to the previous line
+                    if len(cleaned_line) == 5 and cleaned_line.startswith("{\\") and cleaned_line.endswith("0}"):
+                        if lines:
+                            prev_line = lines.pop()
+                            lines.append(prev_line + cleaned_line)
+                            continue
+
+                    lines.append(cleaned_line)
+                else:
+                    if self.debug:
+                        logger.debug(u"Ditching now empty line (%r -> %r)", line)
+
+            if not lines:
+                # don't bother logging when the entry only had one line
+                if self.debug and line_count > 1:
+                    logger.debug(u"%r -> ''", entry.text)
+                continue
+
+            new_text = ur"\N".join(lines)
+
+            # cheap man's approach to avoid open tags
+            add_start_tags = []
+            add_end_tags = []
+            if len(start_tags) != len(end_tags):
+                for tag in start_tags:
+                    end_tag = tag.replace("1", "0")
+                    if end_tag not in end_tags and new_text.count(tag) > new_text.count(end_tag):
+                        add_end_tags.append(end_tag)
+                for tag in end_tags:
+                    start_tag = tag.replace("0", "1")
+                    if start_tag not in start_tags and new_text.count(tag) > new_text.count(start_tag):
+                        add_start_tags.append(start_tag)
+
+                if add_end_tags or add_start_tags:
+                    entry.text = u"".join(add_start_tags) + new_text + u"".join(add_end_tags)
+                    if self.debug:
+                        logger.debug(u"Fixing tags: %s (%r -> %r)", str(add_start_tags+add_end_tags), new_text,
+                                     entry.text)
+            else:
+                entry.text = new_text
+
+            new_entries.append(entry)
+
+SubMod = SubtitleModifications
+
+
+
+
@@ -0,0 +1,94 @@
+# coding=utf-8
+import re
+import logging
+
+from subzero.modification.processors.re_processor import ReProcessor, NReProcessor
+
+logger = logging.getLogger(__name__)
+
+
+class SubtitleModification(object):
+    identifier = None
+    description = None
+    long_description = None
+    exclusive = False
+    advanced = False  # has parameters
+    args_mergeable = False
+    order = None
+    modifies_whole_file = False  # operates on the whole file, not individual entries
+    pre_processors = []
+    processors = []
+    post_processors = []
+
+    def __init__(self, parent):
+        return
+
+    def _process(self, content, processors, debug=False, parent=None, **kwargs):
+        if not content:
+            return
+
+        # processors may be a list or a callable
+        #if callable(processors):
+        #    _processors = processors()
+        #else:
+        #    _processors = processors
+        _processors = processors
+
+        new_content = content
+        for processor in _processors:
+            old_content = new_content
+            new_content = processor.process(new_content, debug=debug)
+            if not new_content:
+                if debug:
+                    logger.debug("Processor returned empty line: %s", processor)
+                break
+            if debug:
+                if old_content == new_content:
+                    continue
+                logger.debug("%s: %s -> %s", processor, repr(old_content), repr(new_content))
+        return new_content
+
+    def pre_process(self, content, debug=False, parent=None, **kwargs):
+        return self._process(content, self.pre_processors, debug=debug, parent=parent, **kwargs)
+
+    def process(self, content, debug=False, parent=None, **kwargs):
+        return self._process(content, self.processors, debug=debug, parent=parent, **kwargs)
+
+    def post_process(self, content, debug=False, parent=None, **kwargs):
+        return self._process(content, self.post_processors, debug=debug, parent=parent, **kwargs)
+
+    def modify(self, content, debug=False, parent=None, **kwargs):
+        if not content:
+            return
+
+        new_content = content
+        for method in ("pre_process", "process", "post_process"):
+            if not new_content:
+                return
+            new_content = getattr(self, method)(new_content, debug=debug, parent=parent, **kwargs)
+
+        return new_content
+
+    @classmethod
+    def get_signature(cls, **kwargs):
+        string_args = ",".join(["%s=%s" % (key, value) for key, value in kwargs.iteritems()])
+        return "%s(%s)" % (cls.identifier, string_args)
+
+    @classmethod
+    def merge_args(cls, args1, args2):
+        raise NotImplementedError
+
+
+class SubtitleTextModification(SubtitleModification):
+    pass
+
+
+EMPTY_TAG_PROCESSOR = ReProcessor(re.compile(r'({\\\w1})[\s.,-_!?]*({\\\w0})'), "", name="empty_tag")
+
+empty_line_post_processors = [
+    # empty tag
+    EMPTY_TAG_PROCESSOR,
+
+    # empty line (needed?)
+    NReProcessor(re.compile(r'^[\s-]+$'), "", name="empty_line"),
+]
@@ -0,0 +1,51 @@
+# coding=utf-8
+
+import logging
+from collections import OrderedDict
+
+from subzero.modification.mods import SubtitleModification
+from subzero.modification import registry
+
+logger = logging.getLogger(__name__)
+
+
+COLOR_MAP = OrderedDict([
+    ("white", "#FFFFFF"),
+    ("light-grey", "#C0C0C0"),
+    ("red", "#FF0000"),
+    ("green", "#00FF00"),
+    ("yellow", "#FFFF00"),
+    ("blue", "#0000FF"),
+    ("magenta", "#FF00FF"),
+    ("cyan", "#00FFFF"),
+    ("black", "#000000"),
+    ("dark-red", "#800000"),
+    ("dark-green", "#008000"),
+    ("dark-yellow", "#808000"),
+    ("dark-blue", "#000080"),
+    ("dark-magenta", "#800080"),
+    ("dark-cyan", "#008080"),
+    ("dark-grey", "#808080"),
+])
+
+
+class Color(SubtitleModification):
+    identifier = "color"
+    description = "Change the color of the subtitle"
+    exclusive = True
+    advanced = True
+
+    colors = COLOR_MAP
+
+    long_description = """\
+    Adds the requested color to every line of the subtitle. Support depends on player.
+    """
+
+    def modify(self, content, debug=False, parent=None, **kwargs):
+        color = self.colors.get(kwargs.get("name"))
+        if color:
+            return u'<font color="%s">%s</font>' % (color, content)
+        return content
+
+
+registry.register(Color)
@@ -0,0 +1,72 @@
+# coding=utf-8
+
+import re
+
+from subzero.modification.mods import SubtitleTextModification, empty_line_post_processors
+from subzero.modification.processors.string_processor import StringProcessor
+from subzero.modification.processors.re_processor import NReProcessor
+from subzero.modification import registry
+
+
+class CommonFixes(SubtitleTextModification):
+    identifier = "common"
+    description = "Basic common fixes"
+    exclusive = True
+    order = 40
+
+    long_description = """\
+    Fix common whitespace/punctuation issues in subtitles
+    """
+
+    processors = [
+        # -- = ...
+        StringProcessor("-- ", '... ', name="CM_doubledash"),
+
+        # '' = "
+        StringProcessor("''", '"', name="CM_double_apostrophe"),
+
+        # remove leading ...
+        NReProcessor(re.compile(r'(?u)^\.\.\.[\s]*'), "", name="CM_leading_ellipsis"),
+
+        # no space after ellipsis
+        NReProcessor(re.compile(r'(?u)\.\.\.(?![\s.,!?\'"])(?!$)'), "... ", name="CM_ellipsis_no_space"),
+
+        # multiple spaces
+        NReProcessor(re.compile(r'(?u)[\s]{2,}'), " ", name="CM_multiple_spaces"),
+
+        # no space after starting dash
+        NReProcessor(re.compile(r'(?u)^-(?![\s-])'), "- ", name="CM_dash_space"),
+
+        # remove starting spaced dots (not matching ellipses
+        NReProcessor(re.compile(r'(?u)^(?!\s?(\.\s\.\s\.)|(\s?\.{3}))[\s.]*'), "", name="CM_starting_spacedots"),
+
+        # space missing before doublequote
+        # ReProcessor(re.compile(r'(?u)(?<!^)(?<![\s(\["])("[^"]+")'), r' \1', name="CM_space_before_dblquote"),
+
+        # space missing after doublequote
+        # ReProcessor(re.compile(r'(?u)("[^"\s][^"]+")([^\s.,!?)\]]+)'), r"\1 \2", name="CM_space_after_dblquote"),
+
+        # space before ending doublequote?
+
+        # remove >>
+        NReProcessor(re.compile(r'(?u)^\s?>>\s*'), "", name="CM_leading_crocodiles"),
+
+        # replace uppercase I with lowercase L in words
+        NReProcessor(re.compile(ur'(?u)([A-zÀ-ž][a-zà-ž]+)(I+)'),
+                     lambda match: ur'%s%s' % (match.group(1), "l"*len(match.group(2))), name="CM_uppercase_i_in_word"),
+
+        # fix spaces in numbers (allows for punctuation: ,.:' (comma only fixed if after space, those may be
+        # countdowns otherwise); don't break up ellipses
+        # fixme: maybe check whether it's a countdown (second part smaller than the first), otherwise handle default?
+        NReProcessor(re.compile(r'(?u)([0-9]+[0-9.:\']*(?<!\.\.))\s+((?!\.\.)[0-9,.:\']*[0-9]+)'), r"\1\2",
+                     name="CM_spaces_in_numbers"),
+
+        # uppercase after dot
+        NReProcessor(re.compile(ur'(?u)((?:[^.\s])+\.\s+)([a-zà-ž])'),
+                     lambda match: ur'%s%s' % (match.group(1), match.group(2).upper()), name="CM_uppercase_after_dot"),
+    ]
+
+    post_processors = empty_line_post_processors
+
+
+registry.register(CommonFixes)
@@ -0,0 +1,27 @@
+# coding=utf-8
+
+import logging
+
+from subzero.modification.mods import SubtitleModification
+from subzero.modification import registry
+
+logger = logging.getLogger(__name__)
+
+
+class ChangeFPS(SubtitleModification):
+    identifier = "change_FPS"
+    description = "Change the FPS of the subtitle"
+    exclusive = True
+    advanced = True
+    modifies_whole_file = True
+
+    long_description = """\
+    Re-syncs the subtitle to the framerate of the current media file. 
+    """
+
+    def modify(self, content, debug=False, parent=None, **kwargs):
+        fps_from = kwargs.get("from")
+        fps_to = kwargs.get("to")
+        parent.f.transform_framerate(float(fps_from), float(fps_to))
+
+registry.register(ChangeFPS)
@@ -0,0 +1,48 @@
+# coding=utf-8
+import re
+
+from subzero.modification.mods import SubtitleTextModification, empty_line_post_processors
+from subzero.modification.processors.re_processor import NReProcessor
+from subzero.modification import registry
+
+
+class HearingImpaired(SubtitleTextModification):
+    identifier = "remove_HI"
+    description = "Remove Hearing Impaired tags"
+    exclusive = True
+    order = 10
+
+    long_description = """\
+    Removes tags, text and characters from subtitles that are meant for hearing impaired people
+    """
+
+    processors = [
+        # brackets (only remove if at least 3 consecutive uppercase chars in brackets
+        NReProcessor(re.compile(ur'(?sux)[([].+(?=[A-ZÀ-Ž]{3,}).+[)\]]'), "", name="HI_brackets"),
+
+        # text before colon (and possible dash in front), max 11 chars after the first whitespace (if any)
+        # NReProcessor(re.compile(r'(?u)(^[A-z\-\'"_]+[\w\s]{0,11}:[^0-9{2}][\s]*)'), "", name="HI_before_colon"),
+
+        # text before colon (at least 4 consecutive uppercase chars)
+        NReProcessor(re.compile(ur'(?u)(^(?=.*[A-ZÀ-Ž]{4,})[A-ZÀ-Ž-_\s]+:\s*)'), "", name="HI_before_colon"),
+
+        # text in brackets at start, after optional dash, before colon or at end of line
+        # fixme: may be too aggressive
+        NReProcessor(re.compile(ur'(?um)(^-?\s?[([][A-zÀ-ž-_\s]{3,}[)\]](?:(?=$)|:\s*))'), "",
+                     name="HI_brackets_special"),
+
+        # all caps line (at least 4 consecutive uppercase chars)
+        NReProcessor(re.compile(ur'(?u)(^(?=.*[A-ZÀ-Ž]{4,})[A-ZÀ-Ž-_\s]+$)'), "", name="HI_all_caps"),
+
+        # dash in front
+        # NReProcessor(re.compile(r'(?u)^\s*-\s*'), "", name="HI_starting_dash"),
+
+        # all caps at start before new sentence
+        NReProcessor(re.compile(ur'(?u)^(?=[A-ZÀ-Ž]{4,})[A-ZÀ-Ž-_\s]+\s([A-ZÀ-Ž][a-zà-ž].+)'), r"\1",
+                     name="HI_starting_upper_then_sentence"),
+    ]
+
+    post_processors = empty_line_post_processors
+
+
+registry.register(HearingImpaired)
@@ -0,0 +1,48 @@
+# coding=utf-8
+import logging
+
+from subzero.modification.mods import SubtitleTextModification
+from subzero.modification.processors.string_processor import MultipleLineProcessor, WholeLineProcessor
+from subzero.modification.processors.re_processor import MultipleWordReProcessor
+from subzero.modification import registry
+from subzero.modification.dictionaries.data import data as OCR_fix_data
+
+logger = logging.getLogger(__name__)
+
+
+class FixOCR(SubtitleTextModification):
+    identifier = "OCR_fixes"
+    description = "Fix common OCR issues"
+    exclusive = True
+    order = 20
+    data_dict = None
+
+    long_description = """\
+    Fix issues that happen when a subtitle gets converted from bitmap to text through OCR
+    """
+
+    def __init__(self, parent):
+        super(FixOCR, self).__init__(parent)
+        data_dict = OCR_fix_data.get(parent.language.alpha3t)
+        if not data_dict:
+            logger.debug("No SnR-data available for language %s", parent.language)
+            return
+
+        self.data_dict = data_dict
+        self.processors = self.get_processors()
+
+    def get_processors(self):
+        if not self.data_dict:
+            return []
+
+        return [
+            WholeLineProcessor(self.data_dict["WholeLines"], name="SE_replace_line"),
+            MultipleWordReProcessor(self.data_dict["WholeWords"], name="SE_replace_word"),
+            MultipleWordReProcessor(self.data_dict["BeginLines"], name="SE_replace_beginline"),
+            MultipleWordReProcessor(self.data_dict["EndLines"], name="SE_replace_endline"),
+            MultipleWordReProcessor(self.data_dict["PartialLines"], name="SE_replace_partialline"),
+            MultipleLineProcessor(self.data_dict["PartialWordsAlways"], name="SE_replace_partialwordsalways")
+        ]
+
+
+registry.register(FixOCR)
@@ -0,0 +1,40 @@
+# coding=utf-8
+
+import logging
+
+from subzero.modification.mods import SubtitleModification
+from subzero.modification import registry
+
+logger = logging.getLogger(__name__)
+
+
+class ShiftOffset(SubtitleModification):
+    identifier = "shift_offset"
+    description = "Change the timing of the subtitle"
+    exclusive = False
+    advanced = True
+    args_mergeable = True
+    modifies_whole_file = True
+
+    long_description = """\
+    Adds or substracts a certain amount of time from the whole subtitle to match your media 
+    """
+
+    @classmethod
+    def merge_args(cls, args1, args2):
+        new_args = dict((key, int(value)) for key, value in args1.iteritems())
+
+        for key, value in args2.iteritems():
+            if key in new_args:
+                new_args[key] += int(value)
+            else:
+                new_args[key] = int(value)
+
+        return new_args
+
+    def modify(self, content, debug=False, parent=None, **kwargs):
+        parent.f.shift(h=int(kwargs.get("h", 0)), m=int(kwargs.get("m", 0)), s=int(kwargs.get("s", 0)),
+                       ms=int(kwargs.get("ms", 0)))
+
+
+registry.register(ShiftOffset)
@@ -0,0 +1,29 @@
+# coding=utf-8
+
+
+class Processor(object):
+    """
+    Processor base class
+    """
+    name = None
+    parent = None
+
+    def __init__(self, name=None, parent=None):
+        self.name = name
+        self.parent = parent
+
+    @property
+    def info(self):
+        return self.name
+
+    def process(self, content, debug=False):
+        return content
+
+    def __repr__(self):
+        return "Processor <%s %s>" % (self.__class__.__name__, self.info)
+
+    def __str__(self):
+        return repr(self)
+
+    def __unicode__(self):
+        return unicode(repr(self))
@@ -0,0 +1,48 @@
+# coding=utf-8
+import re
+import logging
+
+from subzero.modification.processors import Processor
+
+logger = logging.getLogger(__name__)
+
+
+class ReProcessor(Processor):
+    """
+    Regex processor
+    """
+    pattern = None
+    replace_with = None
+
+    def __init__(self, pattern, replace_with, name=None):
+        super(ReProcessor, self).__init__(name=name)
+        self.pattern = pattern
+        self.replace_with = replace_with
+
+    def process(self, content, debug=False):
+        return self.pattern.sub(self.replace_with, content)
+
+
+class NReProcessor(ReProcessor):
+    pass
+
+
+class MultipleWordReProcessor(ReProcessor):
+    """
+    Expects a dictionary in the form of:
+    dict = {
+        "data": {"old_value": "new_value"},
+        "pattern": compiled re object that matches data.keys()
+    }
+    replaces found key in pattern with the corresponding value in data
+    """
+    def __init__(self, snr_dict, name=None, parent=None):
+        super(ReProcessor, self).__init__(name=name)
+        self.snr_dict = snr_dict
+
+    def process(self, content, debug=False):
+        if not self.snr_dict["data"]:
+            return content
+
+        return self.snr_dict["pattern"].sub(lambda x: self.snr_dict["data"][x.group(0)], content)
+
@@ -0,0 +1,84 @@
+# coding=utf-8
+
+import logging
+
+from subzero.modification.processors import Processor
+
+logger = logging.getLogger(__name__)
+
+
+class StringProcessor(Processor):
+    """
+    String replacement processor base
+    """
+
+    def __init__(self, search, replace, name=None, parent=None):
+        super(StringProcessor, self).__init__(name=name)
+        self.search = search
+        self.replace = replace
+
+    def process(self, content, debug=False):
+        return content.replace(self.search, self.replace)
+
+
+class MultipleLineProcessor(Processor):
+    """
+    replaces stuff in whole lines
+    
+    takes a search/replace dict as first argument
+    Expects a dictionary in the form of:
+    dict = {
+        "data": {"old_value": "new_value"}
+    }
+    """
+    def __init__(self, snr_dict, name=None, parent=None):
+        super(MultipleLineProcessor, self).__init__(name=name)
+        self.snr_dict = snr_dict
+
+    def process(self, content, debug=False):
+        if not self.snr_dict["data"]:
+            return content
+
+        for key, value in self.snr_dict["data"].iteritems():
+            if debug and key in content:
+                logger.debug(u"Replacing '%s' with '%s' in '%s'", key, value, content)
+
+            content = content.replace(key, value)
+
+        return content
+
+
+class WholeLineProcessor(MultipleLineProcessor):
+    def process(self, content, debug=False):
+        if not self.snr_dict["data"]:
+            return content
+        content = content.strip()
+
+        for key, value in self.snr_dict["data"].iteritems():
+            if content == key:
+                if debug:
+                    logger.debug(u"Replacing '%s' with '%s'", key, value)
+
+                content = value
+                break
+
+        return content
+
+
+class MultipleWordProcessor(MultipleLineProcessor):
+    """
+    replaces words
+
+    takes a search/replace dict as first argument
+    Expects a dictionary in the form of:
+    dict = {
+        "data": {"old_value": "new_value"}
+    }
+    """
+    def process(self, content, debug=False):
+        words = content.split(u" ")
+        new_words = []
+        for word in words:
+            new_words.append(self.snr_dict.get(word, word))
+
+        return u" ".join(new_words)
@@ -0,0 +1,17 @@
+# coding=utf-8
+from collections import OrderedDict
+
+
+class SubtitleModRegistry(object):
+    mods = None
+    mods_available = None
+
+    def __init__(self):
+        self.mods = OrderedDict()
+        self.mods_available = []
+
+    def register(self, mod):
+        self.mods[mod.identifier] = mod
+        self.mods_available.append(mod.identifier)
+
+registry = SubtitleModRegistry()
@@ -4,13 +4,23 @@ import hashlib
 import os
 import logging
 import traceback
+import gzip
+
+from babelfish import Language
+
+from json_tricks.nonp import loads, dumps
+

 from constants import mode_map
+from subliminal_patch.subtitle import ModifiedSubtitle

 logger = logging.getLogger(__name__)


 class StoredSubtitle(object):
+    """
+    legacy class used for PMS LoadObject/SaveObject
+    """
    score = None
    storage_type = None
    hash = None
@@ -46,8 +56,59 @@ class StoredSubtitle(object):
        return mode_map.get(self.mode, "Unknown")


+class JSONStoredSubtitle(object):
+    score = None
+    storage_type = None
+    hash = None
+    provider_name = None
+    id = None
+    date_added = None
+    mode = "a"  # auto/manual/auto-better (a/m/b)
+    content = None
+    mods = None
+    encoding = None
+
+    def initialize(self, score, storage_type, hash, provider_name, id, date_added=None, mode="a", content=None,
+                 mods=None, encoding=None):
+        self.score = int(score)
+        self.storage_type = storage_type
+        self.hash = hash
+        self.provider_name = provider_name
+        self.id = id
+        self.date_added = date_added or datetime.datetime.now()
+        self.mode = mode
+        self.content = content
+        self.mods = mods or []
+        self.encoding = encoding
+
+    def add_mod(self, identifier):
+        self.mods = self.mods or []
+        if identifier is None:
+            self.mods = []
+            return
+
+        self.mods.append(identifier)
+
+    @property
+    def mode_verbose(self):
+        return mode_map.get(self.mode, "Unknown")
+
+    def serialize(self):
+        if self.content:
+            # content is always stored in unicode (gets converted to string with escaped unicode chars by json)
+            self.content = self.content.decode(self.encoding)
+        return self.__dict__
+
+    def deserialize(self, data):
+        if data["content"]:
+            # content is always present in encoded form
+            data["content"] = data["content"].encode(data["encoding"])
+        self.initialize(**data)
+
+
 class StoredVideoSubtitles(object):
    """
+    legacy class
    manages stored subtitles for video_id per media_part/language combination
    """
    video_id = None  # rating_key
@@ -112,12 +173,136 @@ class StoredVideoSubtitles(object):
        return str(self.video_id)


+class JSONStoredVideoSubtitles(object):
+    """
+    manages stored subtitles for video_id per media_part/language combination
+    """
+    video_id = None  # rating_key
+    title = None
+    parts = None
+    version = None
+    item_type = None  # movie / episode
+    added_at = None
+
+    def initialize(self, plex_item, version=None):
+        self.video_id = str(plex_item.rating_key)
+
+        self.title = plex_item.title
+        self.parts = {}
+        self.version = version
+        self.item_type = plex_item.type
+        self.added_at = datetime.datetime.fromtimestamp(plex_item.added_at)
+
+    def deserialize(self, data):
+        parts = data.pop("parts")
+        self.parts = {}
+        self.__dict__.update(data)
+
+        if parts:
+            for part_id, part in parts.iteritems():
+                self.parts[part_id] = {}
+                for language, sub_data in part.iteritems():
+                    self.parts[part_id][language] = {}
+
+                    for sub_key, subtitle_data in sub_data.iteritems():
+                        if sub_key == "current":
+                            if not isinstance(subtitle_data, tuple):
+                                subtitle_data = tuple(subtitle_data.split("__"))
+                            self.parts[part_id][language]["current"] = subtitle_data
+                        else:
+                            sub = JSONStoredSubtitle()
+
+                            # legacy subtitle storage instance
+                            if isinstance(subtitle_data, StoredSubtitle):
+                                subtitle_data = subtitle_data.__dict__
+
+                            sub.initialize(**subtitle_data)
+                            if not isinstance(sub_key, tuple):
+                                sub_key = tuple(sub_key.split("__"))
+
+                            self.parts[part_id][language][sub_key] = sub
+
+    def serialize(self):
+        data = {"parts": {}}
+        for key, value in self.__dict__.iteritems():
+            if key != "parts":
+                data[key] = value
+
+        for part_id, part in self.parts.iteritems():
+            data["parts"][part_id] = {}
+            for language, sub_data in part.iteritems():
+                data["parts"][part_id][language] = {}
+
+                for sub_key, stored_subtitle in sub_data.iteritems():
+                    if sub_key == "current":
+                        data["parts"][part_id][language]["current"] = "__".join(stored_subtitle)
+                    else:
+                        # migrate missing encoding data
+                        if stored_subtitle.content and not stored_subtitle.encoding:
+                            # correctly serialize the content
+                            lang = Language.fromietf(language)
+                            subtitle = ModifiedSubtitle(lang)
+                            subtitle.content = stored_subtitle.content
+                            stored_subtitle.encoding = subtitle.guess_encoding()
+
+                        data["parts"][part_id][language]["__".join(sub_key)] = stored_subtitle.serialize()
+
+        return data
+
+    def add(self, part_id, lang, subtitle, storage_type, date_added=None, mode="a"):
+        part_id = str(part_id)
+        part = self.parts.get(part_id)
+        if not part:
+            self.parts[part_id] = {}
+            part = self.parts[part_id]
+
+        subs = part.get(lang)
+        if not subs:
+            part[lang] = {}
+            subs = part[lang]
+
+        sub_key = self.get_sub_key(subtitle.provider_name, subtitle.id)
+        subs[sub_key] = JSONStoredSubtitle()
+        subs[sub_key].initialize(subtitle.score, storage_type, hashlib.md5(subtitle.content).hexdigest(),
+                                 subtitle.provider_name, subtitle.id, date_added=date_added, mode=mode,
+                                 content=subtitle.content, mods=subtitle.mods, encoding=subtitle.guess_encoding())
+        subs["current"] = sub_key
+
+        return True
+
+    def get_any(self, part_id, lang):
+        part_id = str(part_id)
+        part = self.parts.get(part_id)
+        if not part:
+            return
+
+        subs = part.get(lang)
+        if not subs:
+            return
+
+        if "current" in subs and subs["current"]:
+            return subs.get(subs["current"])
+
+    def get_sub_key(self, provider_name, id):
+        return provider_name, str(id)
+
+    def __repr__(self):
+        return unicode(self)
+
+    def __unicode__(self):
+        return u"%s (%s)" % (self.title, self.video_id)
+
+    def __str__(self):
+        return str(self.video_id)
+
+
 class StoredSubtitlesManager(object):
    """
    manages the storage and retrieval of StoredVideoSubtitles instances for a given video_id
    """
    storage = None
    version = 2
+    extension = ".json.gz"

    def __init__(self, storage, plexapi_item_getter):
        self.storage = storage
@@ -130,6 +315,11 @@ class StoredSubtitlesManager(object):
    def dataitems_path(self):
        return os.path.join(getattr(self.storage, "_core").storage.data_path, "DataItems")

+    def get_json_data_path(self, bare_fn):
+        if not bare_fn.endswith(self.extension):
+            return os.path.join(self.dataitems_path, "%s%s" % (bare_fn, self.extension))
+        return os.path.join(self.dataitems_path, bare_fn)
+
    def get_all_files(self):
        return [fn for fn in os.listdir(self.dataitems_path) if fn.startswith("subs_")]

@@ -156,10 +346,13 @@ class StoredSubtitlesManager(object):
    def delete_missing_files(self):
        deleted = []
        for fn in self.get_all_files():
-            video_id = os.path.basename(fn).split("subs_")[1]
+            video_id = os.path.basename(fn).split(".")[0].split("subs_")[1]
            item = self.get_item(video_id)
            if not item:
-                self.delete(fn)
+                if fn.endswith(".json.gz"):
+                    self.delete(self.get_json_data_path(fn))
+                else:
+                    self.legacy_delete(fn)
                deleted.append(video_id)
        return deleted

@@ -172,13 +365,47 @@ class StoredSubtitlesManager(object):
        subs_for_video.version = 2
        return True

+    def migrate_legacy_data(self, from_fn, to_fn):
+        try:
+            subs_for_video = self.storage.LoadObject(from_fn)
+        except:
+            logger.error("Failed to load item \"%s\": %s" % (from_fn, traceback.format_exc()))
+
+            # delete
+            return
+
+        if not subs_for_video or not hasattr(subs_for_video, "version"):
+            self.legacy_delete(from_fn)
+
+        # migrate to our new json format
+        new_subs_for_video = JSONStoredVideoSubtitles()
+        new_subs_for_video.deserialize(subs_for_video.__dict__)
+        self.save(new_subs_for_video)
+
+        self.legacy_delete(from_fn)
+
+        return new_subs_for_video
+
    def load(self, video_id=None, filename=None):
        subs_for_video = None
-        fn = self.get_storage_filename(video_id) if video_id else filename
-        try:
-            subs_for_video = self.storage.LoadObject(fn)
-        except:
-            logger.error("Failed to load item %s: %s" % (fn, traceback.format_exc()))
+        bare_fn = self.get_storage_filename(video_id) if video_id else filename
+        json_path = self.get_json_data_path(bare_fn)
+        if os.path.exists(json_path):
+            # new style data
+            subs_for_video = JSONStoredVideoSubtitles()
+            try:
+                with gzip.open(json_path, 'rb') as f:
+                    s = f.read()
+
+                data = loads(s)
+            except:
+                logger.error("Couldn't load JSON data for %s", bare_fn)
+                return
+
+            subs_for_video.deserialize(data)
+
+        elif not bare_fn.endswith(".json.gz") and os.path.exists(os.path.join(self.dataitems_path, bare_fn)):
+            subs_for_video = self.migrate_legacy_data(bare_fn, json_path)

        if not subs_for_video:
            return
@@ -196,7 +423,7 @@ class StoredSubtitlesManager(object):
                    success = getattr(self, mig_func)(subs_for_video)
                    if success is False:
                        logger.error("Couldn't migrate %s, removing data", subs_for_video.video_id)
-                        self.delete(fn)
+                        self.delete(json_path)
                        break

            if cur_ver > old_ver and success:
@@ -210,18 +437,29 @@ class StoredSubtitlesManager(object):
    def load_or_new(self, plex_item):
        subs_for_video = self.load(plex_item.rating_key)
        if not subs_for_video:
-            subs_for_video = StoredVideoSubtitles(plex_item, version=self.version)
+            subs_for_video = JSONStoredVideoSubtitles()
+            subs_for_video.initialize(plex_item, version=self.version)
            self.save(subs_for_video)
        return subs_for_video

    def save(self, subs_for_video):
+        data = subs_for_video.serialize()
+        fn = self.get_json_data_path(self.get_storage_filename(subs_for_video.video_id))
+        json_data = dumps(data)
+        with gzip.open(fn, "wb", compresslevel=6) as f:
+            f.write(json_data)
+
+    def delete(self, filename):
+        os.remove(filename)
+
+    def legacy_save(self, subs_for_video):
        fn = self.get_storage_filename(subs_for_video.video_id)
        try:
            self.storage.SaveObject(fn, subs_for_video)
        except:
            logger.error("Failed to save item %s: %s" % (fn, traceback.format_exc()))

-    def delete(self, filename):
+    def legacy_delete(self, filename):
        try:
            self.storage.Remove(filename)
        except:
@@ -10,8 +10,8 @@ from subliminal_patch import scan_video, refine, search_external_subtitles
 logger = logging.getLogger(__name__)


-def parse_video(fn, hints, external_subtitles=False, embedded_subtitles=False, known_embedded=None, forced_only=False,
-                video_fps=None, dry_run=False):
+def parse_video(fn, video_info, hints, external_subtitles=False, embedded_subtitles=False, known_embedded=None,
+                forced_only=False, video_fps=None, dry_run=False):

    logger.debug("Parsing video: %s, hints: %s", os.path.basename(fn), hints)
    video = scan_video(fn, hints=hints, dont_use_actual_file=dry_run)
@@ -19,29 +19,58 @@ def parse_video(fn, hints, external_subtitles=False, embedded_subtitles=False, k
    # refiners

    refine_kwargs = {
-        "episode_refiners": ('sz_metadata', 'tvdb', 'omdb'),
-        "movie_refiners": ('sz_metadata', 'omdb',),
+        "episode_refiners": ('tvdb', 'sz_omdb'),
+        "movie_refiners": ('sz_omdb',),
        "embedded_subtitles": False,
    }

+    # our own metadata refiner :)
+    if "stream" in video_info:
+        for key, value in video_info["stream"].iteritems():
+            if hasattr(video, key) and not getattr(video, key):
+                logger.info(u"Adding stream %s info: %s", key, value)
+                setattr(video, key, value)
+
+    plex_title = video_info.get("original_title", video_info.get("title"))
+    if hints["type"] == "episode":
+        plex_title = video_info.get("original_title", video_info.get("series"))
+
+    if not video.year:
+        video.year = video_info.get("year")
+
    refine(video, **refine_kwargs)

+    if not video.imdb_id:
+        video.imdb_id = video_info.get("imdb_id")
+        if video.imdb_id:
+            logger.info(u"Adding PMS imdb_id info: %s", video.imdb_id)
+
+    if hints["type"] == "episode":
+        if not video.series_tvdb_id:
+            logger.info(u"Adding PMS series_tvdb_id info: %s", video_info.get("series_tvdb_id"))
+            video.series_tvdb_id = video_info.get("series_tvdb_id")
+
+        if not video.tvdb_id:
+            logger.info(u"Adding PMS tvdb_id info: %s", video_info.get("tvdb_id"))
+            video.tvdb_id = video_info.get("tvdb_id")
+
    # re-refine with plex's known data?
    refine_with_plex = False

    # episode but wasn't able to match title
-    if hints["type"] == "episode" and not video.series_tvdb_id and not video.tvdb_id and not video.series_imdb_id \
-            and video.series != hints["title"]:
-        logger.info(u"Re-refining with series title: '%s' instead of '%s'", hints["title"], video.series)
-        video.series = hints["title"]
-        refine_with_plex = True
+    if plex_title:
+        if hints["type"] == "episode" and not video.series_tvdb_id and not video.tvdb_id and not video.series_imdb_id \
+                and video.series != plex_title:
+            logger.info(u"Re-refining with series title: '%s' instead of '%s'", plex_title, video.series)
+            video.series = plex_title
+            refine_with_plex = True

-    # movie
-    elif hints["type"] == "movie" and not video.imdb_id and video.title != hints["title"]:
        # movie
-        logger.info(u"Re-refining with series title: '%s' instead of '%s'", hints["title"], video.title)
-        video.title = hints["title"]
-        refine_with_plex = True
+        elif hints["type"] == "movie" and not video.imdb_id and video.title != plex_title:
+            # movie
+            logger.info(u"Re-refining with series title: '%s' instead of '%s'", plex_title, video.title)
+            video.title = plex_title
+            refine_with_plex = True

    # title not matched? try plex title hint
    if refine_with_plex:
@@ -60,7 +89,6 @@ def parse_video(fn, hints, external_subtitles=False, embedded_subtitles=False, k
        )

    # add video fps info
-    # fixme: still needed?
    video.fps = video_fps

    # add known embedded subtitles
@@ -77,4 +105,13 @@ def parse_video(fn, hints, external_subtitles=False, embedded_subtitles=False, k
            logger.debug('Found embedded subtitle %r', embedded_subtitle_languages)
            video.subtitle_languages.update(embedded_subtitle_languages)

+    # guess special
+    if hints["type"] == "episode":
+        if video.season == 0 or video.episode == 0:
+            video.is_special = True
+        else:
+            # check parent folder name
+            if os.path.dirname(fn).split(os.path.sep)[-1].lower() in ("specials", "season 00"):
+                video.is_special = True
+
    return video
@@ -4,37 +4,43 @@

 2
 00:00:10,759 --> 00:00:12,678
-ROSE: So what is it?
-What's wrong?
+ROSE: (Help us. Please. . .help us.)
+What's "wrong"? over 9, 000!

 3
 00:00:12,679 --> 00:00:16,097
-I don't know. Some kind
+I don't know. Some kind of wrong "1 00" number
 of signal, drawing the Tardis off course.

 4
 00:00:16,099 --> 00:00:17,224
-Where are we?
+this is a"subtitle" test "with a"text before colons and "peter"following: Where are we?."

 5
 00:00:17,225 --> 00:00:19,684
-Earth. Utah, North America.
+"less text before colons: Earth. Utah, North America."
+MUSIC PLAYS What is that sound?!
+ls it?
+take them balls it

 6
 00:00:19,686 --> 00:00:21,103
-About half a mile underground.
+Ithinkyou're About half a miIe underground. ls it
+Don't fix this countdown: 81, 80, 79, 78
+But fix this: 81 ,00

 7
 00:00:21,103 --> 00:00:23,603
-And when are we?
+<i>(laughing): lrn gonna And when are we? (chuckles)
+lrn gonna And when are we?</i>

 8
 00:00:24,274 --> 00:00:26,649
-2012.
+...2012.  weII it's 1 2:00 o'clock

 9
 00:00:26,650 --> 00:00:29,370
-God, that's so close. I should be 26!
+(BIG BROTHER THEME MUSIC)

 10
 00:00:30,612 --> 00:00:33,112
@@ -43,32 +49,34 @@ God, that's so close. I should be 26!
 11
 00:00:33,658 --> 00:00:34,783
 (WHOO
-SHING) geil
+SHING) >>geil

 12
 00:00:34,783 --> 00:00:36,826
-Blimey.
+-- Blimey.

 13
 00:00:36,828 --> 00:00:39,328
-ROSE: Like a great big museum.
+ROSE: Like a "great...big museum".

 14
 00:00:40,414 --> 00:00:42,914
-DOCTOR: An alien museum.
+DOCTOR's MOM: ''An alien museum".

 15
 00:00:43,542 --> 00:00:46,042
-Someone's got a hobby.
+Someone's got a   hobby.

 16
 00:00:46,378 --> 00:00:49,048
-They must've spent a fortune on this.
+FULL UPPERCASE LINE HERE
+and some text
+- (chuckles)

 17
 00:00:49,631 --> 00:00:51,924
-AGUGU
-pepipi
+<i>AGUGU
+pepipi</i>

 18
 00:00:51,926 --> 00:00:55,304
@@ -263,12 +271,13 @@ Is it talking?

 60
 00:03:45,641 --> 00:03:48,141
-(DRILLING)
+<u>This will end up with an open end tag
+<i>(DRILLING)</i></u>

 61
 00:03:53,233 --> 00:03:56,151
- Not exactly talking, no.
- Then what's it doing?
+- (REMOVE ME <s> PLEASE)
+- Then <i>what's</i> it doing?</s>

 62
 00:03:56,151 --> 00:03:57,235
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
panni	2006ebb244	2.0.20.1364 RC9	2017-05-24 21:47:51 +02:00
panni	58c852cdba	submod: OCR update eng data	2017-05-24 21:41:51 +02:00
panni	9e77a8e304	update guessit to d96859d056864b8956cbeb8c8f5bb6875d270e39	2017-05-24 21:40:12 +02:00
panni	e9817f1e0d	bump version	2017-05-24 18:03:53 +02:00
panni	123dde7b8f	don't verify hashes of specials	2017-05-24 18:02:23 +02:00
panni	c1b84eabdb	improve specials support (opensubtitles), mostly for manual listing	2017-05-24 16:24:23 +02:00
panni	c7ececde77	add doc	2017-05-23 23:06:12 +02:00
panni	6f305d636e	make legandastv subtitle picklable for availablesubsforitem	2017-05-23 22:37:00 +02:00
panni	d25990895c	something's wrong with the menu history key here; add error debug	2017-05-23 22:12:22 +02:00
panni	d406ced759	bump version	2017-05-23 18:08:28 +02:00
panni	b858b56120	add hearing_impaired_verifiable per provider/subtitle and only bail out on force-non-hi if necessary; #289	2017-05-23 18:07:56 +02:00
panni	c94fe81dbf	bump dev version	2017-05-23 17:54:49 +02:00
panni	a67bbebb84	Merge remote-tracking branch 'origin/develop-2.0' into develop-2.0	2017-05-23 13:00:38 +02:00
panni	cf577c81e1	submod: OCR fixes: compile new dictionaries	2017-05-23 13:00:27 +02:00
panni	ad236be02c	submod: OCR fixes: and more.	2017-05-23 12:59:37 +02:00
panni	3412e379d6	submod: better unopened/unclosed font tag handling	2017-05-23 12:49:46 +02:00
panni	95f240ab07	submod: HI: HI_before_colon broke font style tags	2017-05-23 12:17:54 +02:00
panni	0c8ae3f45b	submod: update eng OCR fix data	2017-05-23 11:58:08 +02:00
pannal	fe87944049	Update README.md	2017-05-22 05:48:50 +02:00
pannal	d7918b1714	Update README.md	2017-05-22 05:16:07 +02:00
panni	c147c29756	add wiki notice to notify_executable pref	2017-05-22 02:33:18 +02:00
panni	5a4a50bc9d	add note about enforce_encoding	2017-05-22 02:30:49 +02:00
panni	55ea4009c9	rename exotic_ext prefs to reflect its current function	2017-05-22 02:28:59 +02:00
panni	536fd7dfe4	bump dev version	2017-05-22 02:13:12 +02:00
panni	a1f6568b84	only use the first video stream #270	2017-05-22 02:11:35 +02:00
panni	6a9112f03c	add more known info about the media file/streams; resolves #270	2017-05-22 02:10:26 +02:00
panni	89b4305ccb	don't query plex item twice in case of movies	2017-05-22 01:26:26 +02:00
panni	e2756e85b7	2.0.19.1337 RC8	2017-05-21 15:52:37 +02:00
panni	0f7bc36e86	add fixme	2017-05-21 15:50:12 +02:00
panni	5e20032976	fix findbetter	2017-05-21 15:40:37 +02:00
panni	c7dbac05a9	update guessit to 8d56c9f	2017-05-21 15:35:06 +02:00
panni	a0a5adb807	remove info log	2017-05-21 06:19:41 +02:00
panni	ac6a43f6e5	re-up recently to 2 weeks and 1000 items	2017-05-21 06:13:59 +02:00
panni	91f57da735	fix findallrecentlymissing	2017-05-21 06:13:29 +02:00
panni	488ac604f9	better debug info for findbettersubtitles	2017-05-21 04:09:33 +02:00
panni	70ab3e456f	add missing info to hints and video_info	2017-05-21 04:00:36 +02:00
panni	d0017d2ab8	fix	2017-05-21 03:41:44 +02:00
panni	9633abc09e	ditch OMDB refiner support for now. all needed info comes from the PMS	2017-05-21 01:49:51 +02:00
panni	8f608acc71	submod: OCR update data	2017-05-20 22:31:28 +02:00
panni	dbce582bdf	submod: skip empty line post processors when not needed	2017-05-20 22:24:29 +02:00
panni	62f03bcf11	submod: fix not opened/closed font tags after modification	2017-05-20 16:20:27 +02:00
panni	530eb9ef66	adapt 1.4 readme	2017-05-20 05:22:39 +02:00
panni	497a94e3a5	submod: update dictionaries from SE	2017-05-20 04:07:40 +02:00
panni	e17082d27e	task allrecentlymissing: fix logging	2017-05-20 02:17:41 +02:00
panni	2eefb8e225	fixes; lower default recently added to 1 week	2017-05-20 01:26:38 +02:00
panni	5d9b1a1810	don't re-guess encoding when saving modified subtitle	2017-05-20 00:42:38 +02:00
panni	f274e76253	submod: simplify	2017-05-20 00:35:34 +02:00
panni	3bfef7f67b	submod: break mods.modify up to make it smaller	2017-05-20 00:34:09 +02:00
panni	5d6651e00e	submod: HI: remove obsolete fixme	2017-05-19 23:33:40 +02:00
panni	f0ed0b7c41	submod: common: move CM_double_apostrophe further up the chain	2017-05-19 23:29:49 +02:00
panni	0d4bf7b6b3	submod: common: CM_uppercase_i_in_word: support "WeII" aswell	2017-05-19 23:23:55 +02:00
panni	a5c7c656e6	set get my logs link as title2 also	2017-05-19 23:15:18 +02:00
panni	fb3a937c81	submod: add performance debug	2017-05-19 23:11:24 +02:00
panni	e50820abd0	submod: common: fix CM_uppercase_i_in_word	2017-05-19 23:03:17 +02:00
panni	083084136c	don't fall back to utf-8, we should be good here	2017-05-19 22:57:20 +02:00
panni	0188b81220	clarify	2017-05-19 22:55:09 +02:00
panni	c7468dbfb5	submod: OCR add more eng data	2017-05-19 22:53:19 +02:00
panni	d92ba7125e	in case of microdvd, try guessing the fps from the file, else suggest the FPS from our media file. add docs	2017-05-19 22:52:05 +02:00
panni	050d5dd063	add config.enforce_encoding to debug log	2017-05-19 21:54:37 +02:00
panni	a860c57bd1	when force-utf8 is enabled, also store subtitle content in utf-8	2017-05-19 21:52:19 +02:00
panni	1b0b189c16	add more encodings for western, eastern and northern europe	2017-05-19 18:51:52 +02:00
panni	7d2b3d6663	add our pysubs2 to_unicode encoder to PatchedSubtitle; add iso-8859-2 for polish;	2017-05-19 18:42:31 +02:00
panni	2899d68973	add fps to napiprojekt subtitle for when it can't be guessed from the MicroDVD format contents	2017-05-19 18:28:14 +02:00
panni	0cc8238b1a	don't trigger text conversion more than once in is_valid	2017-05-19 17:55:05 +02:00
panni	f277751d86	don't blerg all of the subtitle content into stdout; log the traceback for pysubs2	2017-05-19 17:51:58 +02:00
panni	74d63a9144	2.0.19.1299 RC7	2017-05-19 14:51:22 +02:00
panni	07f7b4e7fb	add fixme	2017-05-19 14:42:58 +02:00
panni	92fda093f7	submod: CM_spaces_in_numbers: don't break up ellipses	2017-05-19 14:38:33 +02:00
panni	714751d2d8	submod: merge mergeable mods; skip duplicate exclusive mods early; make offset args mergeable to avoid nasty stuff like negative offset first, then positive	2017-05-19 14:29:59 +02:00
panni	2c949192b2	submod: improve processing performance by adding some shortcuts	2017-05-19 14:08:36 +02:00
panni	c0e3c6a0eb	submod: improve processing performance by feeding line mods already cleaned-up lines	2017-05-19 13:43:30 +02:00
panni	764484f735	submod: add fixed order to line mods	2017-05-19 03:29:05 +02:00
panni	208bd4fcb2	reset last order change	2017-05-19 03:28:44 +02:00
panni	ba53a5fa93	add more stuff to test.srt	2017-05-18 13:55:50 +02:00
panni	4d40da5661	submod: common: leading crocodile can also have a space in front	2017-05-18 13:49:36 +02:00
panni	4ab157e2a1	submod: re_processor: clean font style tags before processing the line	2017-05-18 13:47:36 +02:00
panni	dbf64d2a2b	submod: HI: make bracket detection more aggressive	2017-05-18 13:44:55 +02:00
panni	03d4ee3482	submod: HI: add HI_starting_upper_then_sentence	2017-05-18 13:17:43 +02:00
panni	959a061380	submod: set default order	2017-05-18 13:17:20 +02:00
panni	f5432dfb9e	submod: OCR: more eng default fixes	2017-05-18 13:16:59 +02:00
panni	fb494a911d	fix character ranges	2017-05-17 20:15:45 +02:00
panni	bc9dec659c	submod: update uppercase after dot to be less greedy	2017-05-17 20:12:20 +02:00
panni	b68cc3f61e	submod: use À-Ž instead of A-Z for patterns	2017-05-17 20:00:38 +02:00
panni	0db80add2c	submod: common: fit non-uppercase after dot	2017-05-17 19:56:41 +02:00
panni	2a67632497	update OCR fix data	2017-05-17 19:17:04 +02:00
panni	5260b28c15	submod: HI: be less aggressive on removing text-before-colon	2017-05-17 19:14:26 +02:00
panni	4d365cba22	submod: don't fix countdown numbers	2017-05-17 19:02:42 +02:00
panni	8174a8efc3	submod fixes english: Âs='s	2017-05-17 19:01:07 +02:00
panni	a5d8df35b6	more stuff for the readme	2017-05-17 18:50:06 +02:00
panni	0ad429ffaa	add automation	2017-05-17 15:23:05 +02:00
panni	3108572387	move changelog for now	2017-05-17 15:17:28 +02:00
panni	98a406ff9e	revert, preformatted looks better	2017-05-17 15:16:51 +02:00
panni	9257550e56	update readme for mods	2017-05-17 15:15:56 +02:00
panni	ef19ed0a26	update readme for mods	2017-05-17 15:13:43 +02:00
panni	80daa8560d	first version of the 2.0 readme	2017-05-17 15:06:53 +02:00
panni	797cc16a91	add cleanline processor; remove Mr->Mr. as it's valid in the UK	2017-05-17 14:26:19 +02:00
panni	771e0464d7	update OCR fixes	2017-05-17 13:51:48 +02:00
panni	715e9c0015	2.0.19.1267 RC6	2017-05-16 18:10:59 +02:00
panni	d13a0c4fb3	submod: allow for more punctuation in spaced numbers; add more english OCR fixes	2017-05-16 17:55:49 +02:00
panni	2bb0517264	correctly handle partiallines	2017-05-16 17:46:56 +02:00
panni	ac174673ef	fix major whoopsie in item details	2017-05-16 14:22:31 +02:00
panni	dacab5ece7	enzyme: fix logging; skip element without type	2017-05-16 14:22:22 +02:00
panni	69a5ef6f18	common fixes: test for leading ellipsis earlier to skip unnecessary CM_ellipsis_no_space	2017-05-16 14:15:19 +02:00
panni	47be8eef62	HI: improve all caps line matching (allow some punctuation)	2017-05-16 14:13:41 +02:00
panni	fe7760e779	color mod: return the original line if color not found	2017-05-16 13:50:05 +02:00
panni	18dddaf0a1	add our own dictionaries to submod fixes	2017-05-16 13:44:23 +02:00
panni	b32066e6f8	don't bother listing unexistant parts in item details menu	2017-05-16 13:37:02 +02:00
panni	eca378c09e	submod: fix patterns for beginlines/endlines	2017-05-16 13:34:28 +02:00
panni	2c3e4173f4	only append extension to jsonpath if necessary; bail out correctly	2017-05-16 13:00:59 +02:00
panni	488a65055b	cache guessed encoding and don't re-guess every time	2017-05-15 18:47:52 +02:00
panni	cb94f0c2c6	remove invalid comment	2017-05-15 18:09:31 +02:00
panni	8dc4cf8d63	subtitle history: don't fail on old dict data	2017-05-15 18:07:39 +02:00
panni	82ec5e0d5e	only store subtitle info if save was successful	2017-05-15 18:02:33 +02:00
panni	91cebd2902	store encoding of subtitle in storage; store unicode version; add migration task	2017-05-15 18:00:51 +02:00
panni	cecee18d8e	implement new json/gzip based subtitle storage format; auto-migrate legacy data	2017-05-15 17:01:20 +02:00
panni	2b1ea2eb6f	add json_tricks 3.9.0	2017-05-15 16:11:28 +02:00
panni	bc67b380e5	Merge remote-tracking branch 'origin/develop-2.0' into develop-2.0	2017-05-14 02:53:36 +02:00
panni	b7b784f442	clarify not found preferences.xml	2017-05-14 02:53:24 +02:00
pannal	6889effbb6	Update README.md	2017-05-14 02:44:43 +02:00
panni	ae7865ecb8	2.0.18.1245 RC5	2017-05-14 02:31:25 +02:00
panni	83c9d4887b	rename Auto-search to Force-find	2017-05-14 02:26:31 +02:00
panni	75da4dab70	clear up already decoded debug info	2017-05-14 02:25:14 +02:00
panni	07fccf9b52	shift_offset should be non-exclusive	2017-05-14 02:15:20 +02:00
panni	6cfafd60ef	add full color range; add color submod menu	2017-05-14 02:13:12 +02:00
panni	b24bd740c2	fix stupidity. add newline to subtitle line index	2017-05-14 01:36:38 +02:00
panni	6c81ee7b3a	addic7ed: format also matches if release group was correct	2017-05-14 01:33:48 +02:00
panni	cd00194819	add more debug	2017-05-14 01:24:19 +02:00
panni	0eda52e3b2	update readme	2017-05-13 16:47:29 +02:00
panni	56de3b5658	again	2017-05-13 15:00:37 +02:00
panni	b8f31fc36f	forgot version	2017-05-13 15:00:31 +02:00
panni	7354110d2f	pre-release 2.0.15.1234 RC4	2017-05-13 14:59:15 +02:00
panni	c08335b5a8	fail miserably when last-resort utf-8 encoding fails also	2017-05-13 14:49:43 +02:00
panni	f4d9a3c65c	add color mod; add to_unicode to submod	2017-05-13 06:32:40 +02:00
panni	174b73a5cb	doc	2017-05-13 04:55:45 +02:00
panni	5df5123682	simplify data patterns	2017-05-13 04:32:29 +02:00
panni	1aef828fcd	debug mods with repr; (um) = (?um)	2017-05-13 04:11:04 +02:00
panni	6401183eff	increase searchallrecentlymissing wait to 5 seconds per request	2017-05-13 02:13:17 +02:00
panni	82757a2f0c	apply correct path to env on non-windows	2017-05-13 02:05:15 +02:00
panni	736386bc31	try mitigating #27	2017-05-13 01:45:32 +02:00
panni	922bed81fa	resolve #256	2017-05-13 01:34:20 +02:00
panni	708e8c5b14	also print SZ environment variables	2017-05-13 01:26:17 +02:00
panni	1e02082472	don't fail on metadata query timeout	2017-05-13 01:20:10 +02:00
panni	9599bcb70f	searchallrecentlymissing: don't error on timeout; don't fail on no current mods	2017-05-13 01:17:48 +02:00
panni	dad8460574	correctly handle multiple media files with multiple parts; honor physical ignore in missing subtitles	2017-05-12 18:23:53 +02:00
panni	021d12963f	update provider test; add custom repr for napiprojektsubtitle	2017-05-12 16:30:24 +02:00
panni	e5599650ac	implement custom user agent (for OS)	2017-05-12 15:29:44 +02:00
panni	22a1eff98e	backport provider download retry behaviour	2017-05-12 01:28:33 +02:00
panni	2e05eb91ca	also discard provider	2017-05-12 01:18:43 +02:00
panni	031e035a50	2.0.15.1216 RC3	2017-05-08 17:56:25 +02:00
panni	02374575bc	add missing thread.sleep	2017-05-08 17:54:57 +02:00
panni	adef9e1014	only retry on specific RequestExceptions	2017-05-08 17:51:04 +02:00
panni	5bb3f15332	only retry on RequestException	2017-05-08 17:46:44 +02:00
panni	089e0d5d6c	use WholeLineProcessor for WholeLines	2017-05-08 17:40:20 +02:00
panni	513bc2ae8b	use correct sys.modules path; add non-refreshing local subtitle search	2017-05-08 06:01:14 +02:00
panni	8a1c61ac22	2.0.15.1209 RC2	2017-05-08 05:34:32 +02:00
panni	3e1910a28b	2.0.15.1209 RC2	2017-05-08 04:07:24 +02:00
panni	b5e5341436	add generic back options in sub menus	2017-05-08 03:59:53 +02:00
panni	223ef16583	add back menu items for season/episodes	2017-05-08 03:40:07 +02:00
panni	114312e1e5	rename leeway to sleep_after_request	2017-05-08 02:30:36 +02:00
panni	1a49159b64	by default don't download better subtitles for manually modified ones	2017-05-08 02:22:47 +02:00
panni	d0ee9badb2	don't cleanup matching custom or embedded tag	2017-05-08 02:08:34 +02:00
panni	b9116c30ed	debounce crucial items in advanced menu	2017-05-08 02:03:22 +02:00
panni	d7e6436d8d	stagger less	2017-05-08 01:41:40 +02:00
panni	c039172880	stagger thread creation on scheduled and manual (GUI) triggered tasks; react faster on requested task run	2017-05-08 01:39:34 +02:00
panni	bd5da47370	adjust leeway to 0.2s	2017-05-08 01:29:17 +02:00
panni	e9aabe0a5e	spawn scheduled tasks in separate threads	2017-05-08 01:26:59 +02:00
panni	f3f09dbb9d	stagger SearchAllRecentlyAddedMissing	2017-05-08 01:26:33 +02:00
panni	3cc8a98f67	stagger FindBetter by 1 second per item	2017-05-08 01:07:28 +02:00
panni	31e923c080	reduce sudmod shift minute range from -59/60 to -15/15	2017-05-07 22:39:49 +02:00
panni	39b3b4a0c2	move update_local_media before ignore list checking	2017-05-07 22:21:24 +02:00
panni	8470daa20f	more debug info when loading stored sub info; delete invalid sub info when loading; don't fail apply_default_mods on invalid sub info	2017-05-07 06:17:03 +02:00
panni	e852137baf	rename titles for on-deck and recently added items menu items	2017-05-07 05:32:48 +02:00
panni	753c46d9fd	move PartUnknownException to helpers; add items.set_mods_for_part; add ApplyDefaultMods and ReApplyMods to advanced menu	2017-05-07 05:32:23 +02:00
panni	e06ca730a2	make amount of stored recently played items dynamic	2017-05-07 05:31:02 +02:00
panni	f84e84b17b	allow wrong subtitle FPS when manually listing subtitles	2017-05-07 05:16:12 +02:00
panni	4f927b272b	log no better subtitles found	2017-05-07 04:41:36 +02:00
panni	662e1a93a9	store last 20 played items; shift last played item accordingly if already in last played list	2017-05-07 03:40:41 +02:00
panni	e25a043457	return save_successful on save_subtitles	2017-05-07 02:47:06 +02:00
panni	b32f923513	add subtitle modification debug setting; also apply mods on metadata-stored subtitles	2017-05-07 02:45:12 +02:00
panni	ad8898266e	mod: common: fix starting space dots	2017-05-07 02:22:37 +02:00
panni	51e87bdda5	don't crash the menu when no mods are applied on the current subtitle	2017-05-06 18:07:53 +02:00
panni	f88677b0f6	fix common fixes description	2017-05-06 18:04:20 +02:00
panni	fc71ec0250	remove unnecessary debounces	2017-05-06 18:00:40 +02:00
panni	ca6089c220	Pre-Release 2.0.12.1180 RC1	2017-05-06 17:49:58 +02:00
panni	7cc051fd90	set default movie score to lowest (60)	2017-05-06 17:43:38 +02:00
panni	5b01fda526	adapt forced_only for new providers (disable them)	2017-05-06 17:37:31 +02:00
panni	585f6b8a4d	rename config.use_activities to react_to_activities and act accordingly	2017-05-06 17:29:11 +02:00
panni	81aeba0874	use added icon instead of recent icon for recently added menu	2017-05-06 17:24:05 +02:00
panni	d9133e2793	add recently played menu	2017-05-06 17:22:33 +02:00
panni	9ef740ae1f	remove_HI: less aggressive bracket content matching	2017-05-06 16:53:32 +02:00
panni	e54fe71e93	reduce addicted default boost to 21	2017-05-06 16:46:54 +02:00
panni	9df878b8e3	add common fixes as default; remove debug print	2017-05-06 16:46:22 +02:00
panni	1a59c267c1	remove doublequote processors, doesn't seem possible	2017-05-06 16:42:07 +02:00
panni	f8a07d983b	fix typo resolves #274	2017-05-06 15:28:40 +02:00
panni	1f1847f246	change doublequote regexes	2017-05-06 06:48:52 +02:00
panni	a32dfd6b37	add common fixes	2017-05-06 06:14:58 +02:00
panni	b1cce92e04	use positive lookahead for HI all caps line detection	2017-05-06 01:35:43 +02:00
panni	fdf32439c9	don't remove dash-in-front on hearing impaired; skip empty lines properly	2017-05-06 01:26:17 +02:00
panni	fc2208f9e5	bump version	2017-05-05 19:32:12 +02:00
panni	1a4eb366bb	add helping indicator to FPS mod; add 30fps	2017-05-05 19:31:43 +02:00
panni	b89c64a2c2	add modification management menu	2017-05-05 19:19:34 +02:00
panni	68e8f6e753	don't remove HI by default	2017-05-05 19:11:43 +02:00
panni	f15cc4cb3c	add offset shifter submod	2017-05-05 19:10:32 +02:00
panni	903273e3ef	add advanced submods; add global (non-line) submods; test implementation of ChangeFPS mod	2017-05-05 15:39:18 +02:00
panni	1c9b744d31	move subtitle modification menu to separate file	2017-05-05 14:58:19 +02:00
panni	7c0fb29886	fix init_cache whoopsie	2017-05-05 14:58:06 +02:00
panni	2505a7510c	enzyme: incorporate 0.4.2 fixes	2017-05-05 14:44:59 +02:00
panni	0a66db40a2	fix findbetter	2017-05-05 14:30:49 +02:00
panni	6c68893979	add mod.long_description; add remove_last action to subtitle modification menu	2017-05-04 20:10:35 +02:00
panni	c512eab0b6	testcommit	2017-05-04 20:00:12 +02:00
panni	3cedd4bd0f	try getting plex token from environment by default	2017-05-04 19:33:05 +02:00
panni	0759c5e4c6	add environment debug	2017-05-04 19:31:07 +02:00
panni	ad6cf4be79	move config debug to better position; verify readability of log files	2017-05-04 19:15:38 +02:00
panni	23c3899fb2	add fixme	2017-05-04 14:30:25 +02:00
panni	1a6515a660	add platform and os to config debug	2017-05-04 14:20:29 +02:00
panni	58815a7650	use external ip fallback when logs were requested from plex.tv	2017-05-04 14:16:10 +02:00
panni	c15ec9fefc	disable get_logs when universal plex token is None	2017-05-04 13:49:02 +02:00
panni	0e18d59680	2.0.0.12	2017-05-03 23:12:42 +02:00
panni	2d88efa5b4	add doc	2017-05-03 23:12:26 +02:00
panni	b3da7572f3	add PartialWordsAlways to OCR_fixes	2017-05-03 23:11:02 +02:00
panni	099ec4e85d	remove debug print; add doc	2017-05-03 23:04:25 +02:00
panni	ff88a15c61	reset initialized mods after load	2017-05-03 22:59:47 +02:00
panni	839791b0fa	add OCR fixes as default; fix little whoopsie in SubtitleModifications.modify	2017-05-03 22:52:33 +02:00
panni	159a533731	add precompiled patterns to data dict; add more parsed data; add OCR fixes finally	2017-05-03 22:44:54 +02:00
panni	fb5835baa4	separate ocr fix data further into line, word, partial	2017-05-03 15:19:52 +02:00
panni	a3f05cd597	separat partial and full replace data	2017-05-03 15:16:22 +02:00
panni	f3af1672f6	use memory cache on windows for now; add config debug logging	2017-05-03 13:33:29 +02:00
panni	c984c9849b	only add better subtitle if its score is higher than the minimum configured	2017-05-02 21:37:40 +02:00
panni	e28d264125	language conversion test	2017-05-02 19:22:57 +02:00
panni	7166ab9502	use default mods in tasks as well	2017-05-02 18:47:58 +02:00
panni	ab242c2ecb	add current find/replace data	2017-05-02 18:43:45 +02:00
panni	6f829dd4c7	move xmls to xml/; add make_data and test_data script;	2017-05-02 18:43:35 +02:00
panni	3e0602cdf0	add OCRFixReplaceList dictionaries of SubtitleEdit; commit 4f43a84c354d53251614fe6fa4c1b9df92839f57; add second test srt	2017-05-02 18:03:17 +02:00
panni	67cdebfb67	make subtitle modifications a subpackage of subzero	2017-05-02 18:01:46 +02:00
panni	0f87973742	modify test.srt to accomodate for specials chars in text-before-colon; handle special chars in HI_before_colon better	2017-05-02 17:42:39 +02:00
panni	92317f7730	add task run info logging	2017-05-01 05:37:38 +02:00
panni	ce936c2553	add task debug	2017-05-01 05:37:09 +02:00