Compare commits
237 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 2006ebb244 | |||
| 58c852cdba | |||
| 9e77a8e304 | |||
| e9817f1e0d | |||
| 123dde7b8f | |||
| c1b84eabdb | |||
| c7ececde77 | |||
| 6f305d636e | |||
| d25990895c | |||
| d406ced759 | |||
| b858b56120 | |||
| c94fe81dbf | |||
| a67bbebb84 | |||
| cf577c81e1 | |||
| ad236be02c | |||
| 3412e379d6 | |||
| 95f240ab07 | |||
| 0c8ae3f45b | |||
| fe87944049 | |||
| d7918b1714 | |||
| c147c29756 | |||
| 5a4a50bc9d | |||
| 55ea4009c9 | |||
| 536fd7dfe4 | |||
| a1f6568b84 | |||
| 6a9112f03c | |||
| 89b4305ccb | |||
| e2756e85b7 | |||
| 0f7bc36e86 | |||
| 5e20032976 | |||
| c7dbac05a9 | |||
| a0a5adb807 | |||
| ac6a43f6e5 | |||
| 91f57da735 | |||
| 488ac604f9 | |||
| 70ab3e456f | |||
| d0017d2ab8 | |||
| 9633abc09e | |||
| 8f608acc71 | |||
| dbce582bdf | |||
| 62f03bcf11 | |||
| 530eb9ef66 | |||
| 497a94e3a5 | |||
| e17082d27e | |||
| 2eefb8e225 | |||
| 5d9b1a1810 | |||
| f274e76253 | |||
| 3bfef7f67b | |||
| 5d6651e00e | |||
| f0ed0b7c41 | |||
| 0d4bf7b6b3 | |||
| a5c7c656e6 | |||
| fb3a937c81 | |||
| e50820abd0 | |||
| 083084136c | |||
| 0188b81220 | |||
| c7468dbfb5 | |||
| d92ba7125e | |||
| 050d5dd063 | |||
| a860c57bd1 | |||
| 1b0b189c16 | |||
| 7d2b3d6663 | |||
| 2899d68973 | |||
| 0cc8238b1a | |||
| f277751d86 | |||
| 74d63a9144 | |||
| 07f7b4e7fb | |||
| 92fda093f7 | |||
| 714751d2d8 | |||
| 2c949192b2 | |||
| c0e3c6a0eb | |||
| 764484f735 | |||
| 208bd4fcb2 | |||
| ba53a5fa93 | |||
| 4d40da5661 | |||
| 4ab157e2a1 | |||
| dbf64d2a2b | |||
| 03d4ee3482 | |||
| 959a061380 | |||
| f5432dfb9e | |||
| fb494a911d | |||
| bc9dec659c | |||
| b68cc3f61e | |||
| 0db80add2c | |||
| 2a67632497 | |||
| 5260b28c15 | |||
| 4d365cba22 | |||
| 8174a8efc3 | |||
| a5d8df35b6 | |||
| 0ad429ffaa | |||
| 3108572387 | |||
| 98a406ff9e | |||
| 9257550e56 | |||
| ef19ed0a26 | |||
| 80daa8560d | |||
| 797cc16a91 | |||
| 771e0464d7 | |||
| 715e9c0015 | |||
| d13a0c4fb3 | |||
| 2bb0517264 | |||
| ac174673ef | |||
| dacab5ece7 | |||
| 69a5ef6f18 | |||
| 47be8eef62 | |||
| fe7760e779 | |||
| 18dddaf0a1 | |||
| b32066e6f8 | |||
| eca378c09e | |||
| 2c3e4173f4 | |||
| 488a65055b | |||
| cb94f0c2c6 | |||
| 8dc4cf8d63 | |||
| 82ec5e0d5e | |||
| 91cebd2902 | |||
| cecee18d8e | |||
| 2b1ea2eb6f | |||
| bc67b380e5 | |||
| b7b784f442 | |||
| 6889effbb6 | |||
| ae7865ecb8 | |||
| 83c9d4887b | |||
| 75da4dab70 | |||
| 07fccf9b52 | |||
| 6cfafd60ef | |||
| b24bd740c2 | |||
| 6c81ee7b3a | |||
| cd00194819 | |||
| 0eda52e3b2 | |||
| 56de3b5658 | |||
| b8f31fc36f | |||
| 7354110d2f | |||
| c08335b5a8 | |||
| f4d9a3c65c | |||
| 174b73a5cb | |||
| 5df5123682 | |||
| 1aef828fcd | |||
| 6401183eff | |||
| 82757a2f0c | |||
| 736386bc31 | |||
| 922bed81fa | |||
| 708e8c5b14 | |||
| 1e02082472 | |||
| 9599bcb70f | |||
| dad8460574 | |||
| 021d12963f | |||
| e5599650ac | |||
| 22a1eff98e | |||
| 2e05eb91ca | |||
| 031e035a50 | |||
| 02374575bc | |||
| adef9e1014 | |||
| 5bb3f15332 | |||
| 089e0d5d6c | |||
| 513bc2ae8b | |||
| 8a1c61ac22 | |||
| 3e1910a28b | |||
| b5e5341436 | |||
| 223ef16583 | |||
| 114312e1e5 | |||
| 1a49159b64 | |||
| d0ee9badb2 | |||
| b9116c30ed | |||
| d7e6436d8d | |||
| c039172880 | |||
| bd5da47370 | |||
| e9aabe0a5e | |||
| f3f09dbb9d | |||
| 3cc8a98f67 | |||
| 31e923c080 | |||
| 39b3b4a0c2 | |||
| 8470daa20f | |||
| e852137baf | |||
| 753c46d9fd | |||
| e06ca730a2 | |||
| f84e84b17b | |||
| 4f927b272b | |||
| 662e1a93a9 | |||
| e25a043457 | |||
| b32f923513 | |||
| ad8898266e | |||
| 51e87bdda5 | |||
| f88677b0f6 | |||
| fc71ec0250 | |||
| ca6089c220 | |||
| 7cc051fd90 | |||
| 5b01fda526 | |||
| 585f6b8a4d | |||
| 81aeba0874 | |||
| d9133e2793 | |||
| 9ef740ae1f | |||
| e54fe71e93 | |||
| 9df878b8e3 | |||
| 1a59c267c1 | |||
| f8a07d983b | |||
| 1f1847f246 | |||
| a32dfd6b37 | |||
| b1cce92e04 | |||
| fdf32439c9 | |||
| fc2208f9e5 | |||
| 1a4eb366bb | |||
| b89c64a2c2 | |||
| 68e8f6e753 | |||
| f15cc4cb3c | |||
| 903273e3ef | |||
| 1c9b744d31 | |||
| 7c0fb29886 | |||
| 2505a7510c | |||
| 0a66db40a2 | |||
| 6c68893979 | |||
| c512eab0b6 | |||
| 3cedd4bd0f | |||
| 0759c5e4c6 | |||
| ad6cf4be79 | |||
| 23c3899fb2 | |||
| 1a6515a660 | |||
| 58815a7650 | |||
| c15ec9fefc | |||
| 0e18d59680 | |||
| 2d88efa5b4 | |||
| b3da7572f3 | |||
| 099ec4e85d | |||
| ff88a15c61 | |||
| 839791b0fa | |||
| 159a533731 | |||
| fb5835baa4 | |||
| a3f05cd597 | |||
| f3af1672f6 | |||
| c984c9849b | |||
| e28d264125 | |||
| 7166ab9502 | |||
| ab242c2ecb | |||
| 6f829dd4c7 | |||
| 3e0602cdf0 | |||
| 67cdebfb67 | |||
| 0f87973742 | |||
| 92317f7730 | |||
| ce936c2553 |
+107
@@ -1,3 +1,110 @@
|
||||
2.0.19.1337 RC8
|
||||
- napiprojekt: fixed: couldn't convert microdvd to SRT in certain occasions
|
||||
- core: when normalize to UTF-8 is enabled, also store the subtitle in UTF-8 encoding in the internal storage
|
||||
- core: add more encodings for western/eastern/northern europe
|
||||
- submod: OCR: update dictionaries from SubtitleEdit
|
||||
- submod: common: be smarter about uppercase i's in words that should have lowercase L's
|
||||
- submod: fix unopened/unclosed font style tags after modification
|
||||
- core: re-enable OMDB support
|
||||
- core: update guessit for better matching
|
||||
- core: fix SearchAllRecentlyMissing (was broken since RC3)
|
||||
|
||||
|
||||
2.0.19.1299 RC7
|
||||
- submod: offset mods now get merged internally when applied multiple times (to avoid errors and increase performance)
|
||||
- submod: improve performance
|
||||
- submod: core mods (OCR, common, remove_HI) now are always applied in a fixed order internally, regardless of the order they were added in
|
||||
- submod: CM_spaces_in_numbers: don't break up ellipses (30... 29... 28...)
|
||||
- submod: CM_spaces_in_numbers: don't fix countdown numbers (30, 29, 28)
|
||||
- submod: remove_HI: make bracket removal more aggressive
|
||||
- submod: remove_HI: be less aggressive when removing text-before-colon
|
||||
- submod: remove_HI: remove all-uppercase-before-sentence (THIS IS ALL UPPERCASE And here starts a sentence -> And here starts a sentence)
|
||||
- submod: fix all character ranges to include non-ASCII characters
|
||||
- add new README for 2.0
|
||||
|
||||
|
||||
2.0.19.1267 RC6
|
||||
- core: add new SZ subtitle storage format
|
||||
- smaller data files and less cumbersome
|
||||
- it will auto migrate when old data is accessed - to speed this up, use "Trigger subtitle storage migration (expensive)" in advanced menu)
|
||||
- core: performance optimizations
|
||||
- addic7ed: when release group matches, assume the format matches, too (leftover change from RC5)
|
||||
- submod: fix patterns for beginlines/endlines
|
||||
- submod: add our own dictionaries to OCR fixes (english)
|
||||
- submod: hearing impaired: also remove full-caps with punctuation inside
|
||||
- submod: correctly handle partiallines
|
||||
- submod: in numbers with spaces (incorrect), also allow for some punctuation (,.:')
|
||||
|
||||
|
||||
2.0.18.1245 RC5
|
||||
- core: add more debug info
|
||||
- core: fix subtitle modifications (was broken in RC4, created non-usable subtitles)
|
||||
- submod: add ANSI colors
|
||||
- menu/submod: add color mod menu
|
||||
- submod: exclusive mods now are mutually exclusive and get cleaned on duplicate
|
||||
- menu/core: naming
|
||||
|
||||
For everyone who runs RC4: your subtitles are broken. Go to the advanced menu and trigger `Re-Apply mods of all stored subtitles` to fix them.
|
||||
|
||||
|
||||
2.0.17.1234 RC4
|
||||
- core: backport provider-download-retry implementation
|
||||
- core: implement custom user agent (for OpenSubtitles)
|
||||
- core/menu: correct handling of media with multiple files
|
||||
- core: fix SearchAllRecentlyMissing; also wait 5 seconds between searches
|
||||
- core: SearchAllRecentlyMissing: honor physical ignores
|
||||
- submod: pattern fixes
|
||||
- submod: better unicode handling
|
||||
- submod: add color mod (only automatic by now)
|
||||
|
||||
|
||||
2.0.15.1216 RC3
|
||||
- core: fixes
|
||||
- scheduler: revert some of the aggressive changes in RC2
|
||||
- submod: be smarter about WholeLine matches
|
||||
|
||||
|
||||
2.0.15.1209 RC2
|
||||
- core: fixes
|
||||
- core: submod-common: fix multiple dots at start of line
|
||||
- core/menu: add subtitle modification debug setting
|
||||
- core/menu: when manually listing available subtitles in menu, display those with wrong FPS also (opensubtitles), because you can fix them later
|
||||
- core/menu: advanced-menu: add apply-all-default-mods menu item; add re-apply all mods menu item
|
||||
- core: always look for currently (not-) existing subtitles when called; hopefully fixes #276
|
||||
- scheduler/menu: be faster; also launch scheduled tasks in threads, not just manually launched ones
|
||||
- core: don't delete subtitles with .custom or .embedded in their filenames when running auto cleanup, if the correct media file exists
|
||||
- menu: add back-to-previous menu items
|
||||
|
||||
|
||||
2.0.12.1180 RC1
|
||||
- core: update subliminal to version 2
|
||||
- core: update all dependencies
|
||||
- core: add new providers: legendastv (pt-BR), napiprojekt (pl), shooter (cn), subscenter (heb)
|
||||
- core: rewritten all subliminal patches for version 2
|
||||
- menu: add icons for menu items; update main channel icon
|
||||
- core: use SSL again for opensubtitles
|
||||
- core: improved matching due to subliminal 2 (and SZ custom) tvdb/omdb refiners
|
||||
- menu: add "Get my logs" function to the advanced menu, which zips up all necessary logs suitable for posting in the forums
|
||||
- core: on non-windows systems, utilize a file-based cache database for provider media lists and subliminal refiner results
|
||||
- core: add manual and automatic subtitle modification framework (fix common OCR issues, remove hearing impaired etc.)
|
||||
- menu: add subtitle modifications (subtitle content fixes, offset-based shifting, framerate conversion)
|
||||
- menu: add recently played menu
|
||||
- improve almost everything Sub-Zero did in 1.4 :)
|
||||
|
||||
|
||||
1.4.27.973
|
||||
- core: ignore "obfuscated" and "scrambled" tags in filenames when searching for subtitles
|
||||
- core: exotic embedded subtitles are now also considered when searching (and when the option is enabled); fixes #264
|
||||
|
||||
|
||||
1.4.27.967
|
||||
- core: remember the last 10 played items; only consider on_playback for "playing" state within the first 60 seconds of an item
|
||||
|
||||
|
||||
1.4.27.965
|
||||
- core: on_playback activity bugfixes
|
||||
|
||||
|
||||
1.4.27.957
|
||||
- core: correctly fall back to the next best subtitle if the current one couldn't be downloaded; hopefully fixes #231
|
||||
- core: add "Scan: which external subtitles should be picked up?"-setting
|
||||
|
||||
@@ -24,12 +24,11 @@ import support
|
||||
import interface
|
||||
sys.modules["interface"] = interface
|
||||
|
||||
from subliminal.cli import MutexLock
|
||||
from subzero.constants import OS_PLEX_USERAGENT, PERSONAL_MEDIA_IDENTIFIER
|
||||
from interface.menu import *
|
||||
from support.plex_media import media_to_videos, get_media_item_ids, scan_videos
|
||||
from support.subtitlehelpers import get_subtitles_from_metadata
|
||||
from support.storage import whack_missing_parts, save_subtitles, get_subtitle_storage
|
||||
from support.storage import whack_missing_parts, save_subtitles
|
||||
from support.items import is_ignored
|
||||
from support.config import config
|
||||
from support.lib import get_intent
|
||||
@@ -43,13 +42,7 @@ def Start():
|
||||
HTTP.CacheTime = 0
|
||||
HTTP.Headers['User-agent'] = OS_PLEX_USERAGENT
|
||||
|
||||
try:
|
||||
subliminal.region.configure('dogpile.cache.dbm', expiration_time=datetime.timedelta(days=30),
|
||||
arguments={'filename': os.path.join(config.data_items_path, 'subzero.dbm'),
|
||||
'lock_factory': MutexLock})
|
||||
except:
|
||||
Log.Warn("Not using file based cache!")
|
||||
subliminal.region.configure('dogpile.cache.memory')
|
||||
config.init_cache()
|
||||
|
||||
# clear expired intents
|
||||
intent = get_intent()
|
||||
@@ -191,6 +184,9 @@ class SubZeroAgent(object):
|
||||
config.init_subliminal_patches()
|
||||
videos = media_to_videos(media, kind=self.agent_type)
|
||||
|
||||
# find local media
|
||||
update_local_media(metadata, media, media_type=self.agent_type)
|
||||
|
||||
# media ignored?
|
||||
use_any_parts = False
|
||||
for video in videos:
|
||||
@@ -211,9 +207,6 @@ class SubZeroAgent(object):
|
||||
|
||||
set_refresh_menu_state(media, media_type=self.agent_type)
|
||||
|
||||
# find local media
|
||||
update_local_media(metadata, media, media_type=self.agent_type)
|
||||
|
||||
# scanned_video_part_map = {subliminal.Video: plex_part, ...}
|
||||
scanned_video_part_map = scan_videos(videos, kind=self.agent_type)
|
||||
|
||||
|
||||
@@ -18,3 +18,6 @@ sys.modules["interface.refresh_item"] = refresh_item
|
||||
|
||||
import item_details
|
||||
sys.modules["interface.item_details"] = item_details
|
||||
|
||||
import sub_mod
|
||||
sys.modules["interface.modification"] = sub_mod
|
||||
|
||||
@@ -3,19 +3,23 @@ import datetime
|
||||
import StringIO
|
||||
import glob
|
||||
import os
|
||||
import traceback
|
||||
import urlparse
|
||||
|
||||
from zipfile import ZipFile, ZIP_DEFLATED
|
||||
|
||||
from babelfish import Language
|
||||
|
||||
from subzero.lib.io import FileIO
|
||||
from subzero.constants import PREFIX, PLUGIN_IDENTIFIER
|
||||
from menu_helpers import SubFolderObjectContainer, debounce, set_refresh_menu_state, ZipObject
|
||||
from menu_helpers import SubFolderObjectContainer, debounce, set_refresh_menu_state, ZipObject, ObjectContainer
|
||||
from main import fatality
|
||||
from support.helpers import timestamp, pad_title
|
||||
from support.config import config
|
||||
from support.lib import Plex
|
||||
from support.storage import reset_storage, log_storage
|
||||
from support.storage import reset_storage, log_storage, get_subtitle_storage
|
||||
from support.scheduler import scheduler
|
||||
from support.items import set_mods_for_part, get_item_kind_from_rating_key
|
||||
|
||||
|
||||
@route(PREFIX + '/advanced')
|
||||
@@ -49,6 +53,18 @@ def AdvancedMenu(randomize=None, header=None, message=None):
|
||||
key=Callback(TriggerStorageMaintenance, randomize=timestamp()),
|
||||
title=pad_title("Trigger subtitle storage maintenance"),
|
||||
))
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(TriggerStorageMigration, randomize=timestamp()),
|
||||
title=pad_title("Trigger subtitle storage migration (expensive)"),
|
||||
))
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(ApplyDefaultMods, randomize=timestamp()),
|
||||
title=pad_title("Apply configured default subtitle mods to all (active) stored subtitles"),
|
||||
))
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(ReApplyMods, randomize=timestamp()),
|
||||
title=pad_title("Re-Apply mods of all stored subtitles"),
|
||||
))
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(LogStorage, key="tasks", randomize=timestamp()),
|
||||
title=pad_title("Log the plugin's scheduled tasks state storage"),
|
||||
@@ -92,6 +108,7 @@ def Restart():
|
||||
|
||||
|
||||
@route(PREFIX + '/storage/reset', sure=bool)
|
||||
@debounce
|
||||
def ResetStorage(key, randomize=None, sure=False):
|
||||
if not sure:
|
||||
oc = SubFolderObjectContainer(no_history=True, title1="Reset subtitle storage", title2="Are you sure?")
|
||||
@@ -127,6 +144,7 @@ def LogStorage(key, randomize=None):
|
||||
|
||||
|
||||
@route(PREFIX + '/triggerbetter')
|
||||
@debounce
|
||||
def TriggerBetterSubtitles(randomize=None):
|
||||
scheduler.dispatch_task("FindBetterSubtitles")
|
||||
return AdvancedMenu(
|
||||
@@ -137,6 +155,7 @@ def TriggerBetterSubtitles(randomize=None):
|
||||
|
||||
|
||||
@route(PREFIX + '/triggermaintenance')
|
||||
@debounce
|
||||
def TriggerStorageMaintenance(randomize=None):
|
||||
scheduler.dispatch_task("SubtitleStorageMaintenance")
|
||||
return AdvancedMenu(
|
||||
@@ -146,27 +165,111 @@ def TriggerStorageMaintenance(randomize=None):
|
||||
)
|
||||
|
||||
|
||||
@route(PREFIX + '/triggerstoragemigration')
|
||||
@debounce
|
||||
def TriggerStorageMigration(randomize=None):
|
||||
scheduler.dispatch_task("MigrateSubtitleStorage")
|
||||
return AdvancedMenu(
|
||||
randomize=timestamp(),
|
||||
header='Success',
|
||||
message='MigrateSubtitleStorage triggered'
|
||||
)
|
||||
|
||||
|
||||
def apply_default_mods(reapply_current=False):
|
||||
storage = get_subtitle_storage()
|
||||
subs_applied = 0
|
||||
for fn in storage.get_all_files():
|
||||
data = storage.load(None, filename=fn)
|
||||
if data:
|
||||
video_id = data.video_id
|
||||
item_type = get_item_kind_from_rating_key(video_id)
|
||||
if not item_type:
|
||||
continue
|
||||
|
||||
for part_id, part in data.parts.iteritems():
|
||||
for lang, subs in part.iteritems():
|
||||
current_sub = subs.get("current")
|
||||
if not current_sub:
|
||||
continue
|
||||
sub = subs[current_sub]
|
||||
|
||||
if not sub.content:
|
||||
continue
|
||||
|
||||
current_mods = sub.mods or []
|
||||
if not reapply_current:
|
||||
add_mods = list(set(config.default_mods).difference(set(current_mods)))
|
||||
if not add_mods:
|
||||
continue
|
||||
else:
|
||||
if not current_mods:
|
||||
continue
|
||||
add_mods = []
|
||||
|
||||
try:
|
||||
set_mods_for_part(video_id, part_id, Language.fromietf(lang), item_type, add_mods, mode="add")
|
||||
except:
|
||||
Log.Error("Couldn't set mods for %s:%s: %s", video_id, part_id, traceback.format_exc())
|
||||
continue
|
||||
|
||||
subs_applied += 1
|
||||
Log.Debug("Applied mods to %i items" % subs_applied)
|
||||
|
||||
|
||||
@route(PREFIX + '/applydefaultmods')
|
||||
@debounce
|
||||
def ApplyDefaultMods(randomize=None):
|
||||
Thread.CreateTimer(1.0, apply_default_mods)
|
||||
return AdvancedMenu(
|
||||
randomize=timestamp(),
|
||||
header='Success',
|
||||
message='This may take some time ...'
|
||||
)
|
||||
|
||||
|
||||
@route(PREFIX + '/reapplyallmods')
|
||||
@debounce
|
||||
def ReApplyMods(randomize=None):
|
||||
Thread.CreateTimer(1.0, apply_default_mods, reapply_current=True)
|
||||
return AdvancedMenu(
|
||||
randomize=timestamp(),
|
||||
header='Success',
|
||||
message='This may take some time ...'
|
||||
)
|
||||
|
||||
|
||||
@route(PREFIX + '/get_logs_link')
|
||||
def GetLogsLink():
|
||||
if not config.plex_token:
|
||||
oc = ObjectContainer(title2="Download Logs", no_cache=True, no_history=True,
|
||||
header="Sorry, feature unavailable",
|
||||
message="Universal Plex token not available")
|
||||
return oc
|
||||
|
||||
# try getting the link base via the request in context, first, otherwise use the public ip
|
||||
req_headers = Core.sandbox.context.request.headers
|
||||
get_external_ip = True
|
||||
link_base = ""
|
||||
|
||||
if "Origin" in req_headers:
|
||||
link_base = req_headers["Origin"]
|
||||
Log.Debug("Using origin-based link_base")
|
||||
get_external_ip = False
|
||||
|
||||
elif "Referer" in req_headers:
|
||||
parsed = urlparse.urlparse(req_headers["Referer"])
|
||||
link_base = "%s://%s:%s" % (parsed.scheme, parsed.hostname, parsed.port)
|
||||
Log.Debug("Using referer-based link_base")
|
||||
get_external_ip = False
|
||||
|
||||
else:
|
||||
if get_external_ip or "plex.tv" in link_base:
|
||||
ip = Core.networking.http_request("http://www.plexapp.com/ip.php", cacheTime=7200).content.strip()
|
||||
link_base = "https://%s:32400" % ip
|
||||
Log.Debug("Using ip-based fallback link_base")
|
||||
|
||||
logs_link = "%s%s?X-Plex-Token=%s" % (link_base, PREFIX + '/logs', config.universal_plex_token)
|
||||
oc = ObjectContainer(title2="Download Logs", no_cache=True, no_history=True,
|
||||
logs_link = "%s%s?X-Plex-Token=%s" % (link_base, PREFIX + '/logs', config.plex_token)
|
||||
oc = ObjectContainer(title2=logs_link, no_cache=True, no_history=True,
|
||||
header="Copy this link and open this in your browser, please",
|
||||
message=logs_link)
|
||||
return oc
|
||||
@@ -189,6 +292,7 @@ def DownloadLogs():
|
||||
|
||||
|
||||
@route(PREFIX + '/invalidatecache')
|
||||
@debounce
|
||||
def InvalidateCache(randomize=None):
|
||||
from subliminal.cache import region
|
||||
region.invalidate()
|
||||
|
||||
@@ -1,23 +1,19 @@
|
||||
# coding=utf-8
|
||||
import os
|
||||
import traceback
|
||||
|
||||
from babelfish import Language
|
||||
|
||||
from subzero.constants import PREFIX
|
||||
from sub_mod import SubtitleModificationsMenu
|
||||
from menu_helpers import debounce, SubFolderObjectContainer, default_thumb, add_ignore_options, get_item_task_data, \
|
||||
set_refresh_menu_state
|
||||
|
||||
from refresh_item import RefreshItem
|
||||
from subzero.constants import PREFIX
|
||||
from support.config import config
|
||||
from support.helpers import timestamp, cast_bool, df, get_language
|
||||
from support.items import get_item_kind_from_rating_key, get_item, get_current_sub
|
||||
from support.plex_media import get_plex_metadata, scan_videos
|
||||
from support.lib import Plex
|
||||
from support.storage import get_subtitle_storage, save_subtitles
|
||||
from support.config import config
|
||||
from support.plex_media import get_plex_metadata, scan_videos, PMSMediaProxy
|
||||
from support.scheduler import scheduler
|
||||
|
||||
from subliminal_patch import PatchedSubtitle as Subtitle
|
||||
from subzero.modification import registry as mod_registry
|
||||
from support.storage import get_subtitle_storage
|
||||
|
||||
|
||||
@route(PREFIX + '/item/{rating_key}/actions')
|
||||
@@ -41,6 +37,29 @@ def ItemDetailsMenu(rating_key, title=None, base_title=None, item_title=None, ra
|
||||
timeout = 30
|
||||
|
||||
oc = SubFolderObjectContainer(title2=title, replace_parent=True)
|
||||
|
||||
# add back to season for episode
|
||||
if current_kind == "episode":
|
||||
from interface.menu import MetadataMenu
|
||||
show = get_item(item.show.rating_key)
|
||||
season = get_item(item.season.rating_key)
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(MetadataMenu, rating_key=season.rating_key, title=season.title, base_title=show.title,
|
||||
previous_item_type="show", previous_rating_key=show.rating_key,
|
||||
display_items=True, randomize=timestamp()),
|
||||
title=u"< Back to %s" % season.title,
|
||||
summary="Back to %s > %s" % (show.title, season.title),
|
||||
thumb=season.thumb or default_thumb
|
||||
))
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(UpdateLocalMedia, rating_key=rating_key, title=title, item_title=item_title, base_title=base_title,
|
||||
randomize=timestamp()),
|
||||
title=u"Find local subtitles (doesn't refresh metadata)",
|
||||
summary="Searches for locally available subtitles",
|
||||
thumb=item.thumb or default_thumb
|
||||
))
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(RefreshItem, rating_key=rating_key, item_title=item_title, randomize=timestamp(),
|
||||
timeout=timeout * 1000),
|
||||
@@ -51,7 +70,7 @@ def ItemDetailsMenu(rating_key, title=None, base_title=None, item_title=None, ra
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(RefreshItem, rating_key=rating_key, item_title=item_title, force=True, randomize=timestamp(),
|
||||
timeout=timeout * 1000),
|
||||
title=u"Auto-search: %s" % item_title,
|
||||
title=u"Force-find subtitles: %s" % item_title,
|
||||
summary="Issues a forced refresh, ignoring known subtitles and searching for new ones",
|
||||
thumb=item.thumb or default_thumb
|
||||
))
|
||||
@@ -63,52 +82,76 @@ def ItemDetailsMenu(rating_key, title=None, base_title=None, item_title=None, ra
|
||||
# get the plex item
|
||||
plex_item = list(Plex["library"].metadata(rating_key))[0]
|
||||
|
||||
# get current media info for that item
|
||||
media = plex_item.media
|
||||
|
||||
# look for subtitles for all available media parts and all of their languages
|
||||
for part in media.parts:
|
||||
filename = os.path.basename(part.file)
|
||||
part_id = str(part.id)
|
||||
has_multiple_parts = len(plex_item.media) > 1
|
||||
part_index = 0
|
||||
for media in plex_item.media:
|
||||
for part in media.parts:
|
||||
filename = os.path.basename(part.file)
|
||||
if not os.path.exists(part.file):
|
||||
continue
|
||||
|
||||
# iterate through all configured languages
|
||||
for lang in config.lang_list:
|
||||
lang_a2 = lang.alpha2
|
||||
# ietf lang?
|
||||
if cast_bool(Prefs["subtitles.language.ietf"]) and "-" in lang_a2:
|
||||
lang_a2 = lang_a2.split("-")[0]
|
||||
part_id = str(part.id)
|
||||
part_index += 1
|
||||
|
||||
# get corresponding stored subtitle data for that media part (physical media item), for language
|
||||
current_sub = stored_subs.get_any(part_id, lang_a2)
|
||||
current_sub_id = None
|
||||
current_sub_provider_name = None
|
||||
# iterate through all configured languages
|
||||
for lang in config.lang_list:
|
||||
lang_a2 = lang.alpha2
|
||||
# ietf lang?
|
||||
if cast_bool(Prefs["subtitles.language.ietf"]) and "-" in lang_a2:
|
||||
lang_a2 = lang_a2.split("-")[0]
|
||||
|
||||
summary = u"No current subtitle in storage"
|
||||
current_score = None
|
||||
if current_sub:
|
||||
current_sub_id = current_sub.id
|
||||
current_sub_provider_name = current_sub.provider_name
|
||||
current_score = current_sub.score
|
||||
# get corresponding stored subtitle data for that media part (physical media item), for language
|
||||
current_sub = stored_subs.get_any(part_id, lang_a2)
|
||||
current_sub_id = None
|
||||
current_sub_provider_name = None
|
||||
|
||||
summary = u"Current subtitle: %s (added: %s, %s), Language: %s, Score: %i, Storage: %s" % \
|
||||
(current_sub.provider_name, df(current_sub.date_added), current_sub.mode_verbose, lang,
|
||||
current_sub.score, current_sub.storage_type)
|
||||
part_index_addon = ""
|
||||
part_summary_addon = ""
|
||||
if has_multiple_parts:
|
||||
part_index_addon = u"File %s: " % part_index
|
||||
part_summary_addon = "%s " % filename
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleOptionsMenu, rating_key=rating_key, part_id=part_id, title=title,
|
||||
item_title=item_title, language=lang, language_name=lang.name, current_id=current_sub_id,
|
||||
item_type=plex_item.type, filename=filename, current_data=summary,
|
||||
randomize=timestamp(), current_provider=current_sub_provider_name,
|
||||
current_score=current_score),
|
||||
title=u"Actions for %s subtitle" % lang.name,
|
||||
summary=summary
|
||||
))
|
||||
summary = u"%sNo current subtitle in storage" % part_summary_addon
|
||||
current_score = None
|
||||
if current_sub:
|
||||
current_sub_id = current_sub.id
|
||||
current_sub_provider_name = current_sub.provider_name
|
||||
current_score = current_sub.score
|
||||
|
||||
summary = u"%sCurrent subtitle: %s (added: %s, %s), Language: %s, Score: %i, Storage: %s" % \
|
||||
(part_summary_addon, current_sub.provider_name, df(current_sub.date_added),
|
||||
current_sub.mode_verbose, lang, current_sub.score, current_sub.storage_type)
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleOptionsMenu, rating_key=rating_key, part_id=part_id, title=title,
|
||||
item_title=item_title, language=lang, language_name=lang.name, current_id=current_sub_id,
|
||||
item_type=plex_item.type, filename=filename, current_data=summary,
|
||||
randomize=timestamp(), current_provider=current_sub_provider_name,
|
||||
current_score=current_score),
|
||||
title=u"%sActions for %s subtitle" % (part_index_addon, lang.name),
|
||||
summary=summary
|
||||
))
|
||||
|
||||
add_ignore_options(oc, "videos", title=item_title, rating_key=rating_key, callback_menu=IgnoreMenu)
|
||||
|
||||
return oc
|
||||
|
||||
|
||||
@route(PREFIX + '/item/update_local_media/{rating_key}', force=bool)
|
||||
@debounce
|
||||
def UpdateLocalMedia(**kwargs):
|
||||
from support.localmedia import find_subtitles
|
||||
rating_key = kwargs["rating_key"]
|
||||
parts = PMSMediaProxy(rating_key).get_all_parts()
|
||||
for part in parts:
|
||||
find_subtitles(part)
|
||||
|
||||
kwargs.pop("randomize")
|
||||
|
||||
return ItemDetailsMenu(**kwargs)
|
||||
|
||||
|
||||
@route(PREFIX + '/item/current_sub/{rating_key}/{part_id}', force=bool)
|
||||
@debounce
|
||||
def SubtitleOptionsMenu(**kwargs):
|
||||
@@ -123,7 +166,7 @@ def SubtitleOptionsMenu(**kwargs):
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(ItemDetailsMenu, rating_key=kwargs["rating_key"], item_title=kwargs["item_title"],
|
||||
title=kwargs["title"], randomize=timestamp()),
|
||||
title=u"Back to: %s" % kwargs["title"],
|
||||
title=u"< Back to %s" % kwargs["title"],
|
||||
summary=kwargs["current_data"],
|
||||
thumb=default_thumb
|
||||
))
|
||||
@@ -141,69 +184,6 @@ def SubtitleOptionsMenu(**kwargs):
|
||||
return oc
|
||||
|
||||
|
||||
@route(PREFIX + '/item/sub_mods/{rating_key}/{part_id}', force=bool)
|
||||
@debounce
|
||||
def SubtitleModificationsMenu(**kwargs):
|
||||
rating_key = kwargs["rating_key"]
|
||||
part_id = kwargs["part_id"]
|
||||
language = kwargs["language"]
|
||||
current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
|
||||
kwargs.pop("randomize")
|
||||
|
||||
oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
|
||||
for identifier, mod in mod_registry.mods.iteritems():
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleApplyMod, mod_identifier=identifier, randomize=timestamp(), **kwargs),
|
||||
title=mod.description
|
||||
))
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleApplyMod, mod_identifier=None, randomize=timestamp(), **kwargs),
|
||||
title="Restore original version",
|
||||
summary=u"Currently applied mods: %s" % (", ".join(current_sub.mods) if current_sub.mods else "none")
|
||||
))
|
||||
|
||||
return oc
|
||||
|
||||
|
||||
@route(PREFIX + '/item/sub_add_mod/{rating_key}/{part_id}/{mod_identifier}', force=bool)
|
||||
@debounce
|
||||
def SubtitleApplyMod(mod_identifier=None, **kwargs):
|
||||
if mod_identifier is not None and mod_identifier not in mod_registry.mods:
|
||||
raise NotImplementedError
|
||||
|
||||
rating_key = kwargs["rating_key"]
|
||||
part_id = kwargs["part_id"]
|
||||
lang_a2 = kwargs["language"]
|
||||
item_type = kwargs["item_type"]
|
||||
|
||||
language = Language.fromietf(lang_a2)
|
||||
|
||||
current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
|
||||
current_sub.add_mod(mod_identifier)
|
||||
|
||||
storage.save(stored_subs)
|
||||
metadata = get_plex_metadata(rating_key, part_id, item_type)
|
||||
scanned_parts = scan_videos([metadata], kind="series" if item_type == "episode" else "movie", ignore_all=True)
|
||||
video, plex_part = scanned_parts.items()[0]
|
||||
|
||||
subtitle = Subtitle(language, mods=current_sub.mods)
|
||||
subtitle.content = current_sub.content
|
||||
subtitle.plex_media_fps = plex_part.fps
|
||||
subtitle.page_link = "modify subtitles with: %s" % (", ".join(current_sub.mods) if current_sub.mods else "none")
|
||||
subtitle.language = language
|
||||
|
||||
try:
|
||||
save_subtitles(scanned_parts, {video: [subtitle]}, mode="m", bare_save=True)
|
||||
Log.Debug("Modified %s subtitle for: %s:%s with: %s", language.name, rating_key, part_id,
|
||||
", ".join(current_sub.mods) if current_sub.mods else "none")
|
||||
except:
|
||||
Log.Error("Something went wrong when modifying subtitle: %s", traceback.format_exc())
|
||||
|
||||
kwargs.pop("randomize")
|
||||
return SubtitleModificationsMenu(randomize=timestamp(), **kwargs)
|
||||
|
||||
|
||||
@route(PREFIX + '/item/search/{rating_key}/{part_id}', force=bool)
|
||||
@debounce
|
||||
def ListAvailableSubsForItemMenu(rating_key=None, part_id=None, title=None, item_title=None, filename=None,
|
||||
@@ -223,7 +203,7 @@ def ListAvailableSubsForItemMenu(rating_key=None, part_id=None, title=None, item
|
||||
oc = SubFolderObjectContainer(title2=unicode(title), replace_parent=True)
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(ItemDetailsMenu, rating_key=rating_key, item_title=item_title, title=title, randomize=timestamp()),
|
||||
title=u"Back to: %s" % title,
|
||||
title=u"< Back to %s" % title,
|
||||
summary=current_data,
|
||||
thumb=default_thumb
|
||||
))
|
||||
@@ -269,11 +249,15 @@ def ListAvailableSubsForItemMenu(rating_key=None, part_id=None, title=None, item
|
||||
return oc
|
||||
|
||||
for subtitle in search_results:
|
||||
wrong_fps_addon = ""
|
||||
if subtitle.wrong_fps:
|
||||
wrong_fps_addon = " (wrong FPS, sub: %s, media: %s)" % (subtitle.fps, plex_part.fps)
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(TriggerDownloadSubtitle, rating_key=rating_key, randomize=timestamp(), item_title=item_title,
|
||||
subtitle_id=str(subtitle.id), language=language),
|
||||
title=u"%s: %s, score: %s" % ("Available" if current_id != subtitle.id else "Current",
|
||||
subtitle.provider_name, subtitle.score),
|
||||
title=u"%s: %s, score: %s%s" % ("Available" if current_id != subtitle.id else "Current",
|
||||
subtitle.provider_name, subtitle.score, wrong_fps_addon),
|
||||
summary=u"Release: %s, Matches: %s" % (subtitle.release_info, ", ".join(subtitle.matches)),
|
||||
thumb=default_thumb
|
||||
))
|
||||
|
||||
@@ -2,11 +2,13 @@
|
||||
|
||||
from subzero.constants import PREFIX, TITLE, ART
|
||||
from support.config import config
|
||||
from support.helpers import pad_title, timestamp, df
|
||||
from support.helpers import pad_title, timestamp, df, get_plex_item_display_title
|
||||
from support.scheduler import scheduler
|
||||
from support.ignore import ignore_list
|
||||
from support.items import get_item_thumb, get_on_deck_items, get_all_items, get_items_info
|
||||
from menu_helpers import main_icon, debounce, SubFolderObjectContainer, default_thumb, dig_tree, add_ignore_options
|
||||
from support.items import get_item_thumb, get_on_deck_items, get_all_items, get_items_info, get_item, \
|
||||
get_item_kind_from_item
|
||||
from menu_helpers import main_icon, debounce, SubFolderObjectContainer, default_thumb, dig_tree, add_ignore_options,\
|
||||
ObjectContainer
|
||||
from item_details import ItemDetailsMenu
|
||||
|
||||
|
||||
@@ -69,16 +71,24 @@ def fatality(randomize=None, force_title=None, header=None, message=None, only_r
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(OnDeckMenu),
|
||||
title="On Deck items",
|
||||
title="On-deck items",
|
||||
summary="Shows the current on deck items and allows you to individually (force-) refresh their metadata/"
|
||||
"subtitles.",
|
||||
thumb=R("icon-ondeck.jpg")
|
||||
))
|
||||
if "last_played_items" in Dict and Dict["last_played_items"]:
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(RecentlyPlayedMenu),
|
||||
title=pad_title("Recently played items"),
|
||||
summary="Shows the %i recently played items and allows you to individually (force-) refresh their "
|
||||
"metadata/subtitles." % config.store_recently_played_amount,
|
||||
thumb=R("icon-played.jpg")
|
||||
))
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(RecentlyAddedMenu),
|
||||
title="Recently Added items",
|
||||
title="Recently-added items",
|
||||
summary="Shows the recently added items per section.",
|
||||
thumb=R("icon-recent.jpg")
|
||||
thumb=R("icon-added.jpg")
|
||||
))
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(RecentMissingSubtitlesMenu, randomize=timestamp()),
|
||||
@@ -168,6 +178,31 @@ def OnDeckMenu(message=None):
|
||||
return mergedItemsMenu(title="Items On Deck", base_title="Items On Deck", itemGetter=get_on_deck_items)
|
||||
|
||||
|
||||
@route(PREFIX + '/recently_played')
|
||||
def RecentlyPlayedMenu():
|
||||
base_title = "Recently Played"
|
||||
oc = SubFolderObjectContainer(title2=base_title, replace_parent=True)
|
||||
|
||||
for item in [get_item(rating_key) for rating_key in Dict["last_played_items"]]:
|
||||
kind = get_item_kind_from_item(item)
|
||||
if kind not in ("episode", "movie"):
|
||||
continue
|
||||
|
||||
if kind == "episode":
|
||||
item_title = get_plex_item_display_title(item, "show", parent=item.season, section_title=None,
|
||||
parent_title=item.show.title)
|
||||
else:
|
||||
item_title = get_plex_item_display_title(item, kind, section_title=None)
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
title=item_title,
|
||||
key=Callback(ItemDetailsMenu, title=base_title + " > " + item.title, item_title=item.title,
|
||||
rating_key=item.rating_key)
|
||||
))
|
||||
|
||||
return oc
|
||||
|
||||
|
||||
@route(PREFIX + '/recently_added')
|
||||
def RecentlyAddedMenu(message=None):
|
||||
"""
|
||||
@@ -215,8 +250,6 @@ def RecentMissingSubtitlesMenu(force=False, randomize=None):
|
||||
thumb=get_item_thumb(item) or default_thumb
|
||||
))
|
||||
|
||||
scheduler.clear_task_data("MissingSubtitles")
|
||||
|
||||
return oc
|
||||
|
||||
|
||||
|
||||
@@ -1,5 +1,8 @@
|
||||
# coding=utf-8
|
||||
import locale
|
||||
import logging
|
||||
import os
|
||||
|
||||
import logger
|
||||
|
||||
from item_details import ItemDetailsMenu
|
||||
@@ -11,10 +14,10 @@ from advanced import DispatchRestart
|
||||
from subzero.constants import ART, PREFIX, DEPENDENCY_MODULE_NAMES
|
||||
from support.scheduler import scheduler
|
||||
from support.config import config
|
||||
from support.helpers import timestamp, df
|
||||
from support.helpers import timestamp, df
|
||||
from support.ignore import ignore_list
|
||||
from support.items import get_all_items, get_items_info, \
|
||||
get_item_kind_from_rating_key
|
||||
get_item_kind_from_rating_key, get_item
|
||||
|
||||
# init GUI
|
||||
ObjectContainer.art = R(ART)
|
||||
@@ -53,7 +56,7 @@ def FirstLetterMetadataMenu(rating_key, key, title=None, base_title=None, displa
|
||||
|
||||
@route(PREFIX + '/section/contents', display_items=bool)
|
||||
def MetadataMenu(rating_key, title=None, base_title=None, display_items=False, previous_item_type=None,
|
||||
previous_rating_key=None):
|
||||
previous_rating_key=None, randomize=None):
|
||||
"""
|
||||
displays the contents of a section based on whether it has a deeper tree or not (movies->movie (item) list; series->series list)
|
||||
:param rating_key:
|
||||
@@ -72,6 +75,22 @@ def MetadataMenu(rating_key, title=None, base_title=None, display_items=False, p
|
||||
current_kind = get_item_kind_from_rating_key(rating_key)
|
||||
|
||||
if display_items:
|
||||
timeout = 30
|
||||
|
||||
# add back to series for season
|
||||
if current_kind == "season":
|
||||
timeout = 360
|
||||
|
||||
show = get_item(previous_rating_key)
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(MetadataMenu, rating_key=show.rating_key, title=show.title, base_title=show.section.title,
|
||||
previous_item_type="section", display_items=True, randomize=timestamp()),
|
||||
title=u"< Back to %s" % show.title,
|
||||
thumb=show.thumb or default_thumb
|
||||
))
|
||||
elif current_kind == "series":
|
||||
timeout = 1800
|
||||
|
||||
items = get_all_items(key="children", value=rating_key, base="library/metadata")
|
||||
kind, deeper = get_items_info(items)
|
||||
dig_tree(oc, items, MetadataMenu,
|
||||
@@ -81,12 +100,6 @@ def MetadataMenu(rating_key, title=None, base_title=None, display_items=False, p
|
||||
if should_display_ignore(items, previous=previous_item_type):
|
||||
add_ignore_options(oc, "series", title=item_title, rating_key=rating_key, callback_menu=IgnoreMenu)
|
||||
|
||||
timeout = 30
|
||||
if current_kind == "season":
|
||||
timeout = 360
|
||||
elif current_kind == "series":
|
||||
timeout = 1800
|
||||
|
||||
# add refresh
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(RefreshItem, rating_key=rating_key, item_title=title, refresh_kind=current_kind,
|
||||
@@ -147,7 +160,6 @@ def RefreshMissing(randomize=None):
|
||||
@route(PREFIX + '/ValidatePrefs', enforce_route=True)
|
||||
def ValidatePrefs():
|
||||
Core.log.setLevel(logging.DEBUG)
|
||||
Log.Debug("Validate Prefs called.")
|
||||
|
||||
# cache the channel state
|
||||
update_dict = False
|
||||
@@ -182,9 +194,51 @@ def ValidatePrefs():
|
||||
Core.log.removeHandler(logger.console_handler)
|
||||
Log.Debug("Stop logging to console")
|
||||
|
||||
Log.Debug("Validate Prefs called.")
|
||||
|
||||
# SZ config debug
|
||||
Log.Debug("--- SZ Config-Debug ---")
|
||||
for attr in [
|
||||
"app_support_path", "data_path", "data_items_path", "enable_agent",
|
||||
"enable_channel", "permissions_ok", "missing_permissions", "fs_encoding", "enforce_encoding",
|
||||
"subtitle_destination_folder"]:
|
||||
Log.Debug("config.%s: %s", attr, getattr(config, attr))
|
||||
|
||||
for attr in ["plugin_log_path", "server_log_path"]:
|
||||
value = getattr(config, attr)
|
||||
access = os.access(value, os.R_OK)
|
||||
if Core.runtime.os == "Windows":
|
||||
try:
|
||||
f = open(value, "r")
|
||||
f.read(1)
|
||||
f.close()
|
||||
except:
|
||||
access = False
|
||||
|
||||
Log.Debug("config.%s: %s (accessible: %s)", attr, value, access)
|
||||
|
||||
for attr in [
|
||||
"subtitles.save.filesystem", ]:
|
||||
Log.Debug("Pref.%s: %s", attr, Prefs[attr])
|
||||
|
||||
# fixme: check existance of and os access of logs
|
||||
Log.Debug("Platform: %s", Core.runtime.platform)
|
||||
Log.Debug("OS: %s", Core.runtime.os)
|
||||
Log.Debug("----- Environment -----")
|
||||
for key, value in os.environ.iteritems():
|
||||
if key.startswith("PLEX") or key.startswith("SZ_"):
|
||||
if "TOKEN" in key:
|
||||
outval = "xxxxxxxxxxxxxxxxxxx"
|
||||
|
||||
else:
|
||||
outval = value
|
||||
Log.Debug("%s: %s", key, outval)
|
||||
Log.Debug("Locale: %s", locale.getdefaultlocale())
|
||||
Log.Debug("-----------------------")
|
||||
|
||||
Log.Debug("Setting log-level to %s", Prefs["log_level"])
|
||||
logger.register_logging_handler(DEPENDENCY_MODULE_NAMES, level=Prefs["log_level"])
|
||||
Core.log.setLevel(logging.getLevelName(Prefs["log_level"]))
|
||||
os.environ['U1pfT01EQl9LRVk'] = '789CF30DAC2C8B0AF433F5C9AD34290A712DF30D7135F12D0FB3E502006FDE081E'
|
||||
|
||||
return
|
||||
|
||||
|
||||
@@ -43,8 +43,8 @@ def add_ignore_options(oc, kind, callback_menu=None, title=None, rating_key=None
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(callback_menu, kind=use_kind, rating_key=rating_key, title=title),
|
||||
title=u"%s %s \"%s\" %s the ignore list" % (
|
||||
"Remove" if in_list else "Add", ignore_list.verbose(kind) if add_kind else "", unicode(title), "from" if in_list else "to")
|
||||
title=u"%s %s \"%s\"" % (
|
||||
"Un-Ignore" if in_list else "Ignore", ignore_list.verbose(kind) if add_kind else "", unicode(title))
|
||||
)
|
||||
)
|
||||
|
||||
@@ -157,7 +157,12 @@ def debounce(func):
|
||||
return ObjectContainer()
|
||||
else:
|
||||
Dict["menu_history"][key] = datetime.datetime.now() + datetime.timedelta(days=1)
|
||||
Dict.Save()
|
||||
try:
|
||||
Dict.Save()
|
||||
except TypeError:
|
||||
Log.Error("Can't save menu history for: %r", key)
|
||||
del Dict["menu_history"][key]
|
||||
|
||||
return func(*args, **kwargs)
|
||||
|
||||
return wrap
|
||||
|
||||
@@ -0,0 +1,251 @@
|
||||
# coding=utf-8
|
||||
|
||||
import traceback
|
||||
import types
|
||||
|
||||
from babelfish import Language
|
||||
|
||||
from menu_helpers import debounce, SubFolderObjectContainer, default_thumb
|
||||
from subzero.modification import registry as mod_registry, SubtitleModifications
|
||||
from subzero.constants import PREFIX
|
||||
from support.plex_media import get_plex_metadata, scan_videos
|
||||
from support.helpers import timestamp, pad_title
|
||||
from support.items import get_current_sub, set_mods_for_part
|
||||
|
||||
|
||||
@route(PREFIX + '/item/sub_mods/{rating_key}/{part_id}', force=bool)
|
||||
@debounce
|
||||
def SubtitleModificationsMenu(**kwargs):
|
||||
rating_key = kwargs["rating_key"]
|
||||
part_id = kwargs["part_id"]
|
||||
language = kwargs["language"]
|
||||
current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
|
||||
kwargs.pop("randomize")
|
||||
|
||||
current_mods = current_sub.mods or []
|
||||
|
||||
oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
|
||||
|
||||
from interface.item_details import SubtitleOptionsMenu
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleOptionsMenu, randomize=timestamp(), **kwargs),
|
||||
title=u"< Back to subtitle options for: %s" % kwargs["title"],
|
||||
summary=kwargs["current_data"],
|
||||
thumb=default_thumb
|
||||
))
|
||||
|
||||
for identifier, mod in mod_registry.mods.iteritems():
|
||||
if mod.advanced:
|
||||
continue
|
||||
|
||||
if mod.exclusive and identifier in current_mods:
|
||||
continue
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleSetMods, mods=identifier, mode="add", randomize=timestamp(), **kwargs),
|
||||
title=pad_title(mod.description), summary=mod.long_description or ""
|
||||
))
|
||||
|
||||
fps_mod = SubtitleModifications.get_mod_class("change_FPS")
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleFPSModMenu, randomize=timestamp(), **kwargs),
|
||||
title=pad_title(fps_mod.description), summary=fps_mod.long_description or ""
|
||||
))
|
||||
|
||||
shift_mod = SubtitleModifications.get_mod_class("shift_offset")
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleShiftModUnitMenu, randomize=timestamp(), **kwargs),
|
||||
title=pad_title(shift_mod.description), summary=shift_mod.long_description or ""
|
||||
))
|
||||
|
||||
color_mod = SubtitleModifications.get_mod_class("color")
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleColorModMenu, randomize=timestamp(), **kwargs),
|
||||
title=pad_title(color_mod.description), summary=color_mod.long_description or ""
|
||||
))
|
||||
|
||||
if current_mods:
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleSetMods, mods=None, mode="remove_last", randomize=timestamp(), **kwargs),
|
||||
title=pad_title("Remove last applied mod (%s)" % current_mods[-1]),
|
||||
summary=u"Currently applied mods: %s" % (", ".join(current_mods) if current_mods else "none")
|
||||
))
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleListMods, randomize=timestamp(), **kwargs),
|
||||
title=pad_title("Manage applied mods"),
|
||||
summary=u"Currently applied mods: %s" % (", ".join(current_mods))
|
||||
))
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleSetMods, mods=None, mode="clear", randomize=timestamp(), **kwargs),
|
||||
title=pad_title("Restore original version"),
|
||||
summary=u"Currently applied mods: %s" % (", ".join(current_mods) if current_mods else "none")
|
||||
))
|
||||
|
||||
return oc
|
||||
|
||||
|
||||
@route(PREFIX + '/item/sub_mod_fps/{rating_key}/{part_id}', force=bool)
|
||||
def SubtitleFPSModMenu(**kwargs):
|
||||
rating_key = kwargs["rating_key"]
|
||||
part_id = kwargs["part_id"]
|
||||
item_type = kwargs["item_type"]
|
||||
|
||||
kwargs.pop("randomize")
|
||||
|
||||
oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleModificationsMenu, randomize=timestamp(), **kwargs),
|
||||
title="< Back to subtitle modification menu"
|
||||
))
|
||||
|
||||
metadata = get_plex_metadata(rating_key, part_id, item_type)
|
||||
scanned_parts = scan_videos([metadata], kind="series" if item_type == "episode" else "movie", ignore_all=True)
|
||||
video, plex_part = scanned_parts.items()[0]
|
||||
|
||||
target_fps = plex_part.fps
|
||||
|
||||
for fps in ["23.976", "24.000", "25.000", "29.970", "30.000", "50.000", "59.940", "60.000"]:
|
||||
if float(fps) == float(target_fps):
|
||||
continue
|
||||
|
||||
if float(fps) > float(target_fps):
|
||||
indicator = "subs constantly getting faster"
|
||||
else:
|
||||
indicator = "subs constantly getting slower"
|
||||
|
||||
mod_ident = SubtitleModifications.get_mod_signature("change_FPS", **{"from": fps, "to": target_fps})
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleSetMods, mods=mod_ident, mode="add", randomize=timestamp(), **kwargs),
|
||||
title="%s fps -> %s fps (%s)" % (fps, target_fps, indicator)
|
||||
))
|
||||
|
||||
return oc
|
||||
|
||||
|
||||
POSSIBLE_UNITS = (("ms", "milliseconds"), ("s", "seconds"), ("m", "minutes"), ("h", "hours"))
|
||||
POSSIBLE_UNITS_D = dict(POSSIBLE_UNITS)
|
||||
|
||||
|
||||
@route(PREFIX + '/item/sub_mod_shift_unit/{rating_key}/{part_id}', force=bool)
|
||||
def SubtitleShiftModUnitMenu(**kwargs):
|
||||
oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
|
||||
|
||||
kwargs.pop("randomize")
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleModificationsMenu, randomize=timestamp(), **kwargs),
|
||||
title="< Back to subtitle modifications"
|
||||
))
|
||||
|
||||
for unit, title in POSSIBLE_UNITS:
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleShiftModMenu, unit=unit, randomize=timestamp(), **kwargs),
|
||||
title="Adjust by %s" % title
|
||||
))
|
||||
|
||||
return oc
|
||||
|
||||
|
||||
@route(PREFIX + '/item/sub_mod_shift/{rating_key}/{part_id}/{unit}', force=bool)
|
||||
def SubtitleShiftModMenu(unit=None, **kwargs):
|
||||
if unit not in POSSIBLE_UNITS_D:
|
||||
raise NotImplementedError
|
||||
|
||||
kwargs.pop("randomize")
|
||||
|
||||
oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleShiftModUnitMenu, randomize=timestamp(), **kwargs),
|
||||
title="< Back to unit selection"
|
||||
))
|
||||
|
||||
rng = []
|
||||
if unit == "h":
|
||||
rng = range(-10, 11)
|
||||
elif unit in ("m", "s"):
|
||||
rng = range(-15, 15)
|
||||
elif unit == "ms":
|
||||
rng = range(-900, 1000, 100)
|
||||
|
||||
for i in rng:
|
||||
if i == 0:
|
||||
continue
|
||||
|
||||
mod_ident = SubtitleModifications.get_mod_signature("shift_offset", **{unit: i})
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleSetMods, mods=mod_ident, mode="add", randomize=timestamp(), **kwargs),
|
||||
title="%s %s" % (("%s" if i < 0 else "+%s") % i, unit)
|
||||
))
|
||||
|
||||
return oc
|
||||
|
||||
|
||||
@route(PREFIX + '/item/sub_mod_colors/{rating_key}/{part_id}', force=bool)
|
||||
def SubtitleColorModMenu(**kwargs):
|
||||
kwargs.pop("randomize")
|
||||
|
||||
color_mod = SubtitleModifications.get_mod_class("color")
|
||||
|
||||
oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleModificationsMenu, randomize=timestamp(), **kwargs),
|
||||
title="< Back to subtitle modification menu"
|
||||
))
|
||||
|
||||
for color, code in color_mod.colors.iteritems():
|
||||
mod_ident = SubtitleModifications.get_mod_signature("color", **{"name": color})
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleSetMods, mods=mod_ident, mode="add", randomize=timestamp(), **kwargs),
|
||||
title="%s (%s)" % (color, code)
|
||||
))
|
||||
|
||||
return oc
|
||||
|
||||
|
||||
@route(PREFIX + '/item/sub_set_mods/{rating_key}/{part_id}/{mods}/{mode}', force=bool)
|
||||
@debounce
|
||||
def SubtitleSetMods(mods=None, mode=None, **kwargs):
|
||||
if not isinstance(mods, types.ListType) and mods:
|
||||
mods = [mods]
|
||||
|
||||
rating_key = kwargs["rating_key"]
|
||||
part_id = kwargs["part_id"]
|
||||
lang_a2 = kwargs["language"]
|
||||
item_type = kwargs["item_type"]
|
||||
|
||||
language = Language.fromietf(lang_a2)
|
||||
|
||||
set_mods_for_part(rating_key, part_id, language, item_type, mods, mode=mode)
|
||||
|
||||
kwargs.pop("randomize")
|
||||
return SubtitleModificationsMenu(randomize=timestamp(), **kwargs)
|
||||
|
||||
|
||||
@route(PREFIX + '/item/sub_list_mods/{rating_key}/{part_id}', force=bool)
|
||||
@debounce
|
||||
def SubtitleListMods(**kwargs):
|
||||
rating_key = kwargs["rating_key"]
|
||||
part_id = kwargs["part_id"]
|
||||
language = kwargs["language"]
|
||||
current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
|
||||
|
||||
kwargs.pop("randomize")
|
||||
|
||||
oc = SubFolderObjectContainer(title2=kwargs["title"], replace_parent=True)
|
||||
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleModificationsMenu, randomize=timestamp(), **kwargs),
|
||||
title="< Back to subtitle modifications"
|
||||
))
|
||||
|
||||
for identifier in current_sub.mods:
|
||||
oc.add(DirectoryObject(
|
||||
key=Callback(SubtitleSetMods, mods=identifier, mode="remove", randomize=timestamp(), **kwargs),
|
||||
title="Remove: %s" % identifier
|
||||
))
|
||||
|
||||
return oc
|
||||
@@ -18,7 +18,7 @@ sys.modules["support.plex_media"] = plex_media
|
||||
|
||||
import localmedia
|
||||
|
||||
sys.modules["subzero.localmedia"] = localmedia
|
||||
sys.modules["support.localmedia"] = localmedia
|
||||
|
||||
import subtitlehelpers
|
||||
|
||||
|
||||
@@ -11,9 +11,9 @@ class PlexActivityManager(object):
|
||||
def start(self):
|
||||
activity_sources_enabled = None
|
||||
|
||||
if config.universal_plex_token:
|
||||
if config.plex_token:
|
||||
from plex import Plex
|
||||
Plex.configuration.defaults.authentication(config.universal_plex_token)
|
||||
Plex.configuration.defaults.authentication(config.plex_token)
|
||||
activity_sources_enabled = ["websocket"]
|
||||
Activity.on('websocket.playing', self.on_playing)
|
||||
|
||||
@@ -27,9 +27,6 @@ class PlexActivityManager(object):
|
||||
|
||||
@throttle(5, instance_method=True)
|
||||
def on_playing(self, info):
|
||||
if not config.use_activities:
|
||||
return
|
||||
|
||||
# ignore non-playing states and anything too far in
|
||||
if info["state"] != "playing" or info["viewOffset"] > 60000:
|
||||
return
|
||||
@@ -41,13 +38,22 @@ class PlexActivityManager(object):
|
||||
return
|
||||
|
||||
rating_key = info["ratingKey"]
|
||||
if rating_key not in Dict["last_played_items"]:
|
||||
# new playing; store last 10 recently played items
|
||||
if rating_key in Dict["last_played_items"] and rating_key != Dict["last_played_items"][0]:
|
||||
# shift last played
|
||||
Dict["last_played_items"].insert(0,
|
||||
Dict["last_played_items"].pop(Dict["last_played_items"].index(rating_key)))
|
||||
Dict.Save()
|
||||
|
||||
elif rating_key not in Dict["last_played_items"]:
|
||||
# new playing; store last X recently played items
|
||||
Dict["last_played_items"].insert(0, rating_key)
|
||||
Dict["last_played_items"] = Dict["last_played_items"][:10]
|
||||
Dict["last_played_items"] = Dict["last_played_items"][:config.store_recently_played_amount]
|
||||
|
||||
Dict.Save()
|
||||
|
||||
if not config.react_to_activities:
|
||||
return
|
||||
|
||||
debug_msg = "Started playing %s. Refreshing it." % rating_key
|
||||
|
||||
key_to_refresh = None
|
||||
@@ -108,4 +114,5 @@ class PlexActivityManager(object):
|
||||
if ep.index == 1:
|
||||
return ep
|
||||
|
||||
|
||||
activity = PlexActivityManager()
|
||||
|
||||
@@ -9,6 +9,7 @@ import datetime
|
||||
import subliminal
|
||||
import subliminal_patch
|
||||
from babelfish import Language
|
||||
from subliminal.cli import MutexLock
|
||||
from subzero.lib.io import FileIO, get_viable_encoding
|
||||
from subzero.constants import PLUGIN_NAME, PLUGIN_IDENTIFIER, MOVIE, SHOW
|
||||
from lib import Plex
|
||||
@@ -45,6 +46,7 @@ class Config(object):
|
||||
data_path = None
|
||||
data_items_path = None
|
||||
universal_plex_token = None
|
||||
plex_token = None
|
||||
is_development = False
|
||||
|
||||
enable_channel = True
|
||||
@@ -68,6 +70,9 @@ class Config(object):
|
||||
sections = None
|
||||
enabled_sections = None
|
||||
remove_hi = False
|
||||
fix_ocr = False
|
||||
fix_common = False
|
||||
colors = ""
|
||||
enforce_encoding = False
|
||||
chmod = None
|
||||
forced_only = False
|
||||
@@ -75,8 +80,12 @@ class Config(object):
|
||||
treat_und_as_first = False
|
||||
ext_match_strictness = False
|
||||
default_mods = None
|
||||
use_activities = False
|
||||
debug_mods = False
|
||||
react_to_activities = False
|
||||
activity_mode = None
|
||||
subtitles_save_to = None
|
||||
|
||||
store_recently_played_amount = 20
|
||||
|
||||
initialized = False
|
||||
|
||||
@@ -91,6 +100,9 @@ class Config(object):
|
||||
self.data_path = getattr(Data, "_core").storage.data_path
|
||||
self.data_items_path = os.path.join(self.data_path, "DataItems")
|
||||
self.universal_plex_token = self.get_universal_plex_token()
|
||||
self.plex_token = os.environ.get("PLEXTOKEN", self.universal_plex_token)
|
||||
|
||||
os.environ["SZ_USER_AGENT"] = self.get_user_agent()
|
||||
|
||||
self.set_plugin_mode()
|
||||
self.set_plugin_lock()
|
||||
@@ -98,6 +110,7 @@ class Config(object):
|
||||
|
||||
self.lang_list = self.get_lang_list()
|
||||
self.subtitle_destination_folder = self.get_subtitle_destination_folder()
|
||||
self.forced_only = cast_bool(Prefs["subtitles.only_foreign"])
|
||||
self.providers = self.get_providers()
|
||||
self.provider_settings = self.get_provider_settings()
|
||||
self.max_recent_items_per_library = int_or_default(Prefs["scheduler.max_recent_items_per_library"], 2000)
|
||||
@@ -109,15 +122,37 @@ class Config(object):
|
||||
self.permissions_ok = self.check_permissions()
|
||||
self.notify_executable = self.check_notify_executable()
|
||||
self.remove_hi = cast_bool(Prefs['subtitles.remove_hi'])
|
||||
self.fix_ocr = cast_bool(Prefs['subtitles.fix_ocr'])
|
||||
self.fix_common = cast_bool(Prefs['subtitles.fix_common'])
|
||||
self.colors = Prefs['subtitles.colors'] if Prefs['subtitles.colors'] != "don't change" else None
|
||||
self.enforce_encoding = cast_bool(Prefs['subtitles.enforce_encoding'])
|
||||
|
||||
os.environ["SZ_ENFORCE_ENCODING"] = str(self.enforce_encoding)
|
||||
|
||||
self.chmod = self.check_chmod()
|
||||
self.forced_only = cast_bool(Prefs["subtitles.only_foreign"])
|
||||
self.exotic_ext = cast_bool(Prefs["subtitles.scan.exotic_ext"])
|
||||
self.treat_und_as_first = cast_bool(Prefs["subtitles.language.treat_und_as_first"])
|
||||
self.ext_match_strictness = self.determine_ext_sub_strictness()
|
||||
self.default_mods = self.get_default_mods()
|
||||
self.debug_mods = cast_bool(Prefs['log_debug_mods'])
|
||||
self.subtitles_save_to = Prefs['subtitles.save.filesystem']
|
||||
self.initialized = True
|
||||
|
||||
def init_cache(self):
|
||||
use_fallback_cache = True
|
||||
if Core.runtime.os != "Windows":
|
||||
try:
|
||||
subliminal.region.configure('dogpile.cache.dbm', expiration_time=datetime.timedelta(days=30),
|
||||
arguments={'filename': os.path.join(config.data_items_path, 'subzero.dbm'),
|
||||
'lock_factory': MutexLock})
|
||||
use_fallback_cache = False
|
||||
except:
|
||||
pass
|
||||
|
||||
if use_fallback_cache:
|
||||
Log.Warn("Not using file based cache!")
|
||||
subliminal.region.configure('dogpile.cache.memory')
|
||||
|
||||
def set_log_paths(self):
|
||||
# find log handler
|
||||
for handler in Core.log.handlers:
|
||||
@@ -142,7 +177,9 @@ class Config(object):
|
||||
except:
|
||||
Log.Warn("Couldn't determine Plex Token")
|
||||
else:
|
||||
Log("Did NOT find Preferences file - please check logfile and hierarchy. Aborting!")
|
||||
Log("Did NOT find Preferences file - most likely Windows OS. Otherwise please check logfile and hierarchy.")
|
||||
|
||||
# fixme: windows
|
||||
|
||||
def set_plugin_mode(self):
|
||||
if Prefs["plugin_mode"] == "only agent":
|
||||
@@ -217,11 +254,17 @@ class Config(object):
|
||||
return all_permissions_ok
|
||||
|
||||
def get_version(self):
|
||||
return self.get_bare_version() + ("" if not self.is_development else " DEV")
|
||||
|
||||
def get_bare_version(self):
|
||||
result = VERSION_RE.search(self.plugin_info)
|
||||
add = "" if not self.is_development else " DEV"
|
||||
|
||||
if result:
|
||||
return result.group(1) + add
|
||||
return result.group(1)
|
||||
return "2.x.x.x"
|
||||
|
||||
def get_user_agent(self):
|
||||
return "Sub-Zero/%s" % (self.get_bare_version() + ("" if not self.is_development else "-dev"))
|
||||
|
||||
def get_dev_mode(self):
|
||||
dev = DEV_RE.search(self.plugin_info)
|
||||
@@ -347,10 +390,13 @@ class Config(object):
|
||||
}
|
||||
|
||||
# ditch non-forced-subtitles-reporting providers
|
||||
if cast_bool(Prefs['subtitles.only_foreign']):
|
||||
if self.forced_only:
|
||||
providers["addic7ed"] = False
|
||||
providers["tvsubtitles"] = False
|
||||
providers["legendastv"] = False
|
||||
providers["napiprojekt"] = False
|
||||
providers["shooter"] = False
|
||||
providers["subscenter"] = False
|
||||
|
||||
return filter(lambda prov: providers[prov], providers)
|
||||
|
||||
@@ -412,16 +458,22 @@ class Config(object):
|
||||
mods = []
|
||||
if self.remove_hi:
|
||||
mods.append("remove_HI")
|
||||
if self.fix_ocr:
|
||||
mods.append("OCR_fixes")
|
||||
if self.fix_common:
|
||||
mods.append("common")
|
||||
if self.colors:
|
||||
mods.append("color(name=%s)" % self.colors)
|
||||
|
||||
return mods
|
||||
|
||||
def set_activity_modes(self):
|
||||
val = Prefs["activity.on_playback"]
|
||||
if val == "never":
|
||||
self.use_activities = False
|
||||
self.react_to_activities = False
|
||||
return
|
||||
|
||||
self.use_activities = True
|
||||
self.react_to_activities = True
|
||||
if val == "current media item":
|
||||
self.activity_mode = "refresh"
|
||||
elif val == "hybrid: current item or next episode":
|
||||
|
||||
@@ -9,15 +9,24 @@ import time
|
||||
import re
|
||||
import platform
|
||||
import subprocess
|
||||
|
||||
from bs4 import UnicodeDammit
|
||||
|
||||
import sys
|
||||
import chardet
|
||||
|
||||
from bs4 import UnicodeDammit
|
||||
from babelfish import Language
|
||||
|
||||
from subzero.analytics import track_event
|
||||
|
||||
mswindows = (sys.platform == "win32")
|
||||
if mswindows:
|
||||
from subprocess import list2cmdline
|
||||
quote_args = list2cmdline
|
||||
else:
|
||||
# POSIX
|
||||
from pipes import quote
|
||||
|
||||
def quote_args(seq):
|
||||
return ' '.join(quote(arg) for arg in seq)
|
||||
|
||||
# Unicode control characters can appear in ID3v2 tags but are not legal in XML.
|
||||
RE_UNICODE_CONTROL = u'([\u0000-\u0008\u000b-\u000c\u000e-\u001f\ufffe-\uffff])' + \
|
||||
u'|' + \
|
||||
@@ -30,7 +39,7 @@ RE_UNICODE_CONTROL = u'([\u0000-\u0008\u000b-\u000c\u000e-\u001f\ufffe-\uffff])'
|
||||
|
||||
|
||||
def cast_bool(value):
|
||||
return str(value) in ("true", "True")
|
||||
return str(value).strip() in ("true", "True")
|
||||
|
||||
|
||||
# A platform independent way to split paths which might come in with different separators.
|
||||
@@ -110,9 +119,9 @@ def str_pad(s, length, align='left', pad_char=' ', trim=False):
|
||||
raise ValueError("Unknown align type, expected either 'left' or 'right'")
|
||||
|
||||
|
||||
def pad_title(value):
|
||||
def pad_title(value, width=49):
|
||||
"""Pad a title to 30 characters to force the 'details' view."""
|
||||
return str_pad(value, 49, pad_char=' ')
|
||||
return str_pad(value, width, pad_char=' ')
|
||||
|
||||
|
||||
def get_plex_item_display_title(item, kind, parent=None, parent_title=None, section_title=None,
|
||||
@@ -236,13 +245,13 @@ def get_item_hints(data):
|
||||
:param data: video item dict of media_to_videos
|
||||
:return:
|
||||
"""
|
||||
hints = {"title": data["title"], "type": "movie"}
|
||||
hints = {"title": data["original_title"] or data["title"], "type": "movie"}
|
||||
if data["type"] == "episode":
|
||||
hints.update(
|
||||
{
|
||||
"type": "episode",
|
||||
"episode_title": data["title"],
|
||||
"title": data["series"],
|
||||
"title": data["original_title"] or data["series"],
|
||||
}
|
||||
)
|
||||
return hints
|
||||
@@ -273,9 +282,21 @@ def notify_executable(exe_info, videos, subtitles, storage):
|
||||
prepared_arguments = [arg % prepared_data for arg in arguments]
|
||||
|
||||
Log.Debug(u"Calling %s with arguments: %s" % (exe, prepared_arguments))
|
||||
env = os.environ
|
||||
if not mswindows:
|
||||
env_path = {"PATH": os.pathsep.join(
|
||||
[
|
||||
"/usr/local/bin",
|
||||
"/usr/bin",
|
||||
os.environ.get("PATH", "")
|
||||
]
|
||||
)
|
||||
}
|
||||
env = dict(os.environ, **env_path)
|
||||
|
||||
try:
|
||||
output = subprocess.check_output(subprocess.list2cmdline([exe] + prepared_arguments),
|
||||
stderr=subprocess.STDOUT, shell=True)
|
||||
output = subprocess.check_output(quote_args([exe] + prepared_arguments),
|
||||
stderr=subprocess.STDOUT, shell=True, env=env)
|
||||
except subprocess.CalledProcessError:
|
||||
Log.Error(u"Calling %s failed: %s" % (exe, traceback.format_exc()))
|
||||
else:
|
||||
@@ -303,3 +324,7 @@ def dispatch_track_usage(*args, **kwargs):
|
||||
|
||||
def get_language(lang_short):
|
||||
return Language.fromietf(lang_short)
|
||||
|
||||
|
||||
class PartUnknownException(Exception):
|
||||
pass
|
||||
@@ -2,12 +2,15 @@
|
||||
|
||||
import logging
|
||||
import re
|
||||
import traceback
|
||||
import types
|
||||
import os
|
||||
from ignore import ignore_list
|
||||
from helpers import is_recent, get_plex_item_display_title, query_plex
|
||||
from helpers import is_recent, get_plex_item_display_title, query_plex, PartUnknownException
|
||||
from lib import Plex, get_intent
|
||||
from config import config, IGNORE_FN
|
||||
from subliminal_patch.subtitle import ModifiedSubtitle
|
||||
from subzero.modification import registry as mod_registry, SubtitleModifications
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -40,11 +43,11 @@ PLEX_API_TYPE_MAP = {
|
||||
|
||||
def get_item_kind_from_rating_key(key):
|
||||
item = get_item(key)
|
||||
return PLEX_API_TYPE_MAP[get_item_kind(item)]
|
||||
return PLEX_API_TYPE_MAP.get(get_item_kind(item))
|
||||
|
||||
|
||||
def get_item_kind_from_item(item):
|
||||
return PLEX_API_TYPE_MAP[get_item_kind(item)]
|
||||
return PLEX_API_TYPE_MAP.get(get_item_kind(item))
|
||||
|
||||
|
||||
def get_item_thumb(item):
|
||||
@@ -164,14 +167,17 @@ def get_recent_items():
|
||||
"X-Plex-Container-Size": "%s" % config.max_recent_items_per_library
|
||||
}
|
||||
|
||||
episode_re = re.compile(ur'ratingKey="(?P<key>\d+)"'
|
||||
episode_re = re.compile(ur'(?su)ratingKey="(?P<key>\d+)"'
|
||||
ur'.+?grandparentRatingKey="(?P<parent_key>\d+)"'
|
||||
ur'.+?title="(?P<title>.*?)"'
|
||||
ur'.+?grandparentTitle="(?P<parent_title>.*?)"'
|
||||
ur'.+?index="(?P<episode>\d+?)"'
|
||||
ur'.+?parentIndex="(?P<season>\d+?)".+?addedAt="(?P<added>\d+)"')
|
||||
movie_re = re.compile(ur'ratingKey="(?P<key>\d+)".+?title="(?P<title>.*?)".+?addedAt="(?P<added>\d+)"')
|
||||
available_keys = ("key", "title", "parent_key", "parent_title", "season", "episode", "added")
|
||||
ur'.+?parentIndex="(?P<season>\d+?)".+?addedAt="(?P<added>\d+)"'
|
||||
ur'.+?<Part.+? file="(?P<filename>[^"]+?)"')
|
||||
movie_re = re.compile(ur'(?su)ratingKey="(?P<key>\d+)".+?title="(?P<title>.*?)'
|
||||
ur'".+?addedAt="(?P<added>\d+)"'
|
||||
ur'.+?<Part.+? file="(?P<filename>[^"]+?)"')
|
||||
available_keys = ("key", "title", "parent_key", "parent_title", "season", "episode", "added", "filename")
|
||||
recent = []
|
||||
|
||||
for section in Plex["library"].sections():
|
||||
@@ -182,8 +188,10 @@ def get_recent_items():
|
||||
continue
|
||||
|
||||
use_args = args.copy()
|
||||
plex_item_type = "Movie"
|
||||
if section.type == "show":
|
||||
use_args["type"] = "4"
|
||||
plex_item_type = "Episode"
|
||||
|
||||
url = "http://127.0.0.1:32400/library/sections/%s/all" % int(section.key)
|
||||
response = query_plex(url, use_args)
|
||||
@@ -198,6 +206,10 @@ def get_recent_items():
|
||||
if data["key"] in ignore_list.videos:
|
||||
Log.Debug(u"Skipping item: %s" % data["title"])
|
||||
continue
|
||||
if is_physically_ignored(data["filename"], plex_item_type):
|
||||
Log.Debug(u"Skipping item: %s" % data["title"])
|
||||
continue
|
||||
|
||||
if is_recent(int(data["added"])):
|
||||
recent.append((int(data["added"]), section.type, section.title, data["key"]))
|
||||
|
||||
@@ -242,6 +254,16 @@ def is_ignored(rating_key, item=None):
|
||||
return True
|
||||
|
||||
# physical/path ignore
|
||||
if config.ignore_sz_files or config.ignore_paths:
|
||||
for media in item.media:
|
||||
for part in media.parts:
|
||||
if is_physically_ignored(part.file, kind):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def is_physically_ignored(fn, kind):
|
||||
if config.ignore_sz_files or config.ignore_paths:
|
||||
# normally check current item folder and the library
|
||||
check_ignore_paths = [".", "../"]
|
||||
@@ -249,18 +271,15 @@ def is_ignored(rating_key, item=None):
|
||||
# series/episode, we've got a season folder here, also
|
||||
check_ignore_paths.append("../../")
|
||||
|
||||
for part in item.media.parts:
|
||||
if config.ignore_paths and config.is_path_ignored(part.file):
|
||||
Log.Debug("Item %s's path is manually ignored" % rating_key)
|
||||
return True
|
||||
if config.ignore_paths and config.is_path_ignored(fn):
|
||||
Log.Debug("Item %s's path is manually ignored" % fn)
|
||||
return True
|
||||
|
||||
if config.ignore_sz_files:
|
||||
for sub_path in check_ignore_paths:
|
||||
if config.is_physically_ignored(os.path.abspath(os.path.join(os.path.dirname(part.file), sub_path))):
|
||||
Log.Debug("An ignore file exists in either the items or its parent folders")
|
||||
return True
|
||||
|
||||
return False
|
||||
if config.ignore_sz_files:
|
||||
for sub_path in check_ignore_paths:
|
||||
if config.is_physically_ignored(os.path.normpath(os.path.join(os.path.dirname(fn), sub_path))):
|
||||
Log.Debug("An ignore file exists in either the items or its parent folders")
|
||||
return True
|
||||
|
||||
|
||||
def refresh_item(rating_key, force=False, timeout=8000, refresh_kind=None, parent_rating_key=None):
|
||||
@@ -292,4 +311,65 @@ def get_current_sub(rating_key, part_id, language):
|
||||
subtitle_storage = get_subtitle_storage()
|
||||
stored_subs = subtitle_storage.load_or_new(item)
|
||||
current_sub = stored_subs.get_any(part_id, language)
|
||||
return current_sub, stored_subs, subtitle_storage
|
||||
return current_sub, stored_subs, subtitle_storage
|
||||
|
||||
|
||||
def set_mods_for_part(rating_key, part_id, language, item_type, mods, mode="add"):
|
||||
from support.plex_media import get_plex_metadata, scan_videos
|
||||
from support.storage import save_subtitles
|
||||
|
||||
current_sub, stored_subs, storage = get_current_sub(rating_key, part_id, language)
|
||||
if mode == "add":
|
||||
for mod in mods:
|
||||
identifier, args = SubtitleModifications.parse_identifier(mod)
|
||||
mod_class = SubtitleModifications.get_mod_class(identifier)
|
||||
|
||||
if identifier not in mod_registry.mods_available:
|
||||
raise NotImplementedError("Mod unknown or not registered")
|
||||
|
||||
# clean exclusive mods
|
||||
if mod_class.exclusive and current_sub.mods:
|
||||
for current_mod in current_sub.mods[:]:
|
||||
if current_mod.startswith(identifier):
|
||||
current_sub.mods.remove(current_mod)
|
||||
Log.Info("Removing superseded mod %s" % current_mod)
|
||||
|
||||
current_sub.add_mod(mod)
|
||||
elif mode == "clear":
|
||||
current_sub.add_mod(None)
|
||||
elif mode == "remove":
|
||||
for mod in mods:
|
||||
current_sub.mods.remove(mod)
|
||||
|
||||
elif mode == "remove_last":
|
||||
if current_sub.mods:
|
||||
current_sub.mods.pop()
|
||||
else:
|
||||
raise NotImplementedError("Wrong mode given")
|
||||
storage.save(stored_subs)
|
||||
|
||||
try:
|
||||
metadata = get_plex_metadata(rating_key, part_id, item_type)
|
||||
except PartUnknownException:
|
||||
return
|
||||
|
||||
scanned_parts = scan_videos([metadata], kind="series" if item_type == "episode" else "movie", ignore_all=True)
|
||||
video, plex_part = scanned_parts.items()[0]
|
||||
|
||||
subtitle = ModifiedSubtitle(language, mods=current_sub.mods)
|
||||
subtitle.content = current_sub.content
|
||||
if current_sub.encoding:
|
||||
# thanks plex
|
||||
setattr(subtitle, "_guessed_encoding", current_sub.encoding)
|
||||
|
||||
subtitle.plex_media_fps = plex_part.fps
|
||||
subtitle.page_link = "modify subtitles with: %s" % (", ".join(current_sub.mods) if current_sub.mods else "none")
|
||||
subtitle.language = language
|
||||
subtitle.id = current_sub.id
|
||||
|
||||
try:
|
||||
save_subtitles(scanned_parts, {video: [subtitle]}, mode="m", bare_save=True)
|
||||
Log.Debug("Modified %s subtitle for: %s:%s with: %s", language.name, rating_key, part_id,
|
||||
", ".join(current_sub.mods) if current_sub.mods else "none")
|
||||
except:
|
||||
Log.Error("Something went wrong when modifying subtitle: %s", traceback.format_exc())
|
||||
|
||||
@@ -108,7 +108,8 @@ def find_subtitles(part):
|
||||
if ext.lower()[1:] in config.SUBTITLE_EXTS:
|
||||
# get fn without forced/default/normal tag
|
||||
split_tag = root.rsplit(".", 1)
|
||||
if len(split_tag) > 1 and split_tag[1].lower() in ['forced', 'normal', 'default']:
|
||||
if len(split_tag) > 1 and split_tag[1].lower() in ['forced', 'normal', 'default', 'embedded',
|
||||
'custom']:
|
||||
root = split_tag[0]
|
||||
|
||||
# get associated media file name without language
|
||||
@@ -160,9 +161,8 @@ def find_subtitles(part):
|
||||
# determine whether to pick up the subtitle based on our match strictness
|
||||
elif not filename_matches_part:
|
||||
if sz_config.ext_match_strictness == "strict" or (
|
||||
sz_config.ext_match_strictness == "loose" and not filename_contains_part):
|
||||
|
||||
#Log.Debug("%s doesn't match %s, skipping" % (helpers.unicodize(local_filename),
|
||||
sz_config.ext_match_strictness == "loose" and not filename_contains_part):
|
||||
# Log.Debug("%s doesn't match %s, skipping" % (helpers.unicodize(local_filename),
|
||||
# helpers.unicodize(part_basename)))
|
||||
continue
|
||||
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
# coding=utf-8
|
||||
import traceback
|
||||
import time
|
||||
|
||||
from support.config import config
|
||||
from support.helpers import get_plex_item_display_title, cast_bool
|
||||
@@ -8,8 +9,6 @@ from support.lib import Plex
|
||||
|
||||
|
||||
def item_discover_missing_subs(rating_key, kind="show", added_at=None, section_title=None, internal=False, external=True, languages=()):
|
||||
existing_subs = {"internal": [], "external": [], "count": 0}
|
||||
|
||||
item_id = int(rating_key)
|
||||
item = get_item(rating_key)
|
||||
|
||||
@@ -18,36 +17,41 @@ def item_discover_missing_subs(rating_key, kind="show", added_at=None, section_t
|
||||
else:
|
||||
item_title = get_plex_item_display_title(item, kind, section_title=section_title)
|
||||
|
||||
video = item.media
|
||||
missing = set()
|
||||
languages_set = set(languages)
|
||||
for media in item.media:
|
||||
existing_subs = {"internal": [], "external": [], "count": 0}
|
||||
for part in media.parts:
|
||||
for stream in part.streams:
|
||||
if stream.stream_type == 3:
|
||||
if stream.index:
|
||||
key = "internal"
|
||||
else:
|
||||
key = "external"
|
||||
|
||||
for part in video.parts:
|
||||
for stream in part.streams:
|
||||
if stream.stream_type == 3:
|
||||
if stream.index:
|
||||
key = "internal"
|
||||
else:
|
||||
key = "external"
|
||||
existing_subs[key].append(Locale.Language.Match(stream.language_code or ""))
|
||||
existing_subs["count"] = existing_subs["count"] + 1
|
||||
|
||||
existing_subs[key].append(Locale.Language.Match(stream.language_code or ""))
|
||||
existing_subs["count"] = existing_subs["count"] + 1
|
||||
missing_from_part = set(languages_set)
|
||||
if existing_subs["count"]:
|
||||
existing_flat = set((existing_subs["internal"] if internal else []) + (existing_subs["external"] if external else []))
|
||||
if languages_set.issubset(existing_flat) or (len(existing_flat) >= 1 and Prefs['subtitles.only_one']):
|
||||
# all subs found
|
||||
#Log.Info(u"All subtitles exist for '%s'", item_title)
|
||||
continue
|
||||
|
||||
missing = languages
|
||||
if existing_subs["count"]:
|
||||
existing_flat = (existing_subs["internal"] if internal else []) + (existing_subs["external"] if external else [])
|
||||
languages_set = set(languages)
|
||||
if languages_set.issubset(existing_flat) or (len(existing_flat) >= 1 and Prefs['subtitles.only_one']):
|
||||
# all subs found
|
||||
Log.Info(u"All subtitles exist for '%s'", item_title)
|
||||
return
|
||||
missing_from_part = languages_set - existing_flat
|
||||
|
||||
missing = languages_set - set(existing_flat)
|
||||
Log.Info(u"Subs still missing for '%s': %s", item_title, missing)
|
||||
if missing_from_part:
|
||||
Log.Info(u"Subs still missing for '%s' (%s: %s): %s", item_title, rating_key, media.id,
|
||||
missing_from_part)
|
||||
missing.update(missing_from_part)
|
||||
|
||||
if missing:
|
||||
return added_at, item_id, item_title, item, missing
|
||||
|
||||
|
||||
def items_get_all_missing_subs(items):
|
||||
def items_get_all_missing_subs(items, sleep_after_request=False):
|
||||
missing = []
|
||||
for added_at, kind, section_title, key in items:
|
||||
try:
|
||||
@@ -65,6 +69,8 @@ def items_get_all_missing_subs(items):
|
||||
missing.append(state)
|
||||
except:
|
||||
Log.Error("Something went wrong when getting the state of item %s: %s", key, traceback.format_exc())
|
||||
if sleep_after_request:
|
||||
time.sleep(sleep_after_request)
|
||||
return missing
|
||||
|
||||
|
||||
|
||||
@@ -1,15 +1,14 @@
|
||||
# coding=utf-8
|
||||
|
||||
import os
|
||||
from urllib2 import URLError
|
||||
|
||||
import helpers
|
||||
|
||||
from config import config
|
||||
from items import get_item
|
||||
from lib import get_intent, Plex
|
||||
from config import config
|
||||
from subzero.video import parse_video
|
||||
|
||||
|
||||
def get_metadata_dict(item, part, add):
|
||||
data = {
|
||||
"item": item,
|
||||
@@ -22,6 +21,54 @@ def get_metadata_dict(item, part, add):
|
||||
return data
|
||||
|
||||
|
||||
imdb_guid_identifier = "com.plexapp.agents.imdb://"
|
||||
tvdb_guid_identifier = "com.plexapp.agents.thetvdb://"
|
||||
|
||||
|
||||
def get_plexapi_stream_info(plex_item, part_id=None):
|
||||
d = {"stream": {}}
|
||||
data = d["stream"]
|
||||
|
||||
# find current part
|
||||
current_part = None
|
||||
current_media = None
|
||||
for media in plex_item.media:
|
||||
for part in media.parts:
|
||||
if not part_id or str(part.id) == part_id:
|
||||
current_part = part
|
||||
current_media = media
|
||||
break
|
||||
if current_part:
|
||||
break
|
||||
|
||||
if not current_part:
|
||||
return d
|
||||
|
||||
data["video_codec"] = current_media.video_codec
|
||||
data["audio_codec"] = current_media.audio_codec.upper()
|
||||
|
||||
if data["audio_codec"] == "DCA":
|
||||
data["audio_codec"] = "DTS"
|
||||
|
||||
if current_media.audio_channels == 8:
|
||||
data["audio_channels"] = "7.1"
|
||||
|
||||
elif current_media.audio_channels == 6:
|
||||
data["audio_channels"] = "5.1"
|
||||
else:
|
||||
data["audio_channels"] = "%s.0" % str(current_media.audio_channels)
|
||||
|
||||
# iter streams
|
||||
for stream in current_part.streams:
|
||||
if stream.stream_type == 1:
|
||||
# video stream
|
||||
data["resolution"] = "%s%s" % (current_media.video_resolution,
|
||||
"i" if stream.scan_type != "progressive" else "p")
|
||||
break
|
||||
|
||||
return d
|
||||
|
||||
|
||||
def media_to_videos(media, kind="series"):
|
||||
"""
|
||||
iterates through media and returns the associated parts (videos)
|
||||
@@ -31,36 +78,61 @@ def media_to_videos(media, kind="series"):
|
||||
"""
|
||||
videos = []
|
||||
|
||||
# this is a Show or a Movie object
|
||||
plex_item = get_item(media.id)
|
||||
year = plex_item.year
|
||||
original_title = plex_item.title_original
|
||||
|
||||
if kind == "series":
|
||||
for season in media.seasons:
|
||||
season_object = media.seasons[season]
|
||||
for episode in media.seasons[season].episodes:
|
||||
ep = media.seasons[season].episodes[episode]
|
||||
|
||||
tvdb_id = None
|
||||
series_tvdb_id = None
|
||||
if tvdb_guid_identifier in ep.guid:
|
||||
tvdb_id = ep.guid[len(tvdb_guid_identifier):].split("?")[0]
|
||||
series_tvdb_id = tvdb_id.split("/")[0]
|
||||
|
||||
# get plex item via API for additional metadata
|
||||
plex_episode = get_item(ep.id)
|
||||
stream_info = get_plexapi_stream_info(plex_episode)
|
||||
|
||||
for item in media.seasons[season].episodes[episode].items:
|
||||
for part in item.parts:
|
||||
videos.append(
|
||||
get_metadata_dict(plex_episode, part,
|
||||
{"plex_part": part, "type": "episode", "title": ep.title,
|
||||
"series": media.title, "id": ep.id,
|
||||
"series_id": media.id, "season_id": season_object.id,
|
||||
"episode": plex_episode.index, "season": plex_episode.season.index,
|
||||
"section": plex_episode.section.title
|
||||
})
|
||||
dict(stream_info, **{"plex_part": part, "type": "episode",
|
||||
"title": ep.title,
|
||||
"series": media.title, "id": ep.id, "year": year,
|
||||
"series_id": media.id,
|
||||
"season_id": season_object.id,
|
||||
"imdb_id": None, "series_tvdb_id": series_tvdb_id,
|
||||
"tvdb_id": tvdb_id,
|
||||
"original_title": original_title,
|
||||
"episode": plex_episode.index,
|
||||
"season": plex_episode.season.index,
|
||||
"section": plex_episode.section.title
|
||||
})
|
||||
)
|
||||
)
|
||||
else:
|
||||
plex_item = get_item(media.id)
|
||||
stream_info = get_plexapi_stream_info(plex_item)
|
||||
imdb_id = None
|
||||
if imdb_guid_identifier in media.guid:
|
||||
imdb_id = media.guid[len(imdb_guid_identifier):].split("?")[0]
|
||||
for item in media.items:
|
||||
for part in item.parts:
|
||||
videos.append(
|
||||
get_metadata_dict(plex_item, part, {"plex_part": part, "type": "movie",
|
||||
"title": media.title, "id": media.id,
|
||||
"series_id": None,
|
||||
"season_id": None,
|
||||
"section": plex_item.section.title})
|
||||
get_metadata_dict(plex_item, part, dict(stream_info, **{"plex_part": part, "type": "movie",
|
||||
"title": media.title, "id": media.id,
|
||||
"series_id": None, "year": year,
|
||||
"season_id": None, "imdb_id": imdb_id,
|
||||
"original_title": original_title,
|
||||
"series_tvdb_id": None, "tvdb_id": None,
|
||||
"section": plex_item.section.title})
|
||||
)
|
||||
)
|
||||
return videos
|
||||
|
||||
@@ -92,10 +164,10 @@ def get_media_item_ids(media, kind="series"):
|
||||
return ids
|
||||
|
||||
|
||||
def scan_video(plex_part, ignore_all=False, hints=None, rating_key=None):
|
||||
def scan_video(pms_video_info, ignore_all=False, hints=None, rating_key=None):
|
||||
"""
|
||||
returnes a subliminal/guessit-refined parsed video
|
||||
:param plex_part:
|
||||
:param pms_video_info:
|
||||
:param ignore_all:
|
||||
:param hints:
|
||||
:param rating_key:
|
||||
@@ -104,6 +176,8 @@ def scan_video(plex_part, ignore_all=False, hints=None, rating_key=None):
|
||||
embedded_subtitles = not ignore_all and Prefs['subtitles.scan.embedded']
|
||||
external_subtitles = not ignore_all and Prefs['subtitles.scan.external']
|
||||
|
||||
plex_part = pms_video_info["plex_part"]
|
||||
|
||||
if ignore_all:
|
||||
Log.Debug("Force refresh intended.")
|
||||
|
||||
@@ -111,7 +185,10 @@ def scan_video(plex_part, ignore_all=False, hints=None, rating_key=None):
|
||||
plex_part.file, external_subtitles, embedded_subtitles))
|
||||
|
||||
known_embedded = []
|
||||
parts = list(Plex["library"].metadata(rating_key))[0].media.parts
|
||||
parts = []
|
||||
for media in list(Plex["library"].metadata(rating_key))[0].media:
|
||||
parts += media.parts
|
||||
|
||||
plexpy_part = None
|
||||
for part in parts:
|
||||
if int(part.id) == int(plex_part.id):
|
||||
@@ -139,7 +216,7 @@ def scan_video(plex_part, ignore_all=False, hints=None, rating_key=None):
|
||||
|
||||
try:
|
||||
# get basic video info scan (filename)
|
||||
video = parse_video(plex_part.file, hints, external_subtitles=external_subtitles,
|
||||
video = parse_video(plex_part.file, pms_video_info, hints, external_subtitles=external_subtitles,
|
||||
embedded_subtitles=embedded_subtitles, known_embedded=known_embedded,
|
||||
forced_only=config.forced_only, video_fps=plex_part.fps)
|
||||
|
||||
@@ -165,7 +242,7 @@ def scan_videos(videos, kind="series", ignore_all=False):
|
||||
|
||||
hints = helpers.get_item_hints(video)
|
||||
video["plex_part"].fps = get_stream_fps(video["plex_part"].streams)
|
||||
scanned_video = scan_video(video["plex_part"], ignore_all=force_refresh or ignore_all, hints=hints,
|
||||
scanned_video = scan_video(video, ignore_all=force_refresh or ignore_all, hints=hints,
|
||||
rating_key=video["id"])
|
||||
|
||||
if not scanned_video:
|
||||
@@ -179,49 +256,78 @@ def scan_videos(videos, kind="series", ignore_all=False):
|
||||
return ret
|
||||
|
||||
|
||||
class PartUnknownException(Exception):
|
||||
pass
|
||||
|
||||
|
||||
def get_plex_metadata(rating_key, part_id, item_type):
|
||||
"""
|
||||
uses the Plex 3rd party API accessor to get metadata information
|
||||
|
||||
:param rating_key:
|
||||
:param rating_key: movie or episode
|
||||
:param part_id:
|
||||
:param item_type:
|
||||
:return:
|
||||
"""
|
||||
|
||||
plex_item = list(Plex["library"].metadata(rating_key))[0]
|
||||
try:
|
||||
plex_item = list(Plex["library"].metadata(rating_key))[0]
|
||||
except URLError:
|
||||
return None
|
||||
|
||||
# find current part
|
||||
current_part = None
|
||||
for part in plex_item.media.parts:
|
||||
if str(part.id) == part_id:
|
||||
current_part = part
|
||||
for media in plex_item.media:
|
||||
for part in media.parts:
|
||||
if str(part.id) == part_id:
|
||||
current_part = part
|
||||
|
||||
if not current_part:
|
||||
raise PartUnknownException("Part unknown")
|
||||
raise helpers.PartUnknownException("Part unknown")
|
||||
|
||||
stream_info = get_plexapi_stream_info(plex_item, part_id)
|
||||
|
||||
# get normalized metadata
|
||||
# fixme: duplicated logic of media_to_videos
|
||||
if item_type == "episode":
|
||||
show = list(Plex["library"].metadata(plex_item.show.rating_key))[0]
|
||||
year = show.year
|
||||
tvdb_id = None
|
||||
series_tvdb_id = None
|
||||
original_title = show.title_original
|
||||
if tvdb_guid_identifier in plex_item.guid:
|
||||
tvdb_id = plex_item.guid[len(tvdb_guid_identifier):].split("?")[0]
|
||||
series_tvdb_id = tvdb_id.split("/")[0]
|
||||
metadata = get_metadata_dict(plex_item, current_part,
|
||||
{"plex_part": current_part, "type": "episode", "title": plex_item.title,
|
||||
"series": plex_item.show.title, "id": plex_item.rating_key,
|
||||
"series_id": plex_item.show.rating_key,
|
||||
"season_id": plex_item.season.rating_key,
|
||||
"season": plex_item.season.index,
|
||||
"episode": plex_item.index
|
||||
})
|
||||
dict(stream_info,
|
||||
**{"plex_part": current_part, "type": "episode", "title": plex_item.title,
|
||||
"series": plex_item.show.title, "id": plex_item.rating_key,
|
||||
"series_id": plex_item.show.rating_key,
|
||||
"season_id": plex_item.season.rating_key,
|
||||
"imdb_id": None,
|
||||
"year": year,
|
||||
"tvdb_id": tvdb_id,
|
||||
"series_tvdb_id": series_tvdb_id,
|
||||
"original_title": original_title,
|
||||
"season": plex_item.season.index,
|
||||
"episode": plex_item.index
|
||||
})
|
||||
)
|
||||
else:
|
||||
metadata = get_metadata_dict(plex_item, current_part, {"plex_part": current_part, "type": "movie",
|
||||
"title": plex_item.title, "id": plex_item.rating_key,
|
||||
"series_id": None,
|
||||
"season_id": None,
|
||||
"season": None,
|
||||
"episode": None,
|
||||
"section": plex_item.section.title})
|
||||
imdb_id = None
|
||||
original_title = plex_item.title_original
|
||||
if imdb_guid_identifier in plex_item.guid:
|
||||
imdb_id = plex_item.guid[len(imdb_guid_identifier):].split("?")[0]
|
||||
metadata = get_metadata_dict(plex_item, current_part,
|
||||
dict(stream_info, **{"plex_part": current_part, "type": "movie",
|
||||
"title": plex_item.title, "id": plex_item.rating_key,
|
||||
"series_id": None,
|
||||
"season_id": None,
|
||||
"imdb_id": imdb_id,
|
||||
"year": plex_item.year,
|
||||
"tvdb_id": None,
|
||||
"series_tvdb_id": None,
|
||||
"original_title": original_title,
|
||||
"season": None,
|
||||
"episode": None,
|
||||
"section": plex_item.section.title})
|
||||
)
|
||||
return metadata
|
||||
|
||||
|
||||
@@ -257,3 +363,24 @@ class PMSMediaProxy(object):
|
||||
break
|
||||
|
||||
m = m.children[0]
|
||||
|
||||
def get_all_parts(self):
|
||||
"""
|
||||
walk the mediatree until the given part was found; if no part was given, return the first one
|
||||
:param part_id:
|
||||
:return:
|
||||
"""
|
||||
m = self.mediatree
|
||||
parts = []
|
||||
while 1:
|
||||
if m.items:
|
||||
media_item = m.items[0]
|
||||
for part in media_item.parts:
|
||||
parts.append(part)
|
||||
break
|
||||
|
||||
if not m.children:
|
||||
break
|
||||
|
||||
m = m.children[0]
|
||||
return parts
|
||||
|
||||
@@ -168,6 +168,7 @@ class DefaultScheduler(object):
|
||||
for args, kwargs in queue:
|
||||
Log.Debug("Dispatching single task: %s, %s", args, kwargs)
|
||||
Thread.Create(self.run_task, True, *args, **kwargs)
|
||||
Thread.Sleep(5.0)
|
||||
|
||||
# scheduled tasks
|
||||
for name, info in self.tasks.iteritems():
|
||||
@@ -185,9 +186,13 @@ class DefaultScheduler(object):
|
||||
continue
|
||||
|
||||
if not task.last_run or (task.last_run + datetime.timedelta(**{frequency_key: frequency_num}) <= now):
|
||||
# fixme: scheduled tasks run synchronously. is this the best idea?
|
||||
#Thread.Create(self.run_task, True, name)
|
||||
#Thread.Sleep(5.0)
|
||||
self.run_task(name)
|
||||
Thread.Sleep(5.0)
|
||||
|
||||
Thread.Sleep(5.0)
|
||||
Thread.Sleep(1)
|
||||
|
||||
|
||||
scheduler = DefaultScheduler()
|
||||
|
||||
@@ -137,7 +137,8 @@ def save_subtitles_to_file(subtitles):
|
||||
os.makedirs(fld)
|
||||
subliminal.save_subtitles(video, video_subtitles, directory=fld, single=cast_bool(Prefs['subtitles.only_one']),
|
||||
encode_with=force_utf8 if config.enforce_encoding else None,
|
||||
chmod=config.chmod, forced_tag=config.forced_only, path_decoder=force_unicode)
|
||||
chmod=config.chmod, forced_tag=config.forced_only, path_decoder=force_unicode,
|
||||
debug_mods=config.debug_mods)
|
||||
return True
|
||||
|
||||
|
||||
@@ -145,7 +146,8 @@ def save_subtitles_to_metadata(videos, subtitles):
|
||||
for video, video_subtitles in subtitles.items():
|
||||
mediaPart = videos[video]
|
||||
for subtitle in video_subtitles:
|
||||
content = force_utf8(subtitle.text) if config.enforce_encoding else subtitle.content
|
||||
content = force_utf8(subtitle.get_modified_text(debug=config.debug_mods)) if config.enforce_encoding else \
|
||||
subtitle.get_modified_content(debug=config.debug_mods)
|
||||
|
||||
if not isinstance(mediaPart, Framework.api.agentkit.MediaPart):
|
||||
# we're being handed a Plex.py model instance here, not an internal PMS MediaPart object.
|
||||
@@ -204,6 +206,8 @@ def save_subtitles(scanned_video_part_map, downloaded_subtitles, mode="a", bare_
|
||||
if not bare_save and save_successful and config.notify_executable:
|
||||
notify_executable(config.notify_executable, scanned_video_part_map, downloaded_subtitles, storage)
|
||||
|
||||
if not bare_save:
|
||||
if not bare_save and save_successful:
|
||||
store_subtitle_info(scanned_video_part_map, downloaded_subtitles, storage, mode=mode)
|
||||
|
||||
return save_successful
|
||||
|
||||
|
||||
@@ -129,9 +129,8 @@ class DefaultSubtitleHelper(SubtitleHelper):
|
||||
default = '1'
|
||||
|
||||
# Attempt to extract the language from the filename (e.g. Avatar (2009).eng)
|
||||
language = ""
|
||||
|
||||
# IETF support thanks to https://github.com/hpsbranco/LocalMedia.bundle/commit/4fad9aefedece78a1fa96401304351347f644369
|
||||
# IETF support thanks to
|
||||
# https://github.com/hpsbranco/LocalMedia.bundle/commit/4fad9aefedece78a1fa96401304351347f644369
|
||||
language = Locale.Language.Match(match_ietf_language(file))
|
||||
|
||||
# skip non-SRT if wanted
|
||||
@@ -194,7 +193,10 @@ def get_subtitles_from_metadata(part):
|
||||
def force_utf8(content):
|
||||
a = UnicodeDammit(content)
|
||||
|
||||
Log.Debug("detected encoding: %s (None: most likely already successfully decoded)" % a.original_encoding)
|
||||
if a.original_encoding:
|
||||
Log.Debug("detected encoding: %s (None: most likely already successfully decoded)" % a.original_encoding)
|
||||
else:
|
||||
Log.Debug("detected encoding: unicode (already decoded)")
|
||||
|
||||
# easy way out - already utf-8
|
||||
if a.original_encoding and a.original_encoding == "utf-8":
|
||||
|
||||
@@ -4,6 +4,7 @@ import datetime
|
||||
import time
|
||||
import operator
|
||||
import traceback
|
||||
from urllib2 import URLError
|
||||
|
||||
from subliminal_patch.score import compute_score
|
||||
from subliminal_patch.core import download_subtitles
|
||||
@@ -16,8 +17,8 @@ from storage import save_subtitles, whack_missing_parts, get_subtitle_storage
|
||||
from support.config import config
|
||||
from support.items import get_recent_items, is_ignored, get_item
|
||||
from support.lib import Plex
|
||||
from support.helpers import track_usage, get_title_for_video_metadata, cast_bool
|
||||
from support.plex_media import scan_videos, get_plex_metadata, PartUnknownException
|
||||
from support.helpers import track_usage, get_title_for_video_metadata, cast_bool, PartUnknownException
|
||||
from support.plex_media import scan_videos, get_plex_metadata
|
||||
|
||||
|
||||
class Task(object):
|
||||
@@ -80,14 +81,16 @@ class Task(object):
|
||||
return
|
||||
|
||||
def run(self):
|
||||
Log.Info(u"Task: running: %s", self.name)
|
||||
self.time_start = datetime.datetime.now()
|
||||
|
||||
def post_run(self, data_holder):
|
||||
self.running = False
|
||||
self.last_run = datetime.datetime.now()
|
||||
if self.time_start:
|
||||
if self.time_start and self.last_run:
|
||||
self.last_run_time = self.last_run - self.time_start
|
||||
self.time_start = None
|
||||
Log.Info(u"Task: ran: %s", self.name)
|
||||
|
||||
|
||||
class SearchAllRecentlyAddedMissing(Task):
|
||||
@@ -122,7 +125,7 @@ class SearchAllRecentlyAddedMissing(Task):
|
||||
def prepare(self, *args, **kwargs):
|
||||
self.items_done = []
|
||||
recent_items = get_recent_items()
|
||||
missing = items_get_all_missing_subs(recent_items)
|
||||
missing = items_get_all_missing_subs(recent_items, sleep_after_request=0.2)
|
||||
ids = set([id for added_at, id, title, item, missing_languages in missing if not is_ignored(id, item=item)])
|
||||
self.items_searching = missing
|
||||
self.items_searching_ids = ids
|
||||
@@ -138,14 +141,19 @@ class SearchAllRecentlyAddedMissing(Task):
|
||||
|
||||
for added_at, item_id, title, item, missing_languages in self.items_searching:
|
||||
Log.Debug(u"Task: %s, triggering refresh for %s (%s)", self.name, title, item_id)
|
||||
refresh_item(item_id)
|
||||
try:
|
||||
refresh_item(item_id)
|
||||
except URLError:
|
||||
# timeout
|
||||
pass
|
||||
search_started = datetime.datetime.now()
|
||||
tries = 1
|
||||
while 1:
|
||||
if item_id in self.items_done:
|
||||
items_done_count += 1
|
||||
Log.Debug(u"Task: %s, item %s done", self.name, item_id)
|
||||
self.percentage = int(items_done_count * 100 / missing_count)
|
||||
Log.Debug(u"Task: %s, item %s done (%s%%, %s/%s)", self.name, item_id, self.percentage,
|
||||
items_done_count, missing_count)
|
||||
break
|
||||
|
||||
# item considered stalled after self.stall_time seconds passed after last refresh
|
||||
@@ -158,14 +166,18 @@ class SearchAllRecentlyAddedMissing(Task):
|
||||
Log.Debug(u"Task: %s, item stalled for %s seconds: %s, retrying", self.name, self.stall_time,
|
||||
item_id)
|
||||
tries += 1
|
||||
refresh_item(item_id)
|
||||
try:
|
||||
refresh_item(item_id)
|
||||
except URLError:
|
||||
pass
|
||||
search_started = datetime.datetime.now()
|
||||
time.sleep(1)
|
||||
time.sleep(0.1)
|
||||
# we can't hammer the PMS, otherwise requests will be stalled
|
||||
time.sleep(1)
|
||||
time.sleep(5)
|
||||
|
||||
Log.Debug("Task: %s, done. Failed items: %s", self.name, self.items_failed)
|
||||
Log.Debug("Task: %s, done (%s%%, %s/%s). Failed items: %s", self.name, self.percentage,
|
||||
items_done_count, missing_count, self.items_failed)
|
||||
self.running = False
|
||||
|
||||
def post_run(self, task_data):
|
||||
@@ -179,13 +191,11 @@ class SearchAllRecentlyAddedMissing(Task):
|
||||
|
||||
|
||||
class SubtitleListingMixin(object):
|
||||
def list_subtitles(self, rating_key, item_type, part_id, language):
|
||||
def list_subtitles(self, rating_key, item_type, part_id, language, skip_wrong_fps=True):
|
||||
metadata = get_plex_metadata(rating_key, part_id, item_type)
|
||||
|
||||
if item_type == "episode":
|
||||
min_score = 240
|
||||
else:
|
||||
min_score = 60
|
||||
if not metadata:
|
||||
return
|
||||
|
||||
scanned_parts = scan_videos([metadata], kind="series" if item_type == "episode" else "movie", ignore_all=True)
|
||||
if not scanned_parts:
|
||||
@@ -195,9 +205,21 @@ class SubtitleListingMixin(object):
|
||||
video, plex_part = scanned_parts.items()[0]
|
||||
config.init_subliminal_patches()
|
||||
|
||||
provider_settings = config.provider_settings.copy()
|
||||
if not skip_wrong_fps:
|
||||
provider_settings = config.provider_settings.copy()
|
||||
provider_settings["opensubtitles"]["skip_wrong_fps"] = False
|
||||
|
||||
if item_type == "episode":
|
||||
min_score = 240
|
||||
if video.is_special:
|
||||
min_score = 180
|
||||
else:
|
||||
min_score = 60
|
||||
|
||||
available_subs = list_all_subtitles(scanned_parts, {Language.fromietf(language)},
|
||||
providers=config.providers,
|
||||
provider_configs=config.provider_settings,
|
||||
provider_configs=provider_settings,
|
||||
pool_class=config.provider_pool)
|
||||
|
||||
use_hearing_impaired = Prefs['subtitles.search.hearingImpaired'] in ("prefer", "force HI")
|
||||
@@ -248,7 +270,7 @@ class DownloadSubtitleMixin(object):
|
||||
if subtitle.content:
|
||||
try:
|
||||
whack_missing_parts(scanned_parts)
|
||||
save_subtitles(scanned_parts, {video: [subtitle]}, mode=mode)
|
||||
save_subtitles(scanned_parts, {video: [subtitle]}, mode=mode, mods=config.default_mods)
|
||||
Log.Debug("Manually downloaded subtitle for: %s", rating_key)
|
||||
download_successful = True
|
||||
refresh_item(rating_key)
|
||||
@@ -291,7 +313,13 @@ class AvailableSubsForItem(SubtitleListingMixin, Task):
|
||||
super(AvailableSubsForItem, self).run()
|
||||
self.running = True
|
||||
track_usage("Subtitle", "manual", "list", 1)
|
||||
self.data = self.list_subtitles(self.rating_key, self.item_type, self.part_id, self.language)
|
||||
subs = self.list_subtitles(self.rating_key, self.item_type, self.part_id, self.language, skip_wrong_fps=False)
|
||||
if not subs:
|
||||
self.data = None
|
||||
return
|
||||
|
||||
# we can't have nasty unpicklable stuff like ZipFile, BytesIO etc in self.data
|
||||
self.data = [s.make_picklable() for s in subs]
|
||||
|
||||
def post_run(self, task_data):
|
||||
super(AvailableSubsForItem, self).post_run(task_data)
|
||||
@@ -362,13 +390,26 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):
|
||||
return
|
||||
|
||||
now = datetime.datetime.now()
|
||||
min_score_series = int(Prefs["subtitles.search.minimumTVScore2"].strip())
|
||||
min_score_movies = int(Prefs["subtitles.search.minimumMovieScore2"].strip())
|
||||
overwrite_manually_modified = cast_bool(
|
||||
Prefs["scheduler.tasks.FindBetterSubtitles.overwrite_manually_modified"])
|
||||
overwrite_manually_selected = cast_bool(
|
||||
Prefs["scheduler.tasks.FindBetterSubtitles.overwrite_manually_selected"])
|
||||
|
||||
subtitle_storage = get_subtitle_storage()
|
||||
recent_subs = subtitle_storage.load_recent_files(age_days=max_search_days)
|
||||
viable_item_count = 0
|
||||
|
||||
for fn, stored_subs in recent_subs.iteritems():
|
||||
video_id = stored_subs.video_id
|
||||
cutoff = self.series_cutoff if stored_subs.item_type == "episode" else self.movies_cutoff
|
||||
|
||||
if stored_subs.item_type == "episode":
|
||||
cutoff = self.series_cutoff
|
||||
min_score = min_score_series
|
||||
else:
|
||||
cutoff = self.movies_cutoff
|
||||
min_score = min_score_movies
|
||||
|
||||
# don't search for better subtitles until at least 30 minutes have passed
|
||||
if stored_subs.added_at + datetime.timedelta(minutes=30) > now:
|
||||
@@ -379,6 +420,7 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):
|
||||
if stored_subs.added_at + datetime.timedelta(days=max_search_days) <= now:
|
||||
continue
|
||||
|
||||
viable_item_count += 1
|
||||
ditch_parts = []
|
||||
|
||||
# look through all stored subtitle data
|
||||
@@ -398,14 +440,20 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):
|
||||
|
||||
# late cutoff met? skip
|
||||
if current_score >= cutoff:
|
||||
Log.Debug(u"Skipping finding better subs, cutoff met (current: %s, cutoff: %s): %s",
|
||||
current_score, cutoff, stored_subs.title)
|
||||
Log.Debug(u"Skipping finding better subs, cutoff met (current: %s, cutoff: %s): %s (%s)",
|
||||
current_score, cutoff, stored_subs.title, video_id)
|
||||
continue
|
||||
|
||||
# got manual subtitle but don't want to touch those?
|
||||
if current_mode == "m" and \
|
||||
not cast_bool(Prefs["scheduler.tasks.FindBetterSubtitles.overwrite_manually_selected"]):
|
||||
Log.Debug(u"Skipping finding better subs, had manual: %s", stored_subs.title)
|
||||
if current_mode == "m" and not overwrite_manually_selected:
|
||||
Log.Debug(u"Skipping finding better subs, had manual: %s (%s)", stored_subs.title, video_id)
|
||||
continue
|
||||
|
||||
# subtitle modifications different from default
|
||||
if not overwrite_manually_modified and current.mods \
|
||||
and set(current.mods).difference(set(config.default_mods)):
|
||||
Log.Debug(u"Skipping finding better subs, it has manual modifications: %s (%s)",
|
||||
stored_subs.title, video_id)
|
||||
continue
|
||||
|
||||
try:
|
||||
@@ -420,7 +468,7 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):
|
||||
better_downloaded = False
|
||||
better_tried_download = 0
|
||||
for sub in subs:
|
||||
if sub.score > current_score:
|
||||
if sub.score > current_score and sub.score > min_score:
|
||||
Log.Debug("Better subtitle found for %s, downloading", video_id)
|
||||
better_tried_download += 1
|
||||
ret = self.download_subtitle(sub, video_id, mode="b")
|
||||
@@ -444,8 +492,13 @@ class FindBetterSubtitles(DownloadSubtitleMixin, SubtitleListingMixin, Task):
|
||||
pass
|
||||
subtitle_storage.save(stored_subs)
|
||||
|
||||
time.sleep(1)
|
||||
|
||||
if better_found:
|
||||
Log.Debug("Task: %s, done. Better subtitles found for %s items", self.name, better_found)
|
||||
Log.Debug("Task: %s, done. Better subtitles found for %s/%s items", self.name, better_found,
|
||||
viable_item_count)
|
||||
else:
|
||||
Log.Debug("Task: %s, done. No better subtitles found for %s items", self.name, viable_item_count)
|
||||
|
||||
|
||||
class SubtitleStorageMaintenance(Task):
|
||||
@@ -465,9 +518,27 @@ class SubtitleStorageMaintenance(Task):
|
||||
Log.Info("Nothing to do")
|
||||
|
||||
|
||||
class MigrateSubtitleStorage(Task):
|
||||
periodic = False
|
||||
frequency = None
|
||||
|
||||
def run(self):
|
||||
super(MigrateSubtitleStorage, self).run()
|
||||
self.running = True
|
||||
Log.Info("Running subtitle storage migration")
|
||||
storage = get_subtitle_storage()
|
||||
for fn in storage.get_all_files():
|
||||
if fn.endswith(".json.gz"):
|
||||
continue
|
||||
Log.Debug("Migrating %s", fn)
|
||||
storage.load(None, fn)
|
||||
|
||||
|
||||
scheduler.register(SearchAllRecentlyAddedMissing)
|
||||
scheduler.register(AvailableSubsForItem)
|
||||
scheduler.register(DownloadSubtitleForItem)
|
||||
scheduler.register(MissingSubtitles)
|
||||
scheduler.register(FindBetterSubtitles)
|
||||
scheduler.register(SubtitleStorageMaintenance)
|
||||
scheduler.register(MigrateSubtitleStorage)
|
||||
|
||||
|
||||
@@ -258,13 +258,14 @@
|
||||
"35",
|
||||
"30",
|
||||
"25",
|
||||
"21",
|
||||
"20",
|
||||
"15",
|
||||
"10",
|
||||
"5",
|
||||
"0"
|
||||
],
|
||||
"default": "25"
|
||||
"default": "21"
|
||||
},
|
||||
{
|
||||
"id": "provider.addic7ed.use_random_agents",
|
||||
@@ -332,7 +333,7 @@
|
||||
},
|
||||
{
|
||||
"id": "providers.multithreading",
|
||||
"label": "Search enabled providers simuntaneously (multithreading)",
|
||||
"label": "Search enabled providers simultaneously (multithreading)",
|
||||
"type": "bool",
|
||||
"default": "true"
|
||||
},
|
||||
@@ -356,7 +357,7 @@
|
||||
},
|
||||
{
|
||||
"id": "subtitles.scan.exotic_ext",
|
||||
"label": "Scan: include \"exotic\" external subtitle formats (anything else than .srt/.ssa/.ass)",
|
||||
"label": "Scan: include \"exotic\" subtitle formats (anything else than .srt/.ssa/.ass; embedded or external)",
|
||||
"type": "bool",
|
||||
"default": "false"
|
||||
},
|
||||
@@ -381,7 +382,7 @@
|
||||
"id": "subtitles.search.minimumMovieScore2",
|
||||
"label": "Minimum score for movies (min: 60, def/sane: 69, min-ideal: 82; see http://v.ht/szscores)",
|
||||
"type": "text",
|
||||
"default": "69"
|
||||
"default": "60"
|
||||
},
|
||||
{
|
||||
"id": "subtitles.search.hearingImpaired",
|
||||
@@ -399,14 +400,51 @@
|
||||
"id": "subtitles.remove_hi",
|
||||
"label": "Remove Hearing Impaired tags from downloaded subtitles",
|
||||
"type": "bool",
|
||||
"default": "false"
|
||||
},
|
||||
{
|
||||
"id": "subtitles.fix_common",
|
||||
"label": "Fix common whitespace/punctuation issues in subtitles",
|
||||
"type": "bool",
|
||||
"default": "true"
|
||||
},
|
||||
{
|
||||
"id": "subtitles.fix_ocr",
|
||||
"label": "Fix common OCR errors in downloaded subtitles",
|
||||
"type": "bool",
|
||||
"default": "true"
|
||||
},
|
||||
{
|
||||
"id": "subtitles.enforce_encoding",
|
||||
"label": "Normalize subtitle encoding to UTF-8",
|
||||
"label": "Normalize subtitle encoding to UTF-8 (highly recommended!)",
|
||||
"type": "bool",
|
||||
"default": "true"
|
||||
},
|
||||
{
|
||||
"id": "subtitles.colors",
|
||||
"label": "Change colors of subtitles to",
|
||||
"type": "enum",
|
||||
"values": [
|
||||
"don't change",
|
||||
"white",
|
||||
"light-grey",
|
||||
"red",
|
||||
"green",
|
||||
"yellow",
|
||||
"blue",
|
||||
"magenta",
|
||||
"cyan",
|
||||
"black",
|
||||
"dark-red",
|
||||
"dark-green",
|
||||
"dark-yellow",
|
||||
"dark-blue",
|
||||
"dark-magenta",
|
||||
"dark-cyan",
|
||||
"dark-grey"
|
||||
],
|
||||
"default": "don't change"
|
||||
},
|
||||
{
|
||||
"id": "subtitles.save.filesystem",
|
||||
"label": "Store subtitles next to media files (instead of metadata)",
|
||||
@@ -498,7 +536,7 @@
|
||||
"id": "scheduler.max_recent_items_per_library",
|
||||
"label": "Scheduler: Recent items to consider per library",
|
||||
"type": "text",
|
||||
"default": "500"
|
||||
"default": "1000"
|
||||
},
|
||||
{
|
||||
"id": "scheduler.tasks.FindBetterSubtitles.frequency",
|
||||
@@ -524,6 +562,12 @@
|
||||
"type": "bool",
|
||||
"default": "true"
|
||||
},
|
||||
{
|
||||
"id": "scheduler.tasks.FindBetterSubtitles.overwrite_manually_modified",
|
||||
"label": "Scheduler: Overwrite subtitles with non-default subtitle modifications when better found",
|
||||
"type": "bool",
|
||||
"default": "false"
|
||||
},
|
||||
{
|
||||
"id": "history_size",
|
||||
"label": "History: amount of items to store historical data for",
|
||||
@@ -599,7 +643,7 @@
|
||||
},
|
||||
{
|
||||
"id": "notify_executable",
|
||||
"label": "Call this executable upon successful subtitle download",
|
||||
"label": "Call this executable upon successful subtitle download (see Wiki for details)",
|
||||
"type": "text",
|
||||
"default": ""
|
||||
},
|
||||
@@ -622,6 +666,12 @@
|
||||
],
|
||||
"default": "WARNING"
|
||||
},
|
||||
{
|
||||
"id": "log_debug_mods",
|
||||
"label": "Log subtitle modification (debug)",
|
||||
"type": "bool",
|
||||
"default": "false"
|
||||
},
|
||||
{
|
||||
"id": "log_console",
|
||||
"label": "Log to console (for development/debugging)",
|
||||
|
||||
+3
-3
@@ -9,11 +9,11 @@
|
||||
<key>CFBundleInfoDictionaryVersion</key>
|
||||
<string>6.0</string>
|
||||
<key>CFBundleShortVersionString</key>
|
||||
<string>2.0.0</string>
|
||||
<string>2.0.20</string>
|
||||
<key>CFBundleSignature</key>
|
||||
<string>????</string>
|
||||
<key>CFBundleVersion</key>
|
||||
<string>2.0.0.10</string>
|
||||
<string>2.0.20.1364</string>
|
||||
<key>PlexFrameworkVersion</key>
|
||||
<string>2</string>
|
||||
<key>PlexPluginClass</key>
|
||||
@@ -32,7 +32,7 @@
|
||||
|
||||
<h1>Sub-Zero for Plex</h1><i>Subtitles done right</i>
|
||||
|
||||
Version 2.0.0.10 DEV
|
||||
Version 2.0.20.1364 RC9
|
||||
|
||||
Originally based on @bramwalet's awesome <a href="https://github.com/bramwalet/Subliminal.bundle">Subliminal.bundle</a>
|
||||
|
||||
|
||||
@@ -369,7 +369,8 @@ class Chapter(object):
|
||||
if chapterdisplays:
|
||||
string = chapterdisplays[0].get('ChapString')
|
||||
language = chapterdisplays[0].get('ChapLanguage')
|
||||
return cls(start, hidden, enabled, end, string, language)
|
||||
return cls(start, hidden, enabled, end, string, language)
|
||||
return cls(start, hidden, enabled, end)
|
||||
|
||||
def __repr__(self):
|
||||
return '<%s [%s, enabled=%s]>' % (self.__class__.__name__, self.start, self.enabled)
|
||||
|
||||
@@ -168,9 +168,13 @@ def parse(stream, specs, size=None, ignore_element_types=None, ignore_element_na
|
||||
while size is None or stream.tell() - start < size:
|
||||
try:
|
||||
element = parse_element(stream, specs)
|
||||
if not element or not hasattr(element, "type"):
|
||||
stream.seek(element.size, 1)
|
||||
continue
|
||||
|
||||
if element.type is None:
|
||||
logger.error('Element with id 0x%x is not in the specs' % element_id)
|
||||
stream.seek(element_size, 1)
|
||||
logger.error('Element with id 0x%x is not in the specs' % element.id)
|
||||
stream.seek(element.size, 1)
|
||||
continue
|
||||
elif element.type in ignore_element_types or element.name in ignore_element_names:
|
||||
logger.info('%s %s %s ignored', element.__class__.__name__, element.name, element.type)
|
||||
|
||||
@@ -39,12 +39,13 @@ def audio_codec():
|
||||
rebulk.defaults(name="audio_codec", conflict_solver=audio_codec_priority)
|
||||
|
||||
rebulk.regex("MP3", "LAME", r"LAME(?:\d)+-?(?:\d)+", value="MP3")
|
||||
rebulk.regex("Dolby", "DolbyDigital", "Dolby-Digital", "DDP?", value="DolbyDigital")
|
||||
rebulk.regex("Dolby", "DolbyDigital", "Dolby-Digital", "DD", value="DolbyDigital")
|
||||
rebulk.regex("DolbyAtmos", "Dolby-Atmos", "Atmos", value="DolbyAtmos")
|
||||
rebulk.regex("AAC", value="AAC")
|
||||
rebulk.string("AAC", value="AAC")
|
||||
rebulk.regex("AC3D?", value="AC3")
|
||||
rebulk.regex("Flac", value="FLAC")
|
||||
rebulk.regex("DTS", value="DTS")
|
||||
rebulk.string('EAC3', 'DDP', 'DD+', value="EAC3")
|
||||
rebulk.string("Flac", value="FLAC")
|
||||
rebulk.string("DTS", value="DTS")
|
||||
rebulk.regex("True-?HD", value="TrueHD")
|
||||
|
||||
rebulk.defaults(name="audio_profile")
|
||||
|
||||
@@ -34,15 +34,17 @@ def container():
|
||||
'ogv', 'qt', 'ra', 'ram', 'rm', 'ts', 'wav', 'webm', 'wma', 'wmv',
|
||||
'iso', 'vob']
|
||||
torrent = ['torrent']
|
||||
nzb = ['nzb']
|
||||
|
||||
rebulk.regex(r'\.'+build_or_pattern(subtitles)+'$', exts=subtitles, tags=['extension', 'subtitle'])
|
||||
rebulk.regex(r'\.'+build_or_pattern(info)+'$', exts=info, tags=['extension', 'info'])
|
||||
rebulk.regex(r'\.'+build_or_pattern(videos)+'$', exts=videos, tags=['extension', 'video'])
|
||||
rebulk.regex(r'\.'+build_or_pattern(torrent)+'$', exts=torrent, tags=['extension', 'torrent'])
|
||||
rebulk.regex(r'\.'+build_or_pattern(nzb)+'$', exts=nzb, tags=['extension', 'nzb'])
|
||||
|
||||
rebulk.defaults(name='container',
|
||||
validator=seps_surround,
|
||||
formatter=lambda s: s.upper(),
|
||||
formatter=lambda s: s.lower(),
|
||||
conflict_solver=lambda match, other: match
|
||||
if other.name in ['format',
|
||||
'video_codec'] or other.name == 'container' and 'extension' in other.tags
|
||||
@@ -51,5 +53,6 @@ def container():
|
||||
rebulk.string(*[sub for sub in subtitles if sub not in ['sub']], tags=['subtitle'])
|
||||
rebulk.string(*videos, tags=['video'])
|
||||
rebulk.string(*torrent, tags=['torrent'])
|
||||
rebulk.string(*nzb, tags=['nzb'])
|
||||
|
||||
return rebulk
|
||||
|
||||
@@ -5,7 +5,7 @@ Episode title
|
||||
"""
|
||||
from collections import defaultdict
|
||||
|
||||
from rebulk import Rebulk, Rule, AppendMatch, RenameMatch, POST_PROCESS
|
||||
from rebulk import Rebulk, Rule, AppendMatch, RemoveMatch, RenameMatch, POST_PROCESS
|
||||
|
||||
from ..common import seps, title_seps
|
||||
from ..common.formatters import cleanup
|
||||
@@ -19,8 +19,12 @@ def episode_title():
|
||||
:return: Created Rebulk object
|
||||
:rtype: Rebulk
|
||||
"""
|
||||
rebulk = Rebulk().rules(EpisodeTitleFromPosition,
|
||||
AlternativeTitleReplace,
|
||||
previous_names = ('episode', 'episode_details', 'episode_count',
|
||||
'season', 'season_count', 'date', 'title', 'year')
|
||||
|
||||
rebulk = Rebulk().rules(RemoveConflictsWithEpisodeTitle(previous_names),
|
||||
EpisodeTitleFromPosition(previous_names),
|
||||
AlternativeTitleReplace(previous_names),
|
||||
TitleToEpisodeTitle,
|
||||
Filepart3EpisodeTitle,
|
||||
Filepart2EpisodeTitle,
|
||||
@@ -28,6 +32,62 @@ def episode_title():
|
||||
return rebulk
|
||||
|
||||
|
||||
class RemoveConflictsWithEpisodeTitle(Rule):
|
||||
"""
|
||||
Remove conflicting matches that might lead to wrong episode_title parsing.
|
||||
"""
|
||||
|
||||
priority = 64
|
||||
consequence = RemoveMatch
|
||||
|
||||
def __init__(self, previous_names):
|
||||
super(RemoveConflictsWithEpisodeTitle, self).__init__()
|
||||
self.previous_names = previous_names
|
||||
self.next_names = ('streaming_service', 'screen_size', 'format',
|
||||
'video_codec', 'audio_codec', 'other', 'container')
|
||||
self.affected_if_holes_after = ('part', )
|
||||
self.affected_names = ('part', 'year')
|
||||
|
||||
def when(self, matches, context):
|
||||
to_remove = []
|
||||
for filepart in matches.markers.named('path'):
|
||||
for match in matches.range(filepart.start, filepart.end,
|
||||
predicate=lambda m: m.name in self.affected_names):
|
||||
before = matches.previous(match, index=0,
|
||||
predicate=lambda m, fp=filepart: not m.private and m.start >= fp.start)
|
||||
if not before or before.name not in self.previous_names:
|
||||
continue
|
||||
|
||||
after = matches.next(match, index=0,
|
||||
predicate=lambda m, fp=filepart: not m.private and m.end <= fp.end)
|
||||
if not after or after.name not in self.next_names:
|
||||
continue
|
||||
|
||||
group = matches.markers.at_match(match, predicate=lambda m: m.name == 'group', index=0)
|
||||
|
||||
def has_value_in_same_group(current_match, current_group=group):
|
||||
"""Return true if current match has value and belongs to the current group."""
|
||||
return current_match.value.strip(seps) and (
|
||||
current_group == matches.markers.at_match(current_match,
|
||||
predicate=lambda mm: mm.name == 'group', index=0)
|
||||
)
|
||||
|
||||
holes_before = matches.holes(before.end, match.start, predicate=has_value_in_same_group)
|
||||
holes_after = matches.holes(match.end, after.start, predicate=has_value_in_same_group)
|
||||
|
||||
if not holes_before and not holes_after:
|
||||
continue
|
||||
|
||||
if match.name in self.affected_if_holes_after and not holes_after:
|
||||
continue
|
||||
|
||||
to_remove.append(match)
|
||||
if match.parent:
|
||||
to_remove.append(match.parent)
|
||||
|
||||
return to_remove
|
||||
|
||||
|
||||
class TitleToEpisodeTitle(Rule):
|
||||
"""
|
||||
If multiple different title are found, convert the one following episode number to episode_title.
|
||||
@@ -65,12 +125,14 @@ class EpisodeTitleFromPosition(TitleBaseRule):
|
||||
"""
|
||||
dependency = TitleToEpisodeTitle
|
||||
|
||||
def __init__(self, previous_names):
|
||||
super(EpisodeTitleFromPosition, self).__init__('episode_title', ['title'])
|
||||
self.previous_names = previous_names
|
||||
|
||||
def hole_filter(self, hole, matches):
|
||||
episode = matches.previous(hole,
|
||||
lambda previous: any(name in previous.names
|
||||
for name in ['episode', 'episode_details',
|
||||
'episode_count', 'season', 'season_count',
|
||||
'date', 'title', 'year']),
|
||||
for name in self.previous_names),
|
||||
0)
|
||||
|
||||
crc32 = matches.named('crc32')
|
||||
@@ -88,9 +150,6 @@ class EpisodeTitleFromPosition(TitleBaseRule):
|
||||
return False
|
||||
return super(EpisodeTitleFromPosition, self).should_remove(match, matches, filepart, hole, context)
|
||||
|
||||
def __init__(self):
|
||||
super(EpisodeTitleFromPosition, self).__init__('episode_title', ['title'])
|
||||
|
||||
def when(self, matches, context):
|
||||
if matches.named('episode_title'):
|
||||
return
|
||||
@@ -104,6 +163,10 @@ class AlternativeTitleReplace(Rule):
|
||||
dependency = EpisodeTitleFromPosition
|
||||
consequence = RenameMatch
|
||||
|
||||
def __init__(self, previous_names):
|
||||
super(AlternativeTitleReplace, self).__init__()
|
||||
self.previous_names = previous_names
|
||||
|
||||
def when(self, matches, context):
|
||||
if matches.named('episode_title'):
|
||||
return
|
||||
@@ -115,10 +178,7 @@ class AlternativeTitleReplace(Rule):
|
||||
if main_title:
|
||||
episode = matches.previous(main_title,
|
||||
lambda previous: any(name in previous.names
|
||||
for name in ['episode', 'episode_details',
|
||||
'episode_count', 'season',
|
||||
'season_count',
|
||||
'date', 'title', 'year']),
|
||||
for name in self.previous_names),
|
||||
0)
|
||||
|
||||
crc32 = matches.named('crc32')
|
||||
|
||||
@@ -231,14 +231,16 @@ def episodes():
|
||||
formatter={'season': int, 'other': lambda match: 'Complete'})
|
||||
|
||||
# 12, 13
|
||||
rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int}) \
|
||||
rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int},
|
||||
disabled=lambda context: context.get('type') == 'movie') \
|
||||
.defaults(validator=None) \
|
||||
.regex(r'(?P<episode>\d{2})') \
|
||||
.regex(r'v(?P<version>\d+)').repeater('?') \
|
||||
.regex(r'(?P<episodeSeparator>[x-])(?P<episode>\d{2})').repeater('*')
|
||||
|
||||
# 012, 013
|
||||
rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int}) \
|
||||
rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int},
|
||||
disabled=lambda context: context.get('type') == 'movie') \
|
||||
.defaults(validator=None) \
|
||||
.regex(r'0(?P<episode>\d{1,2})') \
|
||||
.regex(r'v(?P<version>\d+)').repeater('?') \
|
||||
@@ -246,7 +248,8 @@ def episodes():
|
||||
|
||||
# 112, 113
|
||||
rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode'], formatter={'episode': int, 'version': int},
|
||||
disabled=lambda context: not context.get('episode_prefer_number', False)) \
|
||||
disabled=lambda context: (not context.get('episode_prefer_number', False) or
|
||||
context.get('type') == 'movie')) \
|
||||
.defaults(validator=None) \
|
||||
.regex(r'(?P<episode>\d{3,4})') \
|
||||
.regex(r'v(?P<version>\d+)').repeater('?') \
|
||||
@@ -287,7 +290,8 @@ def episodes():
|
||||
rebulk.chain(tags=['bonus-conflict', 'weak-movie', 'weak-episode', 'weak-duplicate'],
|
||||
formatter={'season': int, 'episode': int, 'version': int},
|
||||
conflict_solver=lambda match, other: match if other.name == 'year' else '__default__',
|
||||
disabled=lambda context: context.get('episode_prefer_number', False)) \
|
||||
disabled=lambda context: (context.get('episode_prefer_number', False) or
|
||||
context.get('type') == 'movie')) \
|
||||
.defaults(validator=None) \
|
||||
.regex(r'(?P<season>\d{1,2})(?P<episode>\d{2})') \
|
||||
.regex(r'v(?P<version>\d+)').repeater('?') \
|
||||
@@ -460,8 +464,21 @@ class RemoveWeakIfMovie(Rule):
|
||||
return context.get('type') != 'episode'
|
||||
|
||||
def when(self, matches, context):
|
||||
if matches.named('year'):
|
||||
return matches.tagged('weak-movie')
|
||||
to_remove = []
|
||||
to_ignore = set()
|
||||
remove = False
|
||||
for filepart in matches.markers.named('path'):
|
||||
year = matches.range(filepart.start, filepart.end, predicate=lambda m: m.name == 'year', index=0)
|
||||
if year:
|
||||
remove = True
|
||||
next_match = matches.next(year, predicate=lambda m, fp=filepart: m.private and m.end <= fp.end, index=0)
|
||||
if next_match and not matches.at_match(next_match, predicate=lambda m: m.name == 'year'):
|
||||
to_ignore.add(next_match.initiator)
|
||||
|
||||
if remove:
|
||||
to_remove.extend(matches.tagged('weak-movie', predicate=lambda m: m.initiator not in to_ignore))
|
||||
|
||||
return to_remove
|
||||
|
||||
|
||||
class RemoveWeakIfSxxExx(Rule):
|
||||
|
||||
@@ -39,8 +39,7 @@ COMMON_WORDS_STRICT = frozenset(['brazil'])
|
||||
|
||||
UNDETERMINED = babelfish.Language('und')
|
||||
|
||||
SYN = {('und', None): ['unknown', 'inconnu', 'unk'],
|
||||
('ell', None): ['gr', 'greek'],
|
||||
SYN = {('ell', None): ['gr', 'greek'],
|
||||
('spa', None): ['esp', 'español', 'espanol'],
|
||||
('fra', None): ['français', 'vf', 'vff', 'vfi', 'vfq'],
|
||||
('swe', None): ['se'],
|
||||
|
||||
@@ -85,6 +85,7 @@ class ValidateWebsitePrefix(Rule):
|
||||
"""
|
||||
Validate website prefixes
|
||||
"""
|
||||
priority = 64
|
||||
consequence = RemoveMatch
|
||||
|
||||
def when(self, matches, context):
|
||||
|
||||
@@ -1814,7 +1814,7 @@
|
||||
format: HDTV
|
||||
video_codec: h264
|
||||
audio_codec: AAC
|
||||
container: MP4
|
||||
container: mp4
|
||||
release_group: k3n
|
||||
type: episode
|
||||
|
||||
@@ -1885,7 +1885,7 @@
|
||||
|
||||
? Breaking.Bad.S01E01.2008.BluRay.VC1.1080P.5.1.WMV-NOVO
|
||||
: audio_channels: '5.1'
|
||||
container: WMV
|
||||
container: wmv
|
||||
episode: 1
|
||||
format: BluRay
|
||||
release_group: NOVO
|
||||
@@ -1922,9 +1922,7 @@
|
||||
|
||||
? Fear.The.Walking.Dead.S02E01.HDTV.x264.AAC.MP4-k3n.mp4
|
||||
: audio_codec: AAC
|
||||
container:
|
||||
- MP4
|
||||
- mp4
|
||||
container: mp4
|
||||
episode: 1
|
||||
format: HDTV
|
||||
mimetype: video/mp4
|
||||
@@ -2242,7 +2240,7 @@
|
||||
screen_size: 1080p
|
||||
streaming_service: Amazon Prime
|
||||
format: WEBRip
|
||||
audio_codec: DolbyDigital
|
||||
audio_codec: EAC3
|
||||
audio_channels: '5.1'
|
||||
video_codec: h264
|
||||
type: episode
|
||||
@@ -2692,7 +2690,7 @@
|
||||
screen_size: 4K
|
||||
streaming_service: Amazon Prime
|
||||
format: WEBRip
|
||||
audio_codec: DolbyDigital
|
||||
audio_codec: EAC3
|
||||
audio_channels: '5.1'
|
||||
video_codec: h264
|
||||
release_group: Group
|
||||
@@ -3311,7 +3309,7 @@
|
||||
screen_size: 720p
|
||||
format: WEBRip
|
||||
video_codec: h264
|
||||
container: MKV
|
||||
container: mkv
|
||||
audio_codec: AC3
|
||||
audio_channels: '5.1'
|
||||
release_group: Ehhhh
|
||||
@@ -3846,3 +3844,113 @@
|
||||
release_group: 0SEC [GloDLS]
|
||||
container: mkv
|
||||
type: episode
|
||||
|
||||
? Anthony.Bourdain.Parts.Unknown.S09E01.Los.Angeles.720p.HDTV.x264-MiNDTHEGAP
|
||||
: title: Anthony Bourdain Parts Unknown
|
||||
season: 9
|
||||
episode: 1
|
||||
episode_title: Los Angeles
|
||||
screen_size: 720p
|
||||
format: HDTV
|
||||
video_codec: h264
|
||||
release_group: MiNDTHEGAP
|
||||
type: episode
|
||||
|
||||
? -feud.s01e05.and.the.winner.is.(the.oscars.of.1963).720p.amzn.webrip.dd5.1.x264-casstudio.mkv
|
||||
: year: 1963
|
||||
|
||||
? feud.s01e05.and.the.winner.is.(the.oscars.of.1963).720p.amzn.webrip.dd5.1.x264-casstudio.mkv
|
||||
: title: feud
|
||||
season: 1
|
||||
episode: 5
|
||||
episode_title: and the winner is
|
||||
screen_size: 720p
|
||||
streaming_service: Amazon Prime
|
||||
format: WEBRip
|
||||
audio_codec: DolbyDigital
|
||||
audio_channels: '5.1'
|
||||
video_codec: h264
|
||||
release_group: casstudio
|
||||
container: mkv
|
||||
type: episode
|
||||
|
||||
? Adventure.Time.S08E16.Elements.Part.1.Skyhooks.720p.WEB-DL.AAC2.0.H.264-RTN.mkv
|
||||
: title: Adventure Time
|
||||
season: 8
|
||||
episode: 16
|
||||
season: 8
|
||||
episode: 16
|
||||
episode_title: Elements Part 1 Skyhooks
|
||||
screen_size: 720p
|
||||
format: WEB-DL
|
||||
audio_codec: AAC
|
||||
audio_channels: '2.0'
|
||||
video_codec: h264
|
||||
release_group: RTN
|
||||
container: mkv
|
||||
type: episode
|
||||
|
||||
? D:\TV\SITCOMS (CLASSIC)\That '70s Show\Season 07\That '70s Show - S07E22 - 2000 Light Years from Home.mkv
|
||||
: title: That '70s Show
|
||||
season: 7
|
||||
episode: 22
|
||||
episode_title: 2000 Light Years from Home
|
||||
other: Classic
|
||||
container: mkv
|
||||
mimetype: video/x-matroska
|
||||
type: episode
|
||||
|
||||
? Show.Name.S02E01.Super.Title.720p.WEB-DL.DD5.1.H.264-ABC.nzb
|
||||
: title: Show Name
|
||||
season: 2
|
||||
episode: 1
|
||||
episode_title: Super Title
|
||||
screen_size: 720p
|
||||
format: WEB-DL
|
||||
audio_codec: DolbyDigital
|
||||
audio_channels: '5.1'
|
||||
video_codec: h264
|
||||
release_group: ABC
|
||||
container: nzb
|
||||
type: episode
|
||||
|
||||
? "[SGKK] Bleach 312v1 [720p/mkv]-Group.mkv"
|
||||
: title: Bleach
|
||||
season: 3
|
||||
episode: 12
|
||||
version: 1
|
||||
screen_size: 720p
|
||||
release_group: Group
|
||||
container: mkv
|
||||
type: episode
|
||||
|
||||
? The.Expanse.S02E08.720p.WEBRip.x264.EAC3-KiNGS.mkv
|
||||
: title: The Expanse
|
||||
season: 2
|
||||
episode: 8
|
||||
screen_size: 720p
|
||||
format: WEBRip
|
||||
video_codec: h264
|
||||
audio_codec: EAC3
|
||||
release_group: KiNGS
|
||||
container: mkv
|
||||
type: episode
|
||||
|
||||
? Series_name.2005.211.episode.title.avi
|
||||
: title: Series name
|
||||
year: 2005
|
||||
season: 2
|
||||
episode: 11
|
||||
episode_title: episode title
|
||||
container: avi
|
||||
type: episode
|
||||
|
||||
? the.flash.2014.208.hdtv-lol[ettv].mkv
|
||||
: title: the flash
|
||||
year: 2014
|
||||
season: 2
|
||||
episode: 8
|
||||
format: HDTV
|
||||
release_group: lol[ettv]
|
||||
container: mkv
|
||||
type: episode
|
||||
|
||||
@@ -644,7 +644,7 @@
|
||||
- Timsit
|
||||
- Lindon
|
||||
screen_size: 1080p
|
||||
container: MKV
|
||||
container: mkv
|
||||
format: HDTV
|
||||
|
||||
? some.movie.720p.bluray.x264-mind
|
||||
@@ -1082,3 +1082,18 @@
|
||||
format: BluRay
|
||||
screen_size: 1080p
|
||||
type: movie
|
||||
|
||||
? 10 Cloverfield Lane.[Blu-Ray 1080p].[MULTI]
|
||||
: options: --type movie
|
||||
title: 10 Cloverfield Lane
|
||||
format: BluRay
|
||||
screen_size: 1080p
|
||||
language: Multiple languages
|
||||
type: movie
|
||||
|
||||
? 007.Spectre.[HDTC.MD].[TRUEFRENCH]
|
||||
: options: --type movie
|
||||
title: 007 Spectre
|
||||
format: HDTC
|
||||
language: French
|
||||
type: movie
|
||||
|
||||
@@ -10,10 +10,14 @@
|
||||
|
||||
? +DolbyDigital
|
||||
? +DD
|
||||
? +DDP
|
||||
? +Dolby Digital
|
||||
: audio_codec: DolbyDigital
|
||||
|
||||
? +DDP
|
||||
? +DD+
|
||||
? +EAC3
|
||||
: audio_codec: EAC3
|
||||
|
||||
? +DolbyAtmos
|
||||
? +Dolby Atmos
|
||||
? +Atmos
|
||||
|
||||
@@ -146,7 +146,7 @@
|
||||
? Show.Name.-.Season.1.to.3.-.Mp4.1080p
|
||||
? Show.Name.-.Season.1~3.-.Mp4.1080p
|
||||
? Show.Name.-.Saison.1.a.3.-.Mp4.1080p
|
||||
: container: MP4
|
||||
: container: mp4
|
||||
screen_size: 1080p
|
||||
season:
|
||||
- 1
|
||||
|
||||
@@ -761,14 +761,15 @@
|
||||
type: episode
|
||||
video_codec: h264
|
||||
|
||||
# Episode title is indeed 'October 8, 2014'
|
||||
# https://thetvdb.com/?tab=episode&seriesid=82483&seasonid=569935&id=4997362&lid=7
|
||||
? The Soup - 11x41 - October 8, 2014.mp4
|
||||
: container: mp4
|
||||
episode: 41
|
||||
episode_title: October 8
|
||||
episode_title: October 8, 2014
|
||||
season: 11
|
||||
title: The Soup
|
||||
type: episode
|
||||
year: 2014
|
||||
|
||||
? Red.Rock.S02E59.WEB-DLx264-JIVE
|
||||
: episode: 59
|
||||
|
||||
@@ -0,0 +1,24 @@
|
||||
|
||||
from .utils import hashodict, NoNumpyException, NoPandasException, get_scalar_repr, encode_scalars_inplace
|
||||
from .comment import strip_comment_line_with_symbol, strip_comments
|
||||
from .encoders import TricksEncoder, json_date_time_encode, class_instance_encode, json_complex_encode, \
|
||||
numeric_types_encode, ClassInstanceEncoder, json_set_encode, pandas_encode, nopandas_encode, \
|
||||
numpy_encode, NumpyEncoder, nonumpy_encode, NoNumpyEncoder
|
||||
from .decoders import DuplicateJsonKeyException, TricksPairHook, json_date_time_hook, json_complex_hook, \
|
||||
numeric_types_hook, ClassInstanceHook, json_set_hook, pandas_hook, nopandas_hook, json_numpy_obj_hook, \
|
||||
json_nonumpy_obj_hook
|
||||
from .nonp import dumps, dump, loads, load
|
||||
|
||||
|
||||
try:
|
||||
# find_module takes just as long as importing, so no optimization possible
|
||||
import numpy
|
||||
except ImportError:
|
||||
NUMPY_MODE = False
|
||||
# from .nonp import dumps, dump, loads, load, nonumpy_encode as numpy_encode, json_nonumpy_obj_hook as json_numpy_obj_hook
|
||||
else:
|
||||
NUMPY_MODE = True
|
||||
# from .np import dumps, dump, loads, load, numpy_encode, NumpyEncoder, json_numpy_obj_hook
|
||||
# from .np_utils import encode_scalars_inplace
|
||||
|
||||
|
||||
@@ -0,0 +1,29 @@
|
||||
|
||||
from re import findall
|
||||
|
||||
|
||||
def strip_comment_line_with_symbol(line, start):
|
||||
parts = line.split(start)
|
||||
counts = [len(findall(r'(?:^|[^"\\]|(?:\\\\|\\")+)(")', part)) for part in parts]
|
||||
total = 0
|
||||
for nr, count in enumerate(counts):
|
||||
total += count
|
||||
if total % 2 == 0:
|
||||
return start.join(parts[:nr+1]).rstrip()
|
||||
else:
|
||||
return line.rstrip()
|
||||
|
||||
|
||||
def strip_comments(string, comment_symbols=frozenset(('#', '//'))):
|
||||
"""
|
||||
:param string: A string containing json with comments started by comment_symbols.
|
||||
:param comment_symbols: Iterable of symbols that start a line comment (default # or //).
|
||||
:return: The string with the comments removed.
|
||||
"""
|
||||
lines = string.splitlines()
|
||||
for k in range(len(lines)):
|
||||
for symbol in comment_symbols:
|
||||
lines[k] = strip_comment_line_with_symbol(lines[k], start=symbol)
|
||||
return '\n'.join(lines)
|
||||
|
||||
|
||||
@@ -0,0 +1,248 @@
|
||||
|
||||
from datetime import datetime, date, time, timedelta
|
||||
from fractions import Fraction
|
||||
from importlib import import_module
|
||||
from collections import OrderedDict
|
||||
from decimal import Decimal
|
||||
from logging import warning
|
||||
from json_tricks import NoPandasException, NoNumpyException
|
||||
|
||||
|
||||
class DuplicateJsonKeyException(Exception):
|
||||
""" Trying to load a json map which contains duplicate keys, but allow_duplicates is False """
|
||||
|
||||
|
||||
class TricksPairHook(object):
|
||||
"""
|
||||
Hook that converts json maps to the appropriate python type (dict or OrderedDict)
|
||||
and then runs any number of hooks on the individual maps.
|
||||
"""
|
||||
def __init__(self, ordered=True, obj_pairs_hooks=None, allow_duplicates=True):
|
||||
"""
|
||||
:param ordered: True if maps should retain their ordering.
|
||||
:param obj_pairs_hooks: An iterable of hooks to apply to elements.
|
||||
"""
|
||||
self.map_type = OrderedDict
|
||||
if not ordered:
|
||||
self.map_type = dict
|
||||
self.obj_pairs_hooks = []
|
||||
if obj_pairs_hooks:
|
||||
self.obj_pairs_hooks = list(obj_pairs_hooks)
|
||||
self.allow_duplicates = allow_duplicates
|
||||
|
||||
def __call__(self, pairs):
|
||||
if not self.allow_duplicates:
|
||||
known = set()
|
||||
for key, value in pairs:
|
||||
if key in known:
|
||||
raise DuplicateJsonKeyException(('Trying to load a json map which contains a' +
|
||||
' duplicate key "{0:}" (but allow_duplicates is False)').format(key))
|
||||
known.add(key)
|
||||
map = self.map_type(pairs)
|
||||
for hook in self.obj_pairs_hooks:
|
||||
map = hook(map)
|
||||
return map
|
||||
|
||||
|
||||
def json_date_time_hook(dct):
|
||||
"""
|
||||
Return an encoded date, time, datetime or timedelta to it's python representation, including optional timezone.
|
||||
|
||||
:param dct: (dict) json encoded date, time, datetime or timedelta
|
||||
:return: (date/time/datetime/timedelta obj) python representation of the above
|
||||
"""
|
||||
def get_tz(dct):
|
||||
if not 'tzinfo' in dct:
|
||||
return None
|
||||
try:
|
||||
import pytz
|
||||
except ImportError as err:
|
||||
raise ImportError(('Tried to load a json object which has a timezone-aware (date)time. '
|
||||
'However, `pytz` could not be imported, so the object could not be loaded. '
|
||||
'Error: {0:}').format(str(err)))
|
||||
return pytz.timezone(dct['tzinfo'])
|
||||
|
||||
if isinstance(dct, dict):
|
||||
if '__date__' in dct:
|
||||
return date(year=dct.get('year', 0), month=dct.get('month', 0), day=dct.get('day', 0))
|
||||
elif '__time__' in dct:
|
||||
tzinfo = get_tz(dct)
|
||||
return time(hour=dct.get('hour', 0), minute=dct.get('minute', 0), second=dct.get('second', 0),
|
||||
microsecond=dct.get('microsecond', 0), tzinfo=tzinfo)
|
||||
elif '__datetime__' in dct:
|
||||
tzinfo = get_tz(dct)
|
||||
return datetime(year=dct.get('year', 0), month=dct.get('month', 0), day=dct.get('day', 0),
|
||||
hour=dct.get('hour', 0), minute=dct.get('minute', 0), second=dct.get('second', 0),
|
||||
microsecond=dct.get('microsecond', 0), tzinfo=tzinfo)
|
||||
elif '__timedelta__' in dct:
|
||||
return timedelta(days=dct.get('days', 0), seconds=dct.get('seconds', 0),
|
||||
microseconds=dct.get('microseconds', 0))
|
||||
return dct
|
||||
|
||||
|
||||
def json_complex_hook(dct):
|
||||
"""
|
||||
Return an encoded complex number to it's python representation.
|
||||
|
||||
:param dct: (dict) json encoded complex number (__complex__)
|
||||
:return: python complex number
|
||||
"""
|
||||
if isinstance(dct, dict):
|
||||
if '__complex__' in dct:
|
||||
parts = dct['__complex__']
|
||||
assert len(parts) == 2
|
||||
return parts[0] + parts[1] * 1j
|
||||
return dct
|
||||
|
||||
|
||||
def numeric_types_hook(dct):
|
||||
if isinstance(dct, dict):
|
||||
if '__decimal__' in dct:
|
||||
return Decimal(dct['__decimal__'])
|
||||
if '__fraction__' in dct:
|
||||
return Fraction(numerator=dct['numerator'], denominator=dct['denominator'])
|
||||
return dct
|
||||
|
||||
|
||||
class ClassInstanceHook(object):
|
||||
"""
|
||||
This hook tries to convert json encoded by class_instance_encoder back to it's original instance.
|
||||
It only works if the environment is the same, e.g. the class is similarly importable and hasn't changed.
|
||||
"""
|
||||
def __init__(self, cls_lookup_map=None):
|
||||
self.cls_lookup_map = cls_lookup_map or {}
|
||||
|
||||
def __call__(self, dct):
|
||||
if isinstance(dct, dict) and '__instance_type__' in dct:
|
||||
mod, name = dct['__instance_type__']
|
||||
attrs = dct['attributes']
|
||||
if mod is None:
|
||||
try:
|
||||
Cls = getattr((__import__('__main__')), name)
|
||||
except (ImportError, AttributeError) as err:
|
||||
if not name in self.cls_lookup_map:
|
||||
raise ImportError(('class {0:s} seems to have been exported from the main file, which means '
|
||||
'it has no module/import path set; you need to provide cls_lookup_map which maps names '
|
||||
'to classes').format(name))
|
||||
Cls = self.cls_lookup_map[name]
|
||||
else:
|
||||
imp_err = None
|
||||
try:
|
||||
module = import_module('{0:}'.format(mod, name))
|
||||
except ImportError as err:
|
||||
imp_err = ('encountered import error "{0:}" while importing "{1:}" to decode a json file; perhaps '
|
||||
'it was encoded in a different environment where {1:}.{2:} was available').format(err, mod, name)
|
||||
else:
|
||||
if not hasattr(module, name):
|
||||
imp_err = 'imported "{0:}" but could find "{1:}" inside while decoding a json file (found {2:}'.format(
|
||||
module, name, ', '.join(attr for attr in dir(module) if not attr.startswith('_')))
|
||||
Cls = getattr(module, name)
|
||||
if imp_err:
|
||||
if 'name' in self.cls_lookup_map:
|
||||
Cls = self.cls_lookup_map[name]
|
||||
else:
|
||||
raise ImportError(imp_err)
|
||||
try:
|
||||
obj = Cls.__new__(Cls)
|
||||
except TypeError:
|
||||
raise TypeError(('problem while decoding instance of "{0:s}"; this instance has a special '
|
||||
'__new__ method and can\'t be restored').format(name))
|
||||
if hasattr(obj, '__json_decode__'):
|
||||
obj.__json_decode__(**attrs)
|
||||
else:
|
||||
obj.__dict__ = dict(attrs)
|
||||
return obj
|
||||
return dct
|
||||
|
||||
|
||||
def json_set_hook(dct):
|
||||
"""
|
||||
Return an encoded set to it's python representation.
|
||||
"""
|
||||
if isinstance(dct, dict):
|
||||
if '__set__' in dct:
|
||||
return set((tuple(item) if isinstance(item, list) else item) for item in dct['__set__'])
|
||||
return dct
|
||||
|
||||
|
||||
def pandas_hook(dct):
|
||||
if '__pandas_dataframe__' in dct or '__pandas_series__' in dct:
|
||||
# todo: this is experimental
|
||||
if not getattr(pandas_hook, '_warned', False):
|
||||
pandas_hook._warned = True
|
||||
warning('Pandas loading support in json-tricks is experimental and may change in future versions.')
|
||||
if '__pandas_dataframe__' in dct:
|
||||
try:
|
||||
from pandas import DataFrame
|
||||
except ImportError:
|
||||
raise NoPandasException('Trying to decode a map which appears to represent a pandas data structure, but pandas appears not to be installed.')
|
||||
from numpy import dtype, array
|
||||
meta = dct.pop('__pandas_dataframe__')
|
||||
indx = dct.pop('index') if 'index' in dct else None
|
||||
dtypes = dict((colname, dtype(tp)) for colname, tp in zip(meta['column_order'], meta['types']))
|
||||
data = OrderedDict()
|
||||
for name, col in dct.items():
|
||||
data[name] = array(col, dtype=dtypes[name])
|
||||
return DataFrame(
|
||||
data=data,
|
||||
index=indx,
|
||||
columns=meta['column_order'],
|
||||
# mixed `dtypes` argument not supported, so use duct of numpy arrays
|
||||
)
|
||||
elif '__pandas_series__' in dct:
|
||||
from pandas import Series
|
||||
from numpy import dtype, array
|
||||
meta = dct.pop('__pandas_series__')
|
||||
indx = dct.pop('index') if 'index' in dct else None
|
||||
return Series(
|
||||
data=dct['data'],
|
||||
index=indx,
|
||||
name=meta['name'],
|
||||
dtype=dtype(meta['type']),
|
||||
)
|
||||
return dct
|
||||
|
||||
|
||||
def nopandas_hook(dct):
|
||||
if isinstance(dct, dict) and ('__pandas_dataframe__' in dct or '__pandas_series__' in dct):
|
||||
raise NoPandasException(('Trying to decode a map which appears to represent a pandas '
|
||||
'data structure, but pandas support is not enabled, perhaps it is not installed.'))
|
||||
return dct
|
||||
|
||||
|
||||
def json_numpy_obj_hook(dct):
|
||||
"""
|
||||
Replace any numpy arrays previously encoded by NumpyEncoder to their proper
|
||||
shape, data type and data.
|
||||
|
||||
:param dct: (dict) json encoded ndarray
|
||||
:return: (ndarray) if input was an encoded ndarray
|
||||
"""
|
||||
if isinstance(dct, dict) and '__ndarray__' in dct:
|
||||
try:
|
||||
from numpy import asarray
|
||||
import numpy as nptypes
|
||||
except ImportError:
|
||||
raise NoNumpyException('Trying to decode a map which appears to represent a numpy '
|
||||
'array, but numpy appears not to be installed.')
|
||||
order = 'A'
|
||||
if 'Corder' in dct:
|
||||
order = 'C' if dct['Corder'] else 'F'
|
||||
if dct['shape']:
|
||||
return asarray(dct['__ndarray__'], dtype=dct['dtype'], order=order)
|
||||
else:
|
||||
dtype = getattr(nptypes, dct['dtype'])
|
||||
return dtype(dct['__ndarray__'])
|
||||
return dct
|
||||
|
||||
|
||||
def json_nonumpy_obj_hook(dct):
|
||||
"""
|
||||
This hook has no effect except to check if you're trying to decode numpy arrays without support, and give you a useful message.
|
||||
"""
|
||||
if isinstance(dct, dict) and '__ndarray__' in dct:
|
||||
raise NoNumpyException(('Trying to decode a map which appears to represent a numpy array, '
|
||||
'but numpy support is not enabled, perhaps it is not installed.'))
|
||||
return dct
|
||||
|
||||
|
||||
@@ -0,0 +1,311 @@
|
||||
|
||||
from datetime import datetime, date, time, timedelta
|
||||
from fractions import Fraction
|
||||
from logging import warning
|
||||
from json import JSONEncoder
|
||||
from sys import version
|
||||
from decimal import Decimal
|
||||
from .utils import hashodict, call_with_optional_kwargs, NoPandasException, NoNumpyException
|
||||
|
||||
|
||||
class TricksEncoder(JSONEncoder):
|
||||
"""
|
||||
Encoder that runs any number of encoder functions or instances on
|
||||
the objects that are being encoded.
|
||||
|
||||
Each encoder should make any appropriate changes and return an object,
|
||||
changed or not. This will be passes to the other encoders.
|
||||
"""
|
||||
def __init__(self, obj_encoders=None, silence_typeerror=False, primitives=False, **json_kwargs):
|
||||
"""
|
||||
:param obj_encoders: An iterable of functions or encoder instances to try.
|
||||
:param silence_typeerror: If set to True, ignore the TypeErrors that Encoder instances throw (default False).
|
||||
"""
|
||||
self.obj_encoders = []
|
||||
if obj_encoders:
|
||||
self.obj_encoders = list(obj_encoders)
|
||||
self.silence_typeerror = silence_typeerror
|
||||
self.primitives = primitives
|
||||
super(TricksEncoder, self).__init__(**json_kwargs)
|
||||
|
||||
def default(self, obj, *args, **kwargs):
|
||||
"""
|
||||
This is the method of JSONEncoders that is called for each object; it calls
|
||||
all the encoders with the previous one's output used as input.
|
||||
|
||||
It works for Encoder instances, but they are expected not to throw
|
||||
`TypeError` for unrecognized types (the super method does that by default).
|
||||
|
||||
It never calls the `super` method so if there are non-primitive types
|
||||
left at the end, you'll get an encoding error.
|
||||
"""
|
||||
prev_id = id(obj)
|
||||
for encoder in self.obj_encoders:
|
||||
if hasattr(encoder, 'default'):
|
||||
#todo: write test for this scenario (maybe ClassInstanceEncoder?)
|
||||
try:
|
||||
obj = call_with_optional_kwargs(encoder.default, obj, primitives=self.primitives)
|
||||
except TypeError as err:
|
||||
if not self.silence_typeerror:
|
||||
raise
|
||||
elif hasattr(encoder, '__call__'):
|
||||
obj = call_with_optional_kwargs(encoder, obj, primitives=self.primitives)
|
||||
else:
|
||||
raise TypeError('`obj_encoder` {0:} does not have `default` method and is not callable'.format(encoder))
|
||||
if id(obj) == prev_id:
|
||||
#todo: test
|
||||
raise TypeError('Object of type {0:} could not be encoded by {1:} using encoders [{2:s}]'.format(
|
||||
type(obj), self.__class__.__name__, ', '.join(str(encoder) for encoder in self.obj_encoders)))
|
||||
return obj
|
||||
|
||||
|
||||
def json_date_time_encode(obj, primitives=False):
|
||||
"""
|
||||
Encode a date, time, datetime or timedelta to a string of a json dictionary, including optional timezone.
|
||||
|
||||
:param obj: date/time/datetime/timedelta obj
|
||||
:return: (dict) json primitives representation of date, time, datetime or timedelta
|
||||
"""
|
||||
if primitives and isinstance(obj, (date, time, datetime)):
|
||||
return obj.isoformat()
|
||||
if isinstance(obj, datetime):
|
||||
dct = hashodict([('__datetime__', None), ('year', obj.year), ('month', obj.month),
|
||||
('day', obj.day), ('hour', obj.hour), ('minute', obj.minute),
|
||||
('second', obj.second), ('microsecond', obj.microsecond)])
|
||||
if obj.tzinfo:
|
||||
dct['tzinfo'] = obj.tzinfo.zone
|
||||
elif isinstance(obj, date):
|
||||
dct = hashodict([('__date__', None), ('year', obj.year), ('month', obj.month), ('day', obj.day)])
|
||||
elif isinstance(obj, time):
|
||||
dct = hashodict([('__time__', None), ('hour', obj.hour), ('minute', obj.minute),
|
||||
('second', obj.second), ('microsecond', obj.microsecond)])
|
||||
if obj.tzinfo:
|
||||
dct['tzinfo'] = obj.tzinfo.zone
|
||||
elif isinstance(obj, timedelta):
|
||||
if primitives:
|
||||
return obj.total_seconds()
|
||||
else:
|
||||
dct = hashodict([('__timedelta__', None), ('days', obj.days), ('seconds', obj.seconds),
|
||||
('microseconds', obj.microseconds)])
|
||||
else:
|
||||
return obj
|
||||
for key, val in tuple(dct.items()):
|
||||
if not key.startswith('__') and not val:
|
||||
del dct[key]
|
||||
return dct
|
||||
|
||||
|
||||
def class_instance_encode(obj, primitives=False):
|
||||
"""
|
||||
Encodes a class instance to json. Note that it can only be recovered if the environment allows the class to be
|
||||
imported in the same way.
|
||||
"""
|
||||
if isinstance(obj, list) or isinstance(obj, dict):
|
||||
return obj
|
||||
if hasattr(obj, '__class__') and hasattr(obj, '__dict__'):
|
||||
if not hasattr(obj, '__new__'):
|
||||
raise TypeError('class "{0:s}" does not have a __new__ method; '.format(obj.__class__) +
|
||||
('perhaps it is an old-style class not derived from `object`; add `object` as a base class to encode it.'
|
||||
if (version[:2] == '2.') else 'this should not happen in Python3'))
|
||||
try:
|
||||
obj.__new__(obj.__class__)
|
||||
except TypeError:
|
||||
raise TypeError(('instance "{0:}" of class "{1:}" cannot be encoded because it\'s __new__ method '
|
||||
'cannot be called, perhaps it requires extra parameters').format(obj, obj.__class__))
|
||||
mod = obj.__class__.__module__
|
||||
if mod == '__main__':
|
||||
mod = None
|
||||
warning(('class {0:} seems to have been defined in the main file; unfortunately this means'
|
||||
' that it\'s module/import path is unknown, so you might have to provide cls_lookup_map when '
|
||||
'decoding').format(obj.__class__))
|
||||
name = obj.__class__.__name__
|
||||
if hasattr(obj, '__json_encode__'):
|
||||
attrs = obj.__json_encode__()
|
||||
else:
|
||||
attrs = hashodict(obj.__dict__.items())
|
||||
if primitives:
|
||||
return attrs
|
||||
else:
|
||||
return hashodict((('__instance_type__', (mod, name)), ('attributes', attrs)))
|
||||
return obj
|
||||
|
||||
|
||||
def json_complex_encode(obj, primitives=False):
|
||||
"""
|
||||
Encode a complex number as a json dictionary of it's real and imaginary part.
|
||||
|
||||
:param obj: complex number, e.g. `2+1j`
|
||||
:return: (dict) json primitives representation of `obj`
|
||||
"""
|
||||
if isinstance(obj, complex):
|
||||
if primitives:
|
||||
return [obj.real, obj.imag]
|
||||
else:
|
||||
return hashodict(__complex__=[obj.real, obj.imag])
|
||||
return obj
|
||||
|
||||
|
||||
def numeric_types_encode(obj, primitives=False):
|
||||
"""
|
||||
Encode Decimal and Fraction.
|
||||
|
||||
:param primitives: Encode decimals and fractions as standard floats. You may lose precision. If you do this, you may need to enable `allow_nan` (decimals always allow NaNs but floats do not).
|
||||
"""
|
||||
if isinstance(obj, Decimal):
|
||||
if primitives:
|
||||
return float(obj)
|
||||
else:
|
||||
return {
|
||||
'__decimal__': str(obj.canonical()),
|
||||
}
|
||||
if isinstance(obj, Fraction):
|
||||
if primitives:
|
||||
return float(obj)
|
||||
else:
|
||||
return hashodict((
|
||||
('__fraction__', True),
|
||||
('numerator', obj.numerator),
|
||||
('denominator', obj.denominator),
|
||||
))
|
||||
return obj
|
||||
|
||||
|
||||
class ClassInstanceEncoder(JSONEncoder):
|
||||
"""
|
||||
See `class_instance_encoder`.
|
||||
"""
|
||||
# Not covered in tests since `class_instance_encode` is recommended way.
|
||||
def __init__(self, obj, encode_cls_instances=True, **kwargs):
|
||||
self.encode_cls_instances = encode_cls_instances
|
||||
super(ClassInstanceEncoder, self).__init__(obj, **kwargs)
|
||||
|
||||
def default(self, obj, *args, **kwargs):
|
||||
if self.encode_cls_instances:
|
||||
obj = class_instance_encode(obj)
|
||||
return super(ClassInstanceEncoder, self).default(obj, *args, **kwargs)
|
||||
|
||||
|
||||
def json_set_encode(obj, primitives=False):
|
||||
"""
|
||||
Encode python sets as dictionary with key __set__ and a list of the values.
|
||||
|
||||
Try to sort the set to get a consistent json representation, use arbitrary order if the data is not ordinal.
|
||||
"""
|
||||
if isinstance(obj, set):
|
||||
try:
|
||||
repr = sorted(obj)
|
||||
except Exception:
|
||||
repr = list(obj)
|
||||
if primitives:
|
||||
return repr
|
||||
else:
|
||||
return hashodict(__set__=repr)
|
||||
return obj
|
||||
|
||||
|
||||
def pandas_encode(obj, primitives=False):
|
||||
from pandas import DataFrame, Series
|
||||
if isinstance(obj, (DataFrame, Series)):
|
||||
#todo: this is experimental
|
||||
if not getattr(pandas_encode, '_warned', False):
|
||||
pandas_encode._warned = True
|
||||
warning('Pandas dumping support in json-tricks is experimental and may change in future versions.')
|
||||
if isinstance(obj, DataFrame):
|
||||
repr = hashodict()
|
||||
if not primitives:
|
||||
repr['__pandas_dataframe__'] = hashodict((
|
||||
('column_order', tuple(obj.columns.values)),
|
||||
('types', tuple(str(dt) for dt in obj.dtypes)),
|
||||
))
|
||||
repr['index'] = tuple(obj.index.values)
|
||||
for k, name in enumerate(obj.columns.values):
|
||||
repr[name] = tuple(obj.ix[:, k].values)
|
||||
return repr
|
||||
if isinstance(obj, Series):
|
||||
repr = hashodict()
|
||||
if not primitives:
|
||||
repr['__pandas_series__'] = hashodict((
|
||||
('name', str(obj.name)),
|
||||
('type', str(obj.dtype)),
|
||||
))
|
||||
repr['index'] = tuple(obj.index.values)
|
||||
repr['data'] = tuple(obj.values)
|
||||
return repr
|
||||
return obj
|
||||
|
||||
|
||||
def nopandas_encode(obj):
|
||||
if ('DataFrame' in getattr(obj.__class__, '__name__', '') or 'Series' in getattr(obj.__class__, '__name__', '')) \
|
||||
and 'pandas.' in getattr(obj.__class__, '__module__', ''):
|
||||
raise NoPandasException(('Trying to encode an object of type {0:} which appears to be '
|
||||
'a numpy array, but numpy support is not enabled, perhaps it is not installed.').format(type(obj)))
|
||||
return obj
|
||||
|
||||
|
||||
def numpy_encode(obj, primitives=False):
|
||||
"""
|
||||
Encodes numpy `ndarray`s as lists with meta data.
|
||||
|
||||
Encodes numpy scalar types as Python equivalents. Special encoding is not possible,
|
||||
because int64 (in py2) and float64 (in py2 and py3) are subclasses of primitives,
|
||||
which never reach the encoder.
|
||||
|
||||
:param primitives: If True, arrays are serialized as (nested) lists without meta info.
|
||||
"""
|
||||
from numpy import ndarray, generic
|
||||
if isinstance(obj, ndarray):
|
||||
if primitives:
|
||||
return obj.tolist()
|
||||
else:
|
||||
dct = hashodict((
|
||||
('__ndarray__', obj.tolist()),
|
||||
('dtype', str(obj.dtype)),
|
||||
('shape', obj.shape),
|
||||
))
|
||||
if len(obj.shape) > 1:
|
||||
dct['Corder'] = obj.flags['C_CONTIGUOUS']
|
||||
return dct
|
||||
elif isinstance(obj, generic):
|
||||
if NumpyEncoder.SHOW_SCALAR_WARNING:
|
||||
NumpyEncoder.SHOW_SCALAR_WARNING = False
|
||||
warning('json-tricks: numpy scalar serialization is experimental and may work differently in future versions')
|
||||
return obj.item()
|
||||
return obj
|
||||
|
||||
|
||||
class NumpyEncoder(ClassInstanceEncoder):
|
||||
"""
|
||||
JSON encoder for numpy arrays.
|
||||
"""
|
||||
SHOW_SCALAR_WARNING = True # show a warning that numpy scalar serialization is experimental
|
||||
|
||||
def default(self, obj, *args, **kwargs):
|
||||
"""
|
||||
If input object is a ndarray it will be converted into a dict holding
|
||||
data type, shape and the data. The object can be restored using json_numpy_obj_hook.
|
||||
"""
|
||||
warning('`NumpyEncoder` is deprecated, use `numpy_encode`') #todo
|
||||
obj = numpy_encode(obj)
|
||||
return super(NumpyEncoder, self).default(obj, *args, **kwargs)
|
||||
|
||||
|
||||
def nonumpy_encode(obj):
|
||||
"""
|
||||
Raises an error for numpy arrays.
|
||||
"""
|
||||
if 'ndarray' in getattr(obj.__class__, '__name__', '') and 'numpy.' in getattr(obj.__class__, '__module__', ''):
|
||||
raise NoNumpyException(('Trying to encode an object of type {0:} which appears to be '
|
||||
'a pandas data stucture, but pandas support is not enabled, perhaps it is not installed.').format(type(obj)))
|
||||
return obj
|
||||
|
||||
|
||||
class NoNumpyEncoder(JSONEncoder):
|
||||
"""
|
||||
See `nonumpy_encode`.
|
||||
"""
|
||||
def default(self, obj, *args, **kwargs):
|
||||
warning('`NoNumpyEncoder` is deprecated, use `nonumpy_encode`') #todo
|
||||
obj = nonumpy_encode(obj)
|
||||
return super(NoNumpyEncoder, self).default(obj, *args, **kwargs)
|
||||
|
||||
|
||||
@@ -0,0 +1,207 @@
|
||||
|
||||
from gzip import GzipFile
|
||||
from io import BytesIO
|
||||
from json import loads as json_loads
|
||||
from os import fsync
|
||||
from sys import exc_info, version
|
||||
from .utils import NoNumpyException # keep 'unused' imports
|
||||
from .comment import strip_comment_line_with_symbol, strip_comments # keep 'unused' imports
|
||||
from .encoders import TricksEncoder, json_date_time_encode, class_instance_encode, ClassInstanceEncoder, \
|
||||
json_complex_encode, json_set_encode, numeric_types_encode, numpy_encode, nonumpy_encode, NoNumpyEncoder, \
|
||||
nopandas_encode, pandas_encode # keep 'unused' imports
|
||||
from .decoders import DuplicateJsonKeyException, TricksPairHook, json_date_time_hook, ClassInstanceHook, \
|
||||
json_complex_hook, json_set_hook, numeric_types_hook, json_numpy_obj_hook, json_nonumpy_obj_hook, \
|
||||
nopandas_hook, pandas_hook # keep 'unused' imports
|
||||
from json import JSONEncoder
|
||||
|
||||
|
||||
is_py3 = (version[:2] == '3.')
|
||||
str_type = str if is_py3 else (basestring, unicode,)
|
||||
ENCODING = 'UTF-8'
|
||||
|
||||
|
||||
_cih_instance = ClassInstanceHook()
|
||||
DEFAULT_ENCODERS = [json_date_time_encode, class_instance_encode, json_complex_encode, json_set_encode, numeric_types_encode,]
|
||||
DEFAULT_HOOKS = [json_date_time_hook, _cih_instance, json_complex_hook, json_set_hook, numeric_types_hook,]
|
||||
|
||||
try:
|
||||
import numpy
|
||||
except ImportError:
|
||||
DEFAULT_ENCODERS = [nonumpy_encode,] + DEFAULT_ENCODERS
|
||||
DEFAULT_HOOKS = [json_nonumpy_obj_hook,] + DEFAULT_HOOKS
|
||||
else:
|
||||
# numpy encode needs to be before complex
|
||||
DEFAULT_ENCODERS = [numpy_encode,] + DEFAULT_ENCODERS
|
||||
DEFAULT_HOOKS = [json_numpy_obj_hook,] + DEFAULT_HOOKS
|
||||
|
||||
try:
|
||||
import pandas
|
||||
except ImportError:
|
||||
DEFAULT_ENCODERS = [nopandas_encode,] + DEFAULT_ENCODERS
|
||||
DEFAULT_HOOKS = [nopandas_hook,] + DEFAULT_HOOKS
|
||||
else:
|
||||
DEFAULT_ENCODERS = [pandas_encode,] + DEFAULT_ENCODERS
|
||||
DEFAULT_HOOKS = [pandas_hook,] + DEFAULT_HOOKS
|
||||
|
||||
|
||||
DEFAULT_NONP_ENCODERS = [nonumpy_encode,] + DEFAULT_ENCODERS # DEPRECATED
|
||||
DEFAULT_NONP_HOOKS = [json_nonumpy_obj_hook,] + DEFAULT_HOOKS # DEPRECATED
|
||||
|
||||
|
||||
def dumps(obj, sort_keys=None, cls=TricksEncoder, obj_encoders=DEFAULT_ENCODERS, extra_obj_encoders=(),
|
||||
primitives=False, compression=None, allow_nan=False, conv_str_byte=False, **jsonkwargs):
|
||||
"""
|
||||
Convert a nested data structure to a json string.
|
||||
|
||||
:param obj: The Python object to convert.
|
||||
:param sort_keys: Keep this False if you want order to be preserved.
|
||||
:param cls: The json encoder class to use, defaults to NoNumpyEncoder which gives a warning for numpy arrays.
|
||||
:param obj_encoders: Iterable of encoders to use to convert arbitrary objects into json-able promitives.
|
||||
:param extra_obj_encoders: Like `obj_encoders` but on top of them: use this to add encoders without replacing defaults. Since v3.5 these happen before default encoders.
|
||||
:param allow_nan: Allow NaN and Infinity values, which is a (useful) violation of the JSON standard (default False).
|
||||
:param conv_str_byte: Try to automatically convert between strings and bytes (assuming utf-8) (default False).
|
||||
:return: The string containing the json-encoded version of obj.
|
||||
|
||||
Other arguments are passed on to `cls`. Note that `sort_keys` should be false if you want to preserve order.
|
||||
"""
|
||||
if not hasattr(extra_obj_encoders, '__iter__'):
|
||||
raise TypeError('`extra_obj_encoders` should be a tuple in `json_tricks.dump(s)`')
|
||||
encoders = tuple(extra_obj_encoders) + tuple(obj_encoders)
|
||||
txt = cls(sort_keys=sort_keys, obj_encoders=encoders, allow_nan=allow_nan,
|
||||
primitives=primitives, **jsonkwargs).encode(obj)
|
||||
if not is_py3 and isinstance(txt, str):
|
||||
txt = unicode(txt, ENCODING)
|
||||
if not compression:
|
||||
return txt
|
||||
if compression is True:
|
||||
compression = 5
|
||||
txt = txt.encode(ENCODING)
|
||||
sh = BytesIO()
|
||||
with GzipFile(mode='wb', fileobj=sh, compresslevel=compression) as zh:
|
||||
zh.write(txt)
|
||||
gzstring = sh.getvalue()
|
||||
return gzstring
|
||||
|
||||
|
||||
def dump(obj, fp, sort_keys=None, cls=TricksEncoder, obj_encoders=DEFAULT_ENCODERS, extra_obj_encoders=(),
|
||||
primitives=False, compression=None, force_flush=False, allow_nan=False, conv_str_byte=False, **jsonkwargs):
|
||||
"""
|
||||
Convert a nested data structure to a json string.
|
||||
|
||||
:param fp: File handle or path to write to.
|
||||
:param compression: The gzip compression level, or None for no compression.
|
||||
:param force_flush: If True, flush the file handle used, when possibly also in the operating system (default False).
|
||||
|
||||
The other arguments are identical to `dumps`.
|
||||
"""
|
||||
txt = dumps(obj, sort_keys=sort_keys, cls=cls, obj_encoders=obj_encoders, extra_obj_encoders=extra_obj_encoders,
|
||||
primitives=primitives, compression=compression, allow_nan=allow_nan, conv_str_byte=conv_str_byte, **jsonkwargs)
|
||||
if isinstance(fp, str_type):
|
||||
fh = open(fp, 'wb+')
|
||||
else:
|
||||
fh = fp
|
||||
if conv_str_byte:
|
||||
try:
|
||||
fh.write(b'')
|
||||
except TypeError:
|
||||
pass
|
||||
# if not isinstance(txt, str_type):
|
||||
# # Cannot write bytes, so must be in text mode, but we didn't get a text
|
||||
# if not compression:
|
||||
# txt = txt.decode(ENCODING)
|
||||
else:
|
||||
try:
|
||||
fh.write(u'')
|
||||
except TypeError:
|
||||
if isinstance(txt, str_type):
|
||||
txt = txt.encode(ENCODING)
|
||||
try:
|
||||
if 'b' not in getattr(fh, 'mode', 'b?') and not isinstance(txt, str_type) and compression:
|
||||
raise IOError('If compression is enabled, the file must be opened in binary mode.')
|
||||
try:
|
||||
fh.write(txt)
|
||||
except TypeError as err:
|
||||
err.args = (err.args[0] + '. A possible reason is that the file is not opened in binary mode; '
|
||||
'be sure to set file mode to something like "wb".',)
|
||||
raise
|
||||
finally:
|
||||
if force_flush:
|
||||
fh.flush()
|
||||
try:
|
||||
if fh.fileno() is not None:
|
||||
fsync(fh.fileno())
|
||||
except (ValueError,):
|
||||
pass
|
||||
if isinstance(fp, str_type):
|
||||
fh.close()
|
||||
return txt
|
||||
|
||||
|
||||
def loads(string, preserve_order=True, ignore_comments=True, decompression=None, obj_pairs_hooks=DEFAULT_HOOKS,
|
||||
extra_obj_pairs_hooks=(), cls_lookup_map=None, allow_duplicates=True, conv_str_byte=False, **jsonkwargs):
|
||||
"""
|
||||
Convert a nested data structure to a json string.
|
||||
|
||||
:param string: The string containing a json encoded data structure.
|
||||
:param decode_cls_instances: True to attempt to decode class instances (requires the environment to be similar the the encoding one).
|
||||
:param preserve_order: Whether to preserve order by using OrderedDicts or not.
|
||||
:param ignore_comments: Remove comments (starting with # or //).
|
||||
:param decompression: True to use gzip decompression, False to use raw data, None to automatically determine (default). Assumes utf-8 encoding!
|
||||
:param obj_pairs_hooks: A list of dictionary hooks to apply.
|
||||
:param extra_obj_pairs_hooks: Like `obj_pairs_hooks` but on top of them: use this to add hooks without replacing defaults. Since v3.5 these happen before default hooks.
|
||||
:param cls_lookup_map: If set to a dict, for example ``globals()``, then classes encoded from __main__ are looked up this dict.
|
||||
:param allow_duplicates: If set to False, an error will be raised when loading a json-map that contains duplicate keys.
|
||||
:param parse_float: A function to parse strings to integers (e.g. Decimal). There is also `parse_int`.
|
||||
:param conv_str_byte: Try to automatically convert between strings and bytes (assuming utf-8) (default False).
|
||||
:return: The string containing the json-encoded version of obj.
|
||||
|
||||
Other arguments are passed on to json_func.
|
||||
"""
|
||||
if not hasattr(extra_obj_pairs_hooks, '__iter__'):
|
||||
raise TypeError('`extra_obj_pairs_hooks` should be a tuple in `json_tricks.load(s)`')
|
||||
if decompression is None:
|
||||
decompression = string[:2] == b'\x1f\x8b'
|
||||
if decompression:
|
||||
with GzipFile(fileobj=BytesIO(string), mode='rb') as zh:
|
||||
string = zh.read()
|
||||
string = string.decode(ENCODING)
|
||||
if not isinstance(string, str_type):
|
||||
if conv_str_byte:
|
||||
string = string.decode(ENCODING)
|
||||
else:
|
||||
raise TypeError(('Cannot automatically encode object of type "{0:}" in `json_tricks.load(s)` since '
|
||||
'the encoding is not known. You should instead encode the bytes to a string and pass that '
|
||||
'string to `load(s)`, for example bytevar.encode("utf-8") if utf-8 is the encoding.').format(type(string)))
|
||||
if ignore_comments:
|
||||
string = strip_comments(string)
|
||||
obj_pairs_hooks = tuple(obj_pairs_hooks)
|
||||
_cih_instance.cls_lookup_map = cls_lookup_map or {}
|
||||
hooks = tuple(extra_obj_pairs_hooks) + obj_pairs_hooks
|
||||
hook = TricksPairHook(ordered=preserve_order, obj_pairs_hooks=hooks, allow_duplicates=allow_duplicates)
|
||||
return json_loads(string, object_pairs_hook=hook, **jsonkwargs)
|
||||
|
||||
|
||||
def load(fp, preserve_order=True, ignore_comments=True, decompression=None, obj_pairs_hooks=DEFAULT_HOOKS,
|
||||
extra_obj_pairs_hooks=(), cls_lookup_map=None, allow_duplicates=True, conv_str_byte=False, **jsonkwargs):
|
||||
"""
|
||||
Convert a nested data structure to a json string.
|
||||
|
||||
:param fp: File handle or path to load from.
|
||||
|
||||
The other arguments are identical to loads.
|
||||
"""
|
||||
try:
|
||||
if isinstance(fp, str_type):
|
||||
with open(fp, 'rb') as fh:
|
||||
string = fh.read()
|
||||
else:
|
||||
string = fp.read()
|
||||
except UnicodeDecodeError as err:
|
||||
# todo: not covered in tests, is it relevant?
|
||||
raise Exception('There was a problem decoding the file content. A possible reason is that the file is not ' +
|
||||
'opened in binary mode; be sure to set file mode to something like "rb".').with_traceback(exc_info()[2])
|
||||
return loads(string, preserve_order=preserve_order, ignore_comments=ignore_comments, decompression=decompression,
|
||||
obj_pairs_hooks=obj_pairs_hooks, extra_obj_pairs_hooks=extra_obj_pairs_hooks, cls_lookup_map=cls_lookup_map,
|
||||
allow_duplicates=allow_duplicates, conv_str_byte=conv_str_byte, **jsonkwargs)
|
||||
|
||||
|
||||
@@ -0,0 +1,28 @@
|
||||
|
||||
"""
|
||||
This file exists for backward compatibility reasons.
|
||||
"""
|
||||
|
||||
from logging import warning
|
||||
from .nonp import NoNumpyException, DEFAULT_ENCODERS, DEFAULT_HOOKS, dumps, dump, loads, load # keep 'unused' imports
|
||||
from .utils import hashodict, NoPandasException
|
||||
from .comment import strip_comment_line_with_symbol, strip_comments # keep 'unused' imports
|
||||
from .encoders import TricksEncoder, json_date_time_encode, class_instance_encode, ClassInstanceEncoder, \
|
||||
numpy_encode, NumpyEncoder # keep 'unused' imports
|
||||
from .decoders import DuplicateJsonKeyException, TricksPairHook, json_date_time_hook, ClassInstanceHook, \
|
||||
json_complex_hook, json_set_hook, json_numpy_obj_hook # keep 'unused' imports
|
||||
|
||||
try:
|
||||
import numpy
|
||||
except ImportError:
|
||||
raise NoNumpyException('Could not load numpy, maybe it is not installed? If you do not want to use numpy encoding '
|
||||
'or decoding, you can import the functions from json_tricks.nonp instead, which do not need numpy.')
|
||||
|
||||
|
||||
# todo: warning('`json_tricks.np` is deprecated, you can import directly from `json_tricks`')
|
||||
|
||||
|
||||
DEFAULT_NP_ENCODERS = [numpy_encode,] + DEFAULT_ENCODERS # DEPRECATED
|
||||
DEFAULT_NP_HOOKS = [json_numpy_obj_hook,] + DEFAULT_HOOKS # DEPRECATED
|
||||
|
||||
|
||||
@@ -0,0 +1,15 @@
|
||||
|
||||
"""
|
||||
This file exists for backward compatibility reasons.
|
||||
"""
|
||||
|
||||
from .utils import hashodict, get_scalar_repr, encode_scalars_inplace
|
||||
from .nonp import NoNumpyException
|
||||
from . import np
|
||||
|
||||
# try:
|
||||
# from numpy import generic, complex64, complex128
|
||||
# except ImportError:
|
||||
# raise NoNumpyException('Could not load numpy, maybe it is not installed?')
|
||||
|
||||
|
||||
@@ -0,0 +1,81 @@
|
||||
|
||||
from collections import OrderedDict
|
||||
|
||||
|
||||
class hashodict(OrderedDict):
|
||||
"""
|
||||
This dictionary is hashable. It should NOT be mutated, or all kinds of weird
|
||||
bugs may appear. This is not enforced though, it's only used for encoding.
|
||||
"""
|
||||
def __hash__(self):
|
||||
return hash(frozenset(self.items()))
|
||||
|
||||
|
||||
try:
|
||||
from inspect import signature
|
||||
except ImportError:
|
||||
try:
|
||||
from inspect import getfullargspec
|
||||
except ImportError:
|
||||
from inspect import getargspec
|
||||
def get_arg_names(callable):
|
||||
argspec = getargspec(callable)
|
||||
return set(argspec.args)
|
||||
else:
|
||||
#todo: this is not covered in test case (py 3+ uses `signature`, py2 `getfullargspec`); consider removing it
|
||||
def get_arg_names(callable):
|
||||
argspec = getfullargspec(callable)
|
||||
return set(argspec.args) | set(argspec.kwonlyargs)
|
||||
else:
|
||||
def get_arg_names(callable):
|
||||
sig = signature(callable)
|
||||
return set(sig.parameters.keys())
|
||||
|
||||
|
||||
def call_with_optional_kwargs(callable, *args, **optional_kwargs):
|
||||
accepted_kwargs = get_arg_names(callable)
|
||||
use_kwargs = {}
|
||||
for key, val in optional_kwargs.items():
|
||||
if key in accepted_kwargs:
|
||||
use_kwargs[key] = val
|
||||
return callable(*args, **use_kwargs)
|
||||
|
||||
|
||||
class NoNumpyException(Exception):
|
||||
""" Trying to use numpy features, but numpy cannot be found. """
|
||||
|
||||
|
||||
class NoPandasException(Exception):
|
||||
""" Trying to use pandas features, but pandas cannot be found. """
|
||||
|
||||
|
||||
def get_scalar_repr(npscalar):
|
||||
return hashodict((
|
||||
('__ndarray__', npscalar.item()),
|
||||
('dtype', str(npscalar.dtype)),
|
||||
('shape', ()),
|
||||
))
|
||||
|
||||
|
||||
def encode_scalars_inplace(obj):
|
||||
"""
|
||||
Searches a data structure of lists, tuples and dicts for numpy scalars
|
||||
and replaces them by their dictionary representation, which can be loaded
|
||||
by json-tricks. This happens in-place (the object is changed, use a copy).
|
||||
"""
|
||||
from numpy import generic, complex64, complex128
|
||||
if isinstance(obj, (generic, complex64, complex128)):
|
||||
return get_scalar_repr(obj)
|
||||
if isinstance(obj, dict):
|
||||
for key, val in tuple(obj.items()):
|
||||
obj[key] = encode_scalars_inplace(val)
|
||||
return obj
|
||||
if isinstance(obj, list):
|
||||
for k, val in enumerate(obj):
|
||||
obj[k] = encode_scalars_inplace(val)
|
||||
return obj
|
||||
if isinstance(obj, (tuple, set)):
|
||||
return type(obj)(encode_scalars_inplace(val) for val in obj)
|
||||
return obj
|
||||
|
||||
|
||||
@@ -23,6 +23,17 @@ class Media(Descriptor):
|
||||
bitrate = Property(type=int)
|
||||
duration = Property(type=int)
|
||||
|
||||
#@classmethod
|
||||
#def from_node(cls, client, node):
|
||||
# return cls.construct(client, cls.helpers.find(node, 'Media'), child=True)
|
||||
|
||||
@classmethod
|
||||
def from_node(cls, client, node):
|
||||
return cls.construct(client, cls.helpers.find(node, 'Media'), child=True)
|
||||
items = []
|
||||
|
||||
for genre in cls.helpers.findall(node, 'Media'):
|
||||
_, obj = Media.construct(client, genre, child=True)
|
||||
|
||||
items.append(obj)
|
||||
|
||||
return [], items
|
||||
|
||||
@@ -1,27 +1,27 @@
|
||||
# addic7ed
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; ProviderPool(providers=['addic7ed'], provider_configs={'addic7ed': {'use_random_agents': True}})['addic7ed'].query('Game of Thrones', 2)"
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; ProviderPool(providers=['addic7ed'], provider_configs={'addic7ed': {'use_random_agents': True}})['addic7ed'].query('Game of Thrones', 2)"
|
||||
|
||||
# opensubtitles
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; ProviderPool(providers=['opensubtitles'], )['opensubtitles'].query([Language('eng')], query='Game of Thrones', season=2, episode=1, tag='Game.of.Thrones.S06E01.The.Red.Woman.720p.WEB-DL.DD5.1.H.264-NTB.mkv', use_tag_search=True)"
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; from subzero.video import parse_video; SZProviderPool(providers=['opensubtitles'], )['opensubtitles'].list_subtitles(parse_video('FULL_PATH', {}, {'type': 'episode'}), languages=[Language('eng')])"
|
||||
|
||||
# podnapisi
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; ProviderPool(providers=['podnapisi'], )['podnapisi'].query([Language('eng')], 'Game of Thrones', season=2, episode=1)"
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; SZProviderPool(providers=['podnapisi'], )['podnapisi'].query([Language('eng')], 'Game of Thrones', season=2, episode=1)"
|
||||
|
||||
# tvsubtitles
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; ProviderPool(providers=['tvsubtitles'], )['tvsubtitles'].query('Game of Thrones', 2, 1)"
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; SZProviderPool(providers=['tvsubtitles'], )['tvsubtitles'].query('Game of Thrones', 2, 1)"
|
||||
|
||||
# napiprojekt:list
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; from subliminal.core import scan_video; print ProviderPool(providers=['napiprojekt'], )['napiprojekt'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('pol')])"
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; from subliminal.core import scan_video; print SZProviderPool(providers=['napiprojekt'], )['napiprojekt'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('pol')])"
|
||||
|
||||
# napiprojekt:download
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import PatchedProviderPool; from subliminal import download_best_subtitles; from babelfish import Language; from subliminal.core import scan_video; subs = download_best_subtitles([scan_video('FULL_PATH')], languages={Language('eng')}, providers=['napiprojekt'], ); print subs.values()[0][0].is_valid()"
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from subliminal_patch.score import compute_score; from subliminal import download_best_subtitles; from babelfish import Language; from subliminal.core import scan_video; subs = download_best_subtitles([scan_video('FULL_PATH')], languages={Language('eng')}, providers=['napiprojekt'], pool_class=SZProviderPool, compute_score=compute_score); print subs.values()[0][0].is_valid()"
|
||||
|
||||
|
||||
# shooter:list
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; from subliminal.core import scan_video; print ProviderPool(providers=['shooter'], )['shooter'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('zho')])"
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; from subliminal.core import scan_video; print SZProviderPool(providers=['shooter'], )['shooter'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('zho')])"
|
||||
|
||||
# subscenter:list
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal import ProviderPool; from babelfish import Language; from subliminal.core import scan_video; print ProviderPool(providers=['subscenter'], )['subscenter'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('heb')])"
|
||||
python -c "import logging; logging.basicConfig(level=logging.DEBUG); logging.getLogger('rebulk').setLevel(logging.WARNING); import subliminal_patch, subliminal; subliminal.region.configure('dogpile.cache.memory'); from subliminal_patch.core import SZProviderPool; from babelfish import Language; from subliminal.core import scan_video; print SZProviderPool(providers=['subscenter'], )['subscenter'].list_subtitles(scan_video('FULL_PATH'), languages=[Language('heb')])"
|
||||
|
||||
|
||||
# refining
|
||||
|
||||
@@ -9,12 +9,6 @@ from .providers import Provider
|
||||
from .http import RetryingSession
|
||||
subliminal.subtitle.Subtitle = PatchedSubtitle
|
||||
|
||||
try:
|
||||
subliminal.provider_manager.register('napiprojekt = subliminal.providers.napiprojekt:NapiProjektProvider',)
|
||||
except ValueError:
|
||||
# already registered
|
||||
pass
|
||||
|
||||
# add our patched base classes
|
||||
for name in ("Addic7ed", "Podnapisi", "TVsubtitles", "OpenSubtitles", "LegendasTV", "NapiProjekt", "Shooter",
|
||||
"SubsCenter"):
|
||||
@@ -28,13 +22,18 @@ for name in ("Addic7ed", "Podnapisi", "TVsubtitles", "OpenSubtitles", "LegendasT
|
||||
from .core import scan_video, search_external_subtitles, list_all_subtitles, save_subtitles, refine
|
||||
from .score import compute_score
|
||||
from .extensions import provider_manager
|
||||
from .video import Video
|
||||
|
||||
# patch subliminal's core functions
|
||||
subliminal.scan_video = subliminal.core.scan_video = scan_video
|
||||
subliminal.core.search_external_subtitles = search_external_subtitles
|
||||
subliminal.save_subtitles = subliminal.core.save_subtitles = save_subtitles
|
||||
subliminal.refine = subliminal.core.refine = refine
|
||||
subliminal.video.Video = subliminal.Video = Video
|
||||
subliminal.video.Episode.__bases__ = (Video,)
|
||||
subliminal.video.Movie.__bases__ = (Video,)
|
||||
|
||||
# add our own list_all_subtitles
|
||||
subliminal.list_all_subtitles = subliminal.core.list_all_subtitles = list_all_subtitles
|
||||
subliminal.provider_manager = subliminal.core.provider_manager = provider_manager
|
||||
subliminal.provider_manager = subliminal.core.provider_manager = subliminal.extensions.provider_manager = \
|
||||
provider_manager
|
||||
|
||||
@@ -102,14 +102,18 @@ class SZProviderPool(ProviderPool):
|
||||
try:
|
||||
self[subtitle.provider_name].download_subtitle(subtitle)
|
||||
break
|
||||
except (requests.Timeout, socket.timeout):
|
||||
logger.error('Provider %r timed out', subtitle.provider_name)
|
||||
except ProviderError:
|
||||
logger.error('Unexpected error in provider %r, Traceback: %s', subtitle.provider_name,
|
||||
traceback.format_exc())
|
||||
except (requests.ConnectionError,
|
||||
requests.exceptions.ProxyError,
|
||||
requests.exceptions.SSLError,
|
||||
requests.Timeout,
|
||||
socket.timeout):
|
||||
logger.error('Provider %r connection error', subtitle.provider_name)
|
||||
|
||||
except:
|
||||
logger.exception('Unexpected error in provider %r, Traceback: %s', subtitle.provider_name,
|
||||
traceback.format_exc())
|
||||
self.discarded_providers.add(subtitle.provider_name)
|
||||
return False
|
||||
|
||||
if tries == DOWNLOAD_TRIES:
|
||||
self.discarded_providers.add(subtitle.provider_name)
|
||||
@@ -121,6 +125,10 @@ class SZProviderPool(ProviderPool):
|
||||
subtitle.provider_name, DOWNLOAD_RETRY_SLEEP)
|
||||
time.sleep(DOWNLOAD_RETRY_SLEEP)
|
||||
|
||||
if os.environ.get("SZ_ENFORCE_ENCODING", "False") == "True":
|
||||
logger.info("Enforcing encoding of %s from %s to %s", subtitle, subtitle.guess_encoding(), "utf-8")
|
||||
subtitle.set_encoding("utf-8")
|
||||
|
||||
# check subtitle validity
|
||||
if not subtitle.is_valid():
|
||||
logger.error('Invalid subtitle')
|
||||
@@ -192,7 +200,8 @@ class SZProviderPool(ProviderPool):
|
||||
continue
|
||||
|
||||
# bail out if hearing_impaired was wrong
|
||||
if "hearing_impaired" not in matches and hearing_impaired in ("force HI", "force non-HI"):
|
||||
if subtitle.hearing_impaired_verifiable and "hearing_impaired" not in matches and \
|
||||
hearing_impaired in ("force HI", "force non-HI"):
|
||||
logger.debug('%r: Skipping subtitle with score %d because hearing-impaired set to %s', subtitle,
|
||||
score, hearing_impaired)
|
||||
continue
|
||||
@@ -460,7 +469,7 @@ def get_subtitle_path(video_path, language=None, extension='.srt', forced_tag=Fa
|
||||
|
||||
|
||||
def save_subtitles(video, subtitles, single=False, directory=None, encoding=None, encode_with=None, chmod=None,
|
||||
forced_tag=False, path_decoder=None):
|
||||
forced_tag=False, path_decoder=None, debug_mods=False):
|
||||
"""Save subtitles on filesystem.
|
||||
|
||||
Subtitles are saved in the order of the list. If a subtitle with a language has already been saved, other subtitles
|
||||
@@ -515,7 +524,8 @@ def save_subtitles(video, subtitles, single=False, directory=None, encoding=None
|
||||
|
||||
# save normalized subtitle if encoder or no encoding is given
|
||||
if has_encoder or encoding is None:
|
||||
content = encode_with(subtitle.get_modified_text()) if has_encoder else subtitle.get_modified_content()
|
||||
content = encode_with(subtitle.get_modified_text(debug=debug_mods)) if has_encoder else \
|
||||
subtitle.get_modified_content(debug=debug_mods)
|
||||
with io.open(subtitle_path, 'wb') as f:
|
||||
f.write(content)
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@ import subliminal
|
||||
import babelfish
|
||||
from subliminal.extensions import RegistrableExtensionManager
|
||||
|
||||
provider_manager = RegistrableExtensionManager('subliminal.providers', [
|
||||
provider_manager = RegistrableExtensionManager('subliminal_patch.providers', [
|
||||
'addic7ed = subliminal_patch.providers.addic7ed:Addic7edProvider',
|
||||
'legendastv = subliminal_patch.providers.legendastv:LegendasTVProvider',
|
||||
'opensubtitles = subliminal_patch.providers.opensubtitles:OpenSubtitlesProvider',
|
||||
@@ -19,4 +19,5 @@ provider_manager = RegistrableExtensionManager('subliminal.providers', [
|
||||
babelfish.language_converters.unregister('addic7ed = subliminal.converters.addic7ed:Addic7edConverter')
|
||||
babelfish.language_converters.register('addic7ed = subliminal_patch.language:PatchedAddic7edConverter')
|
||||
subliminal.refiner_manager.register('sz_metadata = subliminal_patch.refiners.metadata:refine')
|
||||
subliminal.refiner_manager.register('sz_omdb = subliminal_patch.refiners.omdb:refine')
|
||||
|
||||
|
||||
@@ -4,7 +4,8 @@ from xmlrpclib import SafeTransport
|
||||
import certifi
|
||||
import ssl
|
||||
import os
|
||||
from requests import Session
|
||||
import socket
|
||||
from requests import Session, exceptions
|
||||
from retry.api import retry_call
|
||||
|
||||
pem_file = os.path.normpath(os.path.join(os.path.dirname(os.path.realpath(__file__)), "..", certifi.where()))
|
||||
@@ -23,7 +24,14 @@ class RetryingSession(Session):
|
||||
self.verify = pem_file
|
||||
|
||||
def retry_method(self, method, *args, **kwargs):
|
||||
return retry_call(getattr(super(RetryingSession, self), method), fargs=args, fkwargs=kwargs, tries=3, delay=1)
|
||||
return retry_call(getattr(super(RetryingSession, self), method), fargs=args, fkwargs=kwargs, tries=3, delay=5,
|
||||
exceptions=(exceptions.ConnectionError,
|
||||
exceptions.ProxyError,
|
||||
exceptions.SSLError,
|
||||
exceptions.Timeout,
|
||||
exceptions.ConnectTimeout,
|
||||
exceptions.ReadTimeout,
|
||||
socket.timeout))
|
||||
|
||||
def get(self, *args, **kwargs):
|
||||
return self.retry_method("get", *args, **kwargs)
|
||||
|
||||
@@ -5,3 +5,6 @@ from subliminal.providers import Provider as _Provider
|
||||
|
||||
class Provider(_Provider):
|
||||
hash_verifiable = False
|
||||
hearing_impaired_verifiable = False
|
||||
skip_wrong_fps = True
|
||||
|
||||
|
||||
@@ -17,6 +17,8 @@ series_year_re = re.compile(r'^(?P<series>[ \w\'.:(),&!?-]+?)(?: \((?P<year>\d{4
|
||||
|
||||
|
||||
class Addic7edSubtitle(_Addic7edSubtitle):
|
||||
hearing_impaired_verifiable = True
|
||||
|
||||
def __init__(self, language, hearing_impaired, page_link, series, season, episode, title, year, version,
|
||||
download_link):
|
||||
super(Addic7edSubtitle, self).__init__(language, hearing_impaired, page_link, series, season, episode,
|
||||
@@ -28,6 +30,10 @@ class Addic7edSubtitle(_Addic7edSubtitle):
|
||||
if not subliminal.score.episode_scores.get("addic7ed_boost"):
|
||||
return matches
|
||||
|
||||
# if the release group matches, the format is most likely correct, as well
|
||||
if "release_group" in matches:
|
||||
matches.add("format")
|
||||
|
||||
if {"series", "season", "episode", "year"}.issubset(matches) and "format" in matches:
|
||||
matches.add("addic7ed_boost")
|
||||
logger.info("Boosting Addic7ed subtitle by %s" % subliminal.score.episode_scores.get("addic7ed_boost"))
|
||||
@@ -40,6 +46,7 @@ class Addic7edSubtitle(_Addic7edSubtitle):
|
||||
|
||||
class Addic7edProvider(_Addic7edProvider):
|
||||
USE_ADDICTED_RANDOM_AGENTS = False
|
||||
hearing_impaired_verifiable = True
|
||||
subtitle_class = Addic7edSubtitle
|
||||
|
||||
def __init__(self, username=None, password=None, use_random_agents=False):
|
||||
|
||||
@@ -13,6 +13,10 @@ class LegendasTVSubtitle(_LegendasTVSubtitle):
|
||||
self.release_info = archive.name
|
||||
self.page_link = archive.link
|
||||
|
||||
def make_picklable(self):
|
||||
self.archive.content = None
|
||||
return self
|
||||
|
||||
|
||||
class LegendasTVProvider(_LegendasTVProvider):
|
||||
subtitle_class = LegendasTVSubtitle
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
import re
|
||||
import time
|
||||
import logging
|
||||
import traceback
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -33,10 +34,11 @@ class ProviderRetryMixin(object):
|
||||
while i <= amount:
|
||||
try:
|
||||
return f()
|
||||
except exc, e:
|
||||
except exc:
|
||||
formatted_exc = traceback.format_exc()
|
||||
i += 1
|
||||
if i == amount:
|
||||
raise
|
||||
|
||||
logger.debug(u"Retrying %s, try: %i/%i, exception: %s" % (self.__class__.__name__, i, amount, e))
|
||||
logger.debug(u"Retrying %s, try: %i/%i, exception: %s" % (self.__class__.__name__, i, amount, formatted_exc))
|
||||
time.sleep(retry_timeout)
|
||||
|
||||
@@ -1,14 +1,50 @@
|
||||
# coding=utf-8
|
||||
import logging
|
||||
|
||||
from subliminal.providers.napiprojekt import NapiProjektProvider as _NapiProjektProvider, \
|
||||
NapiProjektSubtitle as _NapiProjektSubtitle
|
||||
NapiProjektSubtitle as _NapiProjektSubtitle, get_subhash
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class NapiProjektSubtitle(_NapiProjektSubtitle):
|
||||
def __init__(self, language, hash):
|
||||
def __init__(self, language, hash, fps):
|
||||
super(NapiProjektSubtitle, self).__init__(language, hash)
|
||||
self.release_info = hash
|
||||
self.plex_media_fps = float(fps)
|
||||
|
||||
def __repr__(self):
|
||||
return '<%s %r [%s]>' % (
|
||||
self.__class__.__name__, self.release_info, self.language)
|
||||
|
||||
|
||||
class NapiProjektProvider(_NapiProjektProvider):
|
||||
subtitle_class = NapiProjektSubtitle
|
||||
|
||||
def query(self, language, hash, fps):
|
||||
params = {
|
||||
'v': 'dreambox',
|
||||
'kolejka': 'false',
|
||||
'nick': '',
|
||||
'pass': '',
|
||||
'napios': 'Linux',
|
||||
'l': language.alpha2.upper(),
|
||||
'f': hash,
|
||||
't': get_subhash(hash)}
|
||||
logger.info('Searching subtitle %r', params)
|
||||
response = self.session.get(self.server_url, params=params, timeout=10)
|
||||
response.raise_for_status()
|
||||
|
||||
# handle subtitles not found and errors
|
||||
if response.content[:4] == b'NPc0':
|
||||
logger.debug('No subtitles found')
|
||||
return None
|
||||
|
||||
subtitle = self.subtitle_class(language, hash, fps)
|
||||
subtitle.content = response.content
|
||||
logger.debug('Found subtitle %r', subtitle)
|
||||
|
||||
return subtitle
|
||||
|
||||
def list_subtitles(self, video, languages):
|
||||
return [s for s in [self.query(l, video.hashes['napiprojekt'], video.fps) for l in languages] if s is not None]
|
||||
@@ -16,10 +16,11 @@ logger = logging.getLogger(__name__)
|
||||
|
||||
class OpenSubtitlesSubtitle(_OpenSubtitlesSubtitle):
|
||||
hash_verifiable = True
|
||||
hearing_impaired_verifiable = True
|
||||
|
||||
def __init__(self, language, hearing_impaired, page_link, subtitle_id, matched_by, movie_kind, hash, movie_name,
|
||||
movie_release_name, movie_year, movie_imdb_id, series_season, series_episode, query_parameters,
|
||||
filename, encoding, fps):
|
||||
filename, encoding, fps, skip_wrong_fps=True):
|
||||
super(OpenSubtitlesSubtitle, self).__init__(language, hearing_impaired, page_link, subtitle_id,
|
||||
matched_by, movie_kind, hash,
|
||||
movie_name, movie_release_name, movie_year, movie_imdb_id,
|
||||
@@ -27,6 +28,8 @@ class OpenSubtitlesSubtitle(_OpenSubtitlesSubtitle):
|
||||
self.query_parameters = query_parameters or {}
|
||||
self.fps = fps
|
||||
self.release_info = movie_release_name
|
||||
self.wrong_fps = False
|
||||
self.skip_wrong_fps = skip_wrong_fps
|
||||
|
||||
def get_matches(self, video, hearing_impaired=False):
|
||||
matches = super(OpenSubtitlesSubtitle, self).get_matches(video)
|
||||
@@ -39,9 +42,14 @@ class OpenSubtitlesSubtitle(_OpenSubtitlesSubtitle):
|
||||
|
||||
# video has fps info, sub also, and sub's fps is greater than 0
|
||||
if video.fps and sub_fps and (video.fps != self.fps):
|
||||
logger.debug("Wrong FPS (expected: %s, got: %s, lowering score massively)", video.fps, self.fps)
|
||||
# fixme: may be too harsh
|
||||
return set()
|
||||
self.wrong_fps = True
|
||||
|
||||
if self.skip_wrong_fps:
|
||||
logger.debug("Wrong FPS (expected: %s, got: %s, lowering score massively)", video.fps, self.fps)
|
||||
# fixme: may be too harsh
|
||||
return set()
|
||||
else:
|
||||
logger.debug("Wrong FPS (expected: %s, got: %s, continuing)", video.fps, self.fps)
|
||||
|
||||
# matched by tag?
|
||||
if self.matched_by == "tag":
|
||||
@@ -57,8 +65,10 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
|
||||
only_foreign = True
|
||||
subtitle_class = OpenSubtitlesSubtitle
|
||||
hash_verifiable = True
|
||||
hearing_impaired_verifiable = True
|
||||
skip_wrong_fps = True
|
||||
|
||||
def __init__(self, username=None, password=None, use_tag_search=False, only_foreign=False):
|
||||
def __init__(self, username=None, password=None, use_tag_search=False, only_foreign=False, skip_wrong_fps=True):
|
||||
if username is not None and password is None or username is None and password is not None:
|
||||
raise ConfigurationError('Username and password must be specified')
|
||||
|
||||
@@ -66,6 +76,7 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
|
||||
self.password = password or ''
|
||||
self.use_tag_search = use_tag_search
|
||||
self.only_foreign = only_foreign
|
||||
self.skip_wrong_fps = skip_wrong_fps
|
||||
|
||||
if use_tag_search:
|
||||
logger.info("Using tag/exact filename search")
|
||||
@@ -81,7 +92,7 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
|
||||
# fixme: retry on SSLError
|
||||
response = self.retry(
|
||||
lambda: checked(
|
||||
self.server.LogIn(self.username, self.password, 'eng', 'subliminal v%s' % __short_version__)
|
||||
self.server.LogIn(self.username, self.password, 'eng', os.environ.get("SZ_USER_AGENT", "Sub-Zero/2"))
|
||||
)
|
||||
)
|
||||
self.token = response['token']
|
||||
@@ -101,6 +112,12 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
|
||||
query = video.series
|
||||
season = video.season
|
||||
episode = video.episode
|
||||
|
||||
if video.is_special:
|
||||
season = None
|
||||
episode = None
|
||||
query = u"%s %s" % (video.series, video.title)
|
||||
logger.info("%s: Searching for special: %r", self.__class__, query)
|
||||
# elif ('opensubtitles' not in video.hashes or not video.size) and not video.imdb_id:
|
||||
# query = video.name.split(os.sep)[-1]
|
||||
else:
|
||||
@@ -176,7 +193,7 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
|
||||
movie_kind,
|
||||
hash, movie_name, movie_release_name, movie_year, movie_imdb_id,
|
||||
series_season, series_episode, query_parameters, filename, encoding,
|
||||
movie_fps)
|
||||
movie_fps, skip_wrong_fps=self.skip_wrong_fps)
|
||||
logger.debug('Found subtitle %r by %s', subtitle, matched_by)
|
||||
subtitles.append(subtitle)
|
||||
|
||||
|
||||
@@ -22,6 +22,7 @@ logger = logging.getLogger(__name__)
|
||||
|
||||
class PodnapisiSubtitle(_PodnapisiSubtitle):
|
||||
provider_name = 'podnapisi'
|
||||
hearing_impaired_verifiable = True
|
||||
|
||||
def __init__(self, language, hearing_impaired, page_link, pid, releases, title, season=None, episode=None,
|
||||
year=None):
|
||||
@@ -33,6 +34,7 @@ class PodnapisiSubtitle(_PodnapisiSubtitle):
|
||||
class PodnapisiProvider(_PodnapisiProvider):
|
||||
only_foreign = False
|
||||
subtitle_class = PodnapisiSubtitle
|
||||
hearing_impaired_verifiable = True
|
||||
|
||||
def __init__(self, only_foreign=False):
|
||||
self.only_foreign = only_foreign
|
||||
@@ -43,6 +45,10 @@ class PodnapisiProvider(_PodnapisiProvider):
|
||||
super(PodnapisiProvider, self).__init__()
|
||||
|
||||
def list_subtitles(self, video, languages):
|
||||
if video.is_special:
|
||||
logger.info("%s can't search for specials right now, skipping", self)
|
||||
return []
|
||||
|
||||
if isinstance(video, Episode):
|
||||
return [s for l in languages for s in self.query(l, video.series, season=video.season,
|
||||
episode=video.episode, year=video.year,
|
||||
|
||||
@@ -5,6 +5,8 @@ from subliminal.providers.subscenter import SubsCenterProvider as _SubsCenterPro
|
||||
|
||||
|
||||
class SubsCenterSubtitle(_SubsCenterSubtitle):
|
||||
hearing_impaired_verifiable = True
|
||||
|
||||
def __init__(self, language, hearing_impaired, page_link, series, season, episode, title, subtitle_id, subtitle_key,
|
||||
subtitle_version, downloaded, releases):
|
||||
super(SubsCenterSubtitle, self).__init__(language, hearing_impaired, page_link, series, season, episode, title,
|
||||
@@ -20,3 +22,4 @@ class SubsCenterSubtitle(_SubsCenterSubtitle):
|
||||
|
||||
class SubsCenterProvider(_SubsCenterProvider):
|
||||
subtitle_class = SubsCenterSubtitle
|
||||
hearing_impaired_verifiable = True
|
||||
|
||||
@@ -0,0 +1,67 @@
|
||||
# coding=utf-8
|
||||
import os
|
||||
import subliminal
|
||||
import base64
|
||||
import zlib
|
||||
from subliminal import __short_version__
|
||||
from subliminal.refiners.omdb import OMDBClient, refine
|
||||
|
||||
|
||||
class SZOMDBClient(OMDBClient):
|
||||
def __init__(self, version=1, session=None, headers=None, timeout=10):
|
||||
super(SZOMDBClient, self).__init__(version=version, session=session, headers=headers, timeout=timeout)
|
||||
|
||||
def get_params(self, params):
|
||||
self.session.params['apikey'] = \
|
||||
zlib.decompress(base64.b16decode(os.environ['U1pfT01EQl9LRVk']))\
|
||||
.decode('cm90MTM=\n'.decode("base64")) \
|
||||
.decode('YmFzZTY0\n'.decode("base64")).split("x")[0]
|
||||
return dict(self.session.params, **params)
|
||||
|
||||
def get(self, id=None, title=None, type=None, year=None, plot='short', tomatoes=False):
|
||||
# build the params
|
||||
params = {}
|
||||
if id:
|
||||
params['i'] = id
|
||||
if title:
|
||||
params['t'] = title
|
||||
if not params:
|
||||
raise ValueError('At least id or title is required')
|
||||
params['type'] = type
|
||||
params['y'] = year
|
||||
params['plot'] = plot
|
||||
params['tomatoes'] = tomatoes
|
||||
|
||||
# perform the request
|
||||
r = self.session.get(self.base_url, params=self.get_params(params))
|
||||
r.raise_for_status()
|
||||
|
||||
# get the response as json
|
||||
j = r.json()
|
||||
|
||||
# check response status
|
||||
if j['Response'] == 'False':
|
||||
return None
|
||||
|
||||
return j
|
||||
|
||||
def search(self, title, type=None, year=None, page=1):
|
||||
# build the params
|
||||
params = {'s': title, 'type': type, 'y': year, 'page': page}
|
||||
|
||||
# perform the request
|
||||
r = self.session.get(self.base_url, params=self.get_params(params))
|
||||
r.raise_for_status()
|
||||
|
||||
# get the response as json
|
||||
j = r.json()
|
||||
|
||||
# check response status
|
||||
if j['Response'] == 'False':
|
||||
return None
|
||||
|
||||
return j
|
||||
|
||||
|
||||
omdb_client = SZOMDBClient(headers={'User-Agent': 'Subliminal/%s' % __short_version__})
|
||||
subliminal.refiners.omdb.omdb_client = omdb_client
|
||||
@@ -45,16 +45,18 @@ def compute_score(matches, subtitle, video, hearing_impaired=None):
|
||||
# hash is error-prone, try to fix that
|
||||
hash_valid_if = episode_hash_valid_if if is_episode else movie_hash_valid_if
|
||||
|
||||
if hash_valid_if <= set(matches):
|
||||
# series, season and episode matched, hash is valid
|
||||
logger.debug('%r: Using valid hash, as %s are correct (%r) and (%r)', subtitle, hash_valid_if, matches,
|
||||
video)
|
||||
matches &= {'hash', 'hearing_impaired'}
|
||||
else:
|
||||
# no match, invalidate hash
|
||||
logger.debug('%r: Ignoring hash as other matches are wrong (missing: %r) and (%r)', subtitle,
|
||||
hash_valid_if - matches, video)
|
||||
matches -= {"hash"}
|
||||
# don't validate hashes of specials, as season and episode tend to be wrong
|
||||
if is_movie or not video.is_special:
|
||||
if hash_valid_if <= set(matches):
|
||||
# series, season and episode matched, hash is valid
|
||||
logger.debug('%r: Using valid hash, as %s are correct (%r) and (%r)', subtitle, hash_valid_if, matches,
|
||||
video)
|
||||
matches &= {'hash'}
|
||||
else:
|
||||
# no match, invalidate hash
|
||||
logger.debug('%r: Ignoring hash as other matches are wrong (missing: %r) and (%r)', subtitle,
|
||||
hash_valid_if - matches, video)
|
||||
matches -= {"hash"}
|
||||
elif 'hash' in matches:
|
||||
logger.debug('%r: Hash not verifiable for this provider. Keeping it', subtitle)
|
||||
|
||||
@@ -75,6 +77,13 @@ def compute_score(matches, subtitle, video, hearing_impaired=None):
|
||||
if 'series_tvdb_id' in matches:
|
||||
logger.debug('Adding series_tvdb_id match equivalents')
|
||||
matches |= {'series', 'year'}
|
||||
|
||||
# specials
|
||||
if video.is_special and 'title' in matches and 'series' in matches \
|
||||
and 'year' in matches:
|
||||
logger.debug('Adding special title match equivalent')
|
||||
matches |= {'season', 'episode'}
|
||||
|
||||
elif is_movie:
|
||||
if 'imdb_id' in matches:
|
||||
logger.debug('Adding imdb_id match equivalents')
|
||||
|
||||
@@ -2,13 +2,18 @@
|
||||
|
||||
|
||||
import logging
|
||||
import traceback
|
||||
|
||||
import re
|
||||
|
||||
import chardet
|
||||
import pysrt
|
||||
import pysubs2
|
||||
from bs4 import UnicodeDammit
|
||||
from subliminal import Subtitle
|
||||
from pysubs2 import SSAStyle
|
||||
from pysubs2.subrip import ms_to_timestamp, parse_tags
|
||||
from subzero.modification import SubtitleModifications
|
||||
from subliminal import Subtitle
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -18,8 +23,13 @@ class PatchedSubtitle(Subtitle):
|
||||
release_info = None
|
||||
matches = None
|
||||
hash_verifiable = False
|
||||
hearing_impaired_verifiable = False
|
||||
mods = None
|
||||
plex_media_fps = None
|
||||
skip_wrong_fps = False
|
||||
wrong_fps = False
|
||||
|
||||
_guessed_encoding = None
|
||||
|
||||
def __init__(self, language, hearing_impaired=False, page_link=None, encoding=None, mods=None):
|
||||
super(PatchedSubtitle, self).__init__(language, hearing_impaired=hearing_impaired, page_link=page_link,
|
||||
@@ -30,6 +40,21 @@ class PatchedSubtitle(Subtitle):
|
||||
return '<%s %r [%s]>' % (
|
||||
self.__class__.__name__, self.page_link, self.language)
|
||||
|
||||
def make_picklable(self):
|
||||
"""
|
||||
some subtitle instances might have unpicklable objects stored; clean them up here
|
||||
:return: self
|
||||
"""
|
||||
return self
|
||||
|
||||
def set_encoding(self, encoding):
|
||||
if encoding == self.guess_encoding():
|
||||
return
|
||||
|
||||
unicontent = self.text
|
||||
self.content = unicontent.encode(encoding)
|
||||
self._guessed_encoding = encoding
|
||||
|
||||
def guess_encoding(self):
|
||||
"""Guess encoding using the language, falling back on chardet.
|
||||
|
||||
@@ -37,11 +62,17 @@ class PatchedSubtitle(Subtitle):
|
||||
:rtype: str
|
||||
|
||||
"""
|
||||
if self._guessed_encoding:
|
||||
logger.info('Encoding already guessed: %s', self._guessed_encoding)
|
||||
return self._guessed_encoding
|
||||
|
||||
logger.info('Guessing encoding for language %s', self.language.alpha3)
|
||||
|
||||
encodings = ['utf-8']
|
||||
|
||||
# add language-specific encodings
|
||||
# http://scratchpad.wikia.com/wiki/Character_Encoding_Recommendation_for_Languages
|
||||
|
||||
if self.language.alpha3 == 'zho':
|
||||
encodings.extend(['gb18030', 'big5'])
|
||||
elif self.language.alpha3 == 'jpn':
|
||||
@@ -67,15 +98,15 @@ class PatchedSubtitle(Subtitle):
|
||||
elif self.language.alpha3 in ('pol', 'cze', 'ces', 'slk', 'slo', 'slv', 'hun', 'bos', 'hbs', 'hrv', 'rsb',
|
||||
'ron', 'rum', 'sqi', 'alb'):
|
||||
# Eastern European Group 1
|
||||
encodings.append('windows-1250')
|
||||
encodings.extend(['iso-8859-2', 'windows-1250'])
|
||||
|
||||
# Bulgarian, Serbian and Macedonian
|
||||
elif self.language.alpha3 in ('bul', 'srp', 'mkd', 'mac'):
|
||||
# Bulgarian, Serbian and Macedonian, Ukranian and Russian
|
||||
elif self.language.alpha3 in ('bul', 'srp', 'mkd', 'mac', 'rus', 'ukr'):
|
||||
# Eastern European Group 2
|
||||
encodings.append('windows-1251')
|
||||
encodings.extend(['iso-8859-5', 'windows-1251'])
|
||||
else:
|
||||
# Western European (windows-1252)
|
||||
encodings.append('latin-1')
|
||||
# Western European (windows-1252) / Northern European
|
||||
encodings.extend(['iso-8859-15', 'iso-8859-9', 'iso-8859-4', 'iso-8859-1', 'latin-1'])
|
||||
|
||||
# try to decode
|
||||
logger.debug('Trying encodings %r', encodings)
|
||||
@@ -86,6 +117,7 @@ class PatchedSubtitle(Subtitle):
|
||||
pass
|
||||
else:
|
||||
logger.info('Guessed encoding %s', encoding)
|
||||
self._guessed_encoding = encoding
|
||||
return encoding
|
||||
|
||||
logger.warning('Could not guess encoding from language')
|
||||
@@ -102,9 +134,11 @@ class PatchedSubtitle(Subtitle):
|
||||
Log.Debug("bs4 detected encoding: %s" % a.original_encoding)
|
||||
|
||||
if a.original_encoding:
|
||||
self._guessed_encoding = a.original_encoding
|
||||
return a.original_encoding
|
||||
raise ValueError(u"Couldn't guess the proper encoding for %s" % self)
|
||||
|
||||
self._guessed_encoding = encoding
|
||||
return encoding
|
||||
|
||||
def is_valid(self):
|
||||
@@ -114,50 +148,95 @@ class PatchedSubtitle(Subtitle):
|
||||
:rtype: bool
|
||||
|
||||
"""
|
||||
if not self.text:
|
||||
text = self.text
|
||||
if not text:
|
||||
return False
|
||||
|
||||
# valid srt
|
||||
try:
|
||||
pysrt.from_string(self.text, error_handling=pysrt.ERROR_RAISE)
|
||||
except Exception, e:
|
||||
logger.error("PySRT-parsing failed: %s, trying pysubs2", e)
|
||||
pysrt.from_string(text, error_handling=pysrt.ERROR_RAISE)
|
||||
except Exception:
|
||||
logger.error("PySRT-parsing failed, trying pysubs2")
|
||||
else:
|
||||
return True
|
||||
|
||||
# something else, try to return srt
|
||||
try:
|
||||
logger.debug("Trying parsing with PySubs2")
|
||||
subs = pysubs2.SSAFile.from_string(self.text)
|
||||
self.content = subs.to_string("srt")
|
||||
try:
|
||||
# in case of microdvd, try parsing the fps from the subtitle
|
||||
subs = pysubs2.SSAFile.from_string(text)
|
||||
if subs.format == "microdvd":
|
||||
logger.info("Got FPS from MicroDVD subtitle: %s", subs.fps)
|
||||
except pysubs2.UnknownFPSError:
|
||||
# if parsing failed, suggest our media file's fps
|
||||
subs = pysubs2.SSAFile.from_string(text, fps=self.plex_media_fps)
|
||||
if subs.format == "microdvd":
|
||||
logger.info("No FPS info in subtitle. Using our own media FPS for the MicroDVD subtitle: %s",
|
||||
subs.fps)
|
||||
|
||||
unicontent = self.pysubs2_to_unicode(subs)
|
||||
self.content = unicontent.encode(self.guess_encoding())
|
||||
except:
|
||||
logger.exception("Couldn't convert subtitle %s to .srt format", self)
|
||||
logger.exception("Couldn't convert subtitle %s to .srt format: %s", self, traceback.format_exc())
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
def get_modified_content(self):
|
||||
@classmethod
|
||||
def pysubs2_to_unicode(cls, sub):
|
||||
def prepare_text(text, style):
|
||||
body = []
|
||||
for fragment, sty in parse_tags(text, style, sub.styles):
|
||||
fragment = fragment.replace(ur"\h", u" ")
|
||||
fragment = fragment.replace(ur"\n", u"\n")
|
||||
fragment = fragment.replace(ur"\N", u"\n")
|
||||
if sty.italic: fragment = u"<i>%s</i>" % fragment
|
||||
if sty.underline: fragment = u"<u>%s</u>" % fragment
|
||||
if sty.strikeout: fragment = u"<s>%s</s>" % fragment
|
||||
body.append(fragment)
|
||||
|
||||
return re.sub(u"\n+", u"\n", u"".join(body).strip())
|
||||
|
||||
visible_lines = (line for line in sub if not line.is_comment)
|
||||
|
||||
out = []
|
||||
|
||||
for i, line in enumerate(visible_lines, 1):
|
||||
start = ms_to_timestamp(line.start)
|
||||
end = ms_to_timestamp(line.end)
|
||||
text = prepare_text(line.text, sub.styles.get(line.style, SSAStyle.DEFAULT_STYLE))
|
||||
|
||||
out.append(u"%d\n" % i)
|
||||
out.append(u"%s --> %s\n" % (start, end))
|
||||
out.append(u"%s%s" % (text, "\n\n"))
|
||||
|
||||
return u"".join(out)
|
||||
|
||||
def get_modified_content(self, debug=False):
|
||||
"""
|
||||
:param language:
|
||||
:param fps:
|
||||
:return: string
|
||||
"""
|
||||
if not self.mods:
|
||||
return self.content
|
||||
|
||||
encoding = self.guess_encoding()
|
||||
|
||||
submods = SubtitleModifications()
|
||||
submods.load(content=self.text, fps=self.plex_media_fps)
|
||||
submods = SubtitleModifications(debug=debug)
|
||||
submods.load(content=self.text, language=self.language)
|
||||
submods.modify(*self.mods)
|
||||
return submods.to_string("srt", encoding=encoding).encode(encoding=encoding)
|
||||
|
||||
def get_modified_text(self):
|
||||
return self.pysubs2_to_unicode(submods.f).encode(encoding=encoding)
|
||||
|
||||
def get_modified_text(self, debug=False):
|
||||
"""
|
||||
:param language:
|
||||
:param fps:
|
||||
:return: unicode
|
||||
"""
|
||||
content = self.get_modified_content()
|
||||
content = self.get_modified_content(debug=debug)
|
||||
if not content:
|
||||
return
|
||||
encoding = self.guess_encoding()
|
||||
return content.decode(encoding=encoding)
|
||||
|
||||
|
||||
class ModifiedSubtitle(PatchedSubtitle):
|
||||
id = None
|
||||
|
||||
@@ -0,0 +1,7 @@
|
||||
# coding=utf-8
|
||||
|
||||
from subliminal.video import Video as Video_
|
||||
|
||||
|
||||
class Video(Video_):
|
||||
is_special = False
|
||||
@@ -1,7 +1,10 @@
|
||||
# coding=utf-8
|
||||
|
||||
import sys
|
||||
import logging
|
||||
import sys
|
||||
import codecs
|
||||
|
||||
from babelfish import Language
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -14,7 +17,18 @@ if debug:
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
|
||||
submod = SubMod(debug=debug)
|
||||
submod.load(fn)
|
||||
submod.modify("remove_HI")
|
||||
submod.load(fn, language=Language.fromietf("eng"), encoding="utf-8")
|
||||
submod.modify("remove_HI", "OCR_fixes", "common", "OCR_fixes", "shift_offset(s=20)", "OCR_fixes", "color(color=#FF0000)", "shift_offset(s=-5, ms=-350)")
|
||||
|
||||
#srt = submod.to_unicode()
|
||||
#print submod.f.to_string("srt", encoding="utf-8")
|
||||
#print repr(srt)
|
||||
#f = codecs.open("testout.srt", "w+", encoding="latin-1")
|
||||
#f.write(srt)
|
||||
#f.close()
|
||||
#print submod.f.to_string("srt")
|
||||
#submod.modify("OCR_fixes")
|
||||
#submod.modify("change_FPS(from=24,to=25)")
|
||||
#submod.modify("common")
|
||||
|
||||
#print submod.f.to_string("srt")
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
import datetime
|
||||
import logging
|
||||
import traceback
|
||||
import types
|
||||
|
||||
from constants import mode_map
|
||||
|
||||
@@ -71,9 +72,10 @@ class SubtitleHistory(object):
|
||||
self.history_items = storage.LoadObject("subtitle_history") or []
|
||||
except:
|
||||
logger.error("Failed to load history storage: %s" % traceback.format_exc())
|
||||
if not isinstance(self.history_items, types.ListType):
|
||||
self.history_items = []
|
||||
|
||||
def add(self, item_title, rating_key, section_title=None, subtitle=None, mode="a", time=None):
|
||||
# create copy
|
||||
items = self.history_items
|
||||
item = SubtitleHistoryItem(item_title, rating_key, section_title=section_title, subtitle=subtitle, mode=mode, time=time)
|
||||
|
||||
|
||||
@@ -1,246 +0,0 @@
|
||||
# coding=utf-8
|
||||
|
||||
import re
|
||||
import traceback
|
||||
from collections import OrderedDict
|
||||
|
||||
import pysubs2
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class SubtitleModifications(object):
|
||||
debug = False
|
||||
|
||||
def __init__(self, debug=False):
|
||||
self.debug = debug
|
||||
|
||||
def load(self, fn=None, content=None, fps=None):
|
||||
"""
|
||||
|
||||
:param fn: filename
|
||||
:param content: unicode
|
||||
:param fps:
|
||||
:return:
|
||||
"""
|
||||
try:
|
||||
if fn:
|
||||
self.f = pysubs2.load(fn, fps=fps)
|
||||
elif content:
|
||||
self.f = pysubs2.SSAFile.from_string(content, fps=fps)
|
||||
except (IOError,
|
||||
UnicodeDecodeError,
|
||||
pysubs2.exceptions.UnknownFPSError,
|
||||
pysubs2.exceptions.UnknownFormatIdentifierError,
|
||||
pysubs2.exceptions.FormatAutodetectionError):
|
||||
if fn:
|
||||
logger.exception("Couldn't load subtitle: %s: %s", fn, traceback.format_exc())
|
||||
elif content:
|
||||
logger.exception("Couldn't load subtitle: %s", traceback.format_exc())
|
||||
|
||||
def modify(self, *mods):
|
||||
new_f = []
|
||||
for line in self.f:
|
||||
applied_mods = []
|
||||
for identifier in mods:
|
||||
if identifier in registry.mods:
|
||||
mod = registry.mods[identifier]
|
||||
|
||||
# don't bother reapplying exclusive mods multiple times
|
||||
if mod.exclusive and identifier in applied_mods:
|
||||
continue
|
||||
|
||||
new_content = mod.modify(line.text, debug=self.debug)
|
||||
if not new_content:
|
||||
if self.debug:
|
||||
logger.debug("%s: deleting %s", identifier, line)
|
||||
continue
|
||||
|
||||
line.text = new_content
|
||||
new_f.append(line)
|
||||
applied_mods.append(identifier)
|
||||
|
||||
self.f.events = new_f
|
||||
|
||||
def to_string(self, format="srt", encoding="utf-8"):
|
||||
return self.f.to_string(format, encoding=encoding)
|
||||
|
||||
def save(self, fn):
|
||||
self.f.save(fn)
|
||||
|
||||
|
||||
SubMod = SubtitleModifications
|
||||
|
||||
|
||||
class SubtitleModRegistry(object):
|
||||
mods = None
|
||||
mods_available = None
|
||||
|
||||
def __init__(self):
|
||||
self.mods = OrderedDict()
|
||||
self.mods_available = []
|
||||
|
||||
def register(self, mod):
|
||||
self.mods[mod.identifier] = mod
|
||||
self.mods_available.append(mod.identifier)
|
||||
|
||||
registry = SubtitleModRegistry()
|
||||
|
||||
|
||||
class Processor(object):
|
||||
"""
|
||||
Processor base class
|
||||
"""
|
||||
name = None
|
||||
|
||||
def __init__(self, name=None):
|
||||
self.name = name
|
||||
|
||||
@property
|
||||
def info(self):
|
||||
return self.name
|
||||
|
||||
def process(self, content):
|
||||
return content
|
||||
|
||||
def __repr__(self):
|
||||
return "Processor <%s %s>" % (self.__class__.__name__, self.info)
|
||||
|
||||
def __str__(self):
|
||||
return repr(self)
|
||||
|
||||
def __unicode__(self):
|
||||
return unicode(repr(self))
|
||||
|
||||
|
||||
class StringProcessor(Processor):
|
||||
"""
|
||||
String replacement processor base
|
||||
"""
|
||||
|
||||
def __init__(self, search, replace, name=None):
|
||||
super(StringProcessor, self).__init__(name=name)
|
||||
self.search = search
|
||||
self.replace = replace
|
||||
|
||||
def process(self, content):
|
||||
return content.replace(self.search, self.replace)
|
||||
|
||||
|
||||
class ReProcessor(Processor):
|
||||
"""
|
||||
Regex processor
|
||||
"""
|
||||
pattern = None
|
||||
replace_with = None
|
||||
|
||||
def __init__(self, pattern, replace_with, name=None):
|
||||
super(ReProcessor, self).__init__(name=name)
|
||||
self.pattern = pattern
|
||||
self.replace_with = replace_with
|
||||
|
||||
def process(self, content, debug=False):
|
||||
return self.pattern.sub(self.replace_with, content)
|
||||
|
||||
|
||||
class NReProcessor(ReProcessor):
|
||||
"""
|
||||
Single line regex processor
|
||||
"""
|
||||
|
||||
def process(self, content, debug=False):
|
||||
lines = []
|
||||
for line in content.split(r"\N"):
|
||||
a = super(NReProcessor, self).process(line, debug=debug)
|
||||
if not a:
|
||||
continue
|
||||
lines.append(a)
|
||||
return r"\N".join(lines)
|
||||
|
||||
|
||||
class SubtitleModification(object):
|
||||
identifier = None
|
||||
description = None
|
||||
exclusive = False
|
||||
pre_processors = []
|
||||
processors = []
|
||||
post_processors = []
|
||||
|
||||
@classmethod
|
||||
def _process(cls, content, processors, debug=False):
|
||||
if not content:
|
||||
return
|
||||
|
||||
new_content = content
|
||||
for processor in processors:
|
||||
old_content = new_content
|
||||
new_content = processor.process(new_content, debug=debug)
|
||||
if not new_content:
|
||||
if debug:
|
||||
logger.debug("Processor returned empty line: %s", processor)
|
||||
break
|
||||
if debug:
|
||||
if old_content == new_content:
|
||||
continue
|
||||
logger.debug("%s: %s -> %s", processor, old_content, new_content)
|
||||
return new_content
|
||||
|
||||
@classmethod
|
||||
def pre_process(cls, content, debug=False):
|
||||
return cls._process(content, cls.pre_processors, debug=debug)
|
||||
|
||||
@classmethod
|
||||
def process(cls, content, debug=False):
|
||||
return cls._process(content, cls.processors, debug=debug)
|
||||
|
||||
@classmethod
|
||||
def post_process(cls, content, debug=False):
|
||||
return cls._process(content, cls.post_processors, debug=debug)
|
||||
|
||||
@classmethod
|
||||
def modify(cls, content, debug=False):
|
||||
new_content = content
|
||||
for method in ("pre_process", "process", "post_process"):
|
||||
new_content = getattr(cls, method)(new_content, debug=debug)
|
||||
|
||||
return new_content
|
||||
|
||||
|
||||
class SubtitleTextModification(SubtitleModification):
|
||||
post_processors = [
|
||||
# empty tag
|
||||
ReProcessor(re.compile(r'({\\\w+1})[\s.,-_!?]+({\\\w+0})'), "", name="empty_tag"),
|
||||
|
||||
# empty line (needed?)
|
||||
NReProcessor(re.compile(r'^\s+$'), "", name="empty_line"),
|
||||
|
||||
# empty dash line (needed?)
|
||||
NReProcessor(re.compile(r'(^[\s]*[\-]+[\s]*)$'), "", name="empty_dash_line"),
|
||||
|
||||
# clean whitespace at start and end
|
||||
ReProcessor(re.compile(r'^\s*([^\s]+)\s*$'), r"\1", name="surrounding_whitespace"),
|
||||
]
|
||||
|
||||
|
||||
class HearingImpaired(SubtitleTextModification):
|
||||
identifier = "remove_HI"
|
||||
description = "Remove Hearing Impaired tags"
|
||||
exclusive = True
|
||||
|
||||
processors = [
|
||||
# brackets
|
||||
NReProcessor(re.compile(r'(?sux)[([].+[)\]]'), "", name="HI_brackets"),
|
||||
|
||||
# text before colon (and possible dash in front)
|
||||
NReProcessor(re.compile(r'(?u)(^[A-z\-]+[\w\s]*:[^0-9{2}][\s]*)'), "", name="HI_before_colon"),
|
||||
|
||||
# all caps line (at least 3 chars)
|
||||
NReProcessor(re.compile(r'(?u)(^[A-Z]{3,}$)'), "", name="HI_all_caps"),
|
||||
|
||||
# dash in front
|
||||
NReProcessor(re.compile(r'(?u)^\s*-\s*'), "", name="HI_starting_dash"),
|
||||
]
|
||||
|
||||
|
||||
registry.register(HearingImpaired)
|
||||
@@ -0,0 +1,5 @@
|
||||
# coding=utf-8
|
||||
|
||||
from registry import registry
|
||||
from mods import hearing_impaired, ocr_fixes, fps, offset, common, color
|
||||
from main import SubtitleModifications, SubMod
|
||||
@@ -0,0 +1,3 @@
|
||||
# coding=utf-8
|
||||
|
||||
from data import data
|
||||
File diff suppressed because one or more lines are too long
@@ -0,0 +1,98 @@
|
||||
# coding=utf-8
|
||||
|
||||
import re
|
||||
import os
|
||||
import pprint
|
||||
from collections import OrderedDict
|
||||
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
TEMPLATE = """\
|
||||
import re
|
||||
from collections import OrderedDict
|
||||
data = """
|
||||
|
||||
TEMPLATE_END = """\
|
||||
|
||||
for lang, grps in data.iteritems():
|
||||
for grp in grps.iterkeys():
|
||||
if data[lang][grp]["pattern"]:
|
||||
data[lang][grp]["pattern"] = re.compile(data[lang][grp]["pattern"])
|
||||
"""
|
||||
|
||||
|
||||
SZ_FIX_DATA = {
|
||||
"eng": {
|
||||
"PartialWordsAlways": {
|
||||
u"°x°": u"%",
|
||||
u"compiete": u"complete",
|
||||
u"Âs": u"'s",
|
||||
u"ÃÂs": u"'s",
|
||||
u"a/ion": u"ation",
|
||||
u"at/on": u"ation",
|
||||
u"l/an": u"lian",
|
||||
},
|
||||
"WholeWords": {
|
||||
u"I'11": u"I'll",
|
||||
u"Tun": u"Run",
|
||||
u"pan'": u"part",
|
||||
u"al'": u"at",
|
||||
u"a re": u"are",
|
||||
u"wail'": u"wait",
|
||||
u"he)'": u"hey",
|
||||
u"He)'": u"Hey",
|
||||
u"Yea h": u"Yeah",
|
||||
u"yea h": u"yeah",
|
||||
u"h is": u"his",
|
||||
u" 're ": u"'re ",
|
||||
u"LAst": u"Last",
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if __name__ == "__main__":
|
||||
cur_dir = os.path.dirname(os.path.realpath(__file__))
|
||||
xml_dir = os.path.join(cur_dir, "xml")
|
||||
file_list = os.listdir(xml_dir)
|
||||
|
||||
data = {}
|
||||
|
||||
for fn in file_list:
|
||||
if fn.endswith("_OCRFixReplaceList.xml"):
|
||||
lang = fn.split("_")[0]
|
||||
soup = BeautifulSoup(open(os.path.join(xml_dir, fn)), "xml")
|
||||
|
||||
fetch_data = (
|
||||
# group, item_name, pattern
|
||||
("WholeLines", "Line", None),
|
||||
("WholeWords", "Word", lambda d: (ur"(?um)\b(?:" + u"|".join([re.escape(k) for k in d.keys()])
|
||||
+ ur')\b') if d else None),
|
||||
("PartialWordsAlways", "WordPart", None),
|
||||
("PartialLines", "LinePart", lambda d: (ur"(?um)(?:(?<=\s)|(?<=^)|(?<=\b))(?:" +
|
||||
u"|".join([re.escape(k) for k in d.keys()]) +
|
||||
ur")(?:(?=\s)|(?=$)|(?=\b))") if d else None),
|
||||
("BeginLines", "Beginning", lambda d: (ur"(?um)^(?:"+u"|".join([re.escape(k) for k in d.keys()])
|
||||
+ ur')') if d else None),
|
||||
("EndLines", "Ending", lambda d: (ur"(?um)(?:" + u"|".join([re.escape(k) for k in d.keys()]) +
|
||||
ur")$") if d else None,),
|
||||
)
|
||||
|
||||
data[lang] = dict((grp, {"data": OrderedDict(), "pattern": None}) for grp, item_name, pattern in fetch_data)
|
||||
|
||||
for grp, item_name, pattern in fetch_data:
|
||||
for grp_data in soup.find_all(grp):
|
||||
for line in grp_data.find_all(item_name):
|
||||
data[lang][grp]["data"][line["from"]] = line["to"]
|
||||
|
||||
# add our own dictionaries
|
||||
if lang in SZ_FIX_DATA and grp in SZ_FIX_DATA[lang]:
|
||||
data[lang][grp]["data"].update(SZ_FIX_DATA[lang][grp])
|
||||
|
||||
if pattern:
|
||||
data[lang][grp]["pattern"] = pattern(data[lang][grp]["data"])
|
||||
|
||||
f = open(os.path.join(cur_dir, "data.py"), "w+")
|
||||
f.write(TEMPLATE)
|
||||
f.write(pprint.pformat(data, width=1))
|
||||
f.write(TEMPLATE_END)
|
||||
f.close()
|
||||
@@ -0,0 +1,10 @@
|
||||
# coding=utf-8
|
||||
|
||||
from babelfish import Language
|
||||
from data import data
|
||||
|
||||
#for lang, data in data.iteritems():
|
||||
# print Language.fromietf(lang).alpha2
|
||||
|
||||
for find, rep in data["dan"].iteritems():
|
||||
print find, rep
|
||||
+638
@@ -0,0 +1,638 @@
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords>
|
||||
<Word from="Haner" to="Han er" />
|
||||
<Word from="JaveL" to="Javel" />
|
||||
<Word from="Pa//e" to="Palle" />
|
||||
<Word from="bffte" to="bitte" />
|
||||
<Word from="Utro//gt" to="Utroligt" />
|
||||
<Word from="Kommerdu" to="Kommer du" />
|
||||
<Word from="smi/er" to="smiler" />
|
||||
<Word from="/eg" to="leg" />
|
||||
<Word from="harvinger" to="har vinger" />
|
||||
<Word from="/et" to="let" />
|
||||
<Word from="erjeres" to="er jeres" />
|
||||
<Word from="hardet" to="har det" />
|
||||
<Word from="tænktjer" to="tænkt jer" />
|
||||
<Word from="erjo" to="er jo" />
|
||||
<Word from="sti/" to="stil" />
|
||||
<Word from="Iappe" to="lappe" />
|
||||
<Word from="Beklagelç" to="Beklager," />
|
||||
<Word from="vardet" to="var det" />
|
||||
<Word from="afden" to="af den" />
|
||||
<Word from="snupperjeg" to="snupper jeg" />
|
||||
<Word from="ikkejeg" to="ikke jeg" />
|
||||
<Word from="bliverjeg" to="bliver jeg" />
|
||||
<Word from="hartravit" to="har travlt" />
|
||||
<Word from="pandekagef/ag" to="pandekageflag" />
|
||||
<Word from="Stormvarsell" to="Stormvarsel!" />
|
||||
<Word from="stormvejn" to="stormvejr." />
|
||||
<Word from="morgenkomp/et" to="morgenkomplet" />
|
||||
<Word from="/yv" to="lyv" />
|
||||
<Word from="varjo" to="var jo" />
|
||||
<Word from="/eger" to="leger" />
|
||||
<Word from="harjeg" to="har jeg" />
|
||||
<Word from="havdejeg" to="havde jeg" />
|
||||
<Word from="hvorjeg" to="hvor jeg" />
|
||||
<Word from="nårjeg" to="når jeg" />
|
||||
<Word from="gårvi" to="går vi" />
|
||||
<Word from="atjeg" to="at jeg" />
|
||||
<Word from="isine" to="i sine" />
|
||||
<Word from="fårjeg" to="får jeg" />
|
||||
<Word from="kærtighed" to="kærlighed" />
|
||||
<Word from="skullejeg" to="skulle jeg" />
|
||||
<Word from="laest" to="læst" />
|
||||
<Word from="laese" to="læse" />
|
||||
<Word from="gørjeg" to="gør jeg" />
|
||||
<Word from="gørvi" to="gør vi" />
|
||||
<Word from="angrerjo" to="angrer jo" />
|
||||
<Word from="Hvergang" to="Hver gang" />
|
||||
<Word from="erder" to="er der" />
|
||||
<Word from="villetilgive" to="ville tilgive" />
|
||||
<Word from="fieme" to="fjeme" />
|
||||
<Word from="genopståri" to="genopstår i" />
|
||||
<Word from="svigtejer" to="svigte jer" />
|
||||
<Word from="kommernu" to="kommer nu" />
|
||||
<Word from="nårman" to="når man" />
|
||||
<Word from="erfire" to="er fire" />
|
||||
<Word from="Hvorforfinderdu" to="Hvorfor finder du" />
|
||||
<Word from="undertigt" to="underligt" />
|
||||
<Word from="itroen" to="i troen" />
|
||||
<Word from="erløgnt" to="er løgn!" />
|
||||
<Word from="gørden" to="gør den" />
|
||||
<Word from="forhelvede" to="for helvede" />
|
||||
<Word from="hjpe" to="hjælpe" />
|
||||
<Word from="togeti" to="toget i" />
|
||||
<Word from="Måjeg" to="Må jeg" />
|
||||
<Word from="savnerjer" to="savner jer" />
|
||||
<Word from="erjeg" to="er jeg" />
|
||||
<Word from="vaere" to="være" />
|
||||
<Word from="geme" to="gerne" />
|
||||
<Word from="trorpå" to="tror på" />
|
||||
<Word from="forham" to="for ham" />
|
||||
<Word from="afham" to="af ham" />
|
||||
<Word from="harjo" to="har jo" />
|
||||
<Word from="ovemafiet" to="overnattet" />
|
||||
<Word from="begaefiighed" to="begærlighed" />
|
||||
<Word from="sy’g" to="syg" />
|
||||
<Word from="Imensjeg" to="Imens jeg" />
|
||||
<Word from="bliverdu" to="bliver du" />
|
||||
<Word from="fiser" to="fiser" />
|
||||
<Word from="manipuierer" to="manipulerer" />
|
||||
<Word from="forjeg" to="for jeg" />
|
||||
<Word from="iivgivendefor" to="livgivende for" />
|
||||
<Word from="formig" to="for mig" />
|
||||
<Word from="Hardu" to="Har du" />
|
||||
<Word from="fornold" to="forhold" />
|
||||
<Word from="defrelste" to="de frelste" />
|
||||
<Word from="Såjeg" to="Så jeg" />
|
||||
<Word from="varjeg" to="var jeg" />
|
||||
<Word from="gørved" to="gør ved" />
|
||||
<Word from="kalderjeg" to="kalder jeg" />
|
||||
<Word from="flytte" to="flytte" />
|
||||
<Word from="handlerdet" to="handler det" />
|
||||
<Word from="trorjeg" to="tror jeg" />
|
||||
<Word from="flytter" to="flytter" />
|
||||
<Word from="soverjeg" to="sover jeg" />
|
||||
<Word from="finderud" to="finder ud" />
|
||||
<Word from="naboerpå" to="naboer på" />
|
||||
<Word from="ervildt" to="er vildt" />
|
||||
<Word from="væreher" to="være her" />
|
||||
<Word from="hyggerjer" to="hygger jer" />
|
||||
<Word from="borjo" to="bor jo" />
|
||||
<Word from="kommerikke" to="kommer ikke" />
|
||||
<Word from="folkynde" to="forkynde" />
|
||||
<Word from="farglad" to="far glad" />
|
||||
<Word from="misterjeg" to="mister jeg" />
|
||||
<Word from="fint" to="fint" />
|
||||
<Word from="Harl" to="Har I" />
|
||||
<Word from="bedejer" to="bede jer" />
|
||||
<Word from="synesjeg" to="synes jeg" />
|
||||
<Word from="vartil" to="var til" />
|
||||
<Word from="eren" to="er en" />
|
||||
<Word from="\Al" to="Vil" />
|
||||
<Word from="\A" to="Vi" />
|
||||
<Word from="fjeme" to="fjerne" />
|
||||
<Word from="Iigefyldt" to="lige fyldt" />
|
||||
<Word from="ertil" to="er til" />
|
||||
<Word from="fafiigt" to="farligt" />
|
||||
<Word from="finder" to="finder" />
|
||||
<Word from="findes" to="findes" />
|
||||
<Word from="irettesaefielse" to="irettesættelse" />
|
||||
<Word from="ermed" to="er med" />
|
||||
<Word from="èn" to="én" />
|
||||
<Word from="gikjoi" to="gik jo i" />
|
||||
<Word from="Hvisjeg" to="Hvis jeg" />
|
||||
<Word from="ovemafier" to="overnatter" />
|
||||
<Word from="hoident" to="holdent" />
|
||||
<Word from="\Adne" to="Vidne" />
|
||||
<Word from="fori" to="for i" />
|
||||
<Word from="vei" to="vel" />
|
||||
<Word from="savnerjerjo" to="savner jer jo" />
|
||||
<Word from="elskerjer" to="elsker jer" />
|
||||
<Word from="harløjet" to="har løjet" />
|
||||
<Word from="eri" to="er i" />
|
||||
<Word from="fiende" to="fjende" />
|
||||
<Word from="derjo" to="der jo" />
|
||||
<Word from="sigerjo" to="siger jo" />
|
||||
<Word from="menerjeg" to="mener jeg" />
|
||||
<Word from="Harjeg" to="Har jeg" />
|
||||
<Word from="sigerjeg" to="siger jeg" />
|
||||
<Word from="splitterjeg" to="splitter jeg" />
|
||||
<Word from="erjournalist" to="er journalist" />
|
||||
<Word from="erjoumalist" to="er journalist" />
|
||||
<Word from="Forjeg" to="For jeg" />
|
||||
<Word from="gârjeg" to="går jeg" />
|
||||
<Word from="Nârjeg" to="Når jeg" />
|
||||
<Word from="afllom" to="afkom" />
|
||||
<Word from="farerjo" to="farer jo" />
|
||||
<Word from="tagerjeg" to="tager jeg" />
|
||||
<Word from="Virkerjeg" to="Virker jeg" />
|
||||
<Word from="morerjer" to="morer jer" />
|
||||
<Word from="kommerjo" to="kommer jo" />
|
||||
<Word from="istand" to="i stand" />
|
||||
<Word from="bøm" to="børn" />
|
||||
<Word from="frygterjeg" to="frygter jeg" />
|
||||
<Word from="kommerjeg" to="kommer jeg" />
|
||||
<Word from="eriournalistelev" to="er journalistelev" />
|
||||
<Word from="harfat" to="har fat" />
|
||||
<Word from="fårfingre" to="får fingre" />
|
||||
<Word from="slârjeg" to="slår jeg" />
|
||||
<Word from="bam" to="barn" />
|
||||
<Word from="erjournalistelev" to="er journalistelev" />
|
||||
<Word from="politietjo" to="politiet jo" />
|
||||
<Word from="elskerjo" to="elsker jo" />
|
||||
<Word from="vari" to="var i" />
|
||||
<Word from="fornemmerjeres" to="fornemmer jeres" />
|
||||
<Word from="udklækketl" to="udklækket!" />
|
||||
<Word from="í" to="i" />
|
||||
<Word from="nyi" to="ny i" />
|
||||
<Word from="Iumijelse" to="fornøjelse" />
|
||||
<Word from="vures" to="vores" />
|
||||
<Word from="I/Vashíngtan" to="Washington" />
|
||||
<Word from="opleverjeg" to="oplever jeg" />
|
||||
<Word from="PANTELÃNER" to="PANTELÅNER" />
|
||||
<Word from="Gudmurgen" to="Godmorgen" />
|
||||
<Word from="SKYDEVÃBEN" to="SKYDEVÅBEN" />
|
||||
<Word from="PÃLIDELIG" to="PÅLIDELIG" />
|
||||
<Word from="avertalte" to="overtalte" />
|
||||
<Word from="Omsíder" to="Omsider" />
|
||||
<Word from="lurtebåd" to="lortebåd" />
|
||||
<Word from="Telrslning" to="Tekstning" />
|
||||
<Word from="miUø" to="miljø" />
|
||||
<Word from="gåri" to="går i" />
|
||||
<Word from="Fan/el" to="Farvel" />
|
||||
<Word from="abefiæs" to="abefjæs" />
|
||||
<Word from="hartalt" to="har talt" />
|
||||
<Word from="\Årkelig" to="Virkelig" />
|
||||
<Word from="beklagerjeg" to="beklager jeg" />
|
||||
<Word from="Nårjeg" to="Når jeg" />
|
||||
<Word from="rnaend" to="mænd" />
|
||||
<Word from="vaskebjorn" to="vaskebjørn" />
|
||||
<Word from="Ivil" to="I vil" />
|
||||
<Word from="besog" to="besøg" />
|
||||
<Word from="Vaer" to="Vær" />
|
||||
<Word from="Undersogte" to="Undersøgte" />
|
||||
<Word from="modte" to="mødte" />
|
||||
<Word from="toj" to="tøj" />
|
||||
<Word from="fodt" to="født" />
|
||||
<Word from="gore" to="gøre" />
|
||||
<Word from="provede" to="prøvede" />
|
||||
<Word from="forste" to="første" />
|
||||
<Word from="igang" to="i gang" />
|
||||
<Word from="ligenu" to="lige nu" />
|
||||
<Word from="clet" to="det" />
|
||||
<Word from="Strombell" to="Strombel!" />
|
||||
<Word from="tmvlt" to="travlt" />
|
||||
<Word from="studererjournalistik" to="studerer journalistik" />
|
||||
<Word from="inforrnererjeg" to="informerer jeg" />
|
||||
<Word from="omkfing" to="omkring" />
|
||||
<Word from="tilAsgård" to="til Asgård" />
|
||||
<Word from="Kederjeg" to="Keder jeg" />
|
||||
<Word from="jaettetamp" to="jættetamp" />
|
||||
<Word from="erjer" to="er jer" />
|
||||
<Word from="atjulehygge" to="at julehygge" />
|
||||
<Word from="Ueneste" to="tjeneste" />
|
||||
<Word from="foltsaetter" to="fortsætter" />
|
||||
<Word from="A/ice" to="Alice" />
|
||||
<Word from="tvivlerjeg" to="tvivler jeg" />
|
||||
<Word from="henterjer" to="henter jer" />
|
||||
<Word from="forstårjeg" to="forstår jeg" />
|
||||
<Word from="hvisjeg" to="hvis jeg" />
|
||||
<Word from="/ært" to="lært" />
|
||||
<Word from="vfgtrgt" to="vigtigt" />
|
||||
<Word from="hurtigtjeg" to="hurtigt jeg" />
|
||||
<Word from="kenderjo" to="kender jo" />
|
||||
<Word from="seiv" to="selv" />
|
||||
<Word from="/ægehuset" to="lægehuset" />
|
||||
<Word from="herjo" to="her jo" />
|
||||
<Word from="stolerjeg" to="stoler jeg" />
|
||||
<Word from="digi" to="dig i" />
|
||||
<Word from="taberi" to="taber i" />
|
||||
<Word from="slårjeres" to="slår jeres" />
|
||||
<Word from="laere" to="lære" />
|
||||
<Word from="trænerwushu" to="træner wushu" />
|
||||
<Word from="efterjeg" to="efter jeg" />
|
||||
<Word from="efier" to="efter" />
|
||||
<Word from="dui" to="du i" />
|
||||
<Word from="afien" to="aften" />
|
||||
<Word from="bliveri" to="bliver i" />
|
||||
<Word from="acceptererjer" to="accepterer jer" />
|
||||
<Word from="drikkerjo" to="drikker jo" />
|
||||
<Word from="fianjin" to="Tianjin" />
|
||||
<Word from="erlænge" to="er længe" />
|
||||
<Word from="erikke" to="er ikke" />
|
||||
<Word from="medjer" to="med jer" />
|
||||
<Word from="Tmykke" to="Tillykke" />
|
||||
<Word from="'fianjins" to="Tianjins" />
|
||||
<Word from="Mesteri" to="Mester i" />
|
||||
<Word from="sagdetil" to="sagde til" />
|
||||
<Word from="indei" to="inde i" />
|
||||
<Word from="ofie" to="ofte" />
|
||||
<Word from="'filgiv" to="Tilgiv" />
|
||||
<Word from="Lfår" to="I får" />
|
||||
<Word from="viserjer" to="viser jer" />
|
||||
<Word from="Rejsjerblot" to="Rejs jer blot" />
|
||||
<Word from="'fillad" to="Tillad" />
|
||||
<Word from="iiiiefinger" to="lillefinger" />
|
||||
<Word from="VILOMFATTE" to="VIL OMFATTE" />
|
||||
<Word from="mofio" to="motto" />
|
||||
<Word from="gørjer" to="gør jer" />
|
||||
<Word from="gifi" to="gift" />
|
||||
<Word from="hardu" to="har du" />
|
||||
<Word from="gifi" to="gift" />
|
||||
<Word from="Iaeggerjeg" to="lægger jeg" />
|
||||
<Word from="iet" to="i et" />
|
||||
<Word from="sv/yte" to="svigte" />
|
||||
<Word from="ti/" to="til" />
|
||||
<Word from="Wdal" to="Vidal" />
|
||||
<Word from="fiået" to="fået" />
|
||||
<Word from="Hvo/for" to="Hvorfor" />
|
||||
<Word from="hellerikke" to="heller ikke" />
|
||||
<Word from="Wlle" to="Ville" />
|
||||
<Word from="dr/ver" to="driver" />
|
||||
<Word from="V\fllliam" to="William" />
|
||||
<Word from="V\fllliams" to="Williams" />
|
||||
<Word from="Vkfilliam" to="William" />
|
||||
<Word from="vådejakke" to="våde jakke" />
|
||||
<Word from="kæfll" to="kæft!" />
|
||||
<Word from="sagdejeg" to="sagde jeg" />
|
||||
<Word from="oven/ejet" to="overvejet" />
|
||||
<Word from="karameisauce" to="karamelsauce" />
|
||||
<Word from="Lfølgejødisk" to="Ifølge jødisk" />
|
||||
<Word from="blevjo" to="blev jo" />
|
||||
<Word from="asiateri" to="asiater i" />
|
||||
<Word from="erV\fllliam" to="er William" />
|
||||
<Word from="lidtflov" to="lidt flov" />
|
||||
<Word from="sagdejo" to="sagde jo" />
|
||||
<Word from="erlige" to="er lige" />
|
||||
<Word from="Vtfilliam" to="William" />
|
||||
<Word from="WfiII" to="Will" />
|
||||
<Word from="afldarede" to="afklarede" />
|
||||
<Word from="hjæiperjeg" to="hjælper jeg" />
|
||||
<Word from="laderjeg" to="lader jeg" />
|
||||
<Word from="Hândledsbeskyttere" to="Håndledsbeskyttere" />
|
||||
<Word from="Lsabels" to="Isabels" />
|
||||
<Word from="Gørjeg" to="Gør jeg" />
|
||||
<Word from="mâjeg" to="må jeg" />
|
||||
<Word from="ogjeg" to="og jeg" />
|
||||
<Word from="gjordejeg" to="gjorde jeg" />
|
||||
<Word from="villejeg" to="ville jeg" />
|
||||
<Word from="Vlfllliams" to="Williams" />
|
||||
<Word from="Dajeg" to="Da jeg" />
|
||||
<Word from="iorden" to="i orden" />
|
||||
<Word from="fandtjeg" to="fandt jeg" />
|
||||
<Word from="Tilykke" to="Tillykke" />
|
||||
<Word from="kørerjer" to="kører jer" />
|
||||
<Word from="gøfjeg" to="gør jeg" />
|
||||
<Word from="Selvflgelig" to="Selvfølgelig" />
|
||||
<Word from="fdder" to="fadder" />
|
||||
<Word from="bnfaldt" to="bønfaldt" />
|
||||
<Word from="t\/ehovedede" to="tvehovedede" />
|
||||
<Word from="EIler" to="Eller" />
|
||||
<Word from="ringerjeg" to="ringer jeg" />
|
||||
<Word from="blevvæk" to="blev væk" />
|
||||
<Word from="stárjeg" to="står jeg" />
|
||||
<Word from="varforbi" to="var forbi" />
|
||||
<Word from="harfortalt" to="har fortalt" />
|
||||
<Word from="iflere" to="i flere" />
|
||||
<Word from="tørjeg" to="tør jeg" />
|
||||
<Word from="kunnejeg" to="kunne jeg" />
|
||||
<Word from="má" to="må" />
|
||||
<Word from="hartænkt" to="har tænkt" />
|
||||
<Word from="Fárjeg" to="Får jeg" />
|
||||
<Word from="afdelingervar" to="afdelinger var" />
|
||||
<Word from="0rd" to="ord" />
|
||||
<Word from="pástá" to="påstå" />
|
||||
<Word from="gráharet" to="gråharet" />
|
||||
<Word from="varforbløffende" to="var forbløffende" />
|
||||
<Word from="holdtjeg" to="holdt jeg" />
|
||||
<Word from="hængerjo" to="hænger jo" />
|
||||
<Word from="fikjeg" to="fik jeg" />
|
||||
<Word from="fár" to="får" />
|
||||
<Word from="Hvorforfølerjeg" to="Hvorfor føler jeg" />
|
||||
<Word from="harfeber" to="har feber" />
|
||||
<Word from="ándssvagt" to="åndssvagt" />
|
||||
<Word from="0g" to="Og" />
|
||||
<Word from="vartre" to="var tre" />
|
||||
<Word from="abner" to="åbner" />
|
||||
<Word from="garjeg" to="går jeg" />
|
||||
<Word from="sertil" to="ser til" />
|
||||
<Word from="hvorfin" to="hvor fin" />
|
||||
<Word from="harfri" to="har fri" />
|
||||
<Word from="forstarjeg" to="forstår jeg" />
|
||||
<Word from="Sä" to="Så" />
|
||||
<Word from="hvorfint" to="hvor fint" />
|
||||
<Word from="mærkerjeg" to="mærker jeg" />
|
||||
<Word from="ogsa" to="også" />
|
||||
<Word from="nárjeg" to="når jeg" />
|
||||
<Word from="Jasá" to="Jaså" />
|
||||
<Word from="bándoptager" to="båndoptager" />
|
||||
<Word from="bedárende" to="bedårende" />
|
||||
<Word from="sá" to="så" />
|
||||
<Word from="nár" to="når" />
|
||||
<Word from="kunnejo" to="kunne jo" />
|
||||
<Word from="Brammertil" to="Brammer til" />
|
||||
<Word from="serjeg" to="ser jeg" />
|
||||
<Word from="gikjeg" to="gik jeg" />
|
||||
<Word from="udholderjeg" to="udholder jeg" />
|
||||
<Word from="máneder" to="måneder" />
|
||||
<Word from="vartræt" to="var træt" />
|
||||
<Word from="dárligt" to="dårligt" />
|
||||
<Word from="klaretjer" to="klaret jer" />
|
||||
<Word from="pavirkelig" to="påvirkelig" />
|
||||
<Word from="spekulererjeg" to="spekulerer jeg" />
|
||||
<Word from="forsøgerjeg" to="forsøger jeg" />
|
||||
<Word from="huskerjeg" to="husker jeg" />
|
||||
<Word from="ifavnen" to="i favnen" />
|
||||
<Word from="skullejo" to="skulle jo" />
|
||||
<Word from="vartung" to="var tung" />
|
||||
<Word from="varfuldstændig" to="var fuldstændig" />
|
||||
<Word from="Paskedag" to="Påskedag" />
|
||||
<Word from="turi" to="tur i" />
|
||||
<Word from="spillerschumanns" to="spiller Schumanns" />
|
||||
<Word from="forstárjeg" to="forstår jeg" />
|
||||
<Word from="istedet" to="i stedet" />
|
||||
<Word from="nárfrem" to="når frem" />
|
||||
<Word from="habertrods" to="håber trods" />
|
||||
<Word from="forførste" to="for første" />
|
||||
<Word from="varto" to="var to" />
|
||||
<Word from="overtil" to="over til" />
|
||||
<Word from="forfem" to="for fem" />
|
||||
<Word from="holdtjo" to="holdt jo" />
|
||||
<Word from="passerjo" to="passer jo" />
|
||||
<Word from="ellerto" to="eller to" />
|
||||
<Word from="hartrods" to="har trods" />
|
||||
<Word from="harfuldstændig" to="har fuldstændig" />
|
||||
<Word from="gårjeg" to="går jeg" />
|
||||
<Word from="giderjeg" to="gider jeg" />
|
||||
<Word from="forjer" to="for jer" />
|
||||
<Word from="erindrerjeg" to="erindrer jeg" />
|
||||
<Word from="tænkerjeg" to="tænker jeg" />
|
||||
<Word from="GAEt" to="GÅET" />
|
||||
<Word from="hørerjo" to="hører jo" />
|
||||
<Word from="forladerjeg" to="forlader jeg" />
|
||||
<Word from="kosterjo" to="koster jo" />
|
||||
<Word from="fortællerjeg" to="fortæller jeg" />
|
||||
<Word from="Forstyrrerjeg" to="Forstyrrer jeg" />
|
||||
<Word from="tjekkerjeg" to="tjekker jeg" />
|
||||
<Word from="erjurist" to="er jurist" />
|
||||
<Word from="tlLBUD" to="TILBUD" />
|
||||
<Word from="serjo" to="se rjo" />
|
||||
<Word from="bederjeg" to="beder jeg" />
|
||||
<Word from="bilderjeg" to="bilder jeg" />
|
||||
<Word from="ULVEtlME" to="ULVETlME" />
|
||||
<Word from="skærerjo" to="skærer jo" />
|
||||
<Word from="afjer" to="af jer" />
|
||||
<Word from="ordnerjeg" to="ordner jeg" />
|
||||
<Word from="giverjeg" to="giver jeg" />
|
||||
<Word from="rejservi" to="rejser vi" />
|
||||
<Word from="fangerjeg" to="fanger jeg" />
|
||||
<Word from="erjaloux" to="er jaloux" />
|
||||
<Word from="glemmerjeg" to="glemmer jeg" />
|
||||
<Word from="Behøverjeg" to="Behøver jeg" />
|
||||
<Word from="harvi" to="har vi" />
|
||||
<Word from="ertyndere" to="er tyndere" />
|
||||
<Word from="fårtordenvejr" to="får tordenvejr" />
|
||||
<Word from="varfærdig" to="var færdig" />
|
||||
<Word from="hørerfor" to="hører for" />
|
||||
<Word from="varvel" to="var vel" />
|
||||
<Word from="erforbi" to="er forbi" />
|
||||
<Word from="AIle" to="Alle" />
|
||||
<Word from="læserjo" to="læser jo" />
|
||||
<Word from="Edgarer" to="Edgar er" />
|
||||
<Word from="hartaget" to="har taget" />
|
||||
<Word from="derer" to="der er" />
|
||||
<Word from="stikkerfrem" to="stikker frem" />
|
||||
<Word from="haraldrig" to="har aldrig" />
|
||||
<Word from="ellerfar" to="eller far" />
|
||||
<Word from="erat" to="er at" />
|
||||
<Word from="turtil" to="tur til" />
|
||||
<Word from="erfærdig" to="er færdig" />
|
||||
<Word from="følerjeg" to="føler jeg" />
|
||||
<Word from="jerfra" to="jer fra" />
|
||||
<Word from="eralt" to="er alt" />
|
||||
<Word from="harfaktisk" to="har faktisk" />
|
||||
<Word from="harfundet" to="har fundet" />
|
||||
<Word from="harvendt" to="har vendt" />
|
||||
<Word from="Kunstneraf" to="Kunstner af" />
|
||||
<Word from="ervel" to="er vel" />
|
||||
<Word from="ståransigt" to="står ansigt" />
|
||||
<Word from="Erjeg" to="Er jeg" />
|
||||
<Word from="venterjeg" to="venter jeg" />
|
||||
<Word from="Hvorvar" to="Hvor var" />
|
||||
<Word from="varfint" to="var fint" />
|
||||
<Word from="ervarmt" to="er varmt" />
|
||||
<Word from="gårfint" to="går fint" />
|
||||
<Word from="flyverforbi" to="flyver forbi" />
|
||||
<Word from="Dervar" to="Der var" />
|
||||
<Word from="dervar" to="der var" />
|
||||
<Word from="meneråndeligt" to="mener åndeligt" />
|
||||
<Word from="forat" to="for at" />
|
||||
<Word from="herovertil" to="herover til" />
|
||||
<Word from="soverfor" to="sover for" />
|
||||
<Word from="begyndtejeg" to="begyndte jeg" />
|
||||
<Word from="vendertilbage" to="vender tilbage" />
|
||||
<Word from="erforfærdelig" to="er forfærdelig" />
|
||||
<Word from="gøraltid" to="gør altid" />
|
||||
<Word from="ertilbage" to="er tilbage" />
|
||||
<Word from="harværet" to="har været" />
|
||||
<Word from="bagoverellertil" to="bagover eller til" />
|
||||
<Word from="hertaler" to="her taler" />
|
||||
<Word from="vågnerjeg" to="vågner jeg" />
|
||||
<Word from="vartomt" to="var tomt" />
|
||||
<Word from="gårfrem" to="går frem" />
|
||||
<Word from="talertil" to="taler til" />
|
||||
<Word from="ertryg" to="er tryg" />
|
||||
<Word from="ansigtervendes" to="ansigter vendes" />
|
||||
<Word from="hervirkeligt" to="her virkeligt" />
|
||||
<Word from="herer" to="her er" />
|
||||
<Word from="drømmerjo" to="drømmer jo" />
|
||||
<Word from="erfuldkommen" to="er fuldkommen" />
|
||||
<Word from="hveren" to="hver en" />
|
||||
<Word from="erfej" to="er fej" />
|
||||
<Word from="datterforgæves" to="datter forgæves" />
|
||||
<Word from="forsøgerjo" to="forsøger jo" />
|
||||
<Word from="ertom" to="er tom" />
|
||||
<Word from="vareftermiddag" to="var eftermiddag" />
|
||||
<Word from="vartom" to="var tom" />
|
||||
<Word from="angerellerforventninger" to="anger eller forventninger" />
|
||||
<Word from="kørtejeg" to="kørte jeg" />
|
||||
<Word from="Hvorforfortæller" to="Hvorfor fortæller" />
|
||||
<Word from="gårtil" to="går til" />
|
||||
<Word from="ringerefter" to="ringer efter" />
|
||||
<Word from="søgertilflugt" to="søger tilflugt" />
|
||||
<Word from="ertvunget" to="er tvunget" />
|
||||
<Word from="megetjeg" to="meget jeg" />
|
||||
<Word from="varikke" to="var ikke" />
|
||||
<Word from="Derermange" to="Der e rmange" />
|
||||
<Word from="dervilhindre" to="der vil hindre" />
|
||||
<Word from="erså" to="er så" />
|
||||
<Word from="DetforstårLeggodt" to="Det forstår jeg godt" />
|
||||
<Word from="ergodt" to="er godt" />
|
||||
<Word from="vorventen" to="vor venten" />
|
||||
<Word from="tagerfejl" to="tager fejl" />
|
||||
<Word from="ellerer" to="eller er" />
|
||||
<Word from="laverjeg" to="laver jeg" />
|
||||
<Word from="0mgang" to="omgang" />
|
||||
<Word from="afstár" to="afstår" />
|
||||
<Word from="pá" to="på" />
|
||||
<Word from="rejserjeg" to="rejser jeg" />
|
||||
<Word from="ellertage" to="eller tage" />
|
||||
<Word from="takkerjeg" to="takker jeg" />
|
||||
<Word from="ertilfældigvis" to="er tilfældigvis" />
|
||||
<Word from="fremstar" to="fremstår" />
|
||||
<Word from="ertæt" to="er tæt" />
|
||||
<Word from="ijeres" to="i jeres" />
|
||||
<Word from="Sagdejeg" to="Sagde jeg" />
|
||||
<Word from="overi" to="over i" />
|
||||
<Word from="plukkerjordbær" to="plukker jordbær" />
|
||||
<Word from="klarerjeg" to="klarer jeg" />
|
||||
<Word from="jerfire" to="jer fire" />
|
||||
<Word from="tábeligste" to="tåbeligste" />
|
||||
<Word from="sigertvillingerne" to="siger tvillingerne" />
|
||||
<Word from="erfaktisk" to="er faktisk" />
|
||||
<Word from="gár" to="går" />
|
||||
<Word from="harvasket" to="har vasket" />
|
||||
<Word from="harplukketjordbærtil" to="har plukket jordbær til" />
|
||||
<Word from="plukketjordbær" to="plukket jordbær" />
|
||||
<Word from="klaverfirehændigt" to="klaver firehændigt" />
|
||||
<Word from="erjævnaldrende" to="er jævnaldrende" />
|
||||
<Word from="tierjeg" to="tier jeg" />
|
||||
<Word from="Hvorerden" to="Hvor er den" />
|
||||
<Word from="0veraltjeg" to="overalt jeg" />
|
||||
<Word from="gårpå" to="går på" />
|
||||
<Word from="finderjeg" to="finder jeg" />
|
||||
<Word from="serhans" to="ser hans" />
|
||||
<Word from="tiderbliver" to="tider bliver" />
|
||||
<Word from="ellertrist" to="eller trist" />
|
||||
<Word from="forstårjeres" to="forstår jeres" />
|
||||
<Word from="Hvorsjælen" to="Hvor sjælen" />
|
||||
<Word from="finderro" to="finder ro" />
|
||||
<Word from="sidderjeg" to="sidder jeg" />
|
||||
<Word from="tagerjo" to="tager jo" />
|
||||
<Word from="efterjeres" to="efter jeres" />
|
||||
<Word from="10O" to="100" />
|
||||
<Word from="besluttedejeg" to="besluttede jeg" />
|
||||
<Word from="varsket" to="var sket" />
|
||||
<Word from="uadskillige" to="uadskillelige" />
|
||||
<Word from="harjetlag" to="har jetlag" />
|
||||
<Word from="lkke" to="Ikke" />
|
||||
<Word from="lntet" to="Intet" />
|
||||
<Word from="afslørerjeg" to="afslører jeg" />
|
||||
<Word from="måjeg" to="må jeg" />
|
||||
<Word from="Vl" to="VI" />
|
||||
<Word from="atbygge" to="at bygge" />
|
||||
<Word from="detmakabre" to="det makabre" />
|
||||
<Word from="vilikke" to="vil ikke" />
|
||||
<Word from="talsmandbekræfter" to="talsmand bekræfter" />
|
||||
<Word from="vedatrenovere" to="ved at renovere" />
|
||||
<Word from="forsøgeratforstå" to="forsøger at forstå" />
|
||||
<Word from="ersket" to="er sket" />
|
||||
<Word from="morderpå" to="morder på" />
|
||||
<Word from="frifodiRosewood" to="fri fod i Rosewood" />
|
||||
<Word from="holdtpressemøde" to="holdt pressemøde" />
|
||||
<Word from="lngen" to="Ingen" />
|
||||
<Word from="lND" to="IND" />
|
||||
<Word from="henterjeg" to="henter jeg" />
|
||||
<Word from="lsabel" to="Isabel" />
|
||||
<Word from="lsabels" to="Isabels" />
|
||||
<Word from="vinderjo" to="vinder jo" />
|
||||
<Word from="rødmerjo" to="rødmer jo" />
|
||||
<Word from="etjakkesæt" to="et jakkesæt" />
|
||||
<Word from="glæderjeg" to="glæder jeg" />
|
||||
<Word from="lgen" to="Igen" />
|
||||
<Word from="lsær" to="Især" />
|
||||
<Word from="iparken" to="i parken" />
|
||||
<Word from="nårl" to="når I" />
|
||||
<Word from="tilA1" to="til A1" />
|
||||
<Word from="FBl" to="FBI" />
|
||||
<Word from="viljo" to="vil jo" />
|
||||
<Word from="detpå" to="det på" />
|
||||
<Word from="KIar" to="Klar" />
|
||||
<Word from="PIan" to="Plan" />
|
||||
<Word from="EIIer" to="Eller" />
|
||||
<Word from="FIot" to="Flot" />
|
||||
<Word from="AIIe" to="Alle" />
|
||||
<Word from="AIt" to="Alt" />
|
||||
<Word from="KIap" to="Klap" />
|
||||
<Word from="PIaza" to="Plaza" />
|
||||
<Word from="SIap" to="Slap" />
|
||||
<Word from="Iå" to="lå" />
|
||||
<Word from="BIing" to="Bling" />
|
||||
<Word from="GIade" to="Glade" />
|
||||
<Word from="Iejrbålssange" to="lejrbålssange" />
|
||||
<Word from="bedtjer" to="bedt jer" />
|
||||
<Word from="hørerjeg" to="hører jeg" />
|
||||
<Word from="Fårjeg" to="Får jeg" />
|
||||
<Word from="fikJames" to="fik James" />
|
||||
<Word from="atsnakke" to="at snakke" />
|
||||
<Word from="varkun" to="var kun" />
|
||||
<Word from="retterjeg" to="retter jeg" />
|
||||
<Word from="ernormale" to="er normale" />
|
||||
<Word from="viljeg" to="vil jeg" />
|
||||
<Word from="Sætjer" to="Sæt jer" />
|
||||
<Word from="udsatham" to="udsat ham" />
|
||||
</WholeWords>
|
||||
<PartialWordsAlways>
|
||||
<WordPart from="¤" to="o" />
|
||||
<WordPart from="IVI" to="M" />
|
||||
<WordPart from="lVI" to="M" />
|
||||
<WordPart from="IVl" to="M" />
|
||||
<WordPart from="lVl" to="M" />
|
||||
</PartialWordsAlways>
|
||||
<PartialWords>
|
||||
<!-- Will be used to check words not in dictionary -->
|
||||
<!-- If new word(s) exists in spelling dictionary, it(they) is accepted -->
|
||||
<WordPart from="fi" to="fi" />
|
||||
<WordPart from="fl" to="fl" />
|
||||
<WordPart from="/" to="l" />
|
||||
<WordPart from="vv" to="w" />
|
||||
<WordPart from="m" to="rn" />
|
||||
<WordPart from="l" to="i" />
|
||||
<WordPart from="€" to="e" />
|
||||
<WordPart from="I" to="l" />
|
||||
<WordPart from="c" to="o" />
|
||||
<WordPart from="i" to="t" />
|
||||
<WordPart from="cc" to="oo" />
|
||||
<WordPart from="ii" to="tt" />
|
||||
<WordPart from="n/" to="ry" />
|
||||
<WordPart from="ae" to="æ" />
|
||||
<!-- "f " will be two words -->
|
||||
<WordPart from="f" to="f " />
|
||||
<WordPart from="c" to="e" />
|
||||
<WordPart from="o" to="e" />
|
||||
<WordPart from="I" to="t" />
|
||||
<WordPart from="n" to="o" />
|
||||
<WordPart from="s" to="e" />
|
||||
<WordPart from="\A" to="Vi" />
|
||||
<WordPart from="n/" to="rv" />
|
||||
<WordPart from="Ã" to="Å" />
|
||||
<WordPart from="í" to="i" />
|
||||
</PartialWords>
|
||||
<PartialLines />
|
||||
<PartialLinesAlways />
|
||||
<BeginLines />
|
||||
<EndLines />
|
||||
<WholeLines />
|
||||
<RegularExpressions />
|
||||
</OCRFixReplaceList>
|
||||
+6865
File diff suppressed because it is too large
Load Diff
+2341
File diff suppressed because it is too large
Load Diff
+1032
File diff suppressed because it is too large
Load Diff
+270
@@ -0,0 +1,270 @@
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords>
|
||||
<Word from="@immatriculation" to="d'immatriculation" />
|
||||
<Word from="acquer" to="acquér" />
|
||||
<Word from="acteurjoue" to="acteur joue" />
|
||||
<Word from="aerien" to="aérien" />
|
||||
<Word from="agreable" to="agréable" />
|
||||
<Word from="aientjamais" to="aient jamais" />
|
||||
<Word from="AII" to="All" />
|
||||
<Word from="aitjamais" to="ait jamais" />
|
||||
<Word from="aitjus" to="ait jus" />
|
||||
<Word from="alle" to="allé" />
|
||||
<Word from="alles" to="allés" />
|
||||
<Word from="appele" to="appelé" />
|
||||
<Word from="apres" to="après" />
|
||||
<Word from="aujourdhui" to="aujourd'hui" />
|
||||
<Word from="aupres" to="auprès" />
|
||||
<Word from="beaute" to="beauté" />
|
||||
<Word from="cabossee" to="cabossée" />
|
||||
<Word from="carj'" to="car j'" />
|
||||
<Word from="Carj'" to="Car j'" />
|
||||
<Word from="carla" to="car la" />
|
||||
<Word from="CEdipe" to="Œdipe" />
|
||||
<Word from="Cest" to="C'est" />
|
||||
<Word from="c'etaient" to="c'étaient" />
|
||||
<Word from="Cétaient" to="C'étaient" />
|
||||
<Word from="c'etait" to="c'était" />
|
||||
<Word from="C'etait" to="C'était" />
|
||||
<Word from="Cétait" to="C'était" />
|
||||
<Word from="choregraphiee" to="chorégraphiée" />
|
||||
<Word from="cinema" to="cinéma" />
|
||||
<Word from="cl'AIcatraz" to="d'Alcatraz" />
|
||||
<Word from="cles" to="clés" />
|
||||
<Word from="cœurjoie" to="cœur-joie" />
|
||||
<Word from="completer" to="compléter" />
|
||||
<Word from="costumiere" to="costumière" />
|
||||
<Word from="cree" to="créé" />
|
||||
<Word from="daccord" to="d'accord" />
|
||||
<Word from="d'AIbert" to="d'Albert" />
|
||||
<Word from="d'AIdous" to="d'Aldous" />
|
||||
<Word from="d'AIec" to="d'Alec" />
|
||||
<Word from="danniversaire" to="d'anniversaire" />
|
||||
<Word from="d'Arra'bida" to="d'Arrabida" />
|
||||
<Word from="d'autodérision" to="d'auto-dérision" />
|
||||
<Word from="dautres" to="d'autres" />
|
||||
<Word from="debattait" to="débattait" />
|
||||
<Word from="decor" to="décor" />
|
||||
<Word from="decorateurs" to="décorateurs" />
|
||||
<Word from="decors" to="décors" />
|
||||
<Word from="defi" to="défi" />
|
||||
<Word from="dejà" to="déjà" />
|
||||
<Word from="déjàm" to="déjà..." />
|
||||
<Word from="dejeunait" to="déjeunait" />
|
||||
<Word from="dengager" to="d'engager" />
|
||||
<Word from="déquipement" to="d'équipement" />
|
||||
<Word from="dérnièré" to="dernière" />
|
||||
<Word from="Desole" to="Désolé" />
|
||||
<Word from="dessayage" to="d'essayage" />
|
||||
<Word from="dessence" to="d'essence" />
|
||||
<Word from="détaient" to="c'étaient" />
|
||||
<Word from="detail" to="détail" />
|
||||
<Word from="dexcellents" to="d'excellents" />
|
||||
<Word from="dexpérience" to="d'expérience" />
|
||||
<Word from="dexpériences" to="d'expériences" />
|
||||
<Word from="d'héro'l'ne" to="d'héroïne" />
|
||||
<Word from="d'idees" to="d'idées" />
|
||||
<Word from="d'intensite" to="d'intensité" />
|
||||
<Word from="dontj" to="dont j" />
|
||||
<Word from="doublaitAlfo" to="doublait Alfo" />
|
||||
<Word from="DrNo" to="Dr No" />
|
||||
<Word from="e'" to="é" />
|
||||
<Word from="ecrit" to="écrit" />
|
||||
<Word from="elegant" to="élégant" />
|
||||
<Word from="Ellé" to="Elle" />
|
||||
<Word from="én" to="en" />
|
||||
<Word from="equipe" to="équipe" />
|
||||
<Word from="erjus" to="er jus" />
|
||||
<Word from="estjamais" to="est jamais" />
|
||||
<Word from="ét" to="et" />
|
||||
<Word from="etaient" to="étaient" />
|
||||
<Word from="etait" to="était" />
|
||||
<Word from="ete" to="été" />
|
||||
<Word from="etiez" to="étiez" />
|
||||
<Word from="etj'" to="et j'" />
|
||||
<Word from="Etj'" to="Et j'" />
|
||||
<Word from="etje" to="et je" />
|
||||
<Word from="Etje" to="Et je" />
|
||||
<Word from="EtsouvenL" to="Et souvent" />
|
||||
<Word from="eviter" to="éviter" />
|
||||
<Word from="Fabsence" to="l'absence" />
|
||||
<Word from="fadapter" to="t'adapter" />
|
||||
<Word from="fadore" to="j'adore" />
|
||||
<Word from="Fâge" to="l'âge" />
|
||||
<Word from="Fagent" to="l'agent" />
|
||||
<Word from="faiessayé" to="j'ai essayé" />
|
||||
<Word from="Failure" to="l'alllure" />
|
||||
<Word from="Fambiance" to="l'ambiance" />
|
||||
<Word from="Famener" to="l'amener" />
|
||||
<Word from="Fanniversaire" to="l'anniversaire" />
|
||||
<Word from="Fapparence" to="l'apparence" />
|
||||
<Word from="Fapres" to="l'apres" />
|
||||
<Word from="Faprès" to="l'après" />
|
||||
<Word from="Farmée" to="l'armée" />
|
||||
<Word from="Farrière" to="l'arrière" />
|
||||
<Word from="Farrivée" to="l'arrivée" />
|
||||
<Word from="Fascenseur" to="l'ascenseur" />
|
||||
<Word from="Fascension" to="l'ascension" />
|
||||
<Word from="Fassaut" to="l'assaut" />
|
||||
<Word from="Fassomme" to="l'assomme" />
|
||||
<Word from="Fatmosphère" to="l'atmosphère" />
|
||||
<Word from="Fattention" to="l'attention" />
|
||||
<Word from="Favalanche" to="l'avalanche" />
|
||||
<Word from="Féclairage" to="l'éclairage" />
|
||||
<Word from="Fécran" to="l'écran" />
|
||||
<Word from="Fémotion" to="l'émotion" />
|
||||
<Word from="Femplacement" to="l'emplacement" />
|
||||
<Word from="Fendroit" to="l'endroit" />
|
||||
<Word from="Fenseigne" to="l'enseigne" />
|
||||
<Word from="Fensemble" to="l'ensemble" />
|
||||
<Word from="Fentouraient" to="l'entouraient" />
|
||||
<Word from="Fentrée" to="l'entrée" />
|
||||
<Word from="Fépaisseur" to="l'épaisseur" />
|
||||
<Word from="Fépoque" to="l'époque" />
|
||||
<Word from="Féquipe" to="Équipe" />
|
||||
<Word from="Fespace" to="l'espace" />
|
||||
<Word from="fespérais" to="j'espérais" />
|
||||
<Word from="Fespère" to="l'espère" />
|
||||
<Word from="Festhétique" to="l'esthétique" />
|
||||
<Word from="Fetranger" to="l'etranger" />
|
||||
<Word from="Févasion" to="l'évasion" />
|
||||
<Word from="Févoque" to="l'évoque" />
|
||||
<Word from="Fexpérience" to="l'expérience" />
|
||||
<Word from="Fexplique" to="l'explique" />
|
||||
<Word from="Fexplosion" to="l'explosion" />
|
||||
<Word from="Fextérieur" to="l'extérieur" />
|
||||
<Word from="Fhabituelle" to="l'habituelle" />
|
||||
<Word from="Fhélicoptère" to="l'hélicoptère" />
|
||||
<Word from="Fhéliport" to="l'héliport" />
|
||||
<Word from="Fhélistation" to="l'hélistation" />
|
||||
<Word from="Fhonneur" to="l'honneur" />
|
||||
<Word from="Fhorloge" to="l'horloge" />
|
||||
<Word from="Fidée" to="l'idée" />
|
||||
<Word from="Fimage" to="l'image" />
|
||||
<Word from="Fimportance" to="l'importance" />
|
||||
<Word from="Fimpression" to="l'impression" />
|
||||
<Word from="Finfluence" to="l'influence" />
|
||||
<Word from="Finscription" to="l'inscription" />
|
||||
<Word from="Fintérieur" to="l'intérieur" />
|
||||
<Word from="Fintrigue" to="l'intrigue" />
|
||||
<Word from="Fobjectif" to="l'objectif" />
|
||||
<Word from="Foccasion" to="l'occasion" />
|
||||
<Word from="Fordre" to="l'ordre" />
|
||||
<Word from="Forigine" to="l'origine" />
|
||||
<Word from="frêre" to="frère" />
|
||||
<Word from="gaylns" to="gaijins" />
|
||||
<Word from="general" to="général" />
|
||||
<Word from="hawaïennel" to="hawaïenne" />
|
||||
<Word from="hawa'l'en" to="hawaïen" />
|
||||
<Word from="Ia" to="la" />
|
||||
<Word from="Ià" to="là" />
|
||||
<Word from="Iaryngotomie" to="laryngotomie" />
|
||||
<Word from="idee" to="idée" />
|
||||
<Word from="idees" to="idées" />
|
||||
<Word from="Ie" to="le" />
|
||||
<Word from="Ies" to="les" />
|
||||
<Word from="Iester" to="Lester" />
|
||||
<Word from="II" to="Il" />
|
||||
<Word from="Iimit" to="limit" />
|
||||
<Word from="IIs" to="Ils" />
|
||||
<Word from="immediatement" to="immédiatement" />
|
||||
<Word from="insufflee" to="insufflée" />
|
||||
<Word from="integrer" to="intégrer" />
|
||||
<Word from="interessante" to="intéressante" />
|
||||
<Word from="Iogions" to="logions" />
|
||||
<Word from="Iorsqu" to="lorsqu" />
|
||||
<Word from="isee" to="isée" />
|
||||
<Word from="Iumiere" to="lumiere" />
|
||||
<Word from="Iynchage" to="lynchage" />
|
||||
<Word from="J'espere" to="J'espère" />
|
||||
<Word from="Jessaie" to="J'essaie" />
|
||||
<Word from="j'etais" to="j'étais" />
|
||||
<Word from="J'etais" to="J'étais" />
|
||||
<Word from="latéralémént" to="latéralement" />
|
||||
<Word from="lci" to="Ici" />
|
||||
<Word from="Lci" to="Ici" />
|
||||
<Word from="lé-" to="là-" />
|
||||
<Word from="lepidopteres" to="lépidoptères" />
|
||||
<Word from="litteraire" to="littéraire" />
|
||||
<Word from="ll" to="il" />
|
||||
<Word from="Ll" to="Il" />
|
||||
<Word from="lls" to="ils" />
|
||||
<Word from="Lls" to="Ils" />
|
||||
<Word from="maintenanu" to="maintenant" />
|
||||
<Word from="maniere" to="manière" />
|
||||
<Word from="mariee" to="mariée" />
|
||||
<Word from="Mayer/ing" to="Mayerling" />
|
||||
<Word from="meilleurjour" to="meilleur jour" />
|
||||
<Word from="melange" to="mélange" />
|
||||
<Word from="n'avaiént" to="n'avaient" />
|
||||
<Word from="n'etait" to="n'était" />
|
||||
<Word from="oitjamais" to="oit jamais" />
|
||||
<Word from="oitjus" to="oit jus" />
|
||||
<Word from="ontete" to="ont été" />
|
||||
<Word from="operateur" to="opérateur" />
|
||||
<Word from="ouvérté" to="ouverte" />
|
||||
<Word from="Pépreuve" to="l'épreuve" />
|
||||
<Word from="pere" to="père" />
|
||||
<Word from="plateforme" to="plate-forme" />
|
||||
<Word from="pourjouer" to="pour jouer" />
|
||||
<Word from="precipice" to="précipice" />
|
||||
<Word from="preferes" to="préférés" />
|
||||
<Word from="premierjour" to="premier jour" />
|
||||
<Word from="presenter" to="présenter" />
|
||||
<Word from="prevu" to="prévu" />
|
||||
<Word from="prevue" to="prévue" />
|
||||
<Word from="propriete" to="propriété" />
|
||||
<Word from="protègeraient" to="protégeraient" />
|
||||
<Word from="qué" to="que" />
|
||||
<Word from="qwangoissé" to="qu'angoissé" />
|
||||
<Word from="realisateur" to="réalisateur" />
|
||||
<Word from="reception" to="réception" />
|
||||
<Word from="reévalu" to="réévalu" />
|
||||
<Word from="repute" to="réputé" />
|
||||
<Word from="reussi" to="réussi" />
|
||||
<Word from="s'arrétait" to="s'arrêtait" />
|
||||
<Word from="s'ave'rer" to="s'avérer" />
|
||||
<Word from="scenario" to="scénario" />
|
||||
<Word from="scene" to="scène" />
|
||||
<Word from="scenes" to="scènes" />
|
||||
<Word from="seances" to="séances" />
|
||||
<Word from="sequence" to="séquence" />
|
||||
<Word from="sflécrasa" to="s'écrasa" />
|
||||
<Word from="speciale" to="spéciale" />
|
||||
<Word from="Supen" to="Super" />
|
||||
<Word from="torturee" to="torturée" />
|
||||
<Word from="Uadmirable" to="L'admirable" />
|
||||
<Word from="Uensemblier" to="L'ensemblier" />
|
||||
<Word from="Uexplosion" to="L'explosion" />
|
||||
<Word from="Uouvre" to="L'ouvre" />
|
||||
<Word from="Vaise" to="l'aise" />
|
||||
<Word from="vecu" to="vécu" />
|
||||
<Word from="vehicules" to="véhicules" />
|
||||
<Word from="Ÿappréciais" to="J'appréciais" />
|
||||
<Word from="Ÿespère" to="J'espère" />
|
||||
<Word from="ÿétrangle" to="s'étrangle" />
|
||||
</WholeWords>
|
||||
<PartialWordsAlways />
|
||||
<PartialWords />
|
||||
<PartialLines>
|
||||
<LinePart from=" I'" to=" l'" />
|
||||
<LinePart from=" |'" to=" l'" />
|
||||
</PartialLines>
|
||||
<PartialLinesAlways />
|
||||
<BeginLines />
|
||||
<EndLines />
|
||||
<WholeLines>
|
||||
<Line from=""D'ac:c:ord."" to=""D'accord."" />
|
||||
<Line from="“i QUÎ gagne, qui perd," to="ni qui gagne, qui perd," />
|
||||
<Line from="L'ac:c:ent est mis 
 
 sur son trajet jusqu'en Suisse." to="L'accent est mis 
 
 sur son trajet jusqu'en Suisse." />
|
||||
<Line from="C'est la plus gentille chose 
 
 qu'Hitchc:oc:k m'ait jamais dite." to="C'est la plus gentille chose 
 
 qu'Hitchcock m'ait jamais dite." />
|
||||
<Line from="Tout le monde, en revanche, qualifie 
 
 Goldfinger d'aventu re structurée," to="Tout le monde, en revanche, qualifie 
 
 Goldfinger d'aventure structurée," />
|
||||
<Line from="et le film Shadow of a man 
 
 a lancé sa carrière au cinéma." to="et le film <i>Shadow of a man</i> 
 
 a lancé sa carrière au cinéma." />
|
||||
<Line from="En 1948, Young est passé à la réalisation 
 
 avec One night with you." to="En 1948, Young est passé à la réalisation 
 
 avec <i>One night with you</i>." />
|
||||
<Line from="Il a construit tous ces véhicules 
 
 à C)c:ala, en Floride." to="Il a construit tous ces véhicules 
 
 à Ocala, en Floride." />
|
||||
<Line from="Tokyo Pop et A Taxing Woman? Return." to="Tokyo Pop et A Taxing Woman's Return." />
|
||||
<Line from="Peter H u nt." to="Peter Hunt." />
|
||||
<Line from=""C'est bien mieux dans Peau. 
 
 On peut sfléclabousser, faire du bruit."" to=""C'est bien mieux dans l'eau. 
 
 On peut s'éclabousser, faire du bruit."" />
|
||||
</WholeLines>
|
||||
<RegularExpressions />
|
||||
</OCRFixReplaceList>
|
||||
+2273
File diff suppressed because it is too large
Load Diff
+1442
File diff suppressed because it is too large
Load Diff
+25
@@ -0,0 +1,25 @@
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords />
|
||||
<PartialWordsAlways />
|
||||
<PartialWords />
|
||||
<PartialLines />
|
||||
<PartialLinesAlways />
|
||||
<BeginLines />
|
||||
<EndLines />
|
||||
<WholeLines />
|
||||
<RegularExpressions>
|
||||
<!-- nagy I-l javítások -->
|
||||
<RegEx find="([\x41-\x5a\x61-\x7a\xc1-\xfc])II" replaceWith="$1ll" />
|
||||
<RegEx find="II([\x61-\x7a\xe1-\xfc])" replaceWith="ll$1" />
|
||||
<RegEx find="([\x61-\x7a\xe1-\xfc])I" replaceWith="$1l" />
|
||||
<RegEx find="([\x20])I([^aeou\x41-\x5a\xc1-\xdc])" replaceWith="$1l$2" />
|
||||
<RegEx find="\bl([bcdfghjklmnpqrstvwxz])" replaceWith="I$1" />
|
||||
<RegEx find="([\x41-\x5a\xc1-\xdc])I([\x61-\x7a\xe1-\xfc])" replaceWith="$1l$2" />
|
||||
<RegEx find="([\x61-\x7a\xe1-\xfc][\-])I([\x61-\x7a\xe1-\xfc])" replaceWith="$1l$2" />
|
||||
<RegEx find="([\x41-\x5a\xc1-\xdc])I([\-][\x41-\x5a\xc1-\xdc][\x61-\x7a\xe1-\xfc])" replaceWith="$1l$2" />
|
||||
<RegEx find="\b([AEÜÓ])I([^\x41-\x5a\xc1-\xdc])" replaceWith="$1l$2" />
|
||||
<RegEx find="\bI([aáeéiíoóöuúüy\xf5\xfb])" replaceWith="l$1" />
|
||||
<RegEx find="\b(?:II|ll)" replaceWith="Il" />
|
||||
<RegEx find="([\xf5\xfb])I" replaceWith="$1l" />
|
||||
</RegularExpressions>
|
||||
</OCRFixReplaceList>
|
||||
+24
@@ -0,0 +1,24 @@
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords>
|
||||
<Word from="ls" to="Is" />
|
||||
<Word from="ln" to="In" />
|
||||
<Word from="lk" to="Ik" />
|
||||
<Word from="ledereen" to="Iedereen" />
|
||||
<Word from="ledere" to="Iedere" />
|
||||
<Word from="lemand" to="Iemand" />
|
||||
</WholeWords>
|
||||
<PartialWordsAlways />
|
||||
<PartialWords />
|
||||
<PartialLines />
|
||||
<PartialLinesAlways />
|
||||
<BeginLines />
|
||||
<EndLines />
|
||||
<WholeLines />
|
||||
<RegularExpressions>
|
||||
<RegEx find="\blk(?=\p{Ll}{2})" replaceWith="Ik" />
|
||||
<RegEx find="\bln(?=\p{Ll}{2})" replaceWith="In" />
|
||||
<RegEx find="\bls(?=\p{Ll}{2})" replaceWith="Is" />
|
||||
<RegEx find="\beIk" replaceWith="elk" />
|
||||
<RegEx find="\bler(land|se|s|)\b" replaceWith="Ier$1" />
|
||||
</RegularExpressions>
|
||||
</OCRFixReplaceList>
|
||||
+43
@@ -0,0 +1,43 @@
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords />
|
||||
<PartialWordsAlways />
|
||||
<PartialWords>
|
||||
<!-- Will be used to check words not in dictionary -->
|
||||
<!-- If new word(s) exists in spelling dictionary, it(they) is accepted -->
|
||||
<WordPart from="¤" to="o" />
|
||||
<WordPart from="fi" to="fi" />
|
||||
<WordPart from="fl" to="fl" />
|
||||
<WordPart from="/" to="l" />
|
||||
<WordPart from="vv" to="w" />
|
||||
<WordPart from="IVI" to="M" />
|
||||
<WordPart from="lVI" to="M" />
|
||||
<WordPart from="IVl" to="M" />
|
||||
<WordPart from="lVl" to="M" />
|
||||
<WordPart from="m" to="rn" />
|
||||
<WordPart from="l" to="i" />
|
||||
<WordPart from="€" to="e" />
|
||||
<WordPart from="I" to="l" />
|
||||
<WordPart from="c" to="o" />
|
||||
<WordPart from="i" to="t" />
|
||||
<WordPart from="cc" to="oo" />
|
||||
<WordPart from="ii" to="tt" />
|
||||
<WordPart from="n/" to="ry" />
|
||||
<WordPart from="ae" to="æ" />
|
||||
<!-- "f " will be two words -->
|
||||
<WordPart from="f" to="f " />
|
||||
<WordPart from="c" to="e" />
|
||||
<WordPart from="I" to="t" />
|
||||
<WordPart from="n" to="o" />
|
||||
<WordPart from="s" to="e" />
|
||||
<WordPart from="\A" to="Vi" />
|
||||
<WordPart from="n/" to="rv" />
|
||||
<WordPart from="Ã" to="Å" />
|
||||
<WordPart from="í" to="i" />
|
||||
</PartialWords>
|
||||
<PartialLines />
|
||||
<PartialLinesAlways />
|
||||
<BeginLines />
|
||||
<EndLines />
|
||||
<WholeLines />
|
||||
<RegularExpressions />
|
||||
</OCRFixReplaceList>
|
||||
+508
@@ -0,0 +1,508 @@
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords>
|
||||
<Word from="abitual" to="habitual" />
|
||||
<Word from="àcerca" to="acerca" />
|
||||
<Word from="acessor" to="assessor" />
|
||||
<Word from="acólico" to="acólito" />
|
||||
<Word from="açoreano" to="açoriano" />
|
||||
<Word from="actuacao" to="actuação" />
|
||||
<Word from="acucar" to="açúcar" />
|
||||
<Word from="açucar" to="açúcar" />
|
||||
<Word from="advinhar" to="adivinhar" />
|
||||
<Word from="africa" to="África" />
|
||||
<Word from="ajuisar" to="ajuizar" />
|
||||
<Word from="album" to="álbum" />
|
||||
<Word from="alcoolémia" to="alcoolemia" />
|
||||
<Word from="aldião" to="aldeão" />
|
||||
<Word from="algerino" to="argelino" />
|
||||
<Word from="ameixeal" to="ameixial" />
|
||||
<Word from="amiaça" to="ameaça" />
|
||||
<Word from="analizar" to="analisar" />
|
||||
<Word from="andáste" to="andaste" />
|
||||
<Word from="anemona" to="anémona" />
|
||||
<Word from="antartico" to="antárctico" />
|
||||
<Word from="antártico" to="antárctico" />
|
||||
<Word from="antepôr" to="antepor" />
|
||||
<Word from="apárte" to="aparte" />
|
||||
<Word from="apiadeiro" to="apeadeiro" />
|
||||
<Word from="apiar" to="apear" />
|
||||
<Word from="apreciacao" to="apreciação" />
|
||||
<Word from="arctico" to="árctico" />
|
||||
<Word from="arrazar" to="arrasar" />
|
||||
<Word from="ártico" to="árctico" />
|
||||
<Word from="artifice" to="artífice" />
|
||||
<Word from="artifícial" to="artificial" />
|
||||
<Word from="ascenção" to="ascensão" />
|
||||
<!-- <Word from="assucar" to="açúcar" /> assucar é uma palavra existente no dicionário -->
|
||||
<Word from="assúcar" to="açúcar" />
|
||||
<Word from="aste" to="haste" />
|
||||
<Word from="asterístico" to="asterisco" />
|
||||
<Word from="averção" to="aversão" />
|
||||
<Word from="avizar" to="avisar" />
|
||||
<Word from="avulsso" to="avulso" />
|
||||
<Word from="baínha" to="bainha" />
|
||||
<Word from="banca-rota" to="bancarrota" />
|
||||
<Word from="bandeija" to="bandeja" />
|
||||
<Word from="bébé" to="bebé" />
|
||||
<Word from="beige" to="bege" />
|
||||
<Word from="benção" to="bênção" />
|
||||
<Word from="beneficiência" to="beneficência" />
|
||||
<Word from="beneficiente" to="beneficente" />
|
||||
<Word from="benvinda" to="bem-vinda" />
|
||||
<Word from="benvindo" to="bem-vindo" />
|
||||
<Word from="boasvindas" to="boas-vindas" />
|
||||
<Word from="borborinho" to="burburinho" />
|
||||
<Word from="Brazil" to="Brasil" />
|
||||
<Word from="bussula" to="bússola" />
|
||||
<Word from="cabo-verdeano" to="cabo-verdiano" />
|
||||
<Word from="caimbras" to="cãibras" />
|
||||
<Word from="calcáreo" to="calcário" />
|
||||
<Word from="calsado" to="calçado" />
|
||||
<Word from="calvíce" to="calvície" />
|
||||
<Word from="camoneano" to="camoniano" />
|
||||
<Word from="campião" to="campeão" />
|
||||
<Word from="cançacos" to="cansaços" />
|
||||
<Word from="caracter" to="carácter" />
|
||||
<Word from="caractéres" to="caracteres" />
|
||||
<Word from="catequeze" to="catequese" />
|
||||
<Word from="catequisador" to="catequizador" />
|
||||
<Word from="catequisar" to="catequizar" />
|
||||
<Word from="chícara" to="xícara" />
|
||||
<Word from="ciclano" to="sicrano" />
|
||||
<Word from="cicrano" to="sicrano" />
|
||||
<Word from="cidadães" to="cidadãos" />
|
||||
<Word from="cidadões" to="cidadãos" />
|
||||
<Word from="cincoenta" to="cinquenta" />
|
||||
<Word from="cinseiro" to="cinzeiro" />
|
||||
<Word from="cinsero" to="sincero" />
|
||||
<Word from="citacoes" to="citações" />
|
||||
<Word from="coalizão" to="colisão" />
|
||||
<Word from="côdia" to="côdea" />
|
||||
<Word from="combóio" to="comboio" />
|
||||
<Word from="compôr" to="compor" />
|
||||
<Word from="concerteza" to="com certeza" />
|
||||
<Word from="constituia" to="constituía" />
|
||||
<Word from="constituíu" to="constituiu" />
|
||||
<Word from="contato" to="contacto" />
|
||||
<Word from="contensão" to="contenção" />
|
||||
<Word from="contribuicoes" to="contribuições" />
|
||||
<Word from="côr" to="cor" />
|
||||
<Word from="corassão" to="coração" />
|
||||
<Word from="corçario" to="corsário" />
|
||||
<Word from="corçário" to="corsário" />
|
||||
<Word from="cornprimidosinbo" to="comprimidozinho" />
|
||||
<!-- <Word from="cota-parte" to="quota-parte" /> é uma palavra existente no dicionário -->
|
||||
<Word from="crâneo" to="crânio" />
|
||||
<Word from="dE" to="de" />
|
||||
<Word from="defenição" to="definição" />
|
||||
<Word from="defenido" to="definido" />
|
||||
<Word from="defenir" to="definir" />
|
||||
<Word from="deficite" to="défice" />
|
||||
<Word from="degladiar" to="digladiar" />
|
||||
<Word from="deiche" to="deixe" />
|
||||
<Word from="desinteria" to="disenteria" />
|
||||
<Word from="despendio" to="dispêndio" />
|
||||
<Word from="despêndio" to="dispêndio" />
|
||||
<Word from="desplicência" to="displicência" />
|
||||
<Word from="dificulidade" to="dificuldade" />
|
||||
<Word from="dispender" to="despender" />
|
||||
<Word from="dispendio" to="dispêndio" />
|
||||
<Word from="distribuido" to="distribuído" />
|
||||
<Word from="druída" to="druida" />
|
||||
<Word from="écrã" to="ecrã" />
|
||||
<Word from="ecran" to="ecrã" />
|
||||
<Word from="écran" to="ecrã" />
|
||||
<Word from="êle" to="ele" />
|
||||
<Word from="elice" to="hélice" />
|
||||
<Word from="élice" to="hélice" />
|
||||
<Word from="emiratos" to="emirados" />
|
||||
<Word from="engolis-te" to="engoliste" />
|
||||
<Word from="engulir" to="engolir" />
|
||||
<Word from="enguliste" to="engoliste" />
|
||||
<Word from="entertido" to="entretido" />
|
||||
<Word from="entitular" to="intitular" />
|
||||
<Word from="entreterimento" to="entretenimento" />
|
||||
<Word from="entreti-me" to="entretive-me" />
|
||||
<Word from="envólucro" to="invólucro" />
|
||||
<Word from="erói" to="herói" />
|
||||
<Word from="escluir" to="excluir" />
|
||||
<Word from="esclusão" to="exclusão" />
|
||||
<Word from="escrivões" to="escrivães" />
|
||||
<Word from="esqueiro" to="isqueiro" />
|
||||
<Word from="esquesito" to="esquisito" />
|
||||
<Word from="estacoes" to="estações" />
|
||||
<Word from="esteje" to="esteja" />
|
||||
<Word from="excavação" to="escavação" />
|
||||
<Word from="excavar" to="escavar" />
|
||||
<Word from="exdrúxula" to="esdrúxula" />
|
||||
<Word from="exdrúxulas" to="esdrúxulas" />
|
||||
<Word from="exitar" to="hesitar" />
|
||||
<Word from="explicacoes" to="explicações" />
|
||||
<Word from="exquisito" to="esquisito" />
|
||||
<Word from="extende" to="estende" />
|
||||
<Word from="extender" to="estender" />
|
||||
<Word from="fàcilmenfe" to="facilmente" />
|
||||
<Word from="fàcilmente" to="facilmente" />
|
||||
<Word from="fariam-lhe" to="far-lhe-iam" />
|
||||
<Word from="FARMÁClAS" to="FARMÁCIAS" />
|
||||
<Word from="farmecêutico" to="farmacêutico" />
|
||||
<Word from="fassa" to="faça" />
|
||||
<Word from="fébre" to="febre" />
|
||||
<Word from="fecula" to="fécula" />
|
||||
<Word from="fémea" to="fêmea" />
|
||||
<Word from="femenino" to="feminino" />
|
||||
<Word from="femininismo" to="feminismo" />
|
||||
<Word from="físiologista" to="fisiologista" />
|
||||
<Word from="fizémos" to="fizemos" />
|
||||
<Word from="fizes-te" to="fizeste" />
|
||||
<Word from="flôr" to="flor" />
|
||||
<Word from="forão" to="foram" />
|
||||
<Word from="formalisar" to="formalizar" />
|
||||
<Word from="fôro" to="foro" />
|
||||
<Word from="fos-te" to="foste" />
|
||||
<Word from="fragância" to="fragrância" />
|
||||
<Word from="françês" to="francês" />
|
||||
<Word from="frasqutnho" to="frasquinho" />
|
||||
<Word from="frustado" to="frustrado" />
|
||||
<Word from="furá" to="furar" />
|
||||
<Word from="gaz" to="gás" />
|
||||
<Word from="gáz" to="gás" />
|
||||
<Word from="geito" to="jeito" />
|
||||
<Word from="geneceu" to="gineceu" />
|
||||
<Word from="geropiga" to="jeropiga" />
|
||||
<Word from="glicémia" to="glicemia" />
|
||||
<Word from="gorgeta" to="gorjeta" />
|
||||
<Word from="grangear" to="granjear" />
|
||||
<Word from="guizar" to="guisar" />
|
||||
<Word from="hectar" to="hectare" />
|
||||
<Word from="herméticamente" to="hermeticamente" />
|
||||
<Word from="hernia" to="hérnia" />
|
||||
<Word from="higiéne" to="higiene" />
|
||||
<Word from="hilariedade" to="hilaridade" />
|
||||
<Word from="hiperacídez" to="hiperacidez" />
|
||||
<Word from="hontem" to="ontem" />
|
||||
<Word from="igiene" to="higiene" />
|
||||
<Word from="igienico" to="higiénico" />
|
||||
<Word from="igiénico" to="higiénico" />
|
||||
<Word from="igreija" to="igreja" />
|
||||
<Word from="iguasu" to="iguaçu" />
|
||||
<Word from="ilacção" to="ilação" />
|
||||
<Word from="imbigo" to="umbigo" />
|
||||
<Word from="impecilho" to="empecilho" />
|
||||
<Word from="íncas" to="incas" />
|
||||
<Word from="incêsto" to="incesto" />
|
||||
<Word from="inclusivé" to="inclusive" />
|
||||
<Word from="incômodos" to="incómodos" />
|
||||
<Word from="incontestávelmente" to="incontestavelmente" />
|
||||
<Word from="incontestàvelmente" to="incontestavelmente" />
|
||||
<Word from="indespensáveis" to="indispensáveis" />
|
||||
<Word from="indespensável" to="indispensável" />
|
||||
<Word from="India" to="Índia" />
|
||||
<Word from="indiguinação" to="indignação" />
|
||||
<Word from="indiguinado" to="indignado" />
|
||||
<Word from="indiguinar" to="indignar" />
|
||||
<Word from="inflacção" to="inflação" />
|
||||
<Word from="ingreja" to="igreja" />
|
||||
<Word from="INSCRICOES" to="INSCRIÇÕES" />
|
||||
<Word from="intensão" to="intenção" />
|
||||
<Word from="intertido" to="entretido" />
|
||||
<Word from="intoxica" to="Intoxica" />
|
||||
<Word from="intrega" to="entrega" />
|
||||
<Word from="inverosímel" to="inverosímil" />
|
||||
<Word from="iorgute" to="iogurte" />
|
||||
<Word from="ipopótamo" to="hipopótamo" />
|
||||
<Word from="ipsilon" to="ípsilon" />
|
||||
<Word from="ipslon" to="ípsilon" />
|
||||
<Word from="isquesito" to="esquisito" />
|
||||
<Word from="juíz" to="juiz" />
|
||||
<Word from="juiza" to="juíza" />
|
||||
<Word from="júniores" to="juniores" />
|
||||
<Word from="justanzente" to="justamente" />
|
||||
<Word from="juz" to="jus" />
|
||||
<Word from="kilo" to="quilo" />
|
||||
<Word from="laboratório-porque" to="laboratório porque" />
|
||||
<Word from="ladravaz" to="ladrava" />
|
||||
<Word from="lamentàvelmente" to="lamentavelmente" />
|
||||
<Word from="lampeão" to="lampião" />
|
||||
<Word from="largartixa" to="lagartixa" />
|
||||
<Word from="largarto" to="lagarto" />
|
||||
<Word from="lêm" to="lêem" />
|
||||
<Word from="leucémia" to="leucemia" />
|
||||
<Word from="licensa" to="licença" />
|
||||
<Word from="linguísta" to="linguista" />
|
||||
<Word from="lisongear" to="lisonjear" />
|
||||
<Word from="logista" to="lojista" />
|
||||
<Word from="maçajar" to="massajar" />
|
||||
<Word from="Macfadden-o" to="Macfadden o" />
|
||||
<Word from="mae" to="mãe" />
|
||||
<Word from="magestade" to="majestade" />
|
||||
<Word from="mãgua" to="mágoa" />
|
||||
<Word from="mangerico" to="manjerico" />
|
||||
<Word from="mangerona" to="manjerona" />
|
||||
<Word from="manteem-se" to="mantêm-se" />
|
||||
<Word from="mantega" to="manteiga" />
|
||||
<Word from="mantem-se" to="mantém-se" />
|
||||
<Word from="massiço" to="maciço" />
|
||||
<Word from="massisso" to="maciço" />
|
||||
<Word from="médica-Rio" to="médica Rio" />
|
||||
<Word from="menistro" to="ministro" />
|
||||
<Word from="merciaria" to="mercearia" />
|
||||
<Word from="metrelhadora" to="metralhadora" />
|
||||
<Word from="miscegenação" to="miscigenação" />
|
||||
<Word from="misogenia" to="misoginia" />
|
||||
<Word from="misogeno" to="misógino" />
|
||||
<Word from="misógeno" to="misógino" />
|
||||
<Word from="mº" to="º" />
|
||||
<Word from="môlho" to="molho" />
|
||||
<Word from="monumentânea" to="momentânea" />
|
||||
<Word from="mortandela" to="mortadela" />
|
||||
<Word from="morteIa" to="mortela" />
|
||||
<Word from="muinto" to="muito" />
|
||||
<Word from="nasaias" to="nasais" />
|
||||
<Word from="nêle" to="nele" />
|
||||
<Word from="nest" to="neste" />
|
||||
<Word from="Nivea" to="Nívea" />
|
||||
<Word from="nonagessimo" to="nonagésimo" />
|
||||
<Word from="nonagéssimo" to="nonagésimo" />
|
||||
<Word from="nornal" to="normal" />
|
||||
<Word from="notàvelmente" to="notavelmente" />
|
||||
<Word from="obcessão" to="obsessão" />
|
||||
<Word from="obesidae" to="obesidade" />
|
||||
<Word from="óbviamente" to="obviamente" />
|
||||
<Word from="òbviamente" to="obviamente" />
|
||||
<Word from="ofecina" to="oficina" />
|
||||
<Word from="oje" to="hoje" />
|
||||
<Word from="omem" to="homem" />
|
||||
<Word from="opcoes" to="opções" />
|
||||
<Word from="opóbrio" to="opróbrio" />
|
||||
<Word from="opróbio" to="opróbrio" />
|
||||
<Word from="orfão" to="órfão" />
|
||||
<Word from="organigrama" to="organograma" />
|
||||
<Word from="organisar" to="organizar" />
|
||||
<Word from="orgão" to="órgão" />
|
||||
<Word from="orta" to="horta" />
|
||||
<Word from="ótima" to="óptima" />
|
||||
<Word from="ótimos" to="óptimos" />
|
||||
<Word from="paralização" to="paralisação" />
|
||||
<Word from="paralizado" to="paralisado" />
|
||||
<Word from="paralizar" to="paralisar" />
|
||||
<Word from="paráste" to="paraste" />
|
||||
<Word from="Pátria" to="pátria" />
|
||||
<Word from="paúl" to="Paul" />
|
||||
<Word from="pecalço" to="percalço" />
|
||||
<Word from="pêga" to="pega" />
|
||||
<Word from="periodo" to="período" />
|
||||
<Word from="pertubar" to="perturbar" />
|
||||
<Word from="perú" to="peru" />
|
||||
<Word from="piqueno" to="pequeno" />
|
||||
<Word from="pirinéus" to="Pirenéus" />
|
||||
<Word from="poblema" to="problema" />
|
||||
<Word from="pobrema" to="problema" />
|
||||
<Word from="poden" to="podem" />
|
||||
<Word from="poder-mos" to="pudermos" />
|
||||
<Word from="ponteagudo" to="pontiagudo" />
|
||||
<Word from="pontuacoes" to="pontuações" />
|
||||
<Word from="prazeiroso" to="prazeroso" />
|
||||
<Word from="precaridade" to="precariedade" />
|
||||
<Word from="precizar" to="precisar" />
|
||||
<Word from="preserverança" to="perseverança" />
|
||||
<Word from="previlégio" to="privilégio" />
|
||||
<Word from="primária-que" to="primária que" />
|
||||
<Word from="priúdo" to="período" />
|
||||
<Word from="probalidade" to="probabilidade" />
|
||||
<Word from="progreso" to="progresso" />
|
||||
<Word from="proibído" to="proibido" />
|
||||
<Word from="proíbido" to="proibido" />
|
||||
<Word from="própia" to="própria" />
|
||||
<Word from="propiedade" to="propriedade" />
|
||||
<Word from="propio" to="próprio" />
|
||||
<Word from="própio" to="próprio" />
|
||||
<Word from="provocacoes" to="provocações" />
|
||||
<Word from="prsença" to="presença" />
|
||||
<Word from="prustituta" to="prostituta" />
|
||||
<Word from="pudérmos" to="pudermos" />
|
||||
<Word from="púlico" to="público" />
|
||||
<Word from="pús" to="pus" />
|
||||
<Word from="pusémos" to="pusemos" />
|
||||
<Word from="quadricomia" to="quadricromia" />
|
||||
<Word from="quadriplicado" to="quadruplicado" />
|
||||
<Word from="quaisqueres" to="quaisquer" />
|
||||
<Word from="quer-a" to="quere-a" />
|
||||
<Word from="quere-se" to="quer-se" />
|
||||
<Word from="quer-o" to="quere-o" />
|
||||
<Word from="químco" to="químico" />
|
||||
<Word from="quises-te" to="quiseste" />
|
||||
<Word from="quizer" to="quiser" />
|
||||
<Word from="quizeram" to="quiseram" />
|
||||
<Word from="quizesse" to="quisesse" />
|
||||
<Word from="quizessem" to="quisessem" />
|
||||
<Word from="raínha" to="rainha" />
|
||||
<Word from="raíz" to="raiz" />
|
||||
<Word from="raizes" to="raízes" />
|
||||
<Word from="ratato" to="retrato" />
|
||||
<Word from="raúl" to="raul" />
|
||||
<Word from="razar" to="rasar" />
|
||||
<Word from="rectaguarda" to="retaguarda" />
|
||||
<Word from="rédia" to="rédea" />
|
||||
<Word from="reestabelecer" to="restabelecer" />
|
||||
<Word from="refeicoes" to="refeições" />
|
||||
<Word from="refêrencia" to="referência" />
|
||||
<Word from="regeitar" to="rejeitar" />
|
||||
<Word from="regurjitar" to="regurgitar" />
|
||||
<Word from="reinvidicação" to="reivindicação" />
|
||||
<Word from="reinvidicar" to="reivindicar" />
|
||||
<Word from="requer-a" to="requere-a" />
|
||||
<Word from="requere-se" to="requer-se" />
|
||||
<Word from="requer-o" to="requere-o" />
|
||||
<Word from="requesito" to="requisito" />
|
||||
<Word from="requisicoes" to="requisições" />
|
||||
<Word from="RESIDENCIA" to="RESIDÊNCIA" />
|
||||
<Word from="respiraçáo" to="respiração" />
|
||||
<Word from="restablecer" to="restabelecer" />
|
||||
<Word from="réstea" to="réstia" />
|
||||
<Word from="ruborisar" to="ruborizar" />
|
||||
<Word from="rúbrica" to="rubrica" />
|
||||
<Word from="sàdia" to="sadia" />
|
||||
<Word from="saiem" to="saem" />
|
||||
<Word from="salchicha" to="salsicha" />
|
||||
<Word from="salchichas" to="salsichas" />
|
||||
<Word from="saloice" to="saloiice" />
|
||||
<Word from="salvé" to="salve" />
|
||||
<Word from="salve-raínha" to="salve-rainha" />
|
||||
<Word from="salvé-rainha" to="salve-rainha" />
|
||||
<Word from="salvé-raínha" to="salve-rainha" />
|
||||
<Word from="sao" to="são" />
|
||||
<Word from="sargeta" to="sarjeta" />
|
||||
<Word from="seções" to="secções" />
|
||||
<Word from="seija" to="seja" />
|
||||
<Word from="seissentos" to="seiscentos" />
|
||||
<Word from="seje" to="seja" />
|
||||
<Word from="semiar" to="semear" />
|
||||
<Word from="séniores" to="seniores" />
|
||||
<Word from="sensibilidadc" to="sensibilidade" />
|
||||
<Word from="sensívelmente" to="sensivelmente" />
|
||||
<Word from="setessentos" to="setecentos" />
|
||||
<Word from="siclano" to="sicrano" />
|
||||
<Word from="Sifilis" to="Sífilis" />
|
||||
<Word from="sifílis" to="sífilis" />
|
||||
<Word from="sinão" to="senão" />
|
||||
<Word from="sinmtoma" to="sintoma" />
|
||||
<Word from="sintéticamente" to="sinteticamente" />
|
||||
<Word from="sintetisa" to="sintetiza" />
|
||||
<Word from="SÓ" to="só" />
|
||||
<Word from="sôfra" to="sofra" />
|
||||
<Word from="sôfregamente" to="sofregamente" />
|
||||
<Word from="somáste" to="somaste" />
|
||||
<Word from="sombracelha" to="sobrancelha" />
|
||||
<Word from="sombrancelha" to="sobrancelha" />
|
||||
<Word from="sombrancelhas" to="sobrancelhas" />
|
||||
<Word from="suavisar" to="suavizar" />
|
||||
<Word from="substituido" to="substituído" />
|
||||
<Word from="suburbio" to="subúrbio" />
|
||||
<!-- <Word from="sues" to="seus" /> sues existe "Cuidado, não sues muito." -->
|
||||
<Word from="suI" to="sul" />
|
||||
<Word from="Suiça" to="Suíça" />
|
||||
<Word from="suiças" to="suíças" />
|
||||
<Word from="suiço" to="suíço" />
|
||||
<Word from="suiços" to="suíços" />
|
||||
<Word from="supôr" to="supor" />
|
||||
<Word from="tabeliões" to="tabeliães" />
|
||||
<Word from="taínha" to="tainha" />
|
||||
<Word from="tava" to="estava" />
|
||||
<Word from="têem" to="têm" />
|
||||
<Word from="telemovel" to="telemóvel" />
|
||||
<Word from="telémovel" to="telemóvel" />
|
||||
<Word from="terminacoes" to="terminações" />
|
||||
<Word from="toráxico" to="torácico" />
|
||||
<Word from="tou" to="estou" />
|
||||
<Word from="transpôr" to="transpor" />
|
||||
<Word from="trasnporte" to="transporte" />
|
||||
<Word from="tumors" to="tumores" />
|
||||
<Word from="úmida" to="húmida" />
|
||||
<Word from="umidade" to="unidade" />
|
||||
<Word from="vai-vem" to="vaivém" />
|
||||
<Word from="vegilância" to="vigilância" />
|
||||
<Word from="vegilante" to="vigilante" />
|
||||
<Word from="ventoínha" to="ventoinha" />
|
||||
<Word from="verosímel" to="verosímil" />
|
||||
<Word from="video" to="vídeo" />
|
||||
<Word from="virus" to="vírus" />
|
||||
<Word from="visiense" to="viseense" />
|
||||
<Word from="voçe" to="você" />
|
||||
<Word from="voçê" to="você" />
|
||||
<Word from="vôo" to="voo" />
|
||||
<Word from="xadrês" to="xadrez" />
|
||||
<Word from="xafariz" to="chafariz" />
|
||||
<Word from="xéxé" to="xexé" />
|
||||
<Word from="xilindró" to="chilindró" />
|
||||
<Word from="zaíre" to="Zaire" />
|
||||
<Word from="zepelin" to="zepelim" />
|
||||
<Word from="zig-zag" to="ziguezague" />
|
||||
<Word from="zoô" to="zoo" />
|
||||
<Word from="zôo" to="zoo" />
|
||||
<Word from="zuar" to="zoar" />
|
||||
<Word from="zum-zum" to="zunzum" />
|
||||
</WholeWords>
|
||||
<PartialWordsAlways />
|
||||
<PartialWords />
|
||||
<PartialLines>
|
||||
<LinePart from="IN 6-E" to="N 6 E" />
|
||||
<LinePart from="in tegrar-se" to="integrar-se" />
|
||||
<LinePart from="in teresse" to="interesse" />
|
||||
<LinePart from="in testinos" to="intestinos" />
|
||||
<LinePart from="indica ção" to="indicação" />
|
||||
<LinePart from="inte tino" to="intestino" />
|
||||
<LinePart from="intes tinos" to="intestinos" />
|
||||
<LinePart from="L da" to="Lda" />
|
||||
<LinePart from="mal estar" to="mal-estar" />
|
||||
<LinePart from="mastiga çáo" to="mastigação" />
|
||||
<LinePart from="médi cas" to="médicas" />
|
||||
<LinePart from="mineo rais" to="minerais" />
|
||||
<LinePart from="mola res" to="molares" />
|
||||
<LinePart from="movi mentos" to="movimentos" />
|
||||
<LinePart from="movimen to" to="movimento" />
|
||||
<LinePart from="N 5-Estendido" to="Nº 5 Estendido" />
|
||||
<LinePart from="oxigé nio" to="oxigénio" />
|
||||
<LinePart from="pod mos" to="podemos" />
|
||||
<LinePart from="poder-se ia" to="poder-se-ia" />
|
||||
<LinePart from="pos sibilidade" to="possibilidade" />
|
||||
<LinePart from="possibi lidades" to="possibilidades" />
|
||||
<LinePart from="pro duto" to="produto" />
|
||||
<LinePart from="procu rar" to="procurar" />
|
||||
<LinePart from="Q u e" to="Que" />
|
||||
<LinePart from="qualifi cam" to="qualificam" />
|
||||
<LinePart from="R egião" to="Região" />
|
||||
<LinePart from="unsuficien temente" to="insuficientemente" />
|
||||
</PartialLines>
|
||||
<PartialLinesAlways />
|
||||
<BeginLines />
|
||||
<EndLines />
|
||||
<WholeLines />
|
||||
<RegularExpressions>
|
||||
<!-- <RegEx find="\bi\b" replaceWith="I" /> just an example - do not use this regex -->
|
||||
<RegEx find="([0-9]) +º" replaceWith="$1º" />
|
||||
<RegEx find="\Bcao\b" replaceWith="ção" />
|
||||
<RegEx find="\Bcoes\b" replaceWith="ções" />
|
||||
<!-- <RegEx find="\Bccao\b" replaceWith="cção" /> não faz sentido ter este e ter a linha de cima -->
|
||||
<!-- <RegEx find="\Bccoes\b" replaceWith="cções" /> não faz sentido ter este e ter a linha de cima -->
|
||||
<RegEx find="\b(m|M)ae\b" replaceWith="$1ãe" />
|
||||
<RegEx find="\Bdmnis\B" replaceWith="dminis" />
|
||||
<RegEx find="\Blcól\B" replaceWith="lcoól" />
|
||||
<RegEx find="\b(t|T)a[nm]b[eé]m\b" replaceWith="$1ambém" />
|
||||
<RegEx find="\bzeppeli[mn]\b" replaceWith="zepelim" />
|
||||
<RegEx find="\b(s|S)ufe?ciente\b" replaceWith="$1uficiente" />
|
||||
<RegEx find="\b(n|N)ao\b" replaceWith="$1ão" />
|
||||
<RegEx find="\b(B|b)elem\b" replaceWith="$1elém" />
|
||||
<RegEx find="\b(s|S)u[íi]sso(s)?\b" replaceWith="$1uíço$2" />
|
||||
<RegEx find="\b(s|S)u[íi]ssa(s)?\b" replaceWith="$1uíça$2" />
|
||||
<RegEx find="\b(p|P)rivelig[ie]\p{Ll}d" replaceWith="$1rivelegiad" />
|
||||
<RegEx find="\bpud(?:és|e-)se\b" replaceWith="pudesse" />
|
||||
<RegEx find="\biquilíbr(?:e|i)o\b" replaceWith="equilíbrio" />
|
||||
<RegEx find="\b(c|C)orregi\B" replaceWith="$1orrigid" />
|
||||
<RegEx find="(?<=A|a)ssociacao" replaceWith="ssociação" />
|
||||
<RegEx find="(?<=N|n)inguem" replaceWith="inguém" />
|
||||
<RegEx find="(?<=g|G)rat(?:uí|úi)to" replaceWith="ratuito" />
|
||||
<RegEx find="(?<=d|D)esiquilíbr[ei]o" replaceWith="esequilíbrio" />
|
||||
<RegEx find="\b[k|K]il(ogramas?|ómetros?)" replaceWith="qui$1" />
|
||||
</RegularExpressions>
|
||||
</OCRFixReplaceList>
|
||||
+257
@@ -0,0 +1,257 @@
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords>
|
||||
<Word from="НЄЙ" to="НЕЙ" />
|
||||
<Word from="ОРГЗНИЗМОБ" to="ОРГАНИЗМА" />
|
||||
<Word from="Чї0" to="ЧТО" />
|
||||
<Word from="НЭ" to="НА" />
|
||||
<Word from="СОСЄДНЮЮ" to="СОСЕДНЮЮ" />
|
||||
<Word from="ПЛЗНЄТУ" to="ПЛАНЕТУ" />
|
||||
<Word from="ЗЗГЭДОК" to="ЗАГАДОК" />
|
||||
<Word from="СОТВОРЄНИЯ" to="СОТВОРЕНИЯ" />
|
||||
<Word from="МИРЭ" to="МИРА" />
|
||||
<Word from="ПОЯБЛЄНИЯ" to="ПОЯВЛЕНИЯ" />
|
||||
<Word from="ЗЄМЛЄ" to="ЗЕМЛЕ" />
|
||||
<Word from="ЄЩЄ" to="ЕЩЁ" />
|
||||
<Word from="ТЄМНЬІХ" to="ТЕМНЫХ" />
|
||||
<Word from="СЄРЬЄЗНЬІМ" to="СЕРЬЕЗНЫМ" />
|
||||
<Word from="ПОШІІ0" to="ПОШЛО" />
|
||||
<Word from="Пр0ИЗ0ШЄЛ" to="ПРОИЗОШЕЛ" />
|
||||
<Word from="СЄКРЄТЭМИ" to="СЕКРЕТАМИ" />
|
||||
<Word from="МЭТЄРИЗЛЬІ" to="МАТЕРИАЛЫ" />
|
||||
<Word from="ПЯТЄН" to="ПЯТЕН" />
|
||||
<Word from="ПЛаНЄїЄ" to="ПЛАНЕТЕ" />
|
||||
<Word from="КЗТЭКЛИЗМ" to="КАТАКЛИЗМ" />
|
||||
<Word from="ОКЗЗЗЛСЯ" to="ОКАЗАЛСЯ" />
|
||||
<Word from="ДЭЛЬШЕ" to="ДАЛЬШЕ" />
|
||||
<Word from="ТВК" to="ТАК" />
|
||||
<Word from="ПЛЗНЄТЗ" to="ПЛАНЕТА" />
|
||||
<Word from="ЧЄГО" to="ЧЕГО" />
|
||||
<Word from="УЗНЭТЬ" to="УЗНАТЬ" />
|
||||
<Word from="ПЛЭНЄТЄ" to="ПЛАНЕТЕ" />
|
||||
<Word from="НЄМ" to="НЕМ" />
|
||||
<Word from="БОЗМОЖНЗ" to="ВОЗМОЖНА" />
|
||||
<Word from="СОБЄРШЄННО" to="СОВЕРШЕННО" />
|
||||
<Word from="ИНЭЧЄ" to="ИНАЧЕ" />
|
||||
<Word from="БСЄ" to="ВСЕ" />
|
||||
<Word from="НЕДОСТЗТКИ" to="НЕДОСТАТКИ" />
|
||||
<Word from="НОВЬІЄ" to="НОВЫЕ" />
|
||||
<Word from="ВЄЛИКОЛЄПНЭЯ" to="ВЕЛИКОЛЕПНАЯ" />
|
||||
<Word from="ОСТЭІІОСЬ" to="ОСТАЛОСЬ" />
|
||||
<Word from="НЗЛИЧИЄ" to="НАЛИЧИЕ" />
|
||||
<Word from="бЫ" to="бы" />
|
||||
<Word from="ПРОЦВЕТВТЬ" to="ПРОЦВЕТАТЬ" />
|
||||
<Word from="КЗК" to="КАК" />
|
||||
<Word from="ВОДЗ" to="ВОДА" />
|
||||
<Word from="НЗШЕЛ" to="НАШЕЛ" />
|
||||
<Word from="НЄ" to="НЕ" />
|
||||
<Word from="ТОЖЄ" to="ТОЖЕ" />
|
||||
<Word from="ВУЛКЭНИЧЄСКОЙ" to="ВУЛКАНИЧЕСКОЙ" />
|
||||
<Word from="ЭКТИБНОСТИ" to="АКТИВНОСТИ" />
|
||||
<Word from="ПОЯВИЛЗСЬ" to="ПОЯВИЛАСЬ" />
|
||||
<Word from="НОВЗЯ" to="НОВАЯ" />
|
||||
<Word from="СТРЭТЄГИЯ" to="СТРАТЕГИЯ" />
|
||||
<Word from="УСПЄШН0" to="УСПЕШНО" />
|
||||
<Word from="ПОСЗДКУ" to="ПОСАДКУ" />
|
||||
<Word from="ГОТОБЫ" to="ГОТОВЫ" />
|
||||
<Word from="НЗЧЗТЬ" to="НАЧАТЬ" />
|
||||
<Word from="ОХОТЭ" to="ОХОТА" />
|
||||
<Word from="ПРИЗНЗКЗМИ" to="ПРИЗНАКАМИ" />
|
||||
<Word from="Пр0ШЛОМ" to="ПРОШЛОМ" />
|
||||
<Word from="НЭСТОЯЩЄМ" to="НАСТОЯЩЕМ" />
|
||||
<Word from="ПУСТОТЗХ" to="ПУСТОТАХ" />
|
||||
<Word from="БЛЗЖНОЙ" to="ВЛАЖНОЙ" />
|
||||
<Word from="ПОЧБЄ" to="ПОЧВЕ" />
|
||||
<Word from="МЬІ" to="МЫ" />
|
||||
<Word from="СЄЙЧЗС" to="СЕЙЧАС" />
|
||||
<Word from="ЄСЛИ" to="ЕСЛИ" />
|
||||
<Word from="ЗЗТРОНЕМ" to="ЗАТРОНЕМ" />
|
||||
<Word from="ОПЗСЗЄМСЯ" to="ОПАСАЕМСЯ" />
|
||||
<Word from="СИЛЬН0" to="СИЛЬНО" />
|
||||
<Word from="ОТЛИЧЗЄТСЯ" to="ОТЛИЧАЕТСЯ" />
|
||||
<Word from="РЭНЬШЄ" to="РАНЬШЕ" />
|
||||
<Word from="НЗЗЬІВЗЮТ" to="НАЗЫВАЮТ" />
|
||||
<Word from="ТЄКЛ3" to="ТЕКЛА" />
|
||||
<Word from="ОСЗДОЧНЫМИ" to="ОСАДОЧНЫМИ" />
|
||||
<Word from="ПОСТЄПЄНН0" to="ПОСТЕПЕННО" />
|
||||
<Word from="ИСПЭРЯЛЗСЬ" to="ИСПАРЯЛАСЬ" />
|
||||
<Word from="ЄОЛЬШОЄ" to="БОЛЬШОЕ" />
|
||||
<Word from="КОЛИЧЄСТБО" to="КОЛИЧЕСТВО" />
|
||||
<Word from="ГЄМЗТИТЕ" to="ГЕМАТИТА" />
|
||||
<Word from="ПОЛУЧЭЄТ" to="ПОЛУЧАЕТ" />
|
||||
<Word from="НЄДОСТЗЧН0" to="НЕДОСТАТОЧНО" />
|
||||
<Word from="ПИТЭНИЯ" to="ПИТАНИЯ" />
|
||||
<Word from="ПОКЗ" to="ПОКА" />
|
||||
<Word from="БЬІХОДИЛИ" to="ВЫХОДИЛИ" />
|
||||
<Word from="ЗЄМІІЄ" to="ЗЕМЛЕ" />
|
||||
<Word from="ВЄСЬІИЗ" to="ВЕСЬМА" />
|
||||
<Word from="ЗЄМЛИ" to="ЗЕМЛИ" />
|
||||
<Word from="бЬІЛО" to="БЫЛО" />
|
||||
<Word from="КИЗНИ" to="ЖИЗНИ" />
|
||||
<Word from="СТЗНОВИЛЗСЬ" to="СТАНОВИЛАСЬ" />
|
||||
<Word from="СОЛЄНЄЄ" to="СОЛЁНЕЕ" />
|
||||
<Word from="МЭГНИТНЫМ" to="МАГНИТНЫМ" />
|
||||
<Word from="ЧТОбЬІ" to="ЧТОБЫ" />
|
||||
<Word from="СОЗДЕТЬ" to="СОЗДАТЬ" />
|
||||
<Word from="МЗГНИТНОЄ" to="МАГНИТНОЕ" />
|
||||
<Word from="КЭЖУТСЯ" to="КАЖУТСЯ" />
|
||||
<Word from="ОЗНЗЧЗЄТ" to="ОЗНАЧАЕТ" />
|
||||
<Word from="МОГЛЗ" to="МОГЛА" />
|
||||
<Word from="ИМЄТЬ" to="ИМЕТЬ" />
|
||||
<Word from="КОСМОСЭ" to="КОСМОСА" />
|
||||
<Word from="СОЛНЄЧНЗЯ" to="СОЛНЕЧНАЯ" />
|
||||
<Word from="СИСТЄМЗ" to="СИСТЕМА" />
|
||||
<Word from="ПОСІІУЖИЛО" to="ПОСЛУЖИЛО" />
|
||||
<Word from="МЗГНИТНОГО" to="МАГНИТНОГО" />
|
||||
<Word from="ПЛВНЄТЫ" to="ПЛАНЕТЫ" />
|
||||
<Word from="ЛОКЗЛЬНЬІХ" to="ЛОКАЛЬНЫХ" />
|
||||
<Word from="ПОЛЄЙ" to="ПОЛЕЙ" />
|
||||
<Word from="КЗЖУТСЯ" to="КАЖУТСЯ" />
|
||||
<Word from="КЗКОГО" to="КАКОГО" />
|
||||
<Word from="СТРЗШНОГО" to="СТРАШНОГО" />
|
||||
<Word from="СТОЛКНОЕЄНИЯ" to="СТОЛКНОВЕНИЯ" />
|
||||
<Word from="МЕСТЗМИ" to="МЕСТАМИ" />
|
||||
<Word from="СДЄЛЗТЬ" to="СДЕЛАТЬ" />
|
||||
<Word from="СТЗЛО" to="СТАЛО" />
|
||||
<Word from="МЭГНИТНОГО" to="МАГНИТНОГО" />
|
||||
<Word from="ЗЗКЛЮЧЗВШЄЙСЯ" to="ЗАКЛЮЧАВШЕЙСЯ" />
|
||||
<Word from="ЄГО" to="ЕГО" />
|
||||
<Word from="ЯДРЄ" to="ЯДРЕ" />
|
||||
<Word from="НЗ" to="НА" />
|
||||
<Word from="ИСЧЄЗЛ3" to="ИСЧЕЗЛА" />
|
||||
<Word from="СЧИТЗЮ" to="СЧИТАЮ" />
|
||||
<Word from="ШЭНСЫ" to="ШАНСЫ" />
|
||||
<Word from="ИНЗЧЄ" to="ИНАЧЕ" />
|
||||
<Word from="СТЗЛ" to="СТАЛ" />
|
||||
<Word from="ТРЗТИТЬ" to="ТРАТИТЬ" />
|
||||
<Word from="НЗПРЗВЛЯЄТСЯ" to="НАПРАВЛЯЕТСЯ" />
|
||||
<Word from="ОБЛЭСТИ" to="ОБЛАСТИ" />
|
||||
<Word from="ЯВЛЯІОТСЯ" to="ЯВЛЯЮТСЯ" />
|
||||
<Word from="ГЛЭВНОЙ" to="ГЛАВНОЙ" />
|
||||
<Word from="ДОКЗЗЗТЄЛЬСТВ" to="ДОКАЗАТЕЛЬСТВ" />
|
||||
<Word from="КИСЛОТЭМИ" to="КИСЛОТАМИ" />
|
||||
<Word from="ОНЭ" to="ОНА" />
|
||||
<Word from="ПРЗКТИЧЄСКИ" to="ПРАКТИЧЕСКИ" />
|
||||
<Word from="ЛЄСУ" to="ЛЕСУ" />
|
||||
<Word from="УСЛОБИЯМ" to="УСЛОВИЯМ" />
|
||||
<Word from="СПЗСТИСЬ" to="СПАСТИСЬ" />
|
||||
<Word from="РЗЗВИВЗЮЩИЄСЯ" to="РАЗВИВАЮЩИЕСЯ" />
|
||||
<Word from="ШЭПКИ" to="ШАПКИ" />
|
||||
<Word from="ЗНЗЄМ" to="ЗНАЕМ" />
|
||||
<Word from="СООИРЭЄМСЯ" to="СОБИРАЕМСЯ" />
|
||||
<Word from="БЫЯСНИТЬ" to="ВЫЯСНИТЬ" />
|
||||
<Word from="СЗМ" to="САМ" />
|
||||
<Word from="РЗСПОЗНЗТЬ" to="РАСПОЗНАТЬ" />
|
||||
<Word from="УЗНЗТЬ" to="УЗНАТЬ" />
|
||||
<Word from="КЭЖЄТСЯ" to="КАЖЕТСЯ" />
|
||||
<Word from="ОРЄИТЗЛЬНЬІЄ" to="ОРБИТАЛЬНЫЕ" />
|
||||
<Word from="ЛЄТЭТЄЛЬНЬІЄ" to="ЛЕТАТЕЛЬНЫЕ" />
|
||||
<Word from="ЗППЗРЕТЬІ" to="АППАРАТЫ" />
|
||||
<Word from="ЖЄ" to="ЖЕ" />
|
||||
<Word from="ТЗКЗЯ" to="ТАКАЯ" />
|
||||
<Word from="МЗЛЄНЬКЗЯ" to="МАЛЕНЬКАЯ" />
|
||||
<Word from="ПЛЭНЄТЗ" to="ПЛАНЕТА" />
|
||||
<Word from="СПЗІІЬКО" to="СТОЛЬКО" />
|
||||
<Word from="бЬІЛ3" to="БЫЛА" />
|
||||
<Word from="ЁЕСЧИСЛЄННОЄ" to="БЕСЧИСЛЕННОЕ" />
|
||||
<Word from="МЗГНИїНЬІХ" to="МАГНИТНЫХ" />
|
||||
<Word from="ПОСТраД3Л" to="ПОСТРАДАЛ" />
|
||||
<Word from="ДЗЖЄ" to="ДАЖЕ" />
|
||||
<Word from="РЗЗНЬІМИ" to="РАЗНЫМИ" />
|
||||
<Word from="СУЩЄСТБОВЭНИЄ" to="СУЩЕСТВОВАНИЕ" />
|
||||
<Word from="ПЛаНЄїЬІ" to="ПЛАНЕТЫ" />
|
||||
<Word from="ПОДВЄРГЛЗСЬ" to="ПОДВЕРГЛАСЬ" />
|
||||
<Word from="ОПЗСІ-ІОСТИ" to="ОПАСНОСТИ" />
|
||||
<Word from="ПЛЗНЄТЄ" to="ПЛАНЕТЕ" />
|
||||
<Word from="Н0" to="НО" />
|
||||
<Word from="бЬІ" to="БЫ" />
|
||||
<Word from="ОТДЗЛЄННЫЄ" to="ОТДАЛЁННЫЕ" />
|
||||
<Word from="ПОЛЯРНЬІЄ" to="ПОЛЯРНЫЕ" />
|
||||
<Word from="ЦЄЛЬІ-О" to="ЦЕЛЬЮ" />
|
||||
<Word from="ПЄЩЄРЗХ" to="ПЕЩЕРАХ" />
|
||||
<Word from="НЗПОЛНЄННЬІХ" to="НАПОЛНЕННЫХ" />
|
||||
<Word from="ИСПЗРЄНИЯМИ" to="ИСПАРЕНИЯМИ" />
|
||||
<Word from="МИНИЗТЮРНЬІЄ" to="МИНИАТЮРНЫЕ" />
|
||||
<Word from="ТЭКЗЯ" to="ТАКАЯ" />
|
||||
<Word from="ПрИСП0СОбИТЬСЯ" to="ПРИСПОСОБИТЬСЯ" />
|
||||
<Word from="НЄОЄХОДИМЬІЄ" to="НЕОБХОДИМЫЕ" />
|
||||
<Word from="ОРГВНИЧЄСКИЄ" to="ОРГАНИЧЕСКИЕ" />
|
||||
<Word from="МЗРСИЗНСКИЄ" to="МАРСИАНСКИЕ" />
|
||||
<Word from="МЄСТЄ" to="МЕСТЕ" />
|
||||
<Word from="І\/ІАККЕЙШ" to="МАККЕЙН" />
|
||||
<Word from="НЗХОДЯЩИЄСЯ" to="НАХОДЯЩИЕСЯ" />
|
||||
<Word from="НЄЗКТИВНОМ" to="НЕАКТИВНОМ" />
|
||||
<Word from="ЗЭСНЯТЬ" to="ЗАСНЯТЬ" />
|
||||
<Word from="ОРГЗНИЗМЬІ" to="ОРГАНИЗМЫ" />
|
||||
<Word from="ВЗЕИМОДЄЙСТВОВЕТЬ" to="ВЗАИМОДЕЙСТВОВАТЬ" />
|
||||
<Word from="ПУТЄШЄСТБИЄ" to="ПУТЕШЕСТВИЕ" />
|
||||
<Word from="ПуСїЬІННЫХ" to="ПУСТЫННЫХ" />
|
||||
<Word from="ТЗКИХ" to="ТАКИХ" />
|
||||
<Word from="ПЄРЄТЗСКИВЗЄМ" to="ПЕРЕТАСКИВАЕМ" />
|
||||
<Word from="ЧТ0" to="ЧТО" />
|
||||
<Word from="ВЄСЬМЗ" to="ВЕСЬМА" />
|
||||
<Word from="ПОЛОСЗМИ" to="ПОЛОСАМИ" />
|
||||
<Word from="ОрїЭНИЗМЬІ" to="ОРГАНИЗМЫ" />
|
||||
<Word from="ОЁЛЗСТИ" to="ОБЛАСТИ" />
|
||||
<Word from="ЯБЛЯЮТСЯ" to="ЯВЛЯЮТСЯ" />
|
||||
<Word from="ЦЄЛЬЮ" to="ЦЕЛЬЮ" />
|
||||
<Word from="ПОИСКОБ" to="ПОИСКОВ" />
|
||||
<Word from="ДОКЗЗЗТЄІІЬСТВ" to="ДОКАЗАТЕЛЬСТВ" />
|
||||
<Word from="МОЖЄТ" to="МОЖЕТ" />
|
||||
<Word from="НЭХОДИТЬСЯ" to="НАХОДИТЬСЯ" />
|
||||
<Word from="ОЧЄНЬ" to="ОЧЕНЬ" />
|
||||
<Word from="СРЗВНИТЬ" to="СРАВНИТЬ" />
|
||||
<Word from="ОЄНЗРУЖИЛ" to="ОБНАРУЖИЛ" />
|
||||
<Word from="ЛЬДЗ" to="ЛЬДА" />
|
||||
<Word from="ПОТЄПЛЄНИЄІИ" to="ПОТЕПЛЕНИЕМ" />
|
||||
<Word from="ПОХОЛОДЗНИЄБД" to="ПОХОЛОДАНИЕМ" />
|
||||
<Word from="КЭК" to="КАК" />
|
||||
<Word from="ТЄЛО" to="ТЕЛО" />
|
||||
<Word from="бОЛЬШЄ" to="БОЛЬШЕ" />
|
||||
<Word from="НЭКЛОНЯЄТСЯ" to="НАКЛОНЯЕТСЯ" />
|
||||
<Word from="СОІІНЦУ" to="СОЛНЦУ" />
|
||||
<Word from="СТ3бИЛИЗИрОБЗТЬ" to="СТАБИЛИЗИРОВАТЬ" />
|
||||
<Word from="СТЭБИЛЬНЭ" to="СТАБИЛЬНА" />
|
||||
<Word from="МИЛІІИОНОВ" to="МИЛЛИОНОВ" />
|
||||
<Word from="НЗЗЭД" to="НАЗАД" />
|
||||
<Word from="ТЄПЛ0" to="ТЕПЛО" />
|
||||
<Word from="ПОІІЯРНЫХ" to="ПОЛЯРНЫХ" />
|
||||
<Word from="СОІІЕНЫМИ" to="СОЛЕНЫМИ" />
|
||||
<Word from="КЕКИМИ" to="КАКИМИ" />
|
||||
<Word from="кислютнюсггь" to="кислотность" />
|
||||
<Word from="ТЗМ" to="ТАМ" />
|
||||
<Word from="ОРГЗНИЗМЫ" to="ОРГАНИЗМЫ" />
|
||||
<Word from="СУЩЄСТВОВЄТЬ" to="СУЩЕСТВОВАТЬ" />
|
||||
<Word from="ВНИМЗНИЄ" to="ВНИМАНИЕ" />
|
||||
<Word from="СДЄЛЗЄТ" to="СДЕЛАЕТ" />
|
||||
<Word from="ПОЗНЭКОМИТЬСЯ" to="ПОЗНАКОМИТЬСЯ" />
|
||||
<Word from="НЭШИМ" to="НАШИМ" />
|
||||
<Word from="ДОКЗЗЭТЄЛЬСТБО" to="ДОКАЗАТЕЛЬСТВО" />
|
||||
<Word from="ЩЗЗЩЄНИЯ" to="ВРАЩЕНИЯ" />
|
||||
<Word from="бЬІЛ0" to="БЫЛО" />
|
||||
<Word from="ОЄЛЕСТЯХ" to="ОБЛАСТЯХ" />
|
||||
<Word from="бЬІЛИ" to="БЫЛИ" />
|
||||
<Word from="РЭЗМЬІШЛЯІІИ" to="РАЗМЫШЛЯЛИ" />
|
||||
<Word from="КОЛИЧЄСТБЄ" to="КОЛИЧЕСТВЕ" />
|
||||
<Word from="ЩЄІІОЧНЫЄ" to="ЩЕЛОЧНЫЕ" />
|
||||
<Word from="НЄКОТЩЗЬІЄ" to="НЕКОТОРЫЕ" />
|
||||
<Word from="ПрИБІ1ЕКуї" to="ПРИВЛЕКУТ" />
|
||||
<Word from="НЗЗЬІВЭЄМЫЄ" to="НАЗЫВАЕМЫЕ" />
|
||||
<Word from="Чї06Ы" to="ЧТОБЫ" />
|
||||
</WholeWords>
|
||||
<PartialWordsAlways />
|
||||
<PartialWords>
|
||||
<WordPart from="Є" to="Е" />
|
||||
<WordPart from="ЬІ" to="Ы" />
|
||||
<WordPart from="КЗ" to="КА" />
|
||||
<WordPart from="ЛЗ" to="ЛА" />
|
||||
<WordPart from="НЗ" to="НА" />
|
||||
<WordPart from="ШЗ" to="ША" />
|
||||
<WordPart from="І\/І" to="М" />
|
||||
</PartialWords>
|
||||
<PartialLines />
|
||||
<PartialLinesAlways />
|
||||
<BeginLines />
|
||||
<EndLines />
|
||||
<WholeLines />
|
||||
<RegularExpressions />
|
||||
</OCRFixReplaceList>
|
||||
+946
@@ -0,0 +1,946 @@
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords>
|
||||
<!-- Abreviaturas simples -->
|
||||
<Word from="KBs" to="kB" />
|
||||
<Word from="Vd" to="Ud" />
|
||||
<Word from="N°" to="N.°" />
|
||||
<Word from="n°" to="n.°" />
|
||||
<Word from="nro." to="n.°" />
|
||||
<Word from="Nro." to="N.°" />
|
||||
<!-- Ortografía básica -->
|
||||
<Word from="aca" to="acá" />
|
||||
<Word from="actuas" to="actúas" />
|
||||
<Word from="actues" to="actúes" />
|
||||
<Word from="adios" to="adiós" />
|
||||
<Word from="agarrenla" to="agárrenla" />
|
||||
<Word from="agarrenlo" to="agárrenlo" />
|
||||
<Word from="agarrandose" to="agarrándose" />
|
||||
<Word from="algun" to="algún" />
|
||||
<Word from="alli" to="allí" />
|
||||
<Word from="alla" to="allá" />
|
||||
<Word from="alejate" to="aléjate" />
|
||||
<Word from="ahi" to="ahí" />
|
||||
<Word from="angel" to="ángel" />
|
||||
<Word from="angeles" to="ángeles" />
|
||||
<Word from="apagala" to="apágala" />
|
||||
<Word from="aqui" to="aquí" />
|
||||
<Word from="asi" to="así" />
|
||||
<Word from="bahia" to="bahía" />
|
||||
<Word from="busqueda" to="búsqueda" />
|
||||
<Word from="busquedas" to="búsquedas" />
|
||||
<Word from="callate" to="cállate" />
|
||||
<Word from="carcel" to="cárcel" />
|
||||
<Word from="camara" to="cámara" />
|
||||
<Word from="caido" to="caído" />
|
||||
<Word from="cabron" to="cabrón" />
|
||||
<Word from="camion" to="camión" />
|
||||
<Word from="codigo" to="código" />
|
||||
<Word from="codigos" to="códigos" />
|
||||
<Word from="comence" to="comencé" />
|
||||
<Word from="comprate" to="cómprate" />
|
||||
<Word from="consegui" to="conseguí" />
|
||||
<Word from="confias" to="confías" />
|
||||
<Word from="convertira" to="convertirá" />
|
||||
<Word from="corazon" to="corazón" />
|
||||
<Word from="crei" to="creí" />
|
||||
<Word from="creia" to="creía" />
|
||||
<Word from="creido" to="creído" />
|
||||
<Word from="creiste" to="creíste" />
|
||||
<Word from="cubrenos" to="cúbrenos" />
|
||||
<Word from="comio" to="comió" />
|
||||
<Word from="dara" to="dará" />
|
||||
<Word from="dia" to="día" />
|
||||
<Word from="dias" to="días" />
|
||||
<Word from="debio" to="debió" />
|
||||
<Word from="demelo" to="démelo" />
|
||||
<Word from="dimelo" to="dímelo" />
|
||||
<Word from="denoslo" to="dénoslo" />
|
||||
<Word from="deselo" to="déselo" />
|
||||
<Word from="decia" to="decía" />
|
||||
<Word from="decian" to="decían" />
|
||||
<Word from="detras" to="detrás" />
|
||||
<Word from="deberia" to="debería" />
|
||||
<Word from="deberas" to="deberás" />
|
||||
<Word from="deberias" to="deberías" />
|
||||
<Word from="deberian" to="deberían" />
|
||||
<Word from="deberiamos" to="deberíamos" />
|
||||
<Word from="dejame" to="déjame" />
|
||||
<Word from="dejate" to="déjate" />
|
||||
<Word from="dejalo" to="déjalo" />
|
||||
<Word from="dejarian" to="dejarían" />
|
||||
<Word from="damela" to="dámela" />
|
||||
<Word from="despues" to="después" />
|
||||
<Word from="diciendome" to="diciéndome" />
|
||||
<Word from="dificil" to="difícil" />
|
||||
<Word from="dificiles" to="difíciles" />
|
||||
<Word from="disculpate" to="discúlpate" />
|
||||
<Word from="dolares" to="dólares" />
|
||||
<Word from="hechar" to="echar" />
|
||||
<Word from="examenes" to="exámenes" />
|
||||
<Word from="empezo" to="empezó" />
|
||||
<Word from="empujon" to="empujón" />
|
||||
<Word from="empujalo" to="empújalo" />
|
||||
<Word from="escondanme" to="escóndanme" />
|
||||
<Word from="esperame" to="espérame" />
|
||||
<Word from="estara" to="estará" />
|
||||
<Word from="estare" to="estaré" />
|
||||
<Word from="estaria" to="estaría" />
|
||||
<Word from="estan" to="están" />
|
||||
<Word from="estaran" to="estarán" />
|
||||
<Word from="estabamos" to="estábamos" />
|
||||
<Word from="estuvieramos" to="estuviéramos" />
|
||||
<Word from="exito" to="éxito" />
|
||||
<Word from="facil" to="fácil" />
|
||||
<Word from="fiscalia" to="fiscalía" />
|
||||
<Word from="fragil" to="frágil" />
|
||||
<Word from="fragiles" to="frágiles" />
|
||||
<Word from="frances" to="francés" />
|
||||
<Word from="gustaria" to="gustaría" />
|
||||
<Word from="habia" to="había" />
|
||||
<Word from="habias" to="habías" />
|
||||
<Word from="habian" to="habían" />
|
||||
<Word from="habrian" to="habrían" />
|
||||
<Word from="habrias" to="habrías" />
|
||||
<Word from="hagalo" to="hágalo" />
|
||||
<Word from="haria" to="haría" />
|
||||
<Word from="increible" to="increíble" />
|
||||
<Word from="incredulo" to="incrédulo" />
|
||||
<Word from="intentalo" to="inténtalo" />
|
||||
<Word from="ire" to="iré" />
|
||||
<Word from="jovenes" to="jóvenes" />
|
||||
<Word from="ladron" to="ladrón" />
|
||||
<Word from="linea" to="línea" />
|
||||
<Word from="llamame" to="llámame" />
|
||||
<Word from="llevalo" to="llévalo" />
|
||||
<Word from="mama" to="mamá" />
|
||||
<Word from="maricon" to="maricón" />
|
||||
<Word from="mayoria" to="mayoría" />
|
||||
<Word from="metodo" to="método" />
|
||||
<Word from="metodos" to="métodos" />
|
||||
<Word from="mio" to="mío" />
|
||||
<Word from="mostro" to="mostró" />
|
||||
<Word from="morira" to="morirá" />
|
||||
<Word from="muevete" to="muévete" />
|
||||
<Word from="murio" to="murió" />
|
||||
<Word from="numero" to="número" />
|
||||
<Word from="numeros" to="números" />
|
||||
<Word from="ningun" to="ningún" />
|
||||
<Word from="oido" to="oído" />
|
||||
<Word from="oidos" to="oídos" />
|
||||
<Word from="oimos" to="oímos" />
|
||||
<Word from="oiste" to="oíste" />
|
||||
<Word from="pasale" to="pásale" />
|
||||
<Word from="pasame" to="pásame" />
|
||||
<Word from="paraiso" to="paraíso" />
|
||||
<Word from="parate" to="párate" />
|
||||
<Word from="pense" to="pensé" />
|
||||
<Word from="peluqueria" to="peluquería" />
|
||||
<Word from="platano" to="plátano" />
|
||||
<Word from="plastico" to="plástico" />
|
||||
<Word from="plasticos" to="plásticos" />
|
||||
<Word from="policia" to="policía" />
|
||||
<Word from="policias" to="policías" />
|
||||
<Word from="poster" to="póster" />
|
||||
<Word from="podia" to="podía" />
|
||||
<Word from="podias" to="podías" />
|
||||
<Word from="podria" to="podría" />
|
||||
<Word from="podrian" to="podrían" />
|
||||
<Word from="podrias" to="podrías" />
|
||||
<Word from="podriamos" to="podríamos" />
|
||||
<Word from="prometio" to="prometió" />
|
||||
<Word from="proposito" to="propósito" />
|
||||
<Word from="pideselo" to="pídeselo" />
|
||||
<Word from="ponganse" to="pónganse" />
|
||||
<Word from="prometeme" to="prométeme" />
|
||||
<Word from="publico" to="público" />
|
||||
<Word from="publicos" to="públicos" />
|
||||
<Word from="publicamente" to="públicamente" />
|
||||
<Word from="quedate" to="quédate" />
|
||||
<Word from="queria" to="quería" />
|
||||
<Word from="querrias" to="querrías" />
|
||||
<Word from="querian" to="querían" />
|
||||
<Word from="rapido" to="rápido" />
|
||||
<Word from="rapidamente" to="rápidamente" />
|
||||
<Word from="razon" to="razón" />
|
||||
<Word from="rehusen" to="rehúsen" />
|
||||
<Word from="rie" to="ríe" />
|
||||
<Word from="rias" to="rías" />
|
||||
<Word from="rindete" to="ríndete" />
|
||||
<Word from="sacame" to="sácame" />
|
||||
<Word from="sentian" to="sentían" />
|
||||
<Word from="sientate" to="siéntate" />
|
||||
<Word from="sera" to="será" />
|
||||
<Word from="soplon" to="soplón" />
|
||||
<Word from="sueltalo" to="suéltalo" />
|
||||
<Word from="tambien" to="también" />
|
||||
<Word from="teoria" to="teoría" />
|
||||
<Word from="tendra" to="tendrá" />
|
||||
<Word from="telefono" to="teléfono" />
|
||||
<Word from="tipica" to="típica" />
|
||||
<Word from="todavia" to="todavía" />
|
||||
<Word from="tomalo" to="tómalo" />
|
||||
<Word from="tonterias" to="tonterías" />
|
||||
<Word from="torci" to="torcí" />
|
||||
<Word from="traelos" to="tráelos" />
|
||||
<Word from="traiganlo" to="tráiganlo" />
|
||||
<Word from="traiganlos" to="tráiganlos" />
|
||||
<Word from="trio" to="trío" />
|
||||
<Word from="tuvieramos" to="tuviéramos" />
|
||||
<Word from="union" to="unión" />
|
||||
<Word from="ultimo" to="último" />
|
||||
<Word from="ultima" to="última" />
|
||||
<Word from="ultimos" to="últimos" />
|
||||
<Word from="ultimas" to="últimas" />
|
||||
<Word from="unica" to="única" />
|
||||
<Word from="unico" to="único" />
|
||||
<Word from="vamonos" to="vámonos" />
|
||||
<Word from="vayanse" to="váyanse" />
|
||||
<Word from="victima" to="víctima" />
|
||||
<Word from="vivira" to="vivirá" />
|
||||
<Word from="volvio" to="volvió" />
|
||||
<Word from="volvia" to="volvía" />
|
||||
<Word from="volvian" to="volvían" />
|
||||
<!-- Palabras con eír/oír más usadas -->
|
||||
<Word from="reir" to="reír" />
|
||||
<Word from="freir" to="freír" />
|
||||
<Word from="sonreir" to="sonreír" />
|
||||
<Word from="hazmerreir" to="hazmerreír" />
|
||||
<Word from="oir" to="oír" />
|
||||
<Word from="oirlo" to="oírlo" />
|
||||
<Word from="oirte" to="oírte" />
|
||||
<Word from="oirse" to="oírse" />
|
||||
<Word from="oirme" to="oírme" />
|
||||
<Word from="oirle" to="oírle" />
|
||||
<Word from="oirla" to="oírla" />
|
||||
<Word from="oirles" to="oírles" />
|
||||
<Word from="oirnos" to="oírnos" />
|
||||
<Word from="oirlas" to="oírlas" />
|
||||
<!-- Palabras que no llevan acento -->
|
||||
<Word from="bién" to="bien" />
|
||||
<Word from="crímen" to="crimen" />
|
||||
<Word from="fué" to="fue" />
|
||||
<Word from="fuí" to="fui" />
|
||||
<Word from="quiéres" to="quieres" />
|
||||
<Word from="tí" to="ti" />
|
||||
<Word from="dí" to="di" />
|
||||
<Word from="vá" to="va" />
|
||||
<Word from="vé" to="ve" />
|
||||
<Word from="ví" to="vi" />
|
||||
<Word from="vió" to="vio" />
|
||||
<Word from="ó" to="o" />
|
||||
<Word from="clón" to="clon" />
|
||||
<Word from="dió" to="dio" />
|
||||
<Word from="guión" to="guion" />
|
||||
<Word from="dón" to="don" />
|
||||
<Word from="fé" to="fe" />
|
||||
<Word from="áquel" to="aquel" />
|
||||
<!-- Palabras donde se puede prescindir de la tilde diacrítica -->
|
||||
<Word from="éste" to="este" />
|
||||
<Word from="ésta" to="esta" />
|
||||
<Word from="éstos" to="estos" />
|
||||
<Word from="éstas" to="estas" />
|
||||
<Word from="ése" to="ese" />
|
||||
<Word from="ésa" to="esa" />
|
||||
<Word from="ésos" to="esos" />
|
||||
<Word from="ésas" to="esas" />
|
||||
<Word from="sólo" to="solo" />
|
||||
<!-- Errores no relacionados con los tildes -->
|
||||
<Word from="coktel" to="cóctel" />
|
||||
<Word from="cocktel" to="cóctel" />
|
||||
<Word from="conciente" to="consciente" />
|
||||
<Word from="comenzé" to="comencé" />
|
||||
<Word from="desilucionarte" to="desilusionarte" />
|
||||
<Word from="dijieron" to="dijeron" />
|
||||
<Word from="empezé" to="empecé" />
|
||||
<Word from="hize" to="hice" />
|
||||
<Word from="ilucionarte" to="ilusionarte" />
|
||||
<Word from="inconciente" to="inconsciente" />
|
||||
<Word from="quize" to="quise" />
|
||||
<Word from="quizo" to="quiso" />
|
||||
<Word from="verguenza" to="vergüenza" />
|
||||
<!-- Errores en nombres propios o de países -->
|
||||
<Word from="Nuñez" to="Núñez" />
|
||||
<Word from="Ivan" to="Iván" />
|
||||
<Word from="Japon" to="Japón" />
|
||||
<Word from="Monica" to="Mónica" />
|
||||
<Word from="Maria" to="María" />
|
||||
<Word from="Jose" to="José" />
|
||||
<Word from="Ramon" to="Ramón" />
|
||||
<Word from="Garcia" to="García" />
|
||||
<Word from="Gonzalez" to="González" />
|
||||
<Word from="Jesus" to="Jesús" />
|
||||
<Word from="Alvarez" to="Álvarez" />
|
||||
<Word from="Damian" to="Damián" />
|
||||
<Word from="Rene" to="René" />
|
||||
<Word from="Nicolas" to="Nicolás" />
|
||||
<Word from="Jonas" to="Jonás" />
|
||||
<Word from="Lopez" to="López" />
|
||||
<Word from="Hernandez" to="Hernández" />
|
||||
<Word from="Bermudez" to="Bermúdez" />
|
||||
<Word from="Fernandez" to="Fernández" />
|
||||
<Word from="Suarez" to="Suárez" />
|
||||
<Word from="Sofia" to="Sofía" />
|
||||
<Word from="Seneca" to="Séneca" />
|
||||
<Word from="Tokyo" to="Tokio" />
|
||||
<Word from="Canada" to="Canadá" />
|
||||
<Word from="Paris" to="París" />
|
||||
<Word from="Turquia" to="Turquía" />
|
||||
<Word from="Mexico" to="México" />
|
||||
<Word from="Mejico" to="México" />
|
||||
<Word from="Matias" to="Matías" />
|
||||
<Word from="Valentin" to="Valentín" />
|
||||
<Word from="mejicano" to="mexicano" />
|
||||
<Word from="mejicanos" to="mexicanos" />
|
||||
<Word from="mejicana" to="mexicana" />
|
||||
<Word from="mejicanas" to="mexicanas" />
|
||||
<!-- Creados por SE -->
|
||||
<Word from="io" to="lo" />
|
||||
<Word from="ia" to="la" />
|
||||
<Word from="ie" to="le" />
|
||||
<Word from="Io" to="lo" />
|
||||
<Word from="Ia" to="la" />
|
||||
<Word from="AI" to="Al" />
|
||||
<Word from="Ie" to="le" />
|
||||
<Word from="EI" to="El" />
|
||||
<Word from="subafluente" to="subafluente" />
|
||||
<Word from="aflójalo" to="aflójalo" />
|
||||
<Word from="Aflójalo" to="Aflójalo" />
|
||||
<Word from="perdi" to="perdí" />
|
||||
<Word from="Podria" to="Podría" />
|
||||
<Word from="confia" to="confía" />
|
||||
<Word from="pasaria" to="pasaría" />
|
||||
<Word from="Podias" to="Podías" />
|
||||
<Word from="responsabke" to="responsable" />
|
||||
<Word from="Todavia" to="Todavía" />
|
||||
<Word from="envien" to="envíen" />
|
||||
<Word from="Queria" to="Quería" />
|
||||
<Word from="tio" to="tío" />
|
||||
<Word from="traido" to="traído" />
|
||||
<Word from="Asi" to="Así" />
|
||||
<Word from="elegi" to="elegí" />
|
||||
<Word from="habria" to="habría" />
|
||||
<Word from="encantaria" to="encantaría" />
|
||||
<Word from="leido" to="leído" />
|
||||
<Word from="conocias" to="conocías" />
|
||||
<Word from="harias" to="harías" />
|
||||
<Word from="Aqui" to="Aquí" />
|
||||
<Word from="decidi" to="decidí" />
|
||||
<Word from="mia" to="mía" />
|
||||
<Word from="Crei" to="Creí" />
|
||||
<Word from="podiamos" to="podíamos" />
|
||||
<Word from="avisame" to="avísame" />
|
||||
<Word from="debia" to="debía" />
|
||||
<Word from="pensarias" to="pensarías" />
|
||||
<Word from="reuniamos" to="reuníamos" />
|
||||
<Word from="POÏ" to="por" />
|
||||
<Word from="vendria" to="vendría" />
|
||||
<Word from="caida" to="caída" />
|
||||
<Word from="venian" to="venían" />
|
||||
<Word from="compañias" to="compañías" />
|
||||
<Word from="leiste" to="leíste" />
|
||||
<Word from="Leiste" to="Leíste" />
|
||||
<Word from="fiaria" to="fiaría" />
|
||||
<Word from="Hungria" to="Hungría" />
|
||||
<Word from="fotografia" to="fotografía" />
|
||||
<Word from="cafeteria" to="cafetería" />
|
||||
<Word from="Digame" to="Dígame" />
|
||||
<Word from="debias" to="debías" />
|
||||
<Word from="tendria" to="tendría" />
|
||||
<Word from="CÏGO" to="creo" />
|
||||
<Word from="anteg" to="antes" />
|
||||
<Word from="SóIo" to="Solo" />
|
||||
<Word from="Ilamándola" to="llamándola" />
|
||||
<Word from="Cáflaté" to="Cállate" />
|
||||
<Word from="Ilamaste" to="llamaste" />
|
||||
<Word from="daria" to="daría" />
|
||||
<Word from="Iargaba" to="largaba" />
|
||||
<Word from="Yati" to="Y a ti" />
|
||||
<Word from="querias" to="querías" />
|
||||
<Word from="Iimpiarlo" to="limpiarlo" />
|
||||
<Word from="Iargado" to="largado" />
|
||||
<Word from="galeria" to="galería" />
|
||||
<Word from="Bartomeu" to="Bertomeu" />
|
||||
<Word from="Iocalizarlo" to="localizarlo" />
|
||||
<Word from="Ilámame" to="llámame" />
|
||||
</WholeWords>
|
||||
<PartialWordsAlways />
|
||||
<PartialWords />
|
||||
<PartialLines>
|
||||
<!-- Varios -->
|
||||
<LinePart from="de gratis" to="gratis" />
|
||||
<LinePart from="si quiera" to="siquiera" />
|
||||
<LinePart from="Cada una de los" to="Cada uno de los" />
|
||||
<LinePart from="Cada uno de las" to="Cada una de las" />
|
||||
<!-- Uso incorrecto de haber / a ver -->
|
||||
<LinePart from="haber que" to="a ver qué" />
|
||||
<LinePart from="haber qué" to="a ver qué" />
|
||||
<LinePart from="Haber si" to="A ver si" />
|
||||
<!-- Ponombres exclamativos o interrogativos Parte 1 -->
|
||||
<LinePart from=" que hora" to=" qué hora" />
|
||||
<LinePart from="yo que se" to="yo qué sé" />
|
||||
<LinePart from="Yo que se" to="Yo qué sé" />
|
||||
<!-- Acentos al final de los signos de exclamación -->
|
||||
<LinePart from=" tu!" to=" tú!" />
|
||||
<LinePart from=" si!" to=" sí!" />
|
||||
<LinePart from=" mi!" to=" mí!" />
|
||||
<LinePart from=" el!" to=" él!" />
|
||||
<!-- Acentos al final de los signos de interrogación -->
|
||||
<LinePart from=" tu?" to=" tú?" />
|
||||
<LinePart from=" si?" to=" sí?" />
|
||||
<LinePart from=" mi?" to=" mí?" />
|
||||
<LinePart from=" el?" to=" él?" />
|
||||
<LinePart from=" aun?" to=" aún?" />
|
||||
<LinePart from=" mas?" to=" más?" />
|
||||
<LinePart from=" que?" to=" qué?" />
|
||||
<LinePart from=" paso?" to=" pasó?" />
|
||||
<LinePart from=" cuando?" to=" cuándo?" />
|
||||
<LinePart from=" cuanto?" to=" cuánto?" />
|
||||
<LinePart from=" cuanta?" to=" cuánta?" />
|
||||
<LinePart from=" cuantas?" to=" cuántas?" />
|
||||
<LinePart from=" cuantos?" to=" cuántos?" />
|
||||
<LinePart from=" donde?" to=" dónde?" />
|
||||
<LinePart from=" quien?" to=" quién?" />
|
||||
<LinePart from=" como?" to=" cómo?" />
|
||||
<LinePart from=" adonde?" to=" adónde?" />
|
||||
<LinePart from=" cual?" to=" cuál?" />
|
||||
<!-- Acentos en los signos de interrogación completos -->
|
||||
<LinePart from="¿Si?" to="¿Sí?" />
|
||||
<LinePart from="¿esta bien?" to="¿está bien?" />
|
||||
<!-- Enunciados que son a la vez interrogativos y exclamativos -->
|
||||
<LinePart from="¿Pero qué haces?" to="¡¿Pero qué haces?!" />
|
||||
<LinePart from="¿pero qué haces?" to="¡¿pero qué haces?!" />
|
||||
<LinePart from="¿Es que no me has escuchado?" to="¡¿Es que no me has escuchado?!" />
|
||||
<LinePart from="¡¿es que no me has escuchado?!" to="¡¿es que no me has escuchado?!" />
|
||||
<!-- Acentos al principio de los signos de interrogación con minúsculas -->
|
||||
<LinePart from="¿aun" to="¿aún" />
|
||||
<LinePart from="¿tu " to="¿tú " />
|
||||
<LinePart from="¿que " to="¿qué " />
|
||||
<LinePart from="¿sabes que" to="¿sabes qué" />
|
||||
<LinePart from="¿sabes adonde" to="¿sabes adónde" />
|
||||
<LinePart from="¿sabes cual" to="¿sabes cuál" />
|
||||
<LinePart from="¿sabes quien" to="¿sabes quién" />
|
||||
<LinePart from="¿sabes como" to="¿sabes cómo" />
|
||||
<LinePart from="¿sabes cuan" to="¿sabes cuán" />
|
||||
<LinePart from="¿sabes cuanto" to="¿sabes cuánto" />
|
||||
<LinePart from="¿sabes cuanta" to="¿sabes cuánta" />
|
||||
<LinePart from="¿sabes cuantos" to="¿sabes cuántos" />
|
||||
<LinePart from="¿sabes cuantas" to="¿sabes cuántas" />
|
||||
<LinePart from="¿sabes cuando" to="¿sabes cuándo" />
|
||||
<LinePart from="¿sabes donde" to="¿sabes dónde" />
|
||||
<LinePart from="¿sabe que" to="¿sabe qué" />
|
||||
<LinePart from="¿sabe adonde" to="¿sabe adónde" />
|
||||
<LinePart from="¿sabe cual" to="¿sabe cuál" />
|
||||
<LinePart from="¿sabe quien" to="¿sabe quién" />
|
||||
<LinePart from="¿sabe como" to="¿sabe cómo" />
|
||||
<LinePart from="¿sabe cuan" to="¿sabe cuán" />
|
||||
<LinePart from="¿sabe cuanto" to="¿sabe cuánto" />
|
||||
<LinePart from="¿sabe cuanta" to="¿sabe cuánta" />
|
||||
<LinePart from="¿sabe cuantos" to="¿sabe cuántos" />
|
||||
<LinePart from="¿sabe cuantas" to="¿sabe cuántas" />
|
||||
<LinePart from="¿sabe cuando" to="¿sabe cuándo" />
|
||||
<LinePart from="¿sabe donde" to="¿sabe dónde" />
|
||||
<LinePart from="¿saben que" to="¿saben qué" />
|
||||
<LinePart from="¿saben adonde" to="¿saben adónde" />
|
||||
<LinePart from="¿saben cual" to="¿saben cuál" />
|
||||
<LinePart from="¿saben quien" to="¿saben quién" />
|
||||
<LinePart from="¿saben como" to="¿saben cómo" />
|
||||
<LinePart from="¿saben cuan" to="¿saben cuán" />
|
||||
<LinePart from="¿saben cuanto" to="¿saben cuánto" />
|
||||
<LinePart from="¿saben cuanta" to="¿saben cuánta" />
|
||||
<LinePart from="¿saben cuantos" to="¿saben cuántos" />
|
||||
<LinePart from="¿saben cuantas" to="¿saben cuántas" />
|
||||
<LinePart from="¿saben cuando" to="¿saben cuándo" />
|
||||
<LinePart from="¿saben donde" to="¿saben dónde" />
|
||||
<LinePart from="¿de que" to="¿de qué" />
|
||||
<LinePart from="¿de donde" to="¿de dónde" />
|
||||
<LinePart from="¿de cual" to="¿de cuál" />
|
||||
<LinePart from="¿de quien" to="¿de quién" />
|
||||
<LinePart from="¿de cuanto" to="¿de cuánto" />
|
||||
<LinePart from="¿de cuanta" to="¿de cuánta" />
|
||||
<LinePart from="¿de cuantos" to="¿de cuántos" />
|
||||
<LinePart from="¿de cuantas" to="¿de cuántas" />
|
||||
<LinePart from="¿de cuando" to="¿de cuándo" />
|
||||
<LinePart from="¿sobre que" to="¿sobre qué" />
|
||||
<LinePart from="¿como " to="¿cómo " />
|
||||
<LinePart from="¿cual " to="¿cuál " />
|
||||
<LinePart from="¿en cual" to="¿en cuál" />
|
||||
<LinePart from="¿cuando" to="¿cuándo" />
|
||||
<LinePart from="¿hasta cual" to="¿hasta cuál" />
|
||||
<LinePart from="¿hasta quien" to="¿hasta quién" />
|
||||
<LinePart from="¿hasta cuanto" to="¿hasta cuánto" />
|
||||
<LinePart from="¿hasta cuantas" to="¿hasta cuántas" />
|
||||
<LinePart from="¿hasta cuantos" to="¿hasta cuántos" />
|
||||
<LinePart from="¿hasta cuando" to="¿hasta cuándo" />
|
||||
<LinePart from="¿hasta donde" to="¿hasta dónde" />
|
||||
<LinePart from="¿hasta que" to="¿hasta qué" />
|
||||
<LinePart from="¿hasta adonde" to="¿hasta adónde" />
|
||||
<LinePart from="¿desde que" to="¿desde qué" />
|
||||
<LinePart from="¿desde cuando" to="¿desde cuándo" />
|
||||
<LinePart from="¿desde quien" to="¿desde quién" />
|
||||
<LinePart from="¿desde donde" to="¿desde dónde" />
|
||||
<LinePart from="¿cuanto" to="¿cuánto" />
|
||||
<LinePart from="¿cuantos" to="¿cuántos" />
|
||||
<LinePart from="¿donde" to="¿dónde" />
|
||||
<LinePart from="¿adonde" to="¿adónde" />
|
||||
<LinePart from="¿con que" to="¿con qué" />
|
||||
<LinePart from="¿con cual" to="¿con cuál" />
|
||||
<LinePart from="¿con quien" to="¿con quién" />
|
||||
<LinePart from="¿con cuantos" to="¿con cuántos" />
|
||||
<LinePart from="¿con cuantas" to="¿con cuántas" />
|
||||
<LinePart from="¿con cuanta" to="¿con cuánta" />
|
||||
<LinePart from="¿con cuanto" to="¿con cuánto" />
|
||||
<LinePart from="¿para donde" to="¿para dónde" />
|
||||
<LinePart from="¿para adonde" to="¿para adónde" />
|
||||
<LinePart from="¿para cuando" to="¿para cuándo" />
|
||||
<LinePart from="¿para que" to="¿para qué" />
|
||||
<LinePart from="¿para quien" to="¿para quién" />
|
||||
<LinePart from="¿para cuanto" to="¿para cuánto" />
|
||||
<LinePart from="¿para cuanta" to="¿para cuánta" />
|
||||
<LinePart from="¿para cuantos" to="¿para cuántos" />
|
||||
<LinePart from="¿para cuantas" to="¿para cuántas" />
|
||||
<LinePart from="¿a donde" to="¿a dónde" />
|
||||
<LinePart from="¿a que" to="¿a qué" />
|
||||
<LinePart from="¿a cual" to="¿a cuál" />
|
||||
<LinePart from="¿a quien" to="¿a quien" />
|
||||
<LinePart from="¿a como" to="¿a cómo" />
|
||||
<LinePart from="¿a cuanto" to="¿a cuánto" />
|
||||
<LinePart from="¿a cuanta" to="¿a cuánta" />
|
||||
<LinePart from="¿a cuantos" to="¿a cuántos" />
|
||||
<LinePart from="¿a cuantas" to="¿a cuántas" />
|
||||
<LinePart from="¿por que" to="¿por qué" />
|
||||
<LinePart from="¿por cual" to="¿por cuál" />
|
||||
<LinePart from="¿por quien" to="¿por quién" />
|
||||
<LinePart from="¿por cuanto" to="¿por cuánto" />
|
||||
<LinePart from="¿por cuanta" to="¿por cuánta" />
|
||||
<LinePart from="¿por cuantos" to="¿por cuántos" />
|
||||
<LinePart from="¿por cuantas" to="¿por cuántas" />
|
||||
<LinePart from="¿por donde" to="¿por dónde" />
|
||||
<LinePart from="¿porque" to="¿por qué" />
|
||||
<LinePart from="¿porqué" to="¿por qué" />
|
||||
<LinePart from="¿y que" to="¿y qué" />
|
||||
<LinePart from="¿y como" to="¿y cómo" />
|
||||
<LinePart from="¿y cuando" to="¿y cuándo" />
|
||||
<LinePart from="¿y cual" to="¿y cuál" />
|
||||
<LinePart from="¿y quien" to="¿y quién" />
|
||||
<LinePart from="¿y cuanto" to="¿y cuánto" />
|
||||
<LinePart from="¿y cuanta" to="¿y cuánta" />
|
||||
<LinePart from="¿y cuantos" to="¿y cuántos" />
|
||||
<LinePart from="¿y cuantas" to="¿y cuántas" />
|
||||
<LinePart from="¿y donde" to="¿y dónde" />
|
||||
<LinePart from="¿y adonde" to="¿y adónde" />
|
||||
<LinePart from="¿quien " to="¿quién " />
|
||||
<LinePart from="¿esta " to="¿está " />
|
||||
<LinePart from="¿estas " to="¿estás " />
|
||||
<!-- Acentos al principio de los signos de interrogación con mayúsculas -->
|
||||
<LinePart from="¿Aun" to="¿Aún" />
|
||||
<LinePart from="¿Que " to="¿Qué " />
|
||||
<LinePart from="¿Sabes que" to="¿Sabes qué" />
|
||||
<LinePart from="¿Sabes adonde" to="¿Sabes adónde" />
|
||||
<LinePart from="¿Sabes cual" to="¿Sabes cuál" />
|
||||
<LinePart from="¿Sabes quien" to="¿Sabes quién" />
|
||||
<LinePart from="¿Sabes como" to="¿Sabes cómo" />
|
||||
<LinePart from="¿Sabes cuan" to="¿Sabes cuán" />
|
||||
<LinePart from="¿Sabes cuanto" to="¿Sabes cuánto" />
|
||||
<LinePart from="¿Sabes cuanta" to="¿Sabes cuánta" />
|
||||
<LinePart from="¿Sabes cuantos" to="¿Sabes cuántos" />
|
||||
<LinePart from="¿Sabes cuantas" to="¿Sabes cuántas" />
|
||||
<LinePart from="¿Sabes cuando" to="¿Sabes cuándo" />
|
||||
<LinePart from="¿Sabes donde" to="¿Sabes dónde" />
|
||||
<LinePart from="¿Sabe que" to="¿Sabe qué" />
|
||||
<LinePart from="¿Sabe adonde" to="¿Sabe adónde" />
|
||||
<LinePart from="¿Sabe cual" to="¿Sabe cuál" />
|
||||
<LinePart from="¿Sabe quien" to="¿Sabe quién" />
|
||||
<LinePart from="¿Sabe como" to="¿Sabe cómo" />
|
||||
<LinePart from="¿Sabe cuan" to="¿Sabe cuán" />
|
||||
<LinePart from="¿Sabe cuanto" to="¿Sabe cuánto" />
|
||||
<LinePart from="¿Sabe cuanta" to="¿Sabe cuánta" />
|
||||
<LinePart from="¿Sabe cuantos" to="¿Sabe cuántos" />
|
||||
<LinePart from="¿Sabe cuantas" to="¿Sabe cuántas" />
|
||||
<LinePart from="¿Sabe cuando" to="¿Sabe cuándo" />
|
||||
<LinePart from="¿Sabe donde" to="¿Sabe dónde" />
|
||||
<LinePart from="¿Saben que" to="¿Saben qué" />
|
||||
<LinePart from="¿Saben adonde" to="¿Saben adónde" />
|
||||
<LinePart from="¿Saben cual" to="¿Saben cuál" />
|
||||
<LinePart from="¿Saben quien" to="¿Saben quién" />
|
||||
<LinePart from="¿Saben como" to="¿Saben cómo" />
|
||||
<LinePart from="¿Saben cuan" to="¿Saben cuán" />
|
||||
<LinePart from="¿Saben cuanto" to="¿Saben cuánto" />
|
||||
<LinePart from="¿Saben cuanta" to="¿Saben cuánta" />
|
||||
<LinePart from="¿Saben cuantos" to="¿Saben cuántos" />
|
||||
<LinePart from="¿Saben cuantas" to="¿Saben cuántas" />
|
||||
<LinePart from="¿Saben cuando" to="¿Saben cuándo" />
|
||||
<LinePart from="¿Saben donde" to="¿Saben dónde" />
|
||||
<LinePart from="¿De que" to="¿De qué" />
|
||||
<LinePart from="¿De donde" to="¿De dónde" />
|
||||
<LinePart from="¿De cual" to="¿De cuál" />
|
||||
<LinePart from="¿De quien" to="¿De quién" />
|
||||
<LinePart from="¿De cuanto" to="¿De cuánto" />
|
||||
<LinePart from="¿De cuanta" to="¿De cuánta" />
|
||||
<LinePart from="¿De cuantos" to="¿De cuántos" />
|
||||
<LinePart from="¿De cuantas" to="¿De cuántas" />
|
||||
<LinePart from="¿De cuando" to="¿De cuándo" />
|
||||
<LinePart from="¿Desde que" to="¿Desde qué" />
|
||||
<LinePart from="¿Desde cuando" to="¿Desde cuándo" />
|
||||
<LinePart from="¿Desde quien" to="¿Desde quién" />
|
||||
<LinePart from="¿Desde donde" to="¿Desde dónde" />
|
||||
<LinePart from="¿Sobre que" to="¿Sobre qué" />
|
||||
<LinePart from="¿Como " to="¿Cómo " />
|
||||
<LinePart from="¿Cual " to="¿Cuál " />
|
||||
<LinePart from="¿En cual" to="¿En cuál" />
|
||||
<LinePart from="¿Cuando" to="¿Cuándo" />
|
||||
<LinePart from="¿Hasta cual" to="¿Hasta cuál" />
|
||||
<LinePart from="¿Hasta quien" to="¿Hasta quién" />
|
||||
<LinePart from="¿Hasta cuanto" to="¿Hasta cuánto" />
|
||||
<LinePart from="¿Hasta cuantas" to="¿Hasta cuántas" />
|
||||
<LinePart from="¿Hasta cuantos" to="¿Hasta cuántos" />
|
||||
<LinePart from="¿Hasta cuando" to="¿Hasta cuándo" />
|
||||
<LinePart from="¿Hasta donde" to="¿Hasta dónde" />
|
||||
<LinePart from="¿Hasta que" to="¿Hasta qué" />
|
||||
<LinePart from="¿Hasta adonde" to="¿Hasta adónde" />
|
||||
<LinePart from="¿Cuanto" to="¿Cuánto" />
|
||||
<LinePart from="¿Cuantos" to="¿Cuántos" />
|
||||
<LinePart from="¿Donde" to="¿Dónde" />
|
||||
<LinePart from="¿Adonde" to="¿Adónde" />
|
||||
<LinePart from="¿Con que" to="¿Con qué" />
|
||||
<LinePart from="¿Con cual" to="¿Con cuál" />
|
||||
<LinePart from="¿Con quien" to="¿Con quién" />
|
||||
<LinePart from="¿Con cuantos" to="¿Con cuántos" />
|
||||
<LinePart from="¿Con cuanta" to="¿Con cuántas" />
|
||||
<LinePart from="¿Con cuanta" to="¿Con cuánta" />
|
||||
<LinePart from="¿Con cuanto" to="¿Con cuánto" />
|
||||
<LinePart from="¿Para donde" to="¿Para dónde" />
|
||||
<LinePart from="¿Para adonde" to="¿Para adónde" />
|
||||
<LinePart from="¿Para cuando" to="¿Para cuándo" />
|
||||
<LinePart from="¿Para que" to="¿Para qué" />
|
||||
<LinePart from="¿Para quien" to="¿Para quién" />
|
||||
<LinePart from="¿Para cuanto" to="¿Para cuánto" />
|
||||
<LinePart from="¿Para cuanta" to="¿Para cuánta" />
|
||||
<LinePart from="¿Para cuantos" to="¿Para cuántos" />
|
||||
<LinePart from="¿Para cuantas" to="¿Para cuántas" />
|
||||
<LinePart from="¿A donde" to="¿A dónde" />
|
||||
<LinePart from="¿A que" to="¿A qué" />
|
||||
<LinePart from="¿A cual" to="¿A cuál" />
|
||||
<LinePart from="¿A quien" to="¿A quien" />
|
||||
<LinePart from="¿A como" to="¿A cómo" />
|
||||
<LinePart from="¿A cuanto" to="¿A cuánto" />
|
||||
<LinePart from="¿A cuanta" to="¿A cuánta" />
|
||||
<LinePart from="¿A cuantos" to="¿A cuántos" />
|
||||
<LinePart from="¿A cuantas" to="¿A cuántas" />
|
||||
<LinePart from="¿Por que" to="¿Por qué" />
|
||||
<LinePart from="¿Por cual" to="¿Por cuál" />
|
||||
<LinePart from="¿Por quien" to="¿Por quién" />
|
||||
<LinePart from="¿Por cuanto" to="¿Por cuánto" />
|
||||
<LinePart from="¿Por cuanta" to="¿Por cuánta" />
|
||||
<LinePart from="¿Por cuantos" to="¿Por cuántos" />
|
||||
<LinePart from="¿Por cuantas" to="¿Por cuántas" />
|
||||
<LinePart from="¿Por donde" to="¿Por dónde" />
|
||||
<LinePart from="¿Porque" to="¿Por qué" />
|
||||
<LinePart from="¿Porqué" to="¿Por qué" />
|
||||
<LinePart from="¿Y que" to="¿Y qué" />
|
||||
<LinePart from="¿Y como" to="¿Y cómo" />
|
||||
<LinePart from="¿Y cuando" to="¿Y cuándo" />
|
||||
<LinePart from="¿Y cual" to="¿Y cuál" />
|
||||
<LinePart from="¿Y quien" to="¿Y quién" />
|
||||
<LinePart from="¿Y cuanto" to="¿Y cuánto" />
|
||||
<LinePart from="¿Y cuanta" to="¿Y cuánta" />
|
||||
<LinePart from="¿Y cuantos" to="¿Y cuántos" />
|
||||
<LinePart from="¿Y cuantas" to="¿Y cuántas" />
|
||||
<LinePart from="¿Y donde" to="¿Y dónde" />
|
||||
<LinePart from="¿Y adonde" to="¿Y adónde" />
|
||||
<LinePart from="¿Quien " to="¿Quién " />
|
||||
<LinePart from="¿Esta " to="¿Está " />
|
||||
<!-- Tilde diacrítica en oraciones interrogativas o exclamativas indirectas -->
|
||||
<LinePart from="el porque" to="el porqué" />
|
||||
<LinePart from="su porque" to="su porqué" />
|
||||
<LinePart from="los porqués" to="los porqués" />
|
||||
<!-- aún -->
|
||||
<LinePart from="aun," to="aún," />
|
||||
<LinePart from="aun no" to="aún no" />
|
||||
<!-- dé -->
|
||||
<LinePart from=" de y " to=" dé y " />
|
||||
<LinePart from=" nos de " to=" nos dé " />
|
||||
<!-- tú -->
|
||||
<LinePart from=" tu ya " to=" tú ya " />
|
||||
<LinePart from="Tu ya " to="Tú ya " />
|
||||
<!-- casos específicos antes de la coma -->
|
||||
<LinePart from=" de, " to=" dé," />
|
||||
<LinePart from=" mi, " to=" mí," />
|
||||
<LinePart from=" tu, " to=" tú," />
|
||||
<LinePart from=" el, " to=" él," />
|
||||
<LinePart from=" te, " to=" té," />
|
||||
<LinePart from=" mas, " to=" más," />
|
||||
<LinePart from=" quien, " to=" quién," />
|
||||
<LinePart from=" cual," to=" cuál," />
|
||||
<LinePart from="porque, " to="porqué," />
|
||||
<LinePart from="cuanto, " to="cuánto," />
|
||||
<LinePart from="cuando, " to="cuándo," />
|
||||
<!-- sé -->
|
||||
<LinePart from=" se," to=" sé," />
|
||||
<LinePart from="se donde" to="sé dónde" />
|
||||
<LinePart from="se cuando" to="sé cuándo" />
|
||||
<LinePart from="se adonde" to="sé adónde" />
|
||||
<LinePart from="se como" to="sé cómo" />
|
||||
<LinePart from="se cual" to="sé cuál" />
|
||||
<LinePart from="se quien" to="sé quién" />
|
||||
<LinePart from="se cuanto" to="sé cuánto" />
|
||||
<LinePart from="se cuanta" to="sé cuánta" />
|
||||
<LinePart from="se cuantos" to="sé cuántos" />
|
||||
<LinePart from="se cuantas" to="sé cuántas" />
|
||||
<LinePart from="se cuan" to="sé cuán" />
|
||||
<!-- si/sí -->
|
||||
<LinePart from=" el si " to=" el sí " />
|
||||
<LinePart from="si mismo" to="sí mismo" />
|
||||
<LinePart from="si misma" to="sí misma" />
|
||||
<!-- Errores de "l" en vez de "i" en casos específicos -->
|
||||
<LinePart from=" llegal" to=" ilegal" />
|
||||
<LinePart from=" lluminar" to=" iluminar" />
|
||||
<LinePart from="sllbato" to="silbato" />
|
||||
<LinePart from="sllenclo" to="silencio" />
|
||||
<LinePart from="clemencla" to="clemencia" />
|
||||
<LinePart from="socledad" to="sociedad" />
|
||||
<LinePart from="tlene" to="tiene" />
|
||||
<LinePart from="tlempo" to="tiempo" />
|
||||
<LinePart from="equlvocaba" to="equivocaba" />
|
||||
<LinePart from="qulnce" to="quince" />
|
||||
<LinePart from="comlen" to="comien" />
|
||||
<LinePart from="historl" to="histori" />
|
||||
<LinePart from="misterl" to="misteri" />
|
||||
<LinePart from="vivencl" to="vivenci" />
|
||||
</PartialLines>
|
||||
<PartialLinesAlways />
|
||||
<BeginLines />
|
||||
<EndLines>
|
||||
<Ending from=".»." to="»." />
|
||||
</EndLines>
|
||||
<WholeLines>
|
||||
<!-- Todas las líneas -->
|
||||
<Line from="No" to="No." />
|
||||
</WholeLines>
|
||||
<RegularExpressions>
|
||||
<!-- Abreviaturas compuestas -->
|
||||
<RegEx find="\b[Ss](r|ra|rta)\b\.?" replaceWith="S$1." />
|
||||
<RegEx find="\b[Dd](r|ra)\b\.?" replaceWith="D$1." />
|
||||
<RegEx find="\b[Uu](d|ds)\b\.?" replaceWith="U$1." />
|
||||
<RegEx find="(\d)(\s){0,1}([Aa])(\.){0,1}([Mm])(\.){0,1}(\W){0,1}" replaceWith="$1 a. m.$7" />
|
||||
<RegEx find="(\d)(\s){0,1}([Pp])(\.){0,1}([Mm])(\.){0,1}(\W){0,1}" replaceWith="$1 p. m.$7" />
|
||||
<RegEx find="(\d)(\s){0,1}(h)(s\b|r\b|rs\b){0,1}(\.){0,1}(\W){0,1}" replaceWith="$1 $3$6" />
|
||||
<RegEx find="(\d)(\s){0,1}([Kk])(m\b|ms\b)(\.){0,1}(\W){0,1}" replaceWith="$1 km$6" />
|
||||
<RegEx find="(\d)(\s){0,1}(s)(g\b|eg\b){0,1}(\.){0,1}(\W){0,1}" replaceWith="$1 s$6" />
|
||||
<RegEx find="(\d)(\s){0,1}([Kk])(g\b|gs\b)(\.){0,1}(\W){0,1}" replaceWith="$1 kg$6" />
|
||||
<RegEx find="(\d)(\s){0,1}(m)(t\b|ts\b){0,1}(\.){0,1}(\W){0,1}" replaceWith="$1 m$6" />
|
||||
<RegEx find="(\d)KBs(\W){0,1}" replaceWith="$1 kB$2" />
|
||||
<RegEx find="([Nn])°(\s){0,1}(\d)" replaceWith="$1.° $3" />
|
||||
<RegEx find="([Nn])ro(\.){0,1}(\s){0,1}(\d)" replaceWith="$1.° $4" />
|
||||
<!-- Signos invertidos -->
|
||||
<RegEx find="\?¿(\W|\w)" replaceWith="? ¿$1" />
|
||||
<RegEx find="\!¡(\W|\w)" replaceWith="! ¡$1" />
|
||||
<RegEx find="\?¿¿(\W|\w)" replaceWith="? ¿$1" />
|
||||
<RegEx find="\!¡¡(\W|\w)" replaceWith="! ¡$1" />
|
||||
<!-- Inicio de línea -->
|
||||
<RegEx find="^_(\s)" replaceWith="-$1" />
|
||||
<RegEx find="^_(\w)" replaceWith="- $1" />
|
||||
<!-- Uso de comillas según la recomendación de la RAE y la Wikipedia -->
|
||||
<RegEx find="(«[^“«»]+)«" replaceWith="$1“" />
|
||||
<RegEx find="(“[^«»”]+)»" replaceWith="$1”" />
|
||||
<RegEx find="`" replaceWith="‘" />
|
||||
<RegEx find="´" replaceWith="’" />
|
||||
<RegEx find="([\wá-ú])(\.)(«|»)" replaceWith="$1»." />
|
||||
<RegEx find="«(\?)" replaceWith="»?" />
|
||||
<RegEx find="«(\!)" replaceWith="»!" />
|
||||
<RegEx find="«\s" replaceWith="» " />
|
||||
<RegEx find="«(\))" replaceWith="»)" />
|
||||
<RegEx find="(\?)«" replaceWith="?»" />
|
||||
<RegEx find="(\!)«" replaceWith="!»" />
|
||||
<RegEx find="«(,)" replaceWith="»," />
|
||||
<RegEx find="«(;)" replaceWith="»;" />
|
||||
<RegEx find="«(:)" replaceWith="»:" />
|
||||
<RegEx find="(¿)»" replaceWith="¿«" />
|
||||
<RegEx find="(¡)»" replaceWith="¡«" />
|
||||
<!-- Uso de comillas (ANSI) según la recomendación de la RAE («\x22» es el carácter «"») -->
|
||||
<RegEx find="([\wá-ú])([\.,]) ?[\x22»]" replaceWith="$1»$2" />
|
||||
<RegEx find="([\wá-ú])\?[\x22»](\s|$)" replaceWith="$1?».$2" />
|
||||
<RegEx find="^(\.\.\.)(\s){0,1}\x22" replaceWith="$1«" />
|
||||
<RegEx find="«\x22" replaceWith="«" />
|
||||
<RegEx find="\x22»" replaceWith="»" />
|
||||
<RegEx find="^\x22{2,}" replaceWith="«" />
|
||||
<RegEx find="\x22{2,}$" replaceWith="»" />
|
||||
<RegEx find="\x22\r" replaceWith="»" />
|
||||
<RegEx find="^\x22" replaceWith="«" />
|
||||
<RegEx find="\x22$" replaceWith="»." />
|
||||
<RegEx find="([\wá-ú])\.[\x22»]" replaceWith="$1»." />
|
||||
<RegEx find="\s\x22" replaceWith=" «" />
|
||||
<RegEx find="\x22\s" replaceWith="» " />
|
||||
<RegEx find="\x22(,)" replaceWith="»," />
|
||||
<RegEx find="\x22(\.)" replaceWith="»." />
|
||||
<RegEx find="\x22(;)" replaceWith="»;" />
|
||||
<RegEx find="\x22(:)" replaceWith="»:" />
|
||||
<RegEx find="(\!)\x22" replaceWith="!»" />
|
||||
<RegEx find="\x22(\!)" replaceWith="»!" />
|
||||
<RegEx find="(\?)\x22" replaceWith="?»" />
|
||||
<RegEx find="\x22(\?)" replaceWith="»?" />
|
||||
<RegEx find="\x22(¿)" replaceWith="«¿" />
|
||||
<RegEx find="(¿)\x22" replaceWith="¿«" />
|
||||
<RegEx find="\x22(¡)" replaceWith="«¡" />
|
||||
<RegEx find="(¡)\x22" replaceWith="¡«" />
|
||||
<RegEx find="\x22(\))" replaceWith="»)" />
|
||||
<RegEx find="(\))\x22" replaceWith=")»" />
|
||||
<RegEx find="(\()\x22" replaceWith="(«" />
|
||||
<!-- Uso de comillas (Unicode) según la recomendación de la RAE («\u0022» es el carácter «"») -->
|
||||
<RegEx find="^(\.\.\.)(\s){0,1}\u0022" replaceWith="$1«" />
|
||||
<RegEx find="^\u0022{2,}" replaceWith="«" />
|
||||
<RegEx find="\u0022{2,}$" replaceWith="»" />
|
||||
<RegEx find="\u0022\r" replaceWith="»" />
|
||||
<RegEx find="^\u0022" replaceWith="«" />
|
||||
<RegEx find="\u0022$" replaceWith="»" />
|
||||
<RegEx find="(\w)(\.)\u0022" replaceWith="$1»." />
|
||||
<RegEx find="\s\u0022" replaceWith=" «" />
|
||||
<RegEx find="\u0022\s" replaceWith="» " />
|
||||
<RegEx find="\u0022(,)" replaceWith="»," />
|
||||
<RegEx find="\u0022(\.)" replaceWith="»." />
|
||||
<RegEx find="\u0022(;)" replaceWith="»;" />
|
||||
<RegEx find="\u0022(:)" replaceWith="»:" />
|
||||
<RegEx find="(\!)\u0022" replaceWith="!»" />
|
||||
<RegEx find="\u0022(\!)" replaceWith="»!" />
|
||||
<RegEx find="(\?)\u0022" replaceWith="?»" />
|
||||
<RegEx find="\u0022(\?)" replaceWith="»?" />
|
||||
<RegEx find="\u0022(¿)" replaceWith="«¿" />
|
||||
<RegEx find="(¿)\u0022" replaceWith="¿«" />
|
||||
<RegEx find="\u0022(¡)" replaceWith="«¡" />
|
||||
<RegEx find="(¡)\u0022" replaceWith="¡«" />
|
||||
<RegEx find="\u0022(\))" replaceWith="»)" />
|
||||
<RegEx find="(\))\u0022" replaceWith=")»" />
|
||||
<RegEx find="(\()\u0022" replaceWith="(«" />
|
||||
<!-- Numeración -->
|
||||
<RegEx find="([0-9])\.([0-9])\b" replaceWith="$1,$2" />
|
||||
<RegEx find="(^|\s|[¡¿«])([0-9])(,|\.)?([0-9]{3})\b" replaceWith="$1$2$4" />
|
||||
<RegEx find="(\d)\s(?=\d{2}\b)" replaceWith="$1-" />
|
||||
<!-- "1 :", "2 :"... "n :" a "n:" -->
|
||||
<RegEx find="(\d) ([:;])" replaceWith="$1$2" />
|
||||
<!-- Corregir las comas y puntos por ej. «, ,» por «,» & «,,,» o similar por «...» -->
|
||||
<RegEx find="(\.\.\.+)$" replaceWith="..." />
|
||||
<RegEx find="(, ,+)$" replaceWith="," />
|
||||
<RegEx find="(,\s),+\s" replaceWith="$1" />
|
||||
<RegEx find="(\.\.\.),$" replaceWith="$1" />
|
||||
<RegEx find="([\wá-ú])(\.\.)$" replaceWith="$1." />
|
||||
<!-- Puntos innecesarios (complemento) -->
|
||||
<RegEx find="([\w\W]\.{3})([¡¿])" replaceWith="$1 $2" />
|
||||
<RegEx find="(\w)\.\.(\s)" replaceWith="$1.$2" />
|
||||
<RegEx find="([\wá-ú\x22»])\.([\?\!])" replaceWith="$1$2" />
|
||||
<RegEx find="([\:\;])\." replaceWith="$1" />
|
||||
<RegEx find="\.([\:\;])" replaceWith="$1" />
|
||||
<RegEx find="\:+" replaceWith=":" />
|
||||
<!-- Terminaciones ción/sión -->
|
||||
<RegEx find="([sc]i)o(n)\b" replaceWith="$1ó$2" />
|
||||
<RegEx find="([SC]I)O(N)\b" replaceWith="$1Ó$2" />
|
||||
<!-- "i" en vez de "l" en terminaciones «clón» -->
|
||||
<RegEx find="clón\b" replaceWith="ción" />
|
||||
<!-- "si" en vez de "sl" -->
|
||||
<RegEx find="\b([Ss])(l)\b" replaceWith="$1i" />
|
||||
<!-- Para corregir por ej. raclones, perforaclones, opclones, etc -->
|
||||
<RegEx find="([Rr]ac)l(o)" replaceWith="$1i$2" />
|
||||
<RegEx find="([Oo]pc)l(o)" replaceWith="$1i$2" />
|
||||
<!-- Para corregir por ej. tenldo, víctlmas, olvldarlo, legítlmo, etc -->
|
||||
<RegEx find="([BbCcDdFfHhMmNnRrSsTtVv])l([bcdhmnrstv])" replaceWith="$1i$2" />
|
||||
<!-- Corrige los errores en el ripeo de la «o» mayúscula por el cero «0» y viceversa -->
|
||||
<RegEx find="(\d)O" replaceWith="$1 0" />
|
||||
<RegEx find="(\d)[,\.]O" replaceWith="$1.0" />
|
||||
<RegEx find="([A-Z])0" replaceWith="$1O" />
|
||||
<RegEx find="\b0([A-Za-z])" replaceWith="O$1" />
|
||||
<!-- Signos musicales -->
|
||||
<RegEx find="[♪♫☺☹♥©☮☯Σ∞≡⇒π#](\r\n)[♪♫☺☹♥©☮☯Σ∞≡⇒π#]" replaceWith="$1" />
|
||||
<!-- Tilde diacrítica antes del punto -->
|
||||
<RegEx find="(\s)([dst])e\.(\s|\$)" replaceWith="$1$2é.$3" />
|
||||
<RegEx find="(\s)mi\.(\s|\$)" replaceWith="$1mí.$2" />
|
||||
<RegEx find="(\s)el\.(\s|\$)" replaceWith="$1él.$2" />
|
||||
<RegEx find="(\s)tu\.(\s|\$)" replaceWith="$1tú.$2" />
|
||||
<RegEx find="(\s)si\.(\s|\$)" replaceWith="$1sí.$2" />
|
||||
<RegEx find="(\s)aun\.(\s|\$)" replaceWith="$1aún.$2" />
|
||||
<RegEx find="(\s)mas\.(\s|\$)" replaceWith="$1más.$2" />
|
||||
<RegEx find="(\s)quien\.(\s|\$)" replaceWith="$1quién.$2" />
|
||||
<RegEx find="(\s)cual\.(\s|\$)" replaceWith="$1cuál.$2" />
|
||||
<RegEx find="(\s)que\.(\s|\$)" replaceWith="$1qué.$2" />
|
||||
<RegEx find="(\s)porque\.(\s|\$)" replaceWith="$1porqué.$2" />
|
||||
<RegEx find="(\s)cuanto\.(\s|\$)" replaceWith="$1cuánto.$2" />
|
||||
<RegEx find="(\s)cuando\.(\s|\$)" replaceWith="$1cuándo.$2" />
|
||||
<!-- Prefijos; palabras compuestas (simple) -->
|
||||
<RegEx find="(\b[Ee]x|\b[Ss]uper|\b[Aa]nti|\b[Pp]os|\b[Pp]re|\b[Pp]ro|\b[Vv]ice)[\s\x2D]([a-zá-ú]{3,20})(\b)" replaceWith="$1$2" />
|
||||
<!-- Prefijos; palabras compuestas (números) -->
|
||||
<RegEx find="(\b[Ss]ub|\b[Ss]uper)[\s\x2D](\d{2})(\b)" replaceWith="$1-$2$3" />
|
||||
<!-- Prefijos; palabras compuestas (mayúsculas) -->
|
||||
<RegEx find="(\b[Aa]nti|\b[Mm]ini|\b[Pp]os|\b[Pp]ro)\s([A-Z]{1,10})([A-Z][a-zá-ú]){0,10}(\b)" replaceWith="$1-$2$3" />
|
||||
<!-- Casos de mayúsculas con dos puntos -->
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(a)" replaceWith="$1A" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(b)" replaceWith="$1B" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(c)" replaceWith="$1C" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(d)" replaceWith="$1D" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(e)" replaceWith="$1E" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(f)" replaceWith="$1F" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(g)" replaceWith="$1G" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(h)" replaceWith="$1H" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(i)" replaceWith="$1I" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(j)" replaceWith="$1J" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(k)" replaceWith="$1K" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(l)" replaceWith="$1L" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(m)" replaceWith="$1M" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(n)" replaceWith="$1N" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(ñ)" replaceWith="$1Ñ" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(o)" replaceWith="$1O" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(p)" replaceWith="$1P" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(q)" replaceWith="$1Q" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(r)" replaceWith="$1R" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(s)" replaceWith="$1S" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(t)" replaceWith="$1T" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(u)" replaceWith="$1U" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(v)" replaceWith="$1V" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(w)" replaceWith="$1W" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(x)" replaceWith="$1X" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(y)" replaceWith="$1Y" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(z)" replaceWith="$1Z" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(á)" replaceWith="$1Á" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(é)" replaceWith="$1É" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(í)" replaceWith="$1Í" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(ó)" replaceWith="$1Ó" />
|
||||
<RegEx find="([\wá-ú]:\s[«\x22]?)(ú)" replaceWith="$1Ú" />
|
||||
<!-- Usos correctos de la coma -->
|
||||
<RegEx find="(\b[Pp]ero),(\s)([¡¿])" replaceWith="$1$2$3" />
|
||||
<RegEx find="(\b[Aa]unque),(\s|$)" replaceWith="$1$2" />
|
||||
<!-- Vocativos -->
|
||||
<RegEx find="(\bHola|\bBueno|\bBien|\bVen|\bVen acá|\besto|\bBuenos días|\bFeliz cumpleaños|\bsiento)\s([A-Z][a-zá-ú]{3,12}\b|seño(r|ra|rita)\b|hij(o|a) mío\b|amig(o|a)\b)" replaceWith="$1, $2" />
|
||||
<!-- «aún» cuando son sinónimos de «incluso» o «hasta» -->
|
||||
<RegEx find="(\W|^)(\b[Aa])ú(n)(\s)(así\b|cuando\b|los\b|las\b|negar(te|se)\b)" replaceWith="$1$2u$3$4$5" />
|
||||
<RegEx find="(\b[Nn]i)(\s)(a)ú(n)(\W|$)" replaceWith="$1$2$3u$4$5" />
|
||||
<!-- «sí» -->
|
||||
<RegEx find="\b([Ss])i(:|;|\.)" replaceWith="$1í$2" />
|
||||
<!-- «sé» -->
|
||||
<RegEx find="(\b[Ll]o|\b[Ll]a|\b[Ll]e)(\s)se(\W|$)" replaceWith="$1$2sé$3" />
|
||||
<RegEx find="[Ss]e\s(dónde\b|cuándo\b|adónde\b|cómo\b|cuál\b|quién\b|cuánto\b|cuánta\b|cuántos\b|cuántas\b|cuán\b)" replaceWith="sé $1" />
|
||||
<!-- «té» -->
|
||||
<RegEx find="\b([Tt])e\s(verde\b|negro\b|perla\b|de manzanilla\b|de lim[óo]n\b|de jazm[íi]n\b)" replaceWith="$1é $2" />
|
||||
<!-- Apóstrofo -->
|
||||
<RegEx find="(\b[A-Z][a-zá-ú]{3,12})\s(’|')(\d\d(\s|$))" replaceWith="$1 $3" />
|
||||
<RegEx find="(\b[A-Z]{2,5})(’|')(s)" replaceWith="(Ej. Devedés)$1$3" />
|
||||
<RegEx find="(\b\d{1,2})(’|')(\d{2})\s(s|m)(\W|$)" replaceWith="$1,$3 $4$5" />
|
||||
<RegEx find="(\b\d{1,2})(’|')(\d{2})\s(h)(\W|$)" replaceWith="$1:$3 $4$5" />
|
||||
<!-- Porcentaje (debe llevar espacio) -->
|
||||
<RegEx find="(\b\d{1,3})%(\W)" replaceWith="$1 %$2" />
|
||||
<!-- Haz/has -->
|
||||
<RegEx find="(\b)([Hh])as\s(la\b|lo\b|clic\b)(\W)" replaceWith="$1$2az $3$4" />
|
||||
<RegEx find="(\b)([Hh])az\s(de\b)(\W)" replaceWith="$1$2as $3$4" />
|
||||
<RegEx find="(\b)([Hh])as(le\b|nos\b|me\b)(\W)" replaceWith="$1$2az$3$4" />
|
||||
<!-- Quitar itálicas en 3 o menos letras -->
|
||||
<RegEx find="\x3ci\x3e(.{1,3})\x3c\/i\x3e" replaceWith="$1" />
|
||||
<!-- Miscelánea -->
|
||||
<RegEx find="(\b[Cc]erca|\b[Ee]ncima|\b[Dd]ebajo|\b[Dd]etrás|\b[Dd]elante)(\s)mío" replaceWith="$1 de mí" />
|
||||
<RegEx find="(\b[Cc]erca|\b[Ee]ncima|\b[Dd]ebajo|\b[Dd]etrás|\b[Dd]elante)(\s)tuyo" replaceWith="$1 de ti" />
|
||||
<!-- Punto antes de «¿» y «¡» -->
|
||||
<RegEx find="([\wá-ú»])\s(?=(¿|¡)[A-ZÁ-Ú])" replaceWith="$1. " />
|
||||
<!-- Espacios después del guión -->
|
||||
<RegEx find="(^|\n)(-)([^\s])" replaceWith="$1$2 $3" />
|
||||
<!-- Punto antes del guión -->
|
||||
<RegEx find="([^\.\?\!]) - " replaceWith="$1. - " />
|
||||
<!-- Terminaciones en «ólogo», «ílogo» y «álogo» -->
|
||||
<RegEx find="\Bo(log[ao]s?\b)" replaceWith="ó$1" />
|
||||
<RegEx find="\Ba(log[ao]s?\b)" replaceWith="á$1" />
|
||||
<RegEx find="\Bi(log[ao]s?\b)" replaceWith="í$1" />
|
||||
</RegularExpressions>
|
||||
</OCRFixReplaceList>
|
||||
+234
@@ -0,0 +1,234 @@
|
||||
<!-- Credit goes to: MilanRS [http://www.prijevodi-online.org] -->
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords>
|
||||
<Word from="ču" to="ću" />
|
||||
<Word from="češ" to="ćeš" />
|
||||
<Word from="če" to="će" />
|
||||
<Word from="ćš" to="ćeš" />
|
||||
<Word from="ćmo" to="ćemo" />
|
||||
<Word from="ćte" to="ćete" />
|
||||
<Word from="čemo" to="ćemo" />
|
||||
<Word from="čete" to="čete" />
|
||||
<Word from="djete" to="dijete" />
|
||||
<Word from="Hey" to="Hej" />
|
||||
<Word from="hey" to="hej" />
|
||||
<Word from="htjeo" to="htio" />
|
||||
<Word from="Hočeš" to="Hoćeš" />
|
||||
<Word from="hočeš" to="hoćeš" />
|
||||
<Word from="iči" to="ići" />
|
||||
<Word from="jel" to="je l'" />
|
||||
<Word from="Jel" to="Je l'" />
|
||||
<Word from="nedaj" to="ne daj" />
|
||||
<Word from="Rješit" to="Riješit" />
|
||||
<Word from="smjeo" to="smio" />
|
||||
<Word from="uopče" to="uopće" />
|
||||
<Word from="valda" to="valjda" />
|
||||
<Word from="želila" to="željela" />
|
||||
</WholeWords>
|
||||
<PartialWordsAlways />
|
||||
<PartialWords>
|
||||
<WordPart from="¤" to="o" />
|
||||
<WordPart from="vv" to="w" />
|
||||
<WordPart from="IVI" to="M" />
|
||||
<WordPart from="lVI" to="M" />
|
||||
<WordPart from="IVl" to="M" />
|
||||
<WordPart from="lVl" to="M" />
|
||||
</PartialWords>
|
||||
<PartialLines>
|
||||
<LinePart from="bi smo" to="bismo" />
|
||||
<LinePart from="dali je" to="da li je" />
|
||||
<LinePart from="dali si" to="da li si" />
|
||||
<LinePart from="Dali si" to="Da li si" />
|
||||
<LinePart from="Jel sam ti" to="Jesam li ti" />
|
||||
<LinePart from="Jel si" to="Jesi li" />
|
||||
<LinePart from="Jel' si" to="Jesi li" />
|
||||
<LinePart from="Je I'" to="Jesi li" />
|
||||
<LinePart from="Jel si to" to="Jesi li to" />
|
||||
<LinePart from="Jel' si to" to="Da li si to" />
|
||||
<LinePart from="jel si to" to="da li si to" />
|
||||
<LinePart from="jel' si to" to="jesi li to" />
|
||||
<LinePart from="Jel si ti" to="Da li si ti" />
|
||||
<LinePart from="Jel' si ti" to="Da li si ti" />
|
||||
<LinePart from="jel si ti" to="da li si ti" />
|
||||
<LinePart from="jel' si ti" to="da li si ti" />
|
||||
<LinePart from="jel ste " to="jeste li " />
|
||||
<LinePart from="Jel ste" to="Jeste li" />
|
||||
<LinePart from="jel' ste " to="jeste li " />
|
||||
<LinePart from="Jel' ste " to="Jeste li " />
|
||||
<LinePart from="Jel su " to="Jesu li " />
|
||||
<LinePart from="Jel da " to="Zar ne" />
|
||||
<LinePart from="jel da " to="zar ne" />
|
||||
<LinePart from="jel'da " to="zar ne" />
|
||||
<LinePart from="Jeli sve " to="Je li sve" />
|
||||
<LinePart from="Jeli on " to="Je li on" />
|
||||
<LinePart from="Jeli ti " to="Je li ti" />
|
||||
<LinePart from="jeli ti " to="je li ti" />
|
||||
<LinePart from="Jeli to " to="Je li to" />
|
||||
<LinePart from="Nebrini" to="Ne brini" />
|
||||
<LinePart from="nedaj" to="ne daj" />
|
||||
<LinePart from="ne ću" to="neću" />
|
||||
<LinePart from="Nemogu" to="Ne mogu" />
|
||||
<LinePart from="ne mogu" to="ne mogu" />
|
||||
<LinePart from="Nemoraš" to="Ne moraš" />
|
||||
<LinePart from="od kako" to="otkako" />
|
||||
<LinePart from="Si dobro" to="Jesi li dobro" />
|
||||
<LinePart from="Svo vreme" to="Sve vrijeme" />
|
||||
<LinePart from="Svo vrijeme" to="Sve vrijeme" />
|
||||
<LinePart from="Cijelo vrijeme" to="Sve vrijeme" />
|
||||
</PartialLines>
|
||||
<PartialLinesAlways />
|
||||
<BeginLines />
|
||||
<EndLines />
|
||||
<WholeLines />
|
||||
<RegularExpressions>
|
||||
<RegEx find="đž" replaceWith="dž" />
|
||||
<RegEx find="ajsmiješnij" replaceWith="ajsmješnij" />
|
||||
<RegEx find="boži[čć]([aeiu]|em|ima)?\b" replaceWith="Božić$1" />
|
||||
<RegEx find=" g-dine\.$" replaceWith=" gospodine." />
|
||||
<RegEx find=" g-dine +(?=[A-ZČĐŠŽ])" replaceWith=" g. " />
|
||||
<RegEx find="([gG])dine? +(?=[A-ZČĐŠŽ])" replaceWith="$1. " />
|
||||
<RegEx find="([gG])-đo +(?=[A-ZČĐŠŽ])" replaceWith="$1gđo " />
|
||||
<RegEx find="gdina +(?=[A-ZČĐŠŽ])" replaceWith="g. " />
|
||||
<RegEx find=" gosp +" replaceWith=" g. " />
|
||||
<RegEx find="Jel si sigur" replaceWith="Jesi li sigur" />
|
||||
<RegEx find="Jel' si sigur" replaceWith="Jesi li sigur" />
|
||||
<RegEx find="\b([jJ])el\?" replaceWith="$1e l'?" />
|
||||
<RegEx find="\bJel'" replaceWith="Je l'" />
|
||||
<RegEx find="([kK]alib(?:ar|r[aeui]))\. *([0-9])" replaceWith="$1 .$2" />
|
||||
<RegEx find="([mM])jenjati" replaceWith="$1ijenjati" />
|
||||
<RegEx find="([mM])oguč" replaceWith="$1oguć" />
|
||||
<RegEx find="\b([nN])ebih?" replaceWith="$1e bi" />
|
||||
<RegEx find="\b([nN])eč([ue]š?|emo|ete)\b" replaceWith="$1eć$2" />
|
||||
<RegEx find="\b([nN])emože(mo|š|te)?\b" replaceWith="$1e može$2" />
|
||||
<RegEx find="\b([nN])ezna([šm]o?|t[ei]|ju|jući|vši)?\b" replaceWith="$1e zna$2" />
|
||||
<RegEx find="najcijenjen" replaceWith="najcjenjen" />
|
||||
<RegEx find="N[jJ]u Jork" replaceWith="Njujork" />
|
||||
<RegEx find="([oO])d([kp])" replaceWith="$1t$2" />
|
||||
<RegEx find="([oO])ružij([aeu])" replaceWith="$1ružj$2" />
|
||||
<RegEx find="([oO])sječa" replaceWith="$1sjeća" />
|
||||
<RegEx find="([pPdD])onje([lt])" replaceWith="$1onije$2" />
|
||||
<RegEx find="([pP])objedi([mšto])" replaceWith="$1obijedi$2" />
|
||||
<RegEx find="redamnom" replaceWith="reda mnom" />
|
||||
<RegEx find="redpostav" replaceWith="retpostav" />
|
||||
<RegEx find="([pP])rimjeti" replaceWith="$1rimijeti" />
|
||||
<RegEx find="([pP])romjeni([mštol])" replaceWith="$1romijeni$2" />
|
||||
<RegEx find="([rR])azumijeć" replaceWith="$1azumjeć" />
|
||||
<RegEx find="rascjepljen" replaceWith="rascijepljen" />
|
||||
<RegEx find="redhodn" replaceWith="rethodn" />
|
||||
<RegEx find="rimjenjen" replaceWith="rimijenjen" />
|
||||
<RegEx find="([^d])rješit" replaceWith="$1riješit" />
|
||||
<RegEx find="([sSzZ])amnom" replaceWith="$1a mnom" />
|
||||
<RegEx find="([sS])lijede[čć]([aeiu]|e[mg])" replaceWith="$1ljedeć$2" />
|
||||
<RegEx find="([sS])mješno" replaceWith="$1miješno" />
|
||||
<RegEx find="([uU])mijesto" replaceWith="$1mjesto" />
|
||||
<RegEx find="([uU])spijeh" replaceWith="$1spjeh" />
|
||||
<RegEx find="([uU])spiješ(an|n[aeiou]|no[mgj])" replaceWith="$1spješ$2" />
|
||||
<RegEx find="([uU])vjek" replaceWith="$1vijek" />
|
||||
<RegEx find="\b([vV])eč([aeiou])" replaceWith="$1eć$2" />
|
||||
<RegEx find="([zZ])ahtijeva" replaceWith="$1ahtjeva" />
|
||||
<RegEx find="([zZ])ahtjeva([ojlmšt])" replaceWith="$1ahtijeva$2" />
|
||||
<RegEx find="([ks]ao)\.:" replaceWith="$1:" />
|
||||
<RegEx find="(?<=[a-zčđšž])Ij(?=[a-zčđšž])" replaceWith="lj" />
|
||||
<RegEx find="(?<=[^A-ZČĐŠŽa-zčđšž])Iju(?=bav|d|t)" replaceWith="lju" />
|
||||
<!-- kad ima razmak između tagova </i> <i> -->
|
||||
<!-- <RegEx find="(>) +(<)" replaceWith="$1$2" /> -->
|
||||
<!-- ',"' to '",' -->
|
||||
<RegEx find="(?<=\w),"(?=\s|$)" replaceWith=""," />
|
||||
<RegEx find=",\.{3}|\.{3},|\.{2} \." replaceWith="..." />
|
||||
<!-- "1 :", "2 :"... "n :" to "n:" -->
|
||||
<RegEx find="([0-9]) +: +(\D)" replaceWith="$1: $2" />
|
||||
<!-- Two or more consecutive "," to "..." -->
|
||||
<RegEx find=",{2,}" replaceWith="..." />
|
||||
<!-- Two or more consecutive "-" to "..." -->
|
||||
<RegEx find="-{2,}" replaceWith="..." />
|
||||
<RegEx find="([^().])\.{2}([^().:])" replaceWith="$1...$2" />
|
||||
<!-- separator stotica i decimalnog ostatka 1,499,000.00 -> 1.499.000,00 -->
|
||||
<RegEx find="([0-9]{3})\.([0-9]{2}[^0-9])" replaceWith="$1,$2" />
|
||||
<RegEx find="([0-9]),([0-9]{3}\D)" replaceWith="$1.$2" />
|
||||
<!-- Apostrophes -->
|
||||
<RegEx find="´´" replaceWith=""" />
|
||||
<!-- <RegEx find="[´`]" replaceWith="'" /> -->
|
||||
<!-- <RegEx find="[“”]" replaceWith=""" /> -->
|
||||
<RegEx find="''" replaceWith=""" />
|
||||
<!-- Two or more consecutive '"' to one '"' -->
|
||||
<RegEx find=""{2,}" replaceWith=""" />
|
||||
<!-- Fix zero and capital 'o' ripping mistakes -->
|
||||
<RegEx find="(?<=[0-9]\.?)O" replaceWith="0" />
|
||||
<RegEx find="\b0(?=[A-ZČĐŠŽa-zčđšž])" replaceWith="O" />
|
||||
<!-- Brisanje crte - na početku 1. reda (i kada ima dva reda) -->
|
||||
<RegEx find="\A- ?([A-ZČĐŠŽa-zčđšž0-9„'"]|\.{3})" replaceWith="$1" />
|
||||
<RegEx find="\A(<[ibu]>)- ?" replaceWith="$1" />
|
||||
<RegEx find=" - " replaceWith=" -" />
|
||||
<!-- Brisanje razmaka iza crte - na početku 2. reda -->
|
||||
<RegEx find="(?<=\n(<[ibu]>)?)- (?=[A-ZČĐŠŽčš0-9„'"<])" replaceWith="-" />
|
||||
<!-- Korigovanje crte - kad je u sredini prvog reda -->
|
||||
<RegEx find="([.!?">]) - ([A-ZČĐŠŽčš'"<])" replaceWith="$1 -$2" />
|
||||
<!-- Zatvoren tag pa razmak poslije crtice -->
|
||||
<RegEx find="(>) - ([A-ZČĐŠŽčš„'"])" replaceWith="$1 -$2" />
|
||||
<!-- Zatvoren tag pa crtica razmak -->
|
||||
<RegEx find="(>)- ([A-ZČĐŠŽčš„'"])" replaceWith="$1-$2" />
|
||||
<!-- Zagrada pa crtica razmak -->
|
||||
<RegEx find="\(- ([A-ZČĐŠŽčš„'"])" replaceWith="(-$1" />
|
||||
<!-- Smart space after dot -->
|
||||
<!-- osim kad je zadnje t (riječ kolt) -->
|
||||
<RegEx find="(?<=[a-su-zá-úñä-ü])\.(?=[^\s\n().:?!*^“”'"<])" replaceWith=". " />
|
||||
<!-- Oznaka za kalibar. Npr. "Colt .45" -->
|
||||
<!-- Da bi radilo, da bi ovaj razmak bio dozvoljen, odčekirajte "Razmaci ispred tačke" -->
|
||||
<RegEx find="t\.(?=[0-9]{2})" replaceWith="t ." />
|
||||
<!-- Joey(j)a -->
|
||||
<RegEx find="(?<=\b[A-Z][a-z])eyj(?=[a-z])" replaceWith="ey" />
|
||||
<!-- Sređuje zarez sa razmakom -->
|
||||
<RegEx find="(?<=[A-ZČĐŠŽa-zčđšžá-úñä-ü"]),(?=[^\s(),?!“<])" replaceWith=", " />
|
||||
<RegEx find=" +,(?=[A-ZČĐŠŽa-zčđšž])" replaceWith=", " />
|
||||
<RegEx find=" +, +" replaceWith=", " />
|
||||
<RegEx find=" +,$" replaceWith="," />
|
||||
<RegEx find="([?!])-" replaceWith="$1 -" />
|
||||
<!-- Space after last of some consecutive dots (eg. "...") -->
|
||||
<RegEx find="(?<=[a-zčđšž])(\.{3}|!)(?=[a-zčđšž])" replaceWith="$1 " />
|
||||
<!-- Delete space after "..." that is at the beginning of the line. You may delete this line if you don't like it -->
|
||||
<!-- <RegEx find="^\.{3} +" replaceWith="..." /> -->
|
||||
<!-- "tekst ... tekst" mijenja u "tekst... tekst" -->
|
||||
<RegEx find="(?<=[A-ZČĐŠŽa-zčđšž]) +\.{3} +" replaceWith="... " />
|
||||
<RegEx find="(?<=\S)\. +"" replaceWith="."" />
|
||||
<RegEx find="" +\." replaceWith=""." />
|
||||
<RegEx find="(?<=\S\.{3}) +"(?=\s|$)" replaceWith=""" />
|
||||
<RegEx find=" +\.{3}$" replaceWith="..." />
|
||||
<RegEx find="(?<=[a-zčđšž])(?: +\.{3}|\.{2}$)" replaceWith="..." />
|
||||
<!-- Razmak ispred zagrade -->
|
||||
<RegEx find="(?<=[A-ZČĐŠŽa-zčđšž])\(" replaceWith=" (" />
|
||||
<!-- Razmak iza upitnika -->
|
||||
<RegEx find="\?(?=[A-ZČĐŠŽčš])" replaceWith="? " />
|
||||
<RegEx find="(?<=^|>)\.{3} +(?=[A-ZČĐŠŽčš])" replaceWith="..." />
|
||||
<!-- Brise ... kad je na poč. reda "... -->
|
||||
<RegEx find="^"\.{3} +" replaceWith=""" />
|
||||
<RegEx find="(?<=[0-9])\$" replaceWith=" $$" />
|
||||
<!-- ti š -> t š by Strider -->
|
||||
<!-- Zamijeni sva "**ti šu*" s "**t šu*" i "**ti še*" s "**t še*" -->
|
||||
<!-- <RegEx find="([a-z])ti (š+[eu])" replaceWith="$1t $2" /> -->
|
||||
<!-- <RegEx find="([A-Za-z])ti( |\r?\n)(š[eu])" replaceWith="$1t$2$3" /> -->
|
||||
<!-- <RegEx find="(?i)\b(ni)t (š[eu])" replaceWith="$1ti $2" /> -->
|
||||
<!-- <RegEx find="\. +Mr. " replaceWith=". G. " /> -->
|
||||
<!-- <RegEx find="\. +Mrs. " replaceWith=". Gđa " /> -->
|
||||
<!-- <RegEx find="\. +Miss " replaceWith=". Gđica " /> -->
|
||||
<!-- <RegEx find=", +Mrs. " replaceWith=", gđo " /> -->
|
||||
<!-- <RegEx find=", +Miss " replaceWith=", gđice " /> -->
|
||||
<!-- Razmak poslije <i> i poslije .. -->
|
||||
<RegEx find="^(<[ibu]>) +" replaceWith="$1" />
|
||||
<RegEx find="^\.{2} +" replaceWith="..." />
|
||||
<!-- Razmak ? "</i> -->
|
||||
<RegEx find="([.?!]) +("<)" replaceWith="$1$2" />
|
||||
<!-- Bez razmaka kod Npr.: -->
|
||||
<RegEx find="(?<=[Nn]pr\.) *: *" replaceWith=": " />
|
||||
<RegEx find="\. ," replaceWith=".," />
|
||||
<RegEx find="([?!])\." replaceWith="$1" />
|
||||
<!-- Da ne kvari potpise sa ..:: -->
|
||||
<RegEx find="\.{3}::" replaceWith="..::" />
|
||||
<RegEx find="::\.{3}" replaceWith="::.." />
|
||||
<RegEx find="\.{2} +::" replaceWith="..::" />
|
||||
<!-- Skracenice bez razmaka -->
|
||||
<RegEx find="d\. o\.o\." replaceWith="d.o.o." />
|
||||
<!-- Kad red počinje sa ...pa malo slovo -->
|
||||
<!-- <RegEx find="^\.{3}([a-zčđšž"<])" replaceWith="$1" /> -->
|
||||
<!-- <RegEx find=" +([.?!])" replaceWith="$1" /> -->
|
||||
</RegularExpressions>
|
||||
</OCRFixReplaceList>
|
||||
+405
@@ -0,0 +1,405 @@
|
||||
<OCRFixReplaceList>
|
||||
<WholeWords>
|
||||
<Word from="lârt" to="lärt" />
|
||||
<Word from="hedervårda" to="hedervärda" />
|
||||
<Word from="stormâstare" to="stormästare" />
|
||||
<Word from="Avfârd" to="Avfärd" />
|
||||
<Word from="tâlten" to="tälten" />
|
||||
<Word from="ârjag" to="är jag" />
|
||||
<Word from="ärjag" to="är jag" />
|
||||
<Word from="jâmlikar" to="jämlikar" />
|
||||
<Word from="Riskakofl" to="Riskakor" />
|
||||
<Word from="Karamellen/" to="Karamellen" />
|
||||
<Word from="Lngenüng" to="Ingenting" />
|
||||
<Word from="ärju" to="är ju" />
|
||||
<Word from="Sá" to="Så" />
|
||||
<Word from="närjag" to="när jag" />
|
||||
<Word from="alltjag" to="allt jag" />
|
||||
<Word from="görjag" to="gör jag" />
|
||||
<Word from="trorjag" to="tror jag" />
|
||||
<Word from="varju" to="var ju" />
|
||||
<Word from="görju" to="gör ju" />
|
||||
<Word from="kanju" to="kan ju" />
|
||||
<Word from="blirjag" to="blir jag" />
|
||||
<Word from="sägerjag" to="säger jag" />
|
||||
<Word from="behållerjag" to="behåller jag" />
|
||||
<Word from="prøblem" to="problem" />
|
||||
<Word from="räddadeju" to="räddade ju" />
|
||||
<Word from="honøm" to="honom" />
|
||||
<Word from="Ln" to="In" />
|
||||
<Word from="svårflörtad" to="svårflörtad" />
|
||||
<Word from="øch" to="och" />
|
||||
<Word from="flörtar" to="flörtar" />
|
||||
<Word from="kännerjag" to="känner jag" />
|
||||
<Word from="flickan" to="flickan" />
|
||||
<Word from="snø" to="snö" />
|
||||
<Word from="gerju" to="ger ju" />
|
||||
<Word from="køntakter" to="kontakter" />
|
||||
<Word from="ølycka" to="olycka" />
|
||||
<Word from="nølla" to="nolla" />
|
||||
<Word from="sinnenajublar" to="sinnena jublar" />
|
||||
<Word from="ijobbet" to="i jobbet" />
|
||||
<Word from="Fårjag" to="Får jag" />
|
||||
<Word from="Ar" to="Är" />
|
||||
<Word from="liggerju" to="ligger ju" />
|
||||
<Word from="um" to="om" />
|
||||
<Word from="lbland" to="Ibland" />
|
||||
<Word from="skjuterjag" to="skjuter jag" />
|
||||
<Word from="Vaddå" to="Vad då" />
|
||||
<Word from="pratarjämt" to="pratar jämt" />
|
||||
<Word from="harju" to="har ju" />
|
||||
<Word from="sitterjag" to="sitter jag" />
|
||||
<Word from="häfla" to="härja" />
|
||||
<Word from="sfiäl" to="stjäl" />
|
||||
<Word from="FÖU" to="Följ" />
|
||||
<Word from="varförjag" to="varför jag" />
|
||||
<Word from="sfiärna" to="stjärna" />
|
||||
<Word from="böflar" to="börjar" />
|
||||
<Word from="böflan" to="början" />
|
||||
<Word from="stäri" to="står" />
|
||||
<Word from="pä" to="på" />
|
||||
<Word from="harjag" to="har jag" />
|
||||
<Word from="attjag" to="att jag" />
|
||||
<Word from="Verkarjag" to="Verkar jag" />
|
||||
<Word from="Kännerjag" to="Känner jag" />
|
||||
<Word from="därjag" to="där jag" />
|
||||
<Word from="tufi" to="tuff" />
|
||||
<Word from="lurarjag" to="lurar jag" />
|
||||
<Word from="varjättebra" to="var jättebra" />
|
||||
<Word from="allvan" to="allvar" />
|
||||
<Word from="dethär" to="det här" />
|
||||
<Word from="vafle" to="varje" />
|
||||
<Word from="FöUer" to="Följer" />
|
||||
<Word from="personalmötetl" to="personalmötet!" />
|
||||
<Word from="harjust" to="har just" />
|
||||
<Word from="ärjätteduktig" to="är jätteduktig" />
|
||||
<Word from="därja" to="där ja" />
|
||||
<Word from="lngenüng" to="lngenting" />
|
||||
<Word from="iluften" to="i luften" />
|
||||
<Word from="ösen" to="öser" />
|
||||
<Word from="tvâ" to="två" />
|
||||
<Word from="Uejerna" to="Tjejerna" />
|
||||
<Word from="hån*" to="hårt" />
|
||||
<Word from="Ärjag" to="Är jag" />
|
||||
<Word from="keL" to="Okej" />
|
||||
<Word from="Förjag" to="För jag" />
|
||||
<Word from="varjättekul" to="var jättekul" />
|
||||
<Word from="kämpan" to="kämpar" />
|
||||
<Word from="mycketjobb" to="mycket jobb" />
|
||||
<Word from="Uus" to="ljus" />
|
||||
<Word from="serjag" to="ser jag" />
|
||||
<Word from="vetjag" to="vet jag" />
|
||||
<Word from="fårjag" to="får jag" />
|
||||
<Word from="hurjag" to="hur jag" />
|
||||
<Word from="försökerjag" to="försöker jag" />
|
||||
<Word from="tánagel" to="tånagel" />
|
||||
<Word from="vaüe" to="varje" />
|
||||
<Word from="Uudet" to="ljudet" />
|
||||
<Word from="amhopa" to="allihopa" />
|
||||
<Word from="Väü" to="Välj" />
|
||||
<Word from="gäri" to="går" />
|
||||
<Word from="rödüus" to="rödljus" />
|
||||
<Word from="Uuset" to="ljuset" />
|
||||
<Word from="Ridàn" to="Ridån" />
|
||||
<Word from="viüa" to="vilja" />
|
||||
<Word from="gåri" to="går i" />
|
||||
<Word from="Hurdå" to="Hur då" />
|
||||
<Word from="inter\/juar" to="intervjuar" />
|
||||
<Word from="menarjag" to="menar jag" />
|
||||
<Word from="spyrjag" to="spyr jag" />
|
||||
<Word from="briüera" to="briljera" />
|
||||
<Word from="Närjag" to="När jag" />
|
||||
<Word from="ner\/ös" to="nervös" />
|
||||
<Word from="ilivets" to="i livets" />
|
||||
<Word from="nägot" to="något" />
|
||||
<Word from="pà" to="på" />
|
||||
<Word from="Lnnan" to="Innan" />
|
||||
<Word from="Uf" to="Ut" />
|
||||
<Word from="lnnan" to="Innan" />
|
||||
<Word from="Dàren" to="Dåren" />
|
||||
<Word from="Fàrjag" to="Får jag" />
|
||||
<Word from="VadärdetdäL" to="Vad är det där" />
|
||||
<Word from="smàtjuv" to="småtjuv" />
|
||||
<Word from="tàgrånare" to="tågrånare" />
|
||||
<Word from="ditàt" to="ditåt" />
|
||||
<Word from="sä" to="så" />
|
||||
<Word from="vàrdslösa" to="vårdslösa" />
|
||||
<Word from="nàn" to="nån" />
|
||||
<Word from="kommerjag" to="kommer jag" />
|
||||
<Word from="ärjättebra" to="är jättebra" />
|
||||
<Word from="ärjävligt" to="är jävligt" />
|
||||
<Word from="àkerjag" to="åker jag" />
|
||||
<Word from="ellerjapaner" to="eller japaner" />
|
||||
<Word from="attjaga" to="att jaga" />
|
||||
<Word from="eften" to="efter" />
|
||||
<Word from="hästan" to="hästar" />
|
||||
<Word from="Lntensivare" to="Intensivare" />
|
||||
<Word from="fràgarjag" to="frågar jag" />
|
||||
<Word from="pen/ers" to="pervers" />
|
||||
<Word from="ràbarkade" to="råbarkade" />
|
||||
<Word from="styrkon" to="styrkor" />
|
||||
<Word from="Difåf" to="Ditåt" />
|
||||
<Word from="händen" to="händer" />
|
||||
<Word from="föfia" to="följa" />
|
||||
<Word from="Idioten/" to="Idioter!" />
|
||||
<Word from="Varförjagade" to="Varför jagade" />
|
||||
<Word from="därförjag" to="därför jag" />
|
||||
<Word from="forjag" to="for jag" />
|
||||
<Word from="Iivsgladje" to="livsglädje" />
|
||||
<Word from="narjag" to="när jag" />
|
||||
<Word from="sajag" to="sa jag" />
|
||||
<Word from="genastja" to="genast ja" />
|
||||
<Word from="rockumentàren" to="rockumentären" />
|
||||
<Word from="turne" to="turné" />
|
||||
<Word from="fickjag" to="fick jag" />
|
||||
<Word from="sager" to="säger" />
|
||||
<Word from="Ijushårig" to="ljushårig" />
|
||||
<Word from="tradgårdsolycka" to="trädgårdsolycka" />
|
||||
<Word from="kvavdes" to="kvävdes" />
|
||||
<Word from="dàrja" to="där ja" />
|
||||
<Word from="hedersgaster" to="hedersgäster" />
|
||||
<Word from="Nar" to="När" />
|
||||
<Word from="smakiösa" to="smaklösa" />
|
||||
<Word from="lan" to="Ian" />
|
||||
<Word from="Lan" to="Ian" />
|
||||
<Word from="eri" to="er i" />
|
||||
<Word from="universitetsamne" to="universitetsämne" />
|
||||
<Word from="garna" to="gärna" />
|
||||
<Word from="ar" to="är" />
|
||||
<Word from="baltdjur" to="bältdjur" />
|
||||
<Word from="varjag" to="var jag" />
|
||||
<Word from="àr" to="är" />
|
||||
<Word from="förförstàrkare" to="förförstärkare" />
|
||||
<Word from="arjattespeciell" to="är jättespeciell" />
|
||||
<Word from="hàrgår" to="här går" />
|
||||
<Word from="Ia" to="la" />
|
||||
<Word from="Iimousinen" to="limousinen" />
|
||||
<Word from="krickettra" to="kricketträ" />
|
||||
<Word from="hårdrockvàrlden" to="hårdrockvärlden" />
|
||||
<Word from="tràbit" to="träbit" />
|
||||
<Word from="Mellanvastern" to="Mellanvästern" />
|
||||
<Word from="arju" to="är ju" />
|
||||
<Word from="turnen" to="turnén" />
|
||||
<Word from="kanns" to="känns" />
|
||||
<Word from="battre" to="bättre" />
|
||||
<Word from="vàrldsturne" to="världsturne" />
|
||||
<Word from="dar" to="där" />
|
||||
<Word from="sjàlvantànder" to="självantänder" />
|
||||
<Word from="jattelange" to="jättelänge" />
|
||||
<Word from="berattade" to="berättade" />
|
||||
<Word from="Sä" to="Så" />
|
||||
<Word from="vandpunkten" to="vändpunkten" />
|
||||
<Word from="Nàrjag" to="När jag" />
|
||||
<Word from="lasa" to="läsa" />
|
||||
<Word from="skitlàskigt" to="skitläskigt" />
|
||||
<Word from="sambandsvàg" to="sambandsväg" />
|
||||
<Word from="valdigt" to="väldigt" />
|
||||
<Word from="Stamgafiel" to="Stämgaffel" />
|
||||
<Word from="àrjag" to="är jag" />
|
||||
<Word from="tajming" to="tajmning" />
|
||||
<Word from="utgäng" to="utgång" />
|
||||
<Word from="Hàråt" to="Häråt" />
|
||||
<Word from="hàråt" to="häråt" />
|
||||
<Word from="anvander" to="använder" />
|
||||
<Word from="harjobbat" to="har jobbat" />
|
||||
<Word from="imageide" to="imageidé" />
|
||||
<Word from="klafien" to="klaffen" />
|
||||
<Word from="sjalv" to="själv" />
|
||||
<Word from="dvarg" to="dvärg" />
|
||||
<Word from="detjag" to="det jag" />
|
||||
<Word from="dvargarna" to="dvärgarna" />
|
||||
<Word from="fantasivàrld" to="fantasivärld" />
|
||||
<Word from="fiolliga" to="Fjolliga" />
|
||||
<Word from="mandoiinstràngar" to="mandollnsträngar" />
|
||||
<Word from="mittjobb" to="mitt jobb" />
|
||||
<Word from="Skajag" to="Ska jag" />
|
||||
<Word from="landari" to="landar i" />
|
||||
<Word from="gang" to="gäng" />
|
||||
<Word from="Detjag" to="Det jag" />
|
||||
<Word from="Narmre" to="Närmre" />
|
||||
<Word from="Iåtjavelni" to="låtjäveln" />
|
||||
<Word from="Hållerjag" to="Håller jag" />
|
||||
<Word from="visionarer" to="visionärer" />
|
||||
<Word from="Tülvad" to="Till vad" />
|
||||
<Word from="militàrbas" to="militärbas" />
|
||||
<Word from="jattegiada" to="jätteglada" />
|
||||
<Word from="Fastjag" to="Fast jag" />
|
||||
<Word from="såjag" to="så jag" />
|
||||
<Word from="rockvarlden" to="rockvärlden" />
|
||||
<Word from="saknarjag" to="saknar jag" />
|
||||
<Word from="allafall" to="alla fall" />
|
||||
<Word from="fianta" to="fjanta" />
|
||||
<Word from="Kràma" to="Kräma" />
|
||||
<Word from="stammer" to="stämmer" />
|
||||
<Word from="budbàrare" to="budbärare" />
|
||||
<Word from="Iivsfiiosofi" to="livsfiiosofi" />
|
||||
<Word from="förjämnan" to="för jämnan" />
|
||||
<Word from="gillarjag" to="gillar jag" />
|
||||
<Word from="Iarvat" to="larvat" />
|
||||
<Word from="klararjag" to="klarar jag" />
|
||||
<Word from="hattafi'àr" to="hattaffär" />
|
||||
<Word from="Dà" to="Då" />
|
||||
<Word from="uppfinna" to="uppfinna" />
|
||||
<Word from="Ràttfåglar" to="Råttfåglar" />
|
||||
<Word from="Sväüboda" to="Sväljboda" />
|
||||
<Word from="Påböflar" to="Påbörjar" />
|
||||
<Word from="slutarju" to="slutar ju" />
|
||||
<Word from="nifiskebuüken" to="i fiskebutiken" />
|
||||
<Word from="härjäkeln" to="här jäkeln" />
|
||||
<Word from="Hßppa" to="Hoppa" />
|
||||
<Word from="förstörds" to="förstördes" />
|
||||
<Word from="varjättegoda" to="var jättegoda" />
|
||||
<Word from="Kor\/" to="Korv" />
|
||||
<Word from="brüléel" to="brülée!" />
|
||||
<Word from="Hei" to="Hej" />
|
||||
<Word from="älskarjordgubbsglass" to="älskar jordgubbsglass" />
|
||||
<Word from="Snöbom" to="Snöboll" />
|
||||
<Word from="SnöboH" to="Snöboll" />
|
||||
<Word from="Snöbol" to="Snöboll" />
|
||||
<Word from="snöboH" to="snöboll" />
|
||||
<Word from="Läggerpå" to="Lägger på" />
|
||||
<Word from="lngefl" to="lnget!" />
|
||||
<Word from="Sägerjättesmarta" to="Säger jättesmarta" />
|
||||
<Word from="dopplen/äderradar" to="dopplerväderradar" />
|
||||
<Word from="säkertjättefin" to="säkert jättefin" />
|
||||
<Word from="ärjättefin" to="är jättefin" />
|
||||
<Word from="verkarju" to="verkar ju" />
|
||||
<Word from="blirju" to="blir ju" />
|
||||
<Word from="kor\/" to="korv" />
|
||||
<Word from="naturkatastrofi" to="naturkatastrof!" />
|
||||
<Word from="stickerjag" to="stickerj ag" />
|
||||
<Word from="jättebufié" to="jättebuffé" />
|
||||
<Word from="befinner" to="befinner" />
|
||||
<Word from="Spflng" to="Spring" />
|
||||
<Word from="trecfie" to="tredje" />
|
||||
<Word from="ryckerjag" to="rycker jag" />
|
||||
<Word from="skullejag" to="skulle jag" />
|
||||
<Word from="vetju" to="vet ju" />
|
||||
<Word from="afljag" to="att jag" />
|
||||
<Word from="flnns" to="finns" />
|
||||
<Word from="ärlång" to="är lång" />
|
||||
<Word from="kåra" to="kära" />
|
||||
<Word from="ärfina" to="är fina" />
|
||||
<Word from="äri" to="är i" />
|
||||
<Word from="hörden" to="hör den" />
|
||||
<Word from="ättjäg" to="att jäg" />
|
||||
<Word from="gär" to="går" />
|
||||
<Word from="föri" to="för i" />
|
||||
<Word from="Hurvisste" to="Hur visste" />
|
||||
<Word from="fick" to="fick" />
|
||||
<Word from="finns" to="finns" />
|
||||
<Word from="fin" to="fin" />
|
||||
<Word from="Fa" to="Bra." />
|
||||
<Word from="bori" to="bor i" />
|
||||
<Word from="fiendeplanl" to="fiendeplan!" />
|
||||
<Word from="iförnamn" to="i förnamn" />
|
||||
<Word from="detju" to="det ju" />
|
||||
<Word from="Nüd" to="Niki" />
|
||||
<Word from="hatarjag" to="hatar jag" />
|
||||
<Word from="Klararjag" to="Klarar jag" />
|
||||
<Word from="detafier" to="detaljer" />
|
||||
<Word from="vä/" to="väl" />
|
||||
<Word from="smakarju" to="smakar ju" />
|
||||
<Word from="Teachefl" to="Teacher!" />
|
||||
<Word from="imorse" to="i morse" />
|
||||
<Word from="drickerjag" to="dricker jag" />
|
||||
<Word from="ståri" to="står i" />
|
||||
<Word from="Harjag" to="Har jag" />
|
||||
<Word from="Talarjag" to="Talar jag" />
|
||||
<Word from="undrarjag" to="undrar jag" />
|
||||
<Word from="ålderjag" to="ålder jag" />
|
||||
<Word from="vafie" to="varje" />
|
||||
<Word from="förfalskningl" to="förfalskning!" />
|
||||
<Word from="Vifiiiiam" to="William" />
|
||||
<Word from="V\filliams" to="Williams" />
|
||||
<Word from="attjobba" to="att jobba" />
|
||||
<Word from="intei" to="inte i" />
|
||||
<Word from="närV\filliam" to="när William" />
|
||||
<Word from="V\filliam" to="William" />
|
||||
<Word from="Efiersom" to="Eftersom" />
|
||||
<Word from="Vlfilliam" to="William" />
|
||||
<Word from="Iängejag" to="länge jag" />
|
||||
<Word from="'fidigare" to="Tidigare" />
|
||||
<Word from="börjadei" to="började i" />
|
||||
<Word from="merjust" to="mer just" />
|
||||
<Word from="efieråt" to="efteråt" />
|
||||
<Word from="gjordejag" to="gjorde jag" />
|
||||
<Word from="hadeju" to="hade ju" />
|
||||
<Word from="gårvi" to="går vi" />
|
||||
<Word from="köperjag" to="köper jag" />
|
||||
<Word from="Måstejag" to="Måste jag" />
|
||||
<Word from="kännerju" to="känner ju" />
|
||||
<Word from="fln" to="fin" />
|
||||
<Word from="treviig" to="trevlig" />
|
||||
<Word from="Grattisl" to="Grattis!" />
|
||||
<Word from="kande" to="kände" />
|
||||
<Word from="'llden" to="Tiden" />
|
||||
<Word from="sakjag" to="sak jag" />
|
||||
<Word from="klartjag" to="klart jag" />
|
||||
<Word from="häfiigt" to="häftigt" />
|
||||
<Word from="Iämnarjag" to="lämnar jag" />
|
||||
<Word from="gickju" to="gick ju" />
|
||||
<Word from="skajag" to="ska jag" />
|
||||
<Word from="Görjag" to="Gör jag" />
|
||||
<Word from="måstejag" to="måste jag" />
|
||||
<Word from="gra\/iditet" to="graviditet" />
|
||||
<Word from="hittadqdin" to="hittade din" />
|
||||
<Word from="ärjobbigt" to="är jobbigt" />
|
||||
<Word from="Overdrivet" to="Överdrivet" />
|
||||
<Word from="hOgtidlig" to="högtidlig" />
|
||||
<Word from="Overtyga" to="Övertyga" />
|
||||
<Word from="SKILSMASSA" to="SKILSMÄSSA" />
|
||||
<Word from="brukarju" to="brukar ju" />
|
||||
<Word from="lsabel" to="Isabel" />
|
||||
<Word from="kundejag" to="kunde jag" />
|
||||
<Word from="ärläget" to="är läget" />
|
||||
<Word from="blirinte" to="blir inte" />
|
||||
<Word from="l'm" to="I'm" />
|
||||
<Word from="lt's" to="It's" />
|
||||
<Word from="ijakt" to="i jakt" />
|
||||
<Word from="avjordens" to="av jordens" />
|
||||
</WholeWords>
|
||||
<PartialWordsAlways />
|
||||
<PartialWords>
|
||||
<!-- Will be used to check words not in dictionary -->
|
||||
<!-- If new word(s) exists in spelling dictionary, it(they) is accepted -->
|
||||
<WordPart from="¤" to="o" />
|
||||
<WordPart from="fi" to="fi" />
|
||||
<WordPart from="â" to="ä" />
|
||||
<WordPart from="/" to="l" />
|
||||
<WordPart from="vv" to="w" />
|
||||
<WordPart from="IVI" to="M" />
|
||||
<WordPart from="lVI" to="M" />
|
||||
<WordPart from="IVl" to="M" />
|
||||
<WordPart from="lVl" to="M" />
|
||||
<WordPart from="m" to="rn" />
|
||||
<WordPart from="l" to="i" />
|
||||
<WordPart from="€" to="e" />
|
||||
<WordPart from="I" to="l" />
|
||||
<WordPart from="c" to="o" />
|
||||
<WordPart from="i" to="t" />
|
||||
<WordPart from="cc" to="oo" />
|
||||
<WordPart from="ii" to="tt" />
|
||||
<WordPart from="n/" to="ry" />
|
||||
<WordPart from="ae" to="æ" />
|
||||
<!-- "f " will be two words -->
|
||||
<WordPart from="f" to="f " />
|
||||
<WordPart from="c" to="e" />
|
||||
<WordPart from="o" to="e" />
|
||||
<WordPart from="I" to="t" />
|
||||
<WordPart from="n" to="o" />
|
||||
<WordPart from="s" to="e" />
|
||||
<WordPart from="å" to="ä" />
|
||||
<WordPart from="à" to="å" />
|
||||
<WordPart from="n/" to="rv" />
|
||||
</PartialWords>
|
||||
<PartialLines />
|
||||
<PartialLinesAlways />
|
||||
<BeginLines>
|
||||
<Beginning from="Ln " to="In " />
|
||||
<Beginning from="U ppfattat" to="Uppfattat" />
|
||||
</BeginLines>
|
||||
<EndLines />
|
||||
<WholeLines />
|
||||
<RegularExpressions />
|
||||
</OCRFixReplaceList>
|
||||
@@ -0,0 +1,238 @@
|
||||
# coding=utf-8
|
||||
|
||||
import traceback
|
||||
|
||||
import pysubs2
|
||||
import logging
|
||||
import time
|
||||
|
||||
from mods import EMPTY_TAG_PROCESSOR
|
||||
from registry import registry
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class SubtitleModifications(object):
|
||||
debug = False
|
||||
language = None
|
||||
initialized_mods = {}
|
||||
|
||||
font_style_tag_start = u"{\\"
|
||||
|
||||
def __init__(self, debug=False):
|
||||
self.debug = debug
|
||||
self.initialized_mods = {}
|
||||
|
||||
def load(self, fn=None, content=None, language=None, encoding="utf-8"):
|
||||
"""
|
||||
|
||||
:param encoding: used for decoding the content when fn is given, not used in case content is given
|
||||
:param language: babelfish.Language language of the subtitle
|
||||
:param fn: filename
|
||||
:param content: unicode
|
||||
:return:
|
||||
"""
|
||||
self.language = language
|
||||
self.initialized_mods = {}
|
||||
try:
|
||||
if fn:
|
||||
self.f = pysubs2.load(fn, encoding=encoding)
|
||||
elif content:
|
||||
self.f = pysubs2.SSAFile.from_string(content)
|
||||
except (IOError,
|
||||
UnicodeDecodeError,
|
||||
pysubs2.exceptions.UnknownFPSError,
|
||||
pysubs2.exceptions.UnknownFormatIdentifierError,
|
||||
pysubs2.exceptions.FormatAutodetectionError):
|
||||
if fn:
|
||||
logger.exception("Couldn't load subtitle: %s: %s", fn, traceback.format_exc())
|
||||
elif content:
|
||||
logger.exception("Couldn't load subtitle: %s", traceback.format_exc())
|
||||
|
||||
@classmethod
|
||||
def parse_identifier(cls, identifier):
|
||||
# simple identifier
|
||||
if identifier in registry.mods:
|
||||
return identifier, {}
|
||||
|
||||
# identifier with params; identifier(param=value)
|
||||
split_args = identifier[identifier.find("(")+1:-1].split(",")
|
||||
args = dict((key, value) for key, value in [sub.split("=") for sub in split_args])
|
||||
return identifier[:identifier.find("(")], args
|
||||
|
||||
@classmethod
|
||||
def get_mod_class(cls, identifier):
|
||||
identifier, args = cls.parse_identifier(identifier)
|
||||
return registry.mods[identifier]
|
||||
|
||||
@classmethod
|
||||
def get_mod_signature(cls, identifier, **kwargs):
|
||||
return cls.get_mod_class(identifier).get_signature(**kwargs)
|
||||
|
||||
def prepare_mods(self, *mods):
|
||||
parsed_mods = [SubtitleModifications.parse_identifier(mod) for mod in mods]
|
||||
final_mods = {}
|
||||
line_mods = []
|
||||
non_line_mods = []
|
||||
|
||||
for identifier, args in parsed_mods:
|
||||
if identifier not in registry.mods:
|
||||
logger.error("Mod %s not loaded", identifier)
|
||||
continue
|
||||
|
||||
mod_cls = registry.mods[identifier]
|
||||
# exclusive mod, kill old, use newest
|
||||
if identifier in final_mods and mod_cls.exclusive:
|
||||
final_mods.pop(identifier)
|
||||
|
||||
# merge args of duplicate mods if possible
|
||||
elif identifier in final_mods and mod_cls.args_mergeable:
|
||||
final_mods[identifier] = mod_cls.merge_args(final_mods[identifier], args)
|
||||
continue
|
||||
final_mods[identifier] = args
|
||||
|
||||
# separate all mods into line and non-line mods
|
||||
for identifier, args in final_mods.iteritems():
|
||||
mod_cls = registry.mods[identifier]
|
||||
if mod_cls.modifies_whole_file:
|
||||
non_line_mods.append((identifier, args))
|
||||
else:
|
||||
line_mods.append((mod_cls.order, identifier, args))
|
||||
|
||||
# initialize the mods
|
||||
if identifier not in self.initialized_mods:
|
||||
self.initialized_mods[identifier] = mod_cls(self)
|
||||
|
||||
return line_mods, non_line_mods
|
||||
|
||||
def modify(self, *mods):
|
||||
new_entries = []
|
||||
start = time.time()
|
||||
line_mods, non_line_mods = self.prepare_mods(*mods)
|
||||
|
||||
# apply file mods
|
||||
if non_line_mods:
|
||||
non_line_mods_start = time.time()
|
||||
self.apply_non_line_mods(non_line_mods)
|
||||
|
||||
if self.debug:
|
||||
logger.debug("Non-Line mods took %ss", time.time() - non_line_mods_start)
|
||||
|
||||
# sort line mods
|
||||
line_mods.sort(key=lambda x: (x is None, x))
|
||||
|
||||
# apply line mods
|
||||
if line_mods:
|
||||
line_mods_start = time.time()
|
||||
self.apply_line_mods(new_entries, line_mods)
|
||||
|
||||
if self.debug:
|
||||
logger.debug("Line mods took %ss", time.time() - line_mods_start)
|
||||
|
||||
self.f.events = new_entries
|
||||
if self.debug:
|
||||
logger.debug("Subtitle Modification took %ss", time.time() - start)
|
||||
|
||||
def apply_non_line_mods(self, mods):
|
||||
for identifier, args in mods:
|
||||
mod = self.initialized_mods[identifier]
|
||||
mod.modify(None, debug=self.debug, parent=self, **args)
|
||||
|
||||
def apply_line_mods(self, new_entries, mods):
|
||||
for entry in self.f:
|
||||
applied_mods = []
|
||||
lines = []
|
||||
|
||||
line_count = 0
|
||||
start_tags = []
|
||||
end_tags = []
|
||||
for line in entry.text.split(ur"\N"):
|
||||
# don't bother the mods with surrounding tags
|
||||
old_line = line
|
||||
line = line.strip()
|
||||
skip_line = False
|
||||
line_count += 1
|
||||
|
||||
# clean {\X0} tags before processing
|
||||
# fixme: handle nested tags?
|
||||
start_tag = u""
|
||||
end_tag = u""
|
||||
if line.startswith(self.font_style_tag_start):
|
||||
start_tag = line[:5]
|
||||
line = line[5:]
|
||||
if line[-5:-3] == self.font_style_tag_start:
|
||||
end_tag = line[-5:]
|
||||
line = line[:-5]
|
||||
|
||||
for order, identifier, args in mods:
|
||||
mod = self.initialized_mods[identifier]
|
||||
|
||||
line = mod.modify(line.strip(), debug=self.debug, parent=self, **args)
|
||||
if not line:
|
||||
if self.debug:
|
||||
logger.debug(u"%s: %r -> ''", identifier, old_line)
|
||||
skip_line = True
|
||||
break
|
||||
|
||||
applied_mods.append(identifier)
|
||||
|
||||
if skip_line:
|
||||
continue
|
||||
|
||||
if start_tag:
|
||||
start_tags.append(start_tag)
|
||||
|
||||
if end_tag:
|
||||
end_tags.append(end_tag)
|
||||
|
||||
# append new line and clean possibly newly added empty tags
|
||||
cleaned_line = EMPTY_TAG_PROCESSOR.process(start_tag + line + end_tag, debug=self.debug).strip()
|
||||
if cleaned_line:
|
||||
# we may have a single closing tag, if so, try appending it to the previous line
|
||||
if len(cleaned_line) == 5 and cleaned_line.startswith("{\\") and cleaned_line.endswith("0}"):
|
||||
if lines:
|
||||
prev_line = lines.pop()
|
||||
lines.append(prev_line + cleaned_line)
|
||||
continue
|
||||
|
||||
lines.append(cleaned_line)
|
||||
else:
|
||||
if self.debug:
|
||||
logger.debug(u"Ditching now empty line (%r -> %r)", line)
|
||||
|
||||
if not lines:
|
||||
# don't bother logging when the entry only had one line
|
||||
if self.debug and line_count > 1:
|
||||
logger.debug(u"%r -> ''", entry.text)
|
||||
continue
|
||||
|
||||
new_text = ur"\N".join(lines)
|
||||
|
||||
# cheap man's approach to avoid open tags
|
||||
add_start_tags = []
|
||||
add_end_tags = []
|
||||
if len(start_tags) != len(end_tags):
|
||||
for tag in start_tags:
|
||||
end_tag = tag.replace("1", "0")
|
||||
if end_tag not in end_tags and new_text.count(tag) > new_text.count(end_tag):
|
||||
add_end_tags.append(end_tag)
|
||||
for tag in end_tags:
|
||||
start_tag = tag.replace("0", "1")
|
||||
if start_tag not in start_tags and new_text.count(tag) > new_text.count(start_tag):
|
||||
add_start_tags.append(start_tag)
|
||||
|
||||
if add_end_tags or add_start_tags:
|
||||
entry.text = u"".join(add_start_tags) + new_text + u"".join(add_end_tags)
|
||||
if self.debug:
|
||||
logger.debug(u"Fixing tags: %s (%r -> %r)", str(add_start_tags+add_end_tags), new_text,
|
||||
entry.text)
|
||||
else:
|
||||
entry.text = new_text
|
||||
|
||||
new_entries.append(entry)
|
||||
|
||||
SubMod = SubtitleModifications
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,94 @@
|
||||
# coding=utf-8
|
||||
import re
|
||||
import logging
|
||||
|
||||
from subzero.modification.processors.re_processor import ReProcessor, NReProcessor
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class SubtitleModification(object):
|
||||
identifier = None
|
||||
description = None
|
||||
long_description = None
|
||||
exclusive = False
|
||||
advanced = False # has parameters
|
||||
args_mergeable = False
|
||||
order = None
|
||||
modifies_whole_file = False # operates on the whole file, not individual entries
|
||||
pre_processors = []
|
||||
processors = []
|
||||
post_processors = []
|
||||
|
||||
def __init__(self, parent):
|
||||
return
|
||||
|
||||
def _process(self, content, processors, debug=False, parent=None, **kwargs):
|
||||
if not content:
|
||||
return
|
||||
|
||||
# processors may be a list or a callable
|
||||
#if callable(processors):
|
||||
# _processors = processors()
|
||||
#else:
|
||||
# _processors = processors
|
||||
_processors = processors
|
||||
|
||||
new_content = content
|
||||
for processor in _processors:
|
||||
old_content = new_content
|
||||
new_content = processor.process(new_content, debug=debug)
|
||||
if not new_content:
|
||||
if debug:
|
||||
logger.debug("Processor returned empty line: %s", processor)
|
||||
break
|
||||
if debug:
|
||||
if old_content == new_content:
|
||||
continue
|
||||
logger.debug("%s: %s -> %s", processor, repr(old_content), repr(new_content))
|
||||
return new_content
|
||||
|
||||
def pre_process(self, content, debug=False, parent=None, **kwargs):
|
||||
return self._process(content, self.pre_processors, debug=debug, parent=parent, **kwargs)
|
||||
|
||||
def process(self, content, debug=False, parent=None, **kwargs):
|
||||
return self._process(content, self.processors, debug=debug, parent=parent, **kwargs)
|
||||
|
||||
def post_process(self, content, debug=False, parent=None, **kwargs):
|
||||
return self._process(content, self.post_processors, debug=debug, parent=parent, **kwargs)
|
||||
|
||||
def modify(self, content, debug=False, parent=None, **kwargs):
|
||||
if not content:
|
||||
return
|
||||
|
||||
new_content = content
|
||||
for method in ("pre_process", "process", "post_process"):
|
||||
if not new_content:
|
||||
return
|
||||
new_content = getattr(self, method)(new_content, debug=debug, parent=parent, **kwargs)
|
||||
|
||||
return new_content
|
||||
|
||||
@classmethod
|
||||
def get_signature(cls, **kwargs):
|
||||
string_args = ",".join(["%s=%s" % (key, value) for key, value in kwargs.iteritems()])
|
||||
return "%s(%s)" % (cls.identifier, string_args)
|
||||
|
||||
@classmethod
|
||||
def merge_args(cls, args1, args2):
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
class SubtitleTextModification(SubtitleModification):
|
||||
pass
|
||||
|
||||
|
||||
EMPTY_TAG_PROCESSOR = ReProcessor(re.compile(r'({\\\w1})[\s.,-_!?]*({\\\w0})'), "", name="empty_tag")
|
||||
|
||||
empty_line_post_processors = [
|
||||
# empty tag
|
||||
EMPTY_TAG_PROCESSOR,
|
||||
|
||||
# empty line (needed?)
|
||||
NReProcessor(re.compile(r'^[\s-]+$'), "", name="empty_line"),
|
||||
]
|
||||
@@ -0,0 +1,51 @@
|
||||
# coding=utf-8
|
||||
|
||||
import logging
|
||||
from collections import OrderedDict
|
||||
|
||||
from subzero.modification.mods import SubtitleModification
|
||||
from subzero.modification import registry
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
COLOR_MAP = OrderedDict([
|
||||
("white", "#FFFFFF"),
|
||||
("light-grey", "#C0C0C0"),
|
||||
("red", "#FF0000"),
|
||||
("green", "#00FF00"),
|
||||
("yellow", "#FFFF00"),
|
||||
("blue", "#0000FF"),
|
||||
("magenta", "#FF00FF"),
|
||||
("cyan", "#00FFFF"),
|
||||
("black", "#000000"),
|
||||
("dark-red", "#800000"),
|
||||
("dark-green", "#008000"),
|
||||
("dark-yellow", "#808000"),
|
||||
("dark-blue", "#000080"),
|
||||
("dark-magenta", "#800080"),
|
||||
("dark-cyan", "#008080"),
|
||||
("dark-grey", "#808080"),
|
||||
])
|
||||
|
||||
|
||||
class Color(SubtitleModification):
|
||||
identifier = "color"
|
||||
description = "Change the color of the subtitle"
|
||||
exclusive = True
|
||||
advanced = True
|
||||
|
||||
colors = COLOR_MAP
|
||||
|
||||
long_description = """\
|
||||
Adds the requested color to every line of the subtitle. Support depends on player.
|
||||
"""
|
||||
|
||||
def modify(self, content, debug=False, parent=None, **kwargs):
|
||||
color = self.colors.get(kwargs.get("name"))
|
||||
if color:
|
||||
return u'<font color="%s">%s</font>' % (color, content)
|
||||
return content
|
||||
|
||||
|
||||
registry.register(Color)
|
||||
@@ -0,0 +1,72 @@
|
||||
# coding=utf-8
|
||||
|
||||
import re
|
||||
|
||||
from subzero.modification.mods import SubtitleTextModification, empty_line_post_processors
|
||||
from subzero.modification.processors.string_processor import StringProcessor
|
||||
from subzero.modification.processors.re_processor import NReProcessor
|
||||
from subzero.modification import registry
|
||||
|
||||
|
||||
class CommonFixes(SubtitleTextModification):
|
||||
identifier = "common"
|
||||
description = "Basic common fixes"
|
||||
exclusive = True
|
||||
order = 40
|
||||
|
||||
long_description = """\
|
||||
Fix common whitespace/punctuation issues in subtitles
|
||||
"""
|
||||
|
||||
processors = [
|
||||
# -- = ...
|
||||
StringProcessor("-- ", '... ', name="CM_doubledash"),
|
||||
|
||||
# '' = "
|
||||
StringProcessor("''", '"', name="CM_double_apostrophe"),
|
||||
|
||||
# remove leading ...
|
||||
NReProcessor(re.compile(r'(?u)^\.\.\.[\s]*'), "", name="CM_leading_ellipsis"),
|
||||
|
||||
# no space after ellipsis
|
||||
NReProcessor(re.compile(r'(?u)\.\.\.(?![\s.,!?\'"])(?!$)'), "... ", name="CM_ellipsis_no_space"),
|
||||
|
||||
# multiple spaces
|
||||
NReProcessor(re.compile(r'(?u)[\s]{2,}'), " ", name="CM_multiple_spaces"),
|
||||
|
||||
# no space after starting dash
|
||||
NReProcessor(re.compile(r'(?u)^-(?![\s-])'), "- ", name="CM_dash_space"),
|
||||
|
||||
# remove starting spaced dots (not matching ellipses
|
||||
NReProcessor(re.compile(r'(?u)^(?!\s?(\.\s\.\s\.)|(\s?\.{3}))[\s.]*'), "", name="CM_starting_spacedots"),
|
||||
|
||||
# space missing before doublequote
|
||||
# ReProcessor(re.compile(r'(?u)(?<!^)(?<![\s(\["])("[^"]+")'), r' \1', name="CM_space_before_dblquote"),
|
||||
|
||||
# space missing after doublequote
|
||||
# ReProcessor(re.compile(r'(?u)("[^"\s][^"]+")([^\s.,!?)\]]+)'), r"\1 \2", name="CM_space_after_dblquote"),
|
||||
|
||||
# space before ending doublequote?
|
||||
|
||||
# remove >>
|
||||
NReProcessor(re.compile(r'(?u)^\s?>>\s*'), "", name="CM_leading_crocodiles"),
|
||||
|
||||
# replace uppercase I with lowercase L in words
|
||||
NReProcessor(re.compile(ur'(?u)([A-zÀ-ž][a-zà-ž]+)(I+)'),
|
||||
lambda match: ur'%s%s' % (match.group(1), "l"*len(match.group(2))), name="CM_uppercase_i_in_word"),
|
||||
|
||||
# fix spaces in numbers (allows for punctuation: ,.:' (comma only fixed if after space, those may be
|
||||
# countdowns otherwise); don't break up ellipses
|
||||
# fixme: maybe check whether it's a countdown (second part smaller than the first), otherwise handle default?
|
||||
NReProcessor(re.compile(r'(?u)([0-9]+[0-9.:\']*(?<!\.\.))\s+((?!\.\.)[0-9,.:\']*[0-9]+)'), r"\1\2",
|
||||
name="CM_spaces_in_numbers"),
|
||||
|
||||
# uppercase after dot
|
||||
NReProcessor(re.compile(ur'(?u)((?:[^.\s])+\.\s+)([a-zà-ž])'),
|
||||
lambda match: ur'%s%s' % (match.group(1), match.group(2).upper()), name="CM_uppercase_after_dot"),
|
||||
]
|
||||
|
||||
post_processors = empty_line_post_processors
|
||||
|
||||
|
||||
registry.register(CommonFixes)
|
||||
@@ -0,0 +1,27 @@
|
||||
# coding=utf-8
|
||||
|
||||
import logging
|
||||
|
||||
from subzero.modification.mods import SubtitleModification
|
||||
from subzero.modification import registry
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ChangeFPS(SubtitleModification):
|
||||
identifier = "change_FPS"
|
||||
description = "Change the FPS of the subtitle"
|
||||
exclusive = True
|
||||
advanced = True
|
||||
modifies_whole_file = True
|
||||
|
||||
long_description = """\
|
||||
Re-syncs the subtitle to the framerate of the current media file.
|
||||
"""
|
||||
|
||||
def modify(self, content, debug=False, parent=None, **kwargs):
|
||||
fps_from = kwargs.get("from")
|
||||
fps_to = kwargs.get("to")
|
||||
parent.f.transform_framerate(float(fps_from), float(fps_to))
|
||||
|
||||
registry.register(ChangeFPS)
|
||||
@@ -0,0 +1,48 @@
|
||||
# coding=utf-8
|
||||
import re
|
||||
|
||||
from subzero.modification.mods import SubtitleTextModification, empty_line_post_processors
|
||||
from subzero.modification.processors.re_processor import NReProcessor
|
||||
from subzero.modification import registry
|
||||
|
||||
|
||||
class HearingImpaired(SubtitleTextModification):
|
||||
identifier = "remove_HI"
|
||||
description = "Remove Hearing Impaired tags"
|
||||
exclusive = True
|
||||
order = 10
|
||||
|
||||
long_description = """\
|
||||
Removes tags, text and characters from subtitles that are meant for hearing impaired people
|
||||
"""
|
||||
|
||||
processors = [
|
||||
# brackets (only remove if at least 3 consecutive uppercase chars in brackets
|
||||
NReProcessor(re.compile(ur'(?sux)[([].+(?=[A-ZÀ-Ž]{3,}).+[)\]]'), "", name="HI_brackets"),
|
||||
|
||||
# text before colon (and possible dash in front), max 11 chars after the first whitespace (if any)
|
||||
# NReProcessor(re.compile(r'(?u)(^[A-z\-\'"_]+[\w\s]{0,11}:[^0-9{2}][\s]*)'), "", name="HI_before_colon"),
|
||||
|
||||
# text before colon (at least 4 consecutive uppercase chars)
|
||||
NReProcessor(re.compile(ur'(?u)(^(?=.*[A-ZÀ-Ž]{4,})[A-ZÀ-Ž-_\s]+:\s*)'), "", name="HI_before_colon"),
|
||||
|
||||
# text in brackets at start, after optional dash, before colon or at end of line
|
||||
# fixme: may be too aggressive
|
||||
NReProcessor(re.compile(ur'(?um)(^-?\s?[([][A-zÀ-ž-_\s]{3,}[)\]](?:(?=$)|:\s*))'), "",
|
||||
name="HI_brackets_special"),
|
||||
|
||||
# all caps line (at least 4 consecutive uppercase chars)
|
||||
NReProcessor(re.compile(ur'(?u)(^(?=.*[A-ZÀ-Ž]{4,})[A-ZÀ-Ž-_\s]+$)'), "", name="HI_all_caps"),
|
||||
|
||||
# dash in front
|
||||
# NReProcessor(re.compile(r'(?u)^\s*-\s*'), "", name="HI_starting_dash"),
|
||||
|
||||
# all caps at start before new sentence
|
||||
NReProcessor(re.compile(ur'(?u)^(?=[A-ZÀ-Ž]{4,})[A-ZÀ-Ž-_\s]+\s([A-ZÀ-Ž][a-zà-ž].+)'), r"\1",
|
||||
name="HI_starting_upper_then_sentence"),
|
||||
]
|
||||
|
||||
post_processors = empty_line_post_processors
|
||||
|
||||
|
||||
registry.register(HearingImpaired)
|
||||
@@ -0,0 +1,48 @@
|
||||
# coding=utf-8
|
||||
import logging
|
||||
|
||||
from subzero.modification.mods import SubtitleTextModification
|
||||
from subzero.modification.processors.string_processor import MultipleLineProcessor, WholeLineProcessor
|
||||
from subzero.modification.processors.re_processor import MultipleWordReProcessor
|
||||
from subzero.modification import registry
|
||||
from subzero.modification.dictionaries.data import data as OCR_fix_data
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class FixOCR(SubtitleTextModification):
|
||||
identifier = "OCR_fixes"
|
||||
description = "Fix common OCR issues"
|
||||
exclusive = True
|
||||
order = 20
|
||||
data_dict = None
|
||||
|
||||
long_description = """\
|
||||
Fix issues that happen when a subtitle gets converted from bitmap to text through OCR
|
||||
"""
|
||||
|
||||
def __init__(self, parent):
|
||||
super(FixOCR, self).__init__(parent)
|
||||
data_dict = OCR_fix_data.get(parent.language.alpha3t)
|
||||
if not data_dict:
|
||||
logger.debug("No SnR-data available for language %s", parent.language)
|
||||
return
|
||||
|
||||
self.data_dict = data_dict
|
||||
self.processors = self.get_processors()
|
||||
|
||||
def get_processors(self):
|
||||
if not self.data_dict:
|
||||
return []
|
||||
|
||||
return [
|
||||
WholeLineProcessor(self.data_dict["WholeLines"], name="SE_replace_line"),
|
||||
MultipleWordReProcessor(self.data_dict["WholeWords"], name="SE_replace_word"),
|
||||
MultipleWordReProcessor(self.data_dict["BeginLines"], name="SE_replace_beginline"),
|
||||
MultipleWordReProcessor(self.data_dict["EndLines"], name="SE_replace_endline"),
|
||||
MultipleWordReProcessor(self.data_dict["PartialLines"], name="SE_replace_partialline"),
|
||||
MultipleLineProcessor(self.data_dict["PartialWordsAlways"], name="SE_replace_partialwordsalways")
|
||||
]
|
||||
|
||||
|
||||
registry.register(FixOCR)
|
||||
@@ -0,0 +1,40 @@
|
||||
# coding=utf-8
|
||||
|
||||
import logging
|
||||
|
||||
from subzero.modification.mods import SubtitleModification
|
||||
from subzero.modification import registry
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ShiftOffset(SubtitleModification):
|
||||
identifier = "shift_offset"
|
||||
description = "Change the timing of the subtitle"
|
||||
exclusive = False
|
||||
advanced = True
|
||||
args_mergeable = True
|
||||
modifies_whole_file = True
|
||||
|
||||
long_description = """\
|
||||
Adds or substracts a certain amount of time from the whole subtitle to match your media
|
||||
"""
|
||||
|
||||
@classmethod
|
||||
def merge_args(cls, args1, args2):
|
||||
new_args = dict((key, int(value)) for key, value in args1.iteritems())
|
||||
|
||||
for key, value in args2.iteritems():
|
||||
if key in new_args:
|
||||
new_args[key] += int(value)
|
||||
else:
|
||||
new_args[key] = int(value)
|
||||
|
||||
return new_args
|
||||
|
||||
def modify(self, content, debug=False, parent=None, **kwargs):
|
||||
parent.f.shift(h=int(kwargs.get("h", 0)), m=int(kwargs.get("m", 0)), s=int(kwargs.get("s", 0)),
|
||||
ms=int(kwargs.get("ms", 0)))
|
||||
|
||||
|
||||
registry.register(ShiftOffset)
|
||||
@@ -0,0 +1,29 @@
|
||||
# coding=utf-8
|
||||
|
||||
|
||||
class Processor(object):
|
||||
"""
|
||||
Processor base class
|
||||
"""
|
||||
name = None
|
||||
parent = None
|
||||
|
||||
def __init__(self, name=None, parent=None):
|
||||
self.name = name
|
||||
self.parent = parent
|
||||
|
||||
@property
|
||||
def info(self):
|
||||
return self.name
|
||||
|
||||
def process(self, content, debug=False):
|
||||
return content
|
||||
|
||||
def __repr__(self):
|
||||
return "Processor <%s %s>" % (self.__class__.__name__, self.info)
|
||||
|
||||
def __str__(self):
|
||||
return repr(self)
|
||||
|
||||
def __unicode__(self):
|
||||
return unicode(repr(self))
|
||||
@@ -0,0 +1,48 @@
|
||||
# coding=utf-8
|
||||
import re
|
||||
import logging
|
||||
|
||||
from subzero.modification.processors import Processor
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class ReProcessor(Processor):
|
||||
"""
|
||||
Regex processor
|
||||
"""
|
||||
pattern = None
|
||||
replace_with = None
|
||||
|
||||
def __init__(self, pattern, replace_with, name=None):
|
||||
super(ReProcessor, self).__init__(name=name)
|
||||
self.pattern = pattern
|
||||
self.replace_with = replace_with
|
||||
|
||||
def process(self, content, debug=False):
|
||||
return self.pattern.sub(self.replace_with, content)
|
||||
|
||||
|
||||
class NReProcessor(ReProcessor):
|
||||
pass
|
||||
|
||||
|
||||
class MultipleWordReProcessor(ReProcessor):
|
||||
"""
|
||||
Expects a dictionary in the form of:
|
||||
dict = {
|
||||
"data": {"old_value": "new_value"},
|
||||
"pattern": compiled re object that matches data.keys()
|
||||
}
|
||||
replaces found key in pattern with the corresponding value in data
|
||||
"""
|
||||
def __init__(self, snr_dict, name=None, parent=None):
|
||||
super(ReProcessor, self).__init__(name=name)
|
||||
self.snr_dict = snr_dict
|
||||
|
||||
def process(self, content, debug=False):
|
||||
if not self.snr_dict["data"]:
|
||||
return content
|
||||
|
||||
return self.snr_dict["pattern"].sub(lambda x: self.snr_dict["data"][x.group(0)], content)
|
||||
|
||||
@@ -0,0 +1,84 @@
|
||||
# coding=utf-8
|
||||
|
||||
import logging
|
||||
|
||||
from subzero.modification.processors import Processor
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class StringProcessor(Processor):
|
||||
"""
|
||||
String replacement processor base
|
||||
"""
|
||||
|
||||
def __init__(self, search, replace, name=None, parent=None):
|
||||
super(StringProcessor, self).__init__(name=name)
|
||||
self.search = search
|
||||
self.replace = replace
|
||||
|
||||
def process(self, content, debug=False):
|
||||
return content.replace(self.search, self.replace)
|
||||
|
||||
|
||||
class MultipleLineProcessor(Processor):
|
||||
"""
|
||||
replaces stuff in whole lines
|
||||
|
||||
takes a search/replace dict as first argument
|
||||
Expects a dictionary in the form of:
|
||||
dict = {
|
||||
"data": {"old_value": "new_value"}
|
||||
}
|
||||
"""
|
||||
def __init__(self, snr_dict, name=None, parent=None):
|
||||
super(MultipleLineProcessor, self).__init__(name=name)
|
||||
self.snr_dict = snr_dict
|
||||
|
||||
def process(self, content, debug=False):
|
||||
if not self.snr_dict["data"]:
|
||||
return content
|
||||
|
||||
for key, value in self.snr_dict["data"].iteritems():
|
||||
if debug and key in content:
|
||||
logger.debug(u"Replacing '%s' with '%s' in '%s'", key, value, content)
|
||||
|
||||
content = content.replace(key, value)
|
||||
|
||||
return content
|
||||
|
||||
|
||||
class WholeLineProcessor(MultipleLineProcessor):
|
||||
def process(self, content, debug=False):
|
||||
if not self.snr_dict["data"]:
|
||||
return content
|
||||
content = content.strip()
|
||||
|
||||
for key, value in self.snr_dict["data"].iteritems():
|
||||
if content == key:
|
||||
if debug:
|
||||
logger.debug(u"Replacing '%s' with '%s'", key, value)
|
||||
|
||||
content = value
|
||||
break
|
||||
|
||||
return content
|
||||
|
||||
|
||||
class MultipleWordProcessor(MultipleLineProcessor):
|
||||
"""
|
||||
replaces words
|
||||
|
||||
takes a search/replace dict as first argument
|
||||
Expects a dictionary in the form of:
|
||||
dict = {
|
||||
"data": {"old_value": "new_value"}
|
||||
}
|
||||
"""
|
||||
def process(self, content, debug=False):
|
||||
words = content.split(u" ")
|
||||
new_words = []
|
||||
for word in words:
|
||||
new_words.append(self.snr_dict.get(word, word))
|
||||
|
||||
return u" ".join(new_words)
|
||||
@@ -0,0 +1,17 @@
|
||||
# coding=utf-8
|
||||
from collections import OrderedDict
|
||||
|
||||
|
||||
class SubtitleModRegistry(object):
|
||||
mods = None
|
||||
mods_available = None
|
||||
|
||||
def __init__(self):
|
||||
self.mods = OrderedDict()
|
||||
self.mods_available = []
|
||||
|
||||
def register(self, mod):
|
||||
self.mods[mod.identifier] = mod
|
||||
self.mods_available.append(mod.identifier)
|
||||
|
||||
registry = SubtitleModRegistry()
|
||||
@@ -4,13 +4,23 @@ import hashlib
|
||||
import os
|
||||
import logging
|
||||
import traceback
|
||||
import gzip
|
||||
|
||||
from babelfish import Language
|
||||
|
||||
from json_tricks.nonp import loads, dumps
|
||||
|
||||
|
||||
from constants import mode_map
|
||||
from subliminal_patch.subtitle import ModifiedSubtitle
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class StoredSubtitle(object):
|
||||
"""
|
||||
legacy class used for PMS LoadObject/SaveObject
|
||||
"""
|
||||
score = None
|
||||
storage_type = None
|
||||
hash = None
|
||||
@@ -46,8 +56,59 @@ class StoredSubtitle(object):
|
||||
return mode_map.get(self.mode, "Unknown")
|
||||
|
||||
|
||||
class JSONStoredSubtitle(object):
|
||||
score = None
|
||||
storage_type = None
|
||||
hash = None
|
||||
provider_name = None
|
||||
id = None
|
||||
date_added = None
|
||||
mode = "a" # auto/manual/auto-better (a/m/b)
|
||||
content = None
|
||||
mods = None
|
||||
encoding = None
|
||||
|
||||
def initialize(self, score, storage_type, hash, provider_name, id, date_added=None, mode="a", content=None,
|
||||
mods=None, encoding=None):
|
||||
self.score = int(score)
|
||||
self.storage_type = storage_type
|
||||
self.hash = hash
|
||||
self.provider_name = provider_name
|
||||
self.id = id
|
||||
self.date_added = date_added or datetime.datetime.now()
|
||||
self.mode = mode
|
||||
self.content = content
|
||||
self.mods = mods or []
|
||||
self.encoding = encoding
|
||||
|
||||
def add_mod(self, identifier):
|
||||
self.mods = self.mods or []
|
||||
if identifier is None:
|
||||
self.mods = []
|
||||
return
|
||||
|
||||
self.mods.append(identifier)
|
||||
|
||||
@property
|
||||
def mode_verbose(self):
|
||||
return mode_map.get(self.mode, "Unknown")
|
||||
|
||||
def serialize(self):
|
||||
if self.content:
|
||||
# content is always stored in unicode (gets converted to string with escaped unicode chars by json)
|
||||
self.content = self.content.decode(self.encoding)
|
||||
return self.__dict__
|
||||
|
||||
def deserialize(self, data):
|
||||
if data["content"]:
|
||||
# content is always present in encoded form
|
||||
data["content"] = data["content"].encode(data["encoding"])
|
||||
self.initialize(**data)
|
||||
|
||||
|
||||
class StoredVideoSubtitles(object):
|
||||
"""
|
||||
legacy class
|
||||
manages stored subtitles for video_id per media_part/language combination
|
||||
"""
|
||||
video_id = None # rating_key
|
||||
@@ -112,12 +173,136 @@ class StoredVideoSubtitles(object):
|
||||
return str(self.video_id)
|
||||
|
||||
|
||||
class JSONStoredVideoSubtitles(object):
|
||||
"""
|
||||
manages stored subtitles for video_id per media_part/language combination
|
||||
"""
|
||||
video_id = None # rating_key
|
||||
title = None
|
||||
parts = None
|
||||
version = None
|
||||
item_type = None # movie / episode
|
||||
added_at = None
|
||||
|
||||
def initialize(self, plex_item, version=None):
|
||||
self.video_id = str(plex_item.rating_key)
|
||||
|
||||
self.title = plex_item.title
|
||||
self.parts = {}
|
||||
self.version = version
|
||||
self.item_type = plex_item.type
|
||||
self.added_at = datetime.datetime.fromtimestamp(plex_item.added_at)
|
||||
|
||||
def deserialize(self, data):
|
||||
parts = data.pop("parts")
|
||||
self.parts = {}
|
||||
self.__dict__.update(data)
|
||||
|
||||
if parts:
|
||||
for part_id, part in parts.iteritems():
|
||||
self.parts[part_id] = {}
|
||||
for language, sub_data in part.iteritems():
|
||||
self.parts[part_id][language] = {}
|
||||
|
||||
for sub_key, subtitle_data in sub_data.iteritems():
|
||||
if sub_key == "current":
|
||||
if not isinstance(subtitle_data, tuple):
|
||||
subtitle_data = tuple(subtitle_data.split("__"))
|
||||
self.parts[part_id][language]["current"] = subtitle_data
|
||||
else:
|
||||
sub = JSONStoredSubtitle()
|
||||
|
||||
# legacy subtitle storage instance
|
||||
if isinstance(subtitle_data, StoredSubtitle):
|
||||
subtitle_data = subtitle_data.__dict__
|
||||
|
||||
sub.initialize(**subtitle_data)
|
||||
if not isinstance(sub_key, tuple):
|
||||
sub_key = tuple(sub_key.split("__"))
|
||||
|
||||
self.parts[part_id][language][sub_key] = sub
|
||||
|
||||
def serialize(self):
|
||||
data = {"parts": {}}
|
||||
for key, value in self.__dict__.iteritems():
|
||||
if key != "parts":
|
||||
data[key] = value
|
||||
|
||||
for part_id, part in self.parts.iteritems():
|
||||
data["parts"][part_id] = {}
|
||||
for language, sub_data in part.iteritems():
|
||||
data["parts"][part_id][language] = {}
|
||||
|
||||
for sub_key, stored_subtitle in sub_data.iteritems():
|
||||
if sub_key == "current":
|
||||
data["parts"][part_id][language]["current"] = "__".join(stored_subtitle)
|
||||
else:
|
||||
# migrate missing encoding data
|
||||
if stored_subtitle.content and not stored_subtitle.encoding:
|
||||
# correctly serialize the content
|
||||
lang = Language.fromietf(language)
|
||||
subtitle = ModifiedSubtitle(lang)
|
||||
subtitle.content = stored_subtitle.content
|
||||
stored_subtitle.encoding = subtitle.guess_encoding()
|
||||
|
||||
data["parts"][part_id][language]["__".join(sub_key)] = stored_subtitle.serialize()
|
||||
|
||||
return data
|
||||
|
||||
def add(self, part_id, lang, subtitle, storage_type, date_added=None, mode="a"):
|
||||
part_id = str(part_id)
|
||||
part = self.parts.get(part_id)
|
||||
if not part:
|
||||
self.parts[part_id] = {}
|
||||
part = self.parts[part_id]
|
||||
|
||||
subs = part.get(lang)
|
||||
if not subs:
|
||||
part[lang] = {}
|
||||
subs = part[lang]
|
||||
|
||||
sub_key = self.get_sub_key(subtitle.provider_name, subtitle.id)
|
||||
subs[sub_key] = JSONStoredSubtitle()
|
||||
subs[sub_key].initialize(subtitle.score, storage_type, hashlib.md5(subtitle.content).hexdigest(),
|
||||
subtitle.provider_name, subtitle.id, date_added=date_added, mode=mode,
|
||||
content=subtitle.content, mods=subtitle.mods, encoding=subtitle.guess_encoding())
|
||||
subs["current"] = sub_key
|
||||
|
||||
return True
|
||||
|
||||
def get_any(self, part_id, lang):
|
||||
part_id = str(part_id)
|
||||
part = self.parts.get(part_id)
|
||||
if not part:
|
||||
return
|
||||
|
||||
subs = part.get(lang)
|
||||
if not subs:
|
||||
return
|
||||
|
||||
if "current" in subs and subs["current"]:
|
||||
return subs.get(subs["current"])
|
||||
|
||||
def get_sub_key(self, provider_name, id):
|
||||
return provider_name, str(id)
|
||||
|
||||
def __repr__(self):
|
||||
return unicode(self)
|
||||
|
||||
def __unicode__(self):
|
||||
return u"%s (%s)" % (self.title, self.video_id)
|
||||
|
||||
def __str__(self):
|
||||
return str(self.video_id)
|
||||
|
||||
|
||||
class StoredSubtitlesManager(object):
|
||||
"""
|
||||
manages the storage and retrieval of StoredVideoSubtitles instances for a given video_id
|
||||
"""
|
||||
storage = None
|
||||
version = 2
|
||||
extension = ".json.gz"
|
||||
|
||||
def __init__(self, storage, plexapi_item_getter):
|
||||
self.storage = storage
|
||||
@@ -130,6 +315,11 @@ class StoredSubtitlesManager(object):
|
||||
def dataitems_path(self):
|
||||
return os.path.join(getattr(self.storage, "_core").storage.data_path, "DataItems")
|
||||
|
||||
def get_json_data_path(self, bare_fn):
|
||||
if not bare_fn.endswith(self.extension):
|
||||
return os.path.join(self.dataitems_path, "%s%s" % (bare_fn, self.extension))
|
||||
return os.path.join(self.dataitems_path, bare_fn)
|
||||
|
||||
def get_all_files(self):
|
||||
return [fn for fn in os.listdir(self.dataitems_path) if fn.startswith("subs_")]
|
||||
|
||||
@@ -156,10 +346,13 @@ class StoredSubtitlesManager(object):
|
||||
def delete_missing_files(self):
|
||||
deleted = []
|
||||
for fn in self.get_all_files():
|
||||
video_id = os.path.basename(fn).split("subs_")[1]
|
||||
video_id = os.path.basename(fn).split(".")[0].split("subs_")[1]
|
||||
item = self.get_item(video_id)
|
||||
if not item:
|
||||
self.delete(fn)
|
||||
if fn.endswith(".json.gz"):
|
||||
self.delete(self.get_json_data_path(fn))
|
||||
else:
|
||||
self.legacy_delete(fn)
|
||||
deleted.append(video_id)
|
||||
return deleted
|
||||
|
||||
@@ -172,13 +365,47 @@ class StoredSubtitlesManager(object):
|
||||
subs_for_video.version = 2
|
||||
return True
|
||||
|
||||
def migrate_legacy_data(self, from_fn, to_fn):
|
||||
try:
|
||||
subs_for_video = self.storage.LoadObject(from_fn)
|
||||
except:
|
||||
logger.error("Failed to load item \"%s\": %s" % (from_fn, traceback.format_exc()))
|
||||
|
||||
# delete
|
||||
return
|
||||
|
||||
if not subs_for_video or not hasattr(subs_for_video, "version"):
|
||||
self.legacy_delete(from_fn)
|
||||
|
||||
# migrate to our new json format
|
||||
new_subs_for_video = JSONStoredVideoSubtitles()
|
||||
new_subs_for_video.deserialize(subs_for_video.__dict__)
|
||||
self.save(new_subs_for_video)
|
||||
|
||||
self.legacy_delete(from_fn)
|
||||
|
||||
return new_subs_for_video
|
||||
|
||||
def load(self, video_id=None, filename=None):
|
||||
subs_for_video = None
|
||||
fn = self.get_storage_filename(video_id) if video_id else filename
|
||||
try:
|
||||
subs_for_video = self.storage.LoadObject(fn)
|
||||
except:
|
||||
logger.error("Failed to load item %s: %s" % (fn, traceback.format_exc()))
|
||||
bare_fn = self.get_storage_filename(video_id) if video_id else filename
|
||||
json_path = self.get_json_data_path(bare_fn)
|
||||
if os.path.exists(json_path):
|
||||
# new style data
|
||||
subs_for_video = JSONStoredVideoSubtitles()
|
||||
try:
|
||||
with gzip.open(json_path, 'rb') as f:
|
||||
s = f.read()
|
||||
|
||||
data = loads(s)
|
||||
except:
|
||||
logger.error("Couldn't load JSON data for %s", bare_fn)
|
||||
return
|
||||
|
||||
subs_for_video.deserialize(data)
|
||||
|
||||
elif not bare_fn.endswith(".json.gz") and os.path.exists(os.path.join(self.dataitems_path, bare_fn)):
|
||||
subs_for_video = self.migrate_legacy_data(bare_fn, json_path)
|
||||
|
||||
if not subs_for_video:
|
||||
return
|
||||
@@ -196,7 +423,7 @@ class StoredSubtitlesManager(object):
|
||||
success = getattr(self, mig_func)(subs_for_video)
|
||||
if success is False:
|
||||
logger.error("Couldn't migrate %s, removing data", subs_for_video.video_id)
|
||||
self.delete(fn)
|
||||
self.delete(json_path)
|
||||
break
|
||||
|
||||
if cur_ver > old_ver and success:
|
||||
@@ -210,18 +437,29 @@ class StoredSubtitlesManager(object):
|
||||
def load_or_new(self, plex_item):
|
||||
subs_for_video = self.load(plex_item.rating_key)
|
||||
if not subs_for_video:
|
||||
subs_for_video = StoredVideoSubtitles(plex_item, version=self.version)
|
||||
subs_for_video = JSONStoredVideoSubtitles()
|
||||
subs_for_video.initialize(plex_item, version=self.version)
|
||||
self.save(subs_for_video)
|
||||
return subs_for_video
|
||||
|
||||
def save(self, subs_for_video):
|
||||
data = subs_for_video.serialize()
|
||||
fn = self.get_json_data_path(self.get_storage_filename(subs_for_video.video_id))
|
||||
json_data = dumps(data)
|
||||
with gzip.open(fn, "wb", compresslevel=6) as f:
|
||||
f.write(json_data)
|
||||
|
||||
def delete(self, filename):
|
||||
os.remove(filename)
|
||||
|
||||
def legacy_save(self, subs_for_video):
|
||||
fn = self.get_storage_filename(subs_for_video.video_id)
|
||||
try:
|
||||
self.storage.SaveObject(fn, subs_for_video)
|
||||
except:
|
||||
logger.error("Failed to save item %s: %s" % (fn, traceback.format_exc()))
|
||||
|
||||
def delete(self, filename):
|
||||
def legacy_delete(self, filename):
|
||||
try:
|
||||
self.storage.Remove(filename)
|
||||
except:
|
||||
|
||||
@@ -10,8 +10,8 @@ from subliminal_patch import scan_video, refine, search_external_subtitles
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def parse_video(fn, hints, external_subtitles=False, embedded_subtitles=False, known_embedded=None, forced_only=False,
|
||||
video_fps=None, dry_run=False):
|
||||
def parse_video(fn, video_info, hints, external_subtitles=False, embedded_subtitles=False, known_embedded=None,
|
||||
forced_only=False, video_fps=None, dry_run=False):
|
||||
|
||||
logger.debug("Parsing video: %s, hints: %s", os.path.basename(fn), hints)
|
||||
video = scan_video(fn, hints=hints, dont_use_actual_file=dry_run)
|
||||
@@ -19,29 +19,58 @@ def parse_video(fn, hints, external_subtitles=False, embedded_subtitles=False, k
|
||||
# refiners
|
||||
|
||||
refine_kwargs = {
|
||||
"episode_refiners": ('sz_metadata', 'tvdb', 'omdb'),
|
||||
"movie_refiners": ('sz_metadata', 'omdb',),
|
||||
"episode_refiners": ('tvdb', 'sz_omdb'),
|
||||
"movie_refiners": ('sz_omdb',),
|
||||
"embedded_subtitles": False,
|
||||
}
|
||||
|
||||
# our own metadata refiner :)
|
||||
if "stream" in video_info:
|
||||
for key, value in video_info["stream"].iteritems():
|
||||
if hasattr(video, key) and not getattr(video, key):
|
||||
logger.info(u"Adding stream %s info: %s", key, value)
|
||||
setattr(video, key, value)
|
||||
|
||||
plex_title = video_info.get("original_title", video_info.get("title"))
|
||||
if hints["type"] == "episode":
|
||||
plex_title = video_info.get("original_title", video_info.get("series"))
|
||||
|
||||
if not video.year:
|
||||
video.year = video_info.get("year")
|
||||
|
||||
refine(video, **refine_kwargs)
|
||||
|
||||
if not video.imdb_id:
|
||||
video.imdb_id = video_info.get("imdb_id")
|
||||
if video.imdb_id:
|
||||
logger.info(u"Adding PMS imdb_id info: %s", video.imdb_id)
|
||||
|
||||
if hints["type"] == "episode":
|
||||
if not video.series_tvdb_id:
|
||||
logger.info(u"Adding PMS series_tvdb_id info: %s", video_info.get("series_tvdb_id"))
|
||||
video.series_tvdb_id = video_info.get("series_tvdb_id")
|
||||
|
||||
if not video.tvdb_id:
|
||||
logger.info(u"Adding PMS tvdb_id info: %s", video_info.get("tvdb_id"))
|
||||
video.tvdb_id = video_info.get("tvdb_id")
|
||||
|
||||
# re-refine with plex's known data?
|
||||
refine_with_plex = False
|
||||
|
||||
# episode but wasn't able to match title
|
||||
if hints["type"] == "episode" and not video.series_tvdb_id and not video.tvdb_id and not video.series_imdb_id \
|
||||
and video.series != hints["title"]:
|
||||
logger.info(u"Re-refining with series title: '%s' instead of '%s'", hints["title"], video.series)
|
||||
video.series = hints["title"]
|
||||
refine_with_plex = True
|
||||
if plex_title:
|
||||
if hints["type"] == "episode" and not video.series_tvdb_id and not video.tvdb_id and not video.series_imdb_id \
|
||||
and video.series != plex_title:
|
||||
logger.info(u"Re-refining with series title: '%s' instead of '%s'", plex_title, video.series)
|
||||
video.series = plex_title
|
||||
refine_with_plex = True
|
||||
|
||||
# movie
|
||||
elif hints["type"] == "movie" and not video.imdb_id and video.title != hints["title"]:
|
||||
# movie
|
||||
logger.info(u"Re-refining with series title: '%s' instead of '%s'", hints["title"], video.title)
|
||||
video.title = hints["title"]
|
||||
refine_with_plex = True
|
||||
elif hints["type"] == "movie" and not video.imdb_id and video.title != plex_title:
|
||||
# movie
|
||||
logger.info(u"Re-refining with series title: '%s' instead of '%s'", plex_title, video.title)
|
||||
video.title = plex_title
|
||||
refine_with_plex = True
|
||||
|
||||
# title not matched? try plex title hint
|
||||
if refine_with_plex:
|
||||
@@ -60,7 +89,6 @@ def parse_video(fn, hints, external_subtitles=False, embedded_subtitles=False, k
|
||||
)
|
||||
|
||||
# add video fps info
|
||||
# fixme: still needed?
|
||||
video.fps = video_fps
|
||||
|
||||
# add known embedded subtitles
|
||||
@@ -77,4 +105,13 @@ def parse_video(fn, hints, external_subtitles=False, embedded_subtitles=False, k
|
||||
logger.debug('Found embedded subtitle %r', embedded_subtitle_languages)
|
||||
video.subtitle_languages.update(embedded_subtitle_languages)
|
||||
|
||||
# guess special
|
||||
if hints["type"] == "episode":
|
||||
if video.season == 0 or video.episode == 0:
|
||||
video.is_special = True
|
||||
else:
|
||||
# check parent folder name
|
||||
if os.path.dirname(fn).split(os.path.sep)[-1].lower() in ("specials", "season 00"):
|
||||
video.is_special = True
|
||||
|
||||
return video
|
||||
|
||||
@@ -4,37 +4,43 @@
|
||||
|
||||
2
|
||||
00:00:10,759 --> 00:00:12,678
|
||||
ROSE: So what is it?
|
||||
What's wrong?
|
||||
ROSE: (Help us. Please. . .help us.)
|
||||
What's "wrong"? over 9, 000!
|
||||
|
||||
3
|
||||
00:00:12,679 --> 00:00:16,097
|
||||
I don't know. Some kind
|
||||
I don't know. Some kind of wrong "1 00" number
|
||||
of signal, drawing the Tardis off course.
|
||||
|
||||
4
|
||||
00:00:16,099 --> 00:00:17,224
|
||||
Where are we?
|
||||
this is a"subtitle" test "with a"text before colons and "peter"following: Where are we?."
|
||||
|
||||
5
|
||||
00:00:17,225 --> 00:00:19,684
|
||||
Earth. Utah, North America.
|
||||
"less text before colons: Earth. Utah, North America."
|
||||
MUSIC PLAYS What is that sound?!
|
||||
ls it?
|
||||
take them balls it
|
||||
|
||||
6
|
||||
00:00:19,686 --> 00:00:21,103
|
||||
About half a mile underground.
|
||||
Ithinkyou're About half a miIe underground. ls it
|
||||
Don't fix this countdown: 81, 80, 79, 78
|
||||
But fix this: 81 ,00
|
||||
|
||||
7
|
||||
00:00:21,103 --> 00:00:23,603
|
||||
And when are we?
|
||||
<i>(laughing): lrn gonna And when are we? (chuckles)
|
||||
lrn gonna And when are we?</i>
|
||||
|
||||
8
|
||||
00:00:24,274 --> 00:00:26,649
|
||||
2012.
|
||||
...2012. weII it's 1 2:00 o'clock
|
||||
|
||||
9
|
||||
00:00:26,650 --> 00:00:29,370
|
||||
God, that's so close. I should be 26!
|
||||
(BIG BROTHER THEME MUSIC)
|
||||
|
||||
10
|
||||
00:00:30,612 --> 00:00:33,112
|
||||
@@ -43,32 +49,34 @@ God, that's so close. I should be 26!
|
||||
11
|
||||
00:00:33,658 --> 00:00:34,783
|
||||
(WHOO
|
||||
SHING) geil
|
||||
SHING) >>geil
|
||||
|
||||
12
|
||||
00:00:34,783 --> 00:00:36,826
|
||||
Blimey.
|
||||
-- Blimey.
|
||||
|
||||
13
|
||||
00:00:36,828 --> 00:00:39,328
|
||||
ROSE: Like a great big museum.
|
||||
ROSE: Like a "great...big museum".
|
||||
|
||||
14
|
||||
00:00:40,414 --> 00:00:42,914
|
||||
DOCTOR: An alien museum.
|
||||
DOCTOR's MOM: ''An alien museum".
|
||||
|
||||
15
|
||||
00:00:43,542 --> 00:00:46,042
|
||||
Someone's got a hobby.
|
||||
Someone's got a hobby.
|
||||
|
||||
16
|
||||
00:00:46,378 --> 00:00:49,048
|
||||
They must've spent a fortune on this.
|
||||
FULL UPPERCASE LINE HERE
|
||||
and some text
|
||||
- (chuckles)
|
||||
|
||||
17
|
||||
00:00:49,631 --> 00:00:51,924
|
||||
AGUGU
|
||||
pepipi
|
||||
<i>AGUGU
|
||||
pepipi</i>
|
||||
|
||||
18
|
||||
00:00:51,926 --> 00:00:55,304
|
||||
@@ -263,12 +271,13 @@ Is it talking?
|
||||
|
||||
60
|
||||
00:03:45,641 --> 00:03:48,141
|
||||
(DRILLING)
|
||||
<u>This will end up with an open end tag
|
||||
<i>(DRILLING)</i></u>
|
||||
|
||||
61
|
||||
00:03:53,233 --> 00:03:56,151
|
||||
- Not exactly talking, no.
|
||||
- Then what's it doing?
|
||||
- (REMOVE ME <s> PLEASE)
|
||||
- Then <i>what's</i> it doing?</s>
|
||||
|
||||
62
|
||||
00:03:56,151 --> 00:03:57,235
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user