Compare commits

...

46 Commits

Author SHA1 Message Date
panni 20845bbcd4 release 2.5.0.2287 2018-03-01 16:34:45 +01:00
panni 739c10ade6 submod: common: require at least one music symbol when fixing 2018-03-01 16:30:02 +01:00
panni 14ea2d72a7 Merge branch 'develop-2.1' 2018-03-01 16:19:01 +01:00
panni 4a9ea97ea1 update doc 2018-03-01 12:51:48 +01:00
panni b017a94353 update doc 2018-03-01 12:51:39 +01:00
panni 15b65dd844 core: better embedded subtitle stream language detection 2018-03-01 12:46:19 +01:00
panni 4b949dcd72 core: support mov_text for embedded subtitle extraction 2018-02-28 18:42:58 +01:00
panni 2626cf4253 core: handle nld for embedded subs 2018-02-28 18:14:59 +01:00
panni b260c8aaec config: clarify subscene being only enabled for TV shows by default 2018-02-28 11:44:35 +01:00
panni 1ece46473b bump dev 2018-02-27 17:45:56 +01:00
panni 890c3cc8b0 core: fix remove crap from filename; fixes non-matched release group in refiners 2018-02-27 15:15:25 +01:00
panni 58fb2f5ea6 bump dev 2018-02-27 12:48:40 +01:00
panni a79f3e47ba submod: OCR: fix it'sjust, isn'tjust, Iam, Ican 2018-02-27 12:37:15 +01:00
panni b3b9db9ff6 core: get subtitles from archive: remove redundant get 2018-02-27 12:33:14 +01:00
panni 9aed245241 core: get subtitles from archive: don't assume any attributes in guess 2018-02-27 12:32:28 +01:00
panni aa03fdb445 core: get subtitles from archive: don't assume an episode match 2018-02-27 12:31:18 +01:00
panni 7cb8356598 submod: HI: HI_before_colon_noncaps: also consider multiple dashes a sentence 2018-02-27 12:28:37 +01:00
panni ac347755fd submod: HI: separate text before colon into two checks; try not to break actual sentences before colon 2018-02-27 12:09:26 +01:00
panni b16cb15e88 submod: HI: fix remove music-symbol-only lines 2018-02-27 11:36:28 +01:00
panni 4989c37964 submod: HI: remove music-symbol-only lines 2018-02-27 11:30:55 +01:00
panni 06849c5814 submod: common: fix music symbols 2018-02-27 11:26:53 +01:00
panni 78b67a6f5e submod: OCR: correctly fix broken HI tag colons 2018-02-27 11:22:58 +01:00
panni acf79df4d0 bump dev 2018-02-26 16:45:04 +01:00
panni bc5a9caf63 submod: OCR: fix "Ls"="Is" 2018-02-26 14:56:48 +01:00
panni 7b34b07cdc hard error on IOError while scanning videos; warn about hard error in menu #444 2018-02-26 10:06:52 +01:00
panni 8df1a1bf17 bump dev 2018-02-23 17:03:54 +01:00
panni 1143b0f2d2 providers: opensubtitles: try re-initializing the provider on ResponseNotReady 2018-02-23 17:01:42 +01:00
panni 86883336fd providers: opensubtitles: catch ResponseNotReady 2018-02-23 16:51:47 +01:00
panni 62d77c5811 #441 #440 add scandir listdir fallback mechanism 2018-02-23 15:22:39 +01:00
panni 8397dddbbe #441 patch sys.getfilesystemencoding 2018-02-23 12:28:48 +01:00
panni 47ef94d8c3 submod: common: rename CM_underscore_only to CM_non_word_only 2018-02-18 00:39:47 +01:00
panni 8aa4a485ed reduce main icon size 2018-02-17 17:08:47 +01:00
panni cb4ef9c9ea submod: common: dash underscore empty 2018-02-17 03:51:01 +01:00
panni 2f80852a7c submod: add entry index to debug 2018-02-16 13:21:08 +01:00
panni 190a580642 submod: common: remove lines that consists only of underscores; update test.srt 2018-02-16 13:18:44 +01:00
panni 6ba85f5069 submod: common: don't break "-- addicted --" 2018-02-16 13:13:54 +01:00
pannal 707b5921fb Update README.md 2018-02-16 10:05:05 +01:00
panni 2e25e68444 refiners: drone: add http:// to base url if needed 2018-02-15 19:31:01 +01:00
pannal 034260e426 Update README.md 2018-02-15 16:59:11 +01:00
pannal b4eda8bbff Update README.md 2018-02-15 09:46:51 +01:00
panni 93a1b7fb52 back to dev 2018-02-15 09:45:53 +01:00
panni 8ef44c3520 release 2.5.0.2247 2018-02-15 09:45:27 +01:00
panni 449de57fc7 config: debug sonarr/radarr 2018-02-15 09:44:59 +01:00
panni cbe29e233d Merge remote-tracking branch 'origin/master' 2018-02-15 09:42:11 +01:00
panni bef56ff124 core: fix wrong episode matches on hash match 2018-02-15 09:34:31 +01:00
panni c1e13e520b back to dev 2018-02-14 16:31:02 +01:00
27 changed files with 340 additions and 148 deletions
+81
View File
@@ -1,4 +1,85 @@
2.5.0.2241
- fix issue when removing crap from filenames to not accidentally remove release group #436
- fix initialization of soft ignore list after upgrade fron 2.0
2.5.0.2221
- refiners: add support for retrieving original filename from
- drone derivates: sonarr, radarr
- filebot
- symlinks
- file_info meta file lists (see wiki)
- providers: add subscene (disabled by default to not flood subscene on release)
- normal search
- season pack search if season has concluded
- core: add provider subtitle-archive/pack cache for retrieving single subtitles from previously downloaded (season-) packs (subscene)
- core/agent: massive performance improvements over 2.0
- core/agent/background-tasks: reduce memory usage to a fraction of 2.0
- core/providers: add dynamic provider throttling when certain events occur (ServiceUnavailable, too many downloads, ...), to lighten the provider-load
- core/agent/config: automatically extract embedded subtitles (and use them if no current subtitle)
- core: fix internal subtitle info storage issues
- core: always store internal subtitle information even if no subtitle was downloaded (fixes SearchAllRecentlyAddedMissing)
- core: fix internal subtitle info storage on windows (gzip handling is broken there)
- core: don't fail on missing logfile paths
- core: fix default encoding order for non-script-serbian
- core: improve logging
- core: add AsRequested to cleanup garbage names
- core: treat SDTV and HDTV the same when searching for subtitles
- core: parse_video: trust PMS season and episode numbers
- core: parse_video: add series year information from PMS if none found
- core: upgrade dependencies
- core: update subliminal to 62cdb3c
- core: add new file based cache mechanism, rendering DBM/memory backends obsolete
- core: treat 23.980 fps as 23.976 and vice-versa
- core: add HTTP proxy support for querying the providers (supports credentials)
- core: only compute file hashes for enabled providers
- core: massive speedup; refine only when needed, exit early otherwise
- core: store last modified timestamp in subtitle info storage
- core: only write to subtitle info storage if we haven't had one or any subtitle was downloaded
- core: only clean up the sub-folder if a subtitle-sub-folder has been selected, and not the parent one also
- core: support for CP437 encoded filenames in ZIP-Archives
- core: use scandir library instead of os.listdir if possible, reducing performance-impact
- core: archives: support multi-episode subtitles (partly)
- core: subtitle cleanup: add support for hi, cc, sdh secondary filename tags; don't autoclean .txt
- core: increase request timeout by three times in case a proxy is being used
- core: fix language=Unknown in Plex when "Restrict to one language"-setting is set
- core: refining: re-add old detected title as alternative title after re-refining with plex metadata's title; fixes #428
- core: implement advanced_settings.json (see advanced_settings.json.template for reference, copy to "Plug-in Support/Data/com.plexapp.agents.subzero" to use it)
- core/tasks: fix search all recently added missing (the total number of items will change in the menu while running), reduces memory usage
- core/menu: add support for extracting embedded subtitles using the builtin plex transcoder
- core/menu: skip wrong season or episode in returned subtitle results
- core/config: fix language handling if treat undefined as first language is set
- providers: remove shooter.cn
- providers: add support for zip/rar archives containing more than one subtitle file
- submod: common: remove redundant interpunction ("Hello !!!" -> "Hello!")
- submod: skip provider hashing when applying mods
- submod: correctly drop empty line (fixing broken display)
- submod: OCR: fix F'xxxxx -> Fxxxxx
- submod: HI: improve bracket matching
- submod: OCR: fix l/L instead of I more aggressively
- submod: common: fix uppercase I's in lowercase words more aggressively
- submod: HI: improve HI_before_colon
- submod: common: be more aggressive when fixing numbers; correctly space out spaced ellipses; don't break spaced ellipses; handle multiple spaces in numbers
- menu: add support for extracting embedded subtitles for a whole season
- menu: add reapply mods to current subtitle
- menu: pad titles for more submenus, resulting in detail view in PlexWeb
- menu: add subtitle selection submenu (if multiple subtitles are inside the subtitle info storage; e.g. previously downloaded ones or extracted embedded)
- menu: advanced: add skip findbettersubtitles menu item, which sets the last_run to now (for debugging purposes)
- menu: ignore: add more natural title for seasons and episodes (kills your old ignore lists!)
- config: skip provider hashing on low impact mode
- config: add limit by air date setting to consider for FindBetterSubtitles task (default: 1 year)
- advanced settings: define enabled-for media types per provider
- advanced settings: define enabled-for languages per provider
- advanced settings: add deep-clean option (clean up the subtitle-sub-folder and the parent one)
2.0.33.1871
- core: normalize line endings in subtitles to LF (\n)
- core: add subtitle storage lock to avoid race condition
+10 -3
View File
@@ -2,10 +2,10 @@
import sys
import datetime
from subzero.sandbox import restore_builtins
from subzero.sandbox import fix_environment_stuff
module = sys.modules['__main__']
restore_builtins(module, {})
fix_environment_stuff(module, {})
globals = getattr(module, "__builtins__")["globals"]
for key, value in getattr(module, "__builtins__").iteritems():
@@ -206,7 +206,14 @@ class SubZeroAgent(object):
# scanned_video_part_map = {subliminal.Video: plex_part, ...}
providers = config.get_providers(media_type=self.agent_type)
scanned_video_part_map = scan_videos(videos, providers=providers)
try:
scanned_video_part_map = scan_videos(videos, providers=providers)
except IOError, e:
Log.Exception("Permission error, please check your folder/file permissions. Exiting.")
if cast_bool(Prefs["check_permissions"]):
config.permissions_ok = False
config.missing_permissions = e.message
return
# auto extract embedded
if config.embedded_auto_extract:
+9 -2
View File
@@ -41,12 +41,19 @@ def fatality(randomize=None, force_title=None, header=None, message=None, only_r
return oc
if not config.permissions_ok and config.missing_permissions:
for title, path in config.missing_permissions:
if not isinstance(config.missing_permissions, list):
oc.add(DirectoryObject(
key=Callback(fatality, randomize=timestamp()),
title=pad_title("Insufficient permissions"),
summary="Insufficient permissions on library %s, folder: %s" % (title, path),
summary=config.missing_permissions,
))
else:
for title, path in config.missing_permissions:
oc.add(DirectoryObject(
key=Callback(fatality, randomize=timestamp()),
title=pad_title("Insufficient permissions"),
summary="Insufficient permissions on library %s, folder: %s" % (title, path),
))
return oc
if not config.enabled_sections:
+19 -16
View File
@@ -332,24 +332,27 @@ def ValidatePrefs():
# debug drone
if "sonarr" in config.refiner_settings or "radarr" in config.refiner_settings:
Log.Debug("----- Connections -----")
from subliminal_patch.refiners.drone import SonarrClient, RadarrClient
for key, cls in [("sonarr", SonarrClient), ("radarr", RadarrClient)]:
if key in config.refiner_settings:
cname = key.capitalize()
try:
status = cls(**config.refiner_settings[key]).status()
except HTTPError, e:
if e.response.status_code == 401:
Log.Debug("%s: NOT WORKING - BAD API KEY", cname)
else:
try:
from subliminal_patch.refiners.drone import SonarrClient, RadarrClient
for key, cls in [("sonarr", SonarrClient), ("radarr", RadarrClient)]:
if key in config.refiner_settings:
cname = key.capitalize()
try:
status = cls(**config.refiner_settings[key]).status()
except HTTPError, e:
if e.response.status_code == 401:
Log.Debug("%s: NOT WORKING - BAD API KEY", cname)
else:
Log.Debug("%s: NOT WORKING - %s", cname, traceback.format_exc())
except:
Log.Debug("%s: NOT WORKING - %s", cname, traceback.format_exc())
except:
Log.Debug("%s: NOT WORKING - %s", cname, traceback.format_exc())
else:
if status["version"]:
Log.Debug("%s: OK - %s", cname, status["version"])
else:
Log.Debug("%s: NOT WORKING - %s", cname)
if status and status["version"]:
Log.Debug("%s: OK - %s", cname, status["version"])
else:
Log.Debug("%s: NOT WORKING - %s", cname)
except:
Log.Debug("Something went really wrong when evaluating Sonarr/Radarr: %s", traceback.format_exc())
# fixme: check existance of and os access of logs
Log.Debug("----- Environment -----")
+1 -1
View File
@@ -35,7 +35,7 @@ SUBTITLE_EXTS_BASE = ['utf', 'utf8', 'utf-8', 'srt', 'smi', 'rt', 'ssa', 'aqt',
'vtt']
SUBTITLE_EXTS = SUBTITLE_EXTS_BASE + ["txt"]
TEXT_SUBTITLE_EXTS = ("srt", "ass", "ssa", "vtt")
TEXT_SUBTITLE_EXTS = ("srt", "ass", "ssa", "vtt", "mov_text")
VIDEO_EXTS = ['3g2', '3gp', 'asf', 'asx', 'avc', 'avi', 'avs', 'bivx', 'bup', 'divx', 'dv', 'dvr-ms', 'evo', 'fli',
'flv',
'm2t', 'm2ts', 'm2v', 'm4v', 'mkv', 'mov', 'mp4', 'mpeg', 'mpg', 'mts', 'nsv', 'nuv', 'ogm', 'ogv', 'tp',
+1 -1
View File
@@ -376,7 +376,7 @@
},
{
"id": "provider.subscene.enabled",
"label": "Provider: Enable SubScene",
"label": "Provider: Enable SubScene (TV shows)",
"type": "bool",
"default": "false"
},
+2 -2
View File
@@ -13,7 +13,7 @@
<key>CFBundleSignature</key>
<string>????</string>
<key>CFBundleVersion</key>
<string>2.5.0.2241</string>
<string>2.5.0.2287</string>
<key>PlexFrameworkVersion</key>
<string>2</string>
<key>PlexPluginClass</key>
@@ -32,7 +32,7 @@
&lt;h1&gt;Sub-Zero for Plex&lt;/h1&gt;&lt;i&gt;Subtitles done right&lt;/i&gt;
Version 2.5.0.2241
Version 2.5.0.2287
Originally based on @bramwalet's awesome &lt;a href=&quot;https://github.com/bramwalet/Subliminal.bundle&quot;&gt;Subliminal.bundle&lt;/a&gt;
@@ -10,6 +10,7 @@ import time
import operator
import itertools
from httplib import ResponseNotReady
import rarfile
import requests
@@ -18,7 +19,6 @@ from collections import defaultdict
from bs4 import UnicodeDammit
from babelfish import LanguageReverseError
from guessit.jsonutils import GuessitEncoder
from scandir import scandir
from subliminal import ProviderError, refiner_manager
from extensions import provider_registry
@@ -31,6 +31,7 @@ from subliminal.core import guessit, ProviderPool, io, is_windows_special_path,
from subliminal_patch.exceptions import TooManyRequests
from subzero.language import Language
from subzero.lib.io import scandir
logger = logging.getLogger(__name__)
@@ -43,11 +44,20 @@ DOWNLOAD_RETRY_SLEEP = 6
# fixme: this may be overkill
REMOVE_CRAP_FROM_FILENAME = re.compile(r"(?i)(?:([\s_-]+(?:obfuscated|scrambled|nzbgeek|chamele0n|buymore|xpost|postbot"
r"|asrequested)(?:\[.+\])?)|[\s_-]\w{2,}(\[.+\]))(?=\.\w+$|$)")
r"|asrequested)(?:\[.+\])?)|([\s_-]\w{2,})(\[.+\]))(?=\.\w+$|$)")
SUBTITLE_EXTENSIONS = ('.srt', '.sub', '.smi', '.txt', '.ssa', '.ass', '.mpl', '.vtt')
def remove_crap_from_fn(fn):
# in case of the second regex part, the legit release group name will be in group(2), if it's followed by [string]
# otherwise replace fully, because the first part matched
def repl(m):
return m.group(2) if len(m.groups()) == 3 else ""
return REMOVE_CRAP_FROM_FILENAME.sub(repl, fn)
class SZProviderPool(ProviderPool):
def __init__(self, providers=None, provider_configs=None, blacklist=None, throttle_callback=None,
pre_download_hook=None, post_download_hook=None, language_hook=None):
@@ -245,6 +255,14 @@ class SZProviderPool(ProviderPool):
socket.timeout):
logger.error('Provider %r connection error', subtitle.provider_name)
except ResponseNotReady:
logger.error('Provider %r response error, reinitializing', subtitle.provider_name)
try:
self[subtitle.provider_name].terminate()
self[subtitle.provider_name].initialize()
except:
logger.error('Provider %r reinitialization error', subtitle.provider_name)
except rarfile.BadRarFile:
logger.error('Malformed RAR file from provider %r, skipping subtitle.', subtitle.provider_name)
return False
@@ -319,16 +337,18 @@ class SZProviderPool(ProviderPool):
logger.error("%r: Match computation failed: %s", s, traceback.format_exc())
continue
orig_matches = matches.copy()
logger.debug('%r: Found matches %r', s, matches)
unsorted_subtitles.append(
(s, compute_score(matches, s, video, hearing_impaired=use_hearing_impaired), matches))
(s, compute_score(matches, s, video, hearing_impaired=use_hearing_impaired), matches, orig_matches))
# sort subtitles by score
scored_subtitles = sorted(unsorted_subtitles, key=operator.itemgetter(1), reverse=True)
# download best subtitles, falling back on the next on error
downloaded_subtitles = []
for subtitle, score, matches in scored_subtitles:
for subtitle, score, matches, orig_matches in scored_subtitles:
# check score
if score < min_score:
logger.info('%r: Score %d is below min_score (%d)', subtitle, score, min_score)
@@ -351,7 +371,7 @@ class SZProviderPool(ProviderPool):
score, hearing_impaired)
continue
if is_episode and not {"series", "season", "episode"}.issubset(matches):
if is_episode and not {"series", "season", "episode"}.issubset(orig_matches):
logger.debug("%r: Skipping subtitle with score %d, because it doesn't match our series/episode",
subtitle, score)
continue
@@ -457,15 +477,15 @@ def scan_video(path, dont_use_actual_file=False, hints=None, providers=None, ski
# remove crap from folder names
if video_type == "episode":
if len(split_path) > 2:
split_path[-3] = REMOVE_CRAP_FROM_FILENAME.sub("", split_path[-3])
split_path[-3] = remove_crap_from_fn(split_path[-3])
else:
if len(split_path) > 1:
split_path[-2] = REMOVE_CRAP_FROM_FILENAME.sub("", split_path[-2])
split_path[-2] = remove_crap_from_fn(split_path[-2])
guess_from = os.path.join(*split_path)
# remove crap from file name
guess_from = REMOVE_CRAP_FROM_FILENAME.sub("", guess_from)
guess_from = remove_crap_from_fn(guess_from)
# guess
hints["single_value"] = True
@@ -56,8 +56,8 @@ class ProviderRetryMixin(object):
class ProviderSubtitleArchiveMixin(object):
"""
handled ZipFile and RarFile archives
needs subtitle.episode, subtitle.season, subtitle.matches and subtitle.releases to work
handles ZipFile and RarFile archives
needs subtitle.episode, subtitle.season, subtitle.matches, subtitle.releases and subtitle.asked_for_episode to work
"""
def get_subtitle_from_archive(self, subtitle, archive):
# extract subtitle's content
@@ -84,7 +84,7 @@ class ProviderSubtitleArchiveMixin(object):
# - release group matches (and we asked for one and it was matched, or it was not matched)
is_episode = subtitle.asked_for_episode
episodes = guess["episode"]
episodes = guess.get("episode")
if is_episode and episodes and not isinstance(episodes, list):
episodes = [episodes]
@@ -92,7 +92,7 @@ class ProviderSubtitleArchiveMixin(object):
(
subtitle.episode in episodes
or (subtitle.is_pack and subtitle.asked_for_episode in episodes)
) and guess["season"] == subtitle.season):
) and guess.get("season") == subtitle.season):
format_matches = True
wanted_format_but_not_found = False
@@ -94,12 +94,6 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
self.token = None
self.is_vip = is_vip
if is_vip:
self.server = self.get_server_proxy(self.vip_url)
logger.info("Using VIP server")
else:
self.server = self.get_server_proxy(self.default_url)
if use_tag_search:
logger.info("Using tag/exact filename search")
@@ -138,6 +132,12 @@ class OpenSubtitlesProvider(ProviderRetryMixin, _OpenSubtitlesProvider):
return func()
def initialize(self):
if self.is_vip:
self.server = self.get_server_proxy(self.vip_url)
logger.info("Using VIP server")
else:
self.server = self.get_server_proxy(self.default_url)
logger.info('Logging in')
token = region.get("os_token", expiration_time=3600)
@@ -5,13 +5,13 @@ import os
from guessit import guessit
from subliminal import Episode
from subliminal_patch.core import REMOVE_CRAP_FROM_FILENAME
from subliminal_patch.core import remove_crap_from_fn
logger = logging.getLogger(__name__)
def update_video(video, fn):
guess_from = REMOVE_CRAP_FROM_FILENAME.sub("", fn)
guess_from = remove_crap_from_fn(fn)
logger.debug(u"Got original filename: %s", guess_from)
@@ -9,7 +9,7 @@ import requests
from guessit import guessit
from requests.compat import urljoin, quote
from subliminal import Episode, Movie, region
from subliminal_patch.core import REMOVE_CRAP_FROM_FILENAME
from subliminal_patch.core import remove_crap_from_fn
logger = logging.getLogger(__name__)
@@ -29,6 +29,9 @@ class DroneAPIClient(object):
if not base_url.endswith("/"):
base_url += "/"
if not base_url.startswith("http"):
base_url = "http://%s" % base_url
if not base_url.endswith("api/"):
self.api_url = urljoin(base_url, "api/")
@@ -171,7 +174,7 @@ class SonarrClient(DroneAPIClient):
:return:
"""
ext = os.path.splitext(video.name)[1]
guess_from = REMOVE_CRAP_FROM_FILENAME.sub("", scene_name + ext)
guess_from = remove_crap_from_fn(scene_name + ext)
# guess
hints = {
@@ -260,7 +263,7 @@ class RadarrClient(DroneAPIClient):
:return:
"""
ext = os.path.splitext(video.name)[1]
guess_from = REMOVE_CRAP_FROM_FILENAME.sub("", scene_name + ext)
guess_from = remove_crap_from_fn(scene_name + ext)
# guess
hints = {
@@ -308,4 +311,4 @@ def refine(video, **kwargs):
client.update_video(video, os.path.splitext(additional_data["original_filepath"])[0])
if "release_group" in additional_data and not video.release_group:
video.release_group = REMOVE_CRAP_FROM_FILENAME.sub("", additional_data["release_group"])
video.release_group = remove_crap_from_fn(additional_data["release_group"])
@@ -1,13 +1,24 @@
# coding=utf-8
from babelfish.exceptions import LanguageError
from babelfish import Language as Language_
repl_map = {
"dk": "da",
"nld": "nl",
}
def language_from_stream(l):
for method in ("fromietf", "fromalpha3t", "fromalpha3b"):
try:
return getattr(Language, method)(l)
except LanguageError:
pass
raise LanguageError()
class Language(Language_):
@classmethod
def fromietf(cls, ietf):
@@ -15,3 +26,11 @@ class Language(Language_):
ietf = repl_map[ietf]
return Language_.fromietf(ietf)
@classmethod
def fromalpha3b(cls, s):
if s in repl_map:
s = repl_map[s]
return Language_.fromietf(s)
return Language_.fromalpha3b(s)
@@ -1,6 +1,10 @@
# coding=utf-8
import os
import sys
from scandir import scandir as _scandir
# thanks @ plex trakt scrobbler: https://github.com/trakt/Plex-Trakt-Scrobbler/blob/master/Trakttv.bundle/Contents/Libraries/Shared/plugin/core/io.py
@@ -34,3 +38,62 @@ def get_viable_encoding():
encoding = sys.getfilesystemencoding()
return "utf-8" if not encoding or encoding.lower() not in VALID_ENCODINGS else encoding
class ScandirListdirEntryStub(object):
"""
A class which mimics the entries returned by scandir, for fallback purposes when using listdir instead.
"""
__slots__ = ('name', '_d_type', '_stat', '_lstat', '_scandir_path', '_path', '_inode')
def __init__(self, scandir_path, name, d_type, inode):
self._scandir_path = scandir_path
self.name = name
self._d_type = d_type
self._inode = inode
self._stat = None
self._lstat = None
self._path = None
@property
def path(self):
if self._path is None:
self._path = os.path.join(self._scandir_path, self.name)
return self._path
def stat(self, follow_symlinks=True):
path = self.path
if follow_symlinks and self.is_symlink():
path = os.path.realpath(path)
return os.stat(path)
def is_dir(self, follow_symlinks=True):
path = self.path
if follow_symlinks and self.is_symlink():
path = os.path.realpath(path)
return os.path.isdir(path)
def is_file(self, follow_symlinks=True):
path = self.path
if follow_symlinks and self.is_symlink():
path = os.path.realpath(path)
return os.path.isfile(path)
def is_symlink(self):
return os.path.islink(self.path)
def scandir_listdir_fallback(path):
for fn in os.listdir(path):
yield ScandirListdirEntryStub(path, fn, None, None)
def scandir(path):
try:
return _scandir(path)
# fallback for systems where sys.getfilesystemencoding() returns the "wrong" value
except UnicodeDecodeError:
return scandir_listdir_fallback(path)
File diff suppressed because one or more lines are too long
@@ -34,6 +34,8 @@ SZ_FIX_DATA = {
u"lljust": u"ll just",
u" L ": u" I ",
u" l ": u" I ",
u"'sjust": u"'s just",
u"'tjust": u"'t just",
},
"WholeWords": {
u"I'11": u"I'll",
@@ -53,6 +55,9 @@ SZ_FIX_DATA = {
u" 're ": u"'re ",
u"LAst": u"Last",
u"forthis": u"for this",
u"Ls": u"Is",
u"Iam": u"I am",
u"Ican": u"I can",
},
"PartialLines": {
u"L know": u"I know",
@@ -197,13 +197,13 @@ class SubtitleModifications(object):
line = mod.modify(line.strip(), entry=entry.text, debug=self.debug, parent=self, **args)
except EmptyEntryError:
if self.debug:
logger.debug(u"%s: %r -> ''", identifier, entry.text)
logger.debug(u"%d: %s: %r -> ''", index, identifier, entry.text)
skip_entry = True
break
if not line:
if self.debug:
logger.debug(u"%s: %r -> ''", identifier, old_line)
logger.debug(u"%d: %s: %r -> ''", index, identifier, old_line)
skip_line = True
break
@@ -235,12 +235,12 @@ class SubtitleModifications(object):
lines.append(cleaned_line)
else:
if self.debug:
logger.debug(u"Ditching now empty line (%r)", line)
logger.debug(u"%d: Ditching now empty line (%r)", index, line)
if not lines:
# don't bother logging when the entry only had one line
if self.debug and line_count > 1:
logger.debug(u"%r -> ''", entry.text)
logger.debug(u"%d: %r -> ''", index, entry.text)
continue
new_text = ur"\N".join(lines)
@@ -20,7 +20,13 @@ class CommonFixes(SubtitleTextModification):
processors = [
# -- = ...
StringProcessor("-- ", '... ', name="CM_doubledash"),
NReProcessor(re.compile(r'(?u)(^-\s?-[-\s]*)(?!.+\s?-\s?-[-\s]*)'), "", name="CM_doubledash"),
# line = _/-/\s
NReProcessor(re.compile(r'(?u)(^[-_\s]*[-_\s]+[-_\s]*$)'), "", name="CM_non_word_only"),
# fix music symbols
NReProcessor(re.compile(ur'(?u)(^[*#¶\s]*[*#¶]+[*#¶\s]*$)'), u"", name="CM_music_symbols"),
# '' = "
StringProcessor("''", '"', name="CM_double_apostrophe"),
@@ -47,10 +47,19 @@ class HearingImpaired(SubtitleTextModification):
#NReProcessor(re.compile(ur'(?u)(\b|^)([\s-]*(?=[A-zÀ-ž-_0-9"\']{3,})[A-zÀ-ž-_0-9"\']+:\s*)'), "",
# name="HI_before_colon"),
# text before colon (at least 3 chars); at start or after a sentence, possibly with a dash in front
NReProcessor(re.compile(ur'(?u)(?:(?<=^)|(?<=[.\-!?"\']))'
ur'([\s-]*(?=[A-zÀ-ž-_0-9\s"\']{3,})[A-zÀ-ž-_0-9\s"\']+:\s*)(?![0-9])'), "",
name="HI_before_colon"),
# uppercase text before colon (at least 3 uppercase chars); at start or after a sentence,
# possibly with a dash in front
NReProcessor(re.compile(ur'(?u)(?:(?<=^)|(?<=[.\-!?\"\']))([\s-]*(?=[A-ZÀ-Ž]\s*[A-ZÀ-Ž]\s*[A-ZÀ-Ž])'
ur'[A-ZÀ-Ž-_0-9\s\"\']+:\s*)(?![0-9])'), "", name="HI_before_colon_caps"),
# any text before colon (at least 3 uppercase chars); at start or after a sentence,
# possibly with a dash in front; try not breaking actual sentences with a colon at the end by not matching if
# more than one space is inside the text
NReProcessor(re.compile(ur'(?u)(?:(?<=^)|(?<=[.\-!?\"\']))([\s-]*(?=[A-zÀ-ž]\s*[A-zÀ-ž]\s*[A-zÀ-ž])'
ur'[A-zÀ-ž-_0-9\s\"\']+:\s*)(?![0-9])'),
lambda match: match.group(1) if (match.group(1).count(" ") > 1
or match.group(1).count("-") > 1) else "",
name="HI_before_colon_noncaps"),
# text in brackets at start, after optional dash, before colon or at end of line
# fixme: may be too aggressive
@@ -66,6 +75,9 @@ class HearingImpaired(SubtitleTextModification):
# all caps at start before new sentence
NReProcessor(re.compile(ur'(?u)^(?=[A-ZÀ-Ž]{4,})[A-ZÀ-Ž-_\s]+\s([A-ZÀ-Ž][a-zà-ž].+)'), r"\1",
name="HI_starting_upper_then_sentence"),
# remove music symbols
NReProcessor(re.compile(ur'(?u)(^[*#¶♫♪\s]*[*#¶♫♪\s]+[*#¶♫♪\s]*$)'), "", name="HI_music_symbols_only"),
]
post_processors = empty_line_post_processors
@@ -39,8 +39,8 @@ class FixOCR(SubtitleTextModification):
return [
# remove broken HI tag colons (ANNOUNCER'., ". instead of :) after at least 3 uppercase chars
NReProcessor(re.compile(ur'(?u)(^.*(?<=[A-ZÀ-Ž]{3})[A-ZÀ-Ž-_\s0-9"\']+["\'’ʼ❜‘‛”“‟„][.,‚،⹁、]\s*)'), "",
name="OCR_fix_HI_colons"),
NReProcessor(re.compile(ur'(?u)(^.*(?<=[A-ZÀ-Ž]{3})[A-ZÀ-Ž-_\s0-9]+)(["\'’ʼ❜‘‛”“‟„]*[.,‚،⹁、;]+)(\s*)'),
r"\1:\3", name="OCR_fix_HI_colons"),
# fix F'bla
NReProcessor(re.compile(ur'(?u)(\bF)(\')([A-zÀ-ž]*\b)'), r"\1\3", name="OCR_fix_F"),
WholeLineProcessor(self.data_dict["WholeLines"], name="OCR_replace_line"),
+11 -3
View File
@@ -1,7 +1,15 @@
# coding=utf-8
# restore builtins
import sys
def restore_builtins(module, base):
def fix_environment_stuff(module, base):
# restore builtins
module.__builtins__ = [x for x in base.__class__.__base__.__subclasses__() if x.__name__ == 'catch_warnings'][0]()._module.__builtins__
# patch getfilesystemencoding for NVIDIA Shield
getfilesystemencoding_orig = sys.getfilesystemencoding
def getfilesystemencoding():
return getfilesystemencoding_orig() or "utf-8"
sys.getfilesystemencoding = getfilesystemencoding
@@ -12,7 +12,7 @@ import sys
from json_tricks.nonp import loads
from subzero.lib.json import dumps
from scandir import scandir
from subzero.lib.io import scandir
from constants import mode_map
logger = logging.getLogger(__name__)
+3 -2
View File
@@ -4,7 +4,7 @@ import logging
import os
from babelfish.exceptions import LanguageError
from subzero.language import Language
from subzero.language import Language, language_from_stream
from subliminal_patch import scan_video, refine, search_external_subtitles
logger = logging.getLogger(__name__)
@@ -44,7 +44,8 @@ def set_existing_languages(video, video_info, external_subtitles=False, embedded
# mp4 and stuff, check burned in
for language in known_embedded:
try:
embedded_subtitle_languages.add(Language.fromalpha3b(language))
embedded_subtitle_languages.add(language_from_stream(language))
except LanguageError:
logger.error('Embedded subtitle track language %r is not a valid language', language)
embedded_subtitle_languages.add(Language('und'))
+11 -4
View File
@@ -56,6 +56,9 @@ SHING) >>geil
12
00:00:34,783 --> 00:00:36,826
-- Blimey.
remove this hi you no, please: is that ok?
here also no hi removal:
butremove this:
13
00:00:36,828 --> 00:00:39,328
@@ -83,7 +86,7 @@ pepipi</i>
18
00:00:51,926 --> 00:00:55,304
That's the milometer
That's the milometer well
from the Roswell spaceship.
19
@@ -104,15 +107,19 @@ Ah, look at you!
22
00:01:13,489 --> 00:01:15,073
What is it?
- _
_
*
♫♫
23
00:01:15,075 --> 00:01:18,076
An old friend of mine. Well, enemy.
-- www.Addic7ed.com --
24
00:01:19,037 --> 00:01:22,367
The stuff of nightmares,
-- The stuff of nightmares,
reduced to an exhibit.
25
Binary file not shown.

Before

Width:  |  Height:  |  Size: 177 KiB

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 177 KiB

After

Width:  |  Height:  |  Size: 50 KiB

+25 -75
View File
@@ -1,11 +1,11 @@
# Sub-Zero for Plex
[![](https://img.shields.io/github/release/pannal/Sub-Zero.bundle.svg?style=flat&label=stable)](https://github.com/pannal/Sub-Zero.bundle/releases/latest)<!--[![](https://img.shields.io/github/release/pannal/Sub-Zero.bundle/all.svg?maxAge=2592000&label=testing+2.0+RC9)](https://github.com/pannal/Sub-Zero.bundle/releases)--> [![master](https://img.shields.io/badge/master-stable-green.svg?maxAge=2592000)]()
[![Maintenance](https://img.shields.io/maintenance/yes/2017.svg)]()
[![Maintenance](https://img.shields.io/maintenance/yes/2018.svg)]()
[![Slack Status](https://szslack.fragstore.net/badge.svg)](https://szslack.fragstore.net)
<img src="https://raw.githubusercontent.com/pannal/Sub-Zero.bundle/master/Contents/Resources/subzero.gif" align="left" height="100"> <font size="5"><b>Subtitles done right!</b></font><br />
Check out **[the Sub-Zero Wiki](https://github.com/pannal/Sub-Zero.bundle/wiki)** by [@ukdtom](https://github.com/ukdtom) <br />
Check out **[the Sub-Zero Wiki](https://github.com/pannal/Sub-Zero.bundle/wiki)** by [@ukdtom](https://github.com/ukdtom) and [@mmgoodnow](https://github.com/mmgoodnow) <br />
<br style="clear:left;"/>
If you like this, buy me a beer: <br>[![Donate](https://www.paypalobjects.com/en_US/i/btn/btn_donate_LG.gif)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=G9VKR2B8PMNKG) <br>or become a Patreon starting at **1 $ / month** <br><a href="https://www.patreon.com/subzero_plex" target="_blank"><img src="http://www.wenspencer.com/wp-content/uploads/2017/02/patreon-button.png" height="42" /></a> <br>or use the OpenSubtitles Sub-Zero affiliate link to become VIP <br>**10€/year, ad-free subs, 1000 subs/day, no-cache *VIP* server**<br><a href="http://v.ht/osvip" target="_blank"><img src="https://static.opensubtitles.org/gfx/logo.gif" height="50" /></a>
@@ -77,83 +77,33 @@ Jacob K, Ninjouz, chopeta, fvb
## Changelog
2.5.0.2241
2.5.0.2287
- fix issue when removing crap from filenames to not accidentally remove release group #436
- fix initialization of soft ignore list after upgrade fron 2.0
- core: reduce main icon size
- core: fix usage on NVIDIA SHIELD (hopefully, please report back), #441
- core: add scandir fallback to listdir in case of badly configured locale in environment, #441, #440
- core: get subtitles from archive: don't assume an episode match
- core: get subtitles from archive: don't assume any attributes in guess
- core: improve release group detection for drone/filebot/file_info refiners
- core: fix language detection for embedded subtitle streams
- core: support extraction of embedded mov_text subtitles in mp4 video files
- refiners: drone: add http:// to url if not given
- providers: opensubtitles: retry/reinitialize request when encountering ResponseNotReady
- config: clarify subscene being only enabled for TV series by default
- menu: when encountering permission errors when scanning media files, warn in the menu about them
- submod: common: don't break -- addic7ed --
- submod: common: remove lines that consist only of dash, underscore
- submod: OCR: fix Ls = Is
- submod: OCR: fix bad HI colons (ANNOUNCER; instead of ANNOUNCER:)
- submod: common: fix lines consisting only of bad music symbols (*#¶ = ♪)
- submod: HI: remove music-symbol-only-lines
- submod: HI: be less aggressive about lines ending with a colon; please re-apply all your mods via advanced menu
- submod: OCR: fix it'sjust, isn'tjust, Iam, Ican
2.5.0.2221
- refiners: add support for retrieving original filename from
- drone derivates: sonarr, radarr
- filebot
- symlinks
- file_info meta file lists (see wiki)
- providers: add subscene (disabled by default to not flood subscene on release)
- normal search
- season pack search if season has concluded
- core: add provider subtitle-archive/pack cache for retrieving single subtitles from previously downloaded (season-) packs (subscene)
- core/agent: massive performance improvements over 2.0
- core/agent/background-tasks: reduce memory usage to a fraction of 2.0
- core/providers: add dynamic provider throttling when certain events occur (ServiceUnavailable, too many downloads, ...), to lighten the provider-load
- core/agent/config: automatically extract embedded subtitles (and use them if no current subtitle)
- core: fix internal subtitle info storage issues
- core: always store internal subtitle information even if no subtitle was downloaded (fixes SearchAllRecentlyAddedMissing)
- core: fix internal subtitle info storage on windows (gzip handling is broken there)
- core: don't fail on missing logfile paths
- core: fix default encoding order for non-script-serbian
- core: improve logging
- core: add AsRequested to cleanup garbage names
- core: treat SDTV and HDTV the same when searching for subtitles
- core: parse_video: trust PMS season and episode numbers
- core: parse_video: add series year information from PMS if none found
- core: upgrade dependencies
- core: update subliminal to 62cdb3c
- core: add new file based cache mechanism, rendering DBM/memory backends obsolete
- core: treat 23.980 fps as 23.976 and vice-versa
- core: add HTTP proxy support for querying the providers (supports credentials)
- core: only compute file hashes for enabled providers
- core: massive speedup; refine only when needed, exit early otherwise
- core: store last modified timestamp in subtitle info storage
- core: only write to subtitle info storage if we haven't had one or any subtitle was downloaded
- core: only clean up the sub-folder if a subtitle-sub-folder has been selected, and not the parent one also
- core: support for CP437 encoded filenames in ZIP-Archives
- core: use scandir library instead of os.listdir if possible, reducing performance-impact
- core: archives: support multi-episode subtitles (partly)
- core: subtitle cleanup: add support for hi, cc, sdh secondary filename tags; don't autoclean .txt
- core: increase request timeout by three times in case a proxy is being used
- core: fix language=Unknown in Plex when "Restrict to one language"-setting is set
- core: refining: re-add old detected title as alternative title after re-refining with plex metadata's title; fixes #428
- core: implement advanced_settings.json (see advanced_settings.json.template for reference, copy to "Plug-in Support/Data/com.plexapp.agents.subzero" to use it)
- core/tasks: fix search all recently added missing (the total number of items will change in the menu while running), reduces memory usage
- core/menu: add support for extracting embedded subtitles using the builtin plex transcoder
- core/menu: skip wrong season or episode in returned subtitle results
- core/config: fix language handling if treat undefined as first language is set
- providers: remove shooter.cn
- providers: add support for zip/rar archives containing more than one subtitle file
- submod: common: remove redundant interpunction ("Hello !!!" -> "Hello!")
- submod: skip provider hashing when applying mods
- submod: correctly drop empty line (fixing broken display)
- submod: OCR: fix F'xxxxx -> Fxxxxx
- submod: HI: improve bracket matching
- submod: OCR: fix l/L instead of I more aggressively
- submod: common: fix uppercase I's in lowercase words more aggressively
- submod: HI: improve HI_before_colon
- submod: common: be more aggressive when fixing numbers; correctly space out spaced ellipses; don't break spaced ellipses; handle multiple spaces in numbers
- menu: add support for extracting embedded subtitles for a whole season
- menu: add reapply mods to current subtitle
- menu: pad titles for more submenus, resulting in detail view in PlexWeb
- menu: add subtitle selection submenu (if multiple subtitles are inside the subtitle info storage; e.g. previously downloaded ones or extracted embedded)
- menu: advanced: add skip findbettersubtitles menu item, which sets the last_run to now (for debugging purposes)
- menu: ignore: add more natural title for seasons and episodes (kills your old ignore lists!)
- config: skip provider hashing on low impact mode
- config: add limit by air date setting to consider for FindBetterSubtitles task (default: 1 year)
- advanced settings: define enabled-for media types per provider
- advanced settings: define enabled-for languages per provider
- advanced settings: add deep-clean option (clean up the subtitle-sub-folder and the parent one)
2.5.0.2247
- fix ignoring by-hash-matched episodes
[older changes](CHANGELOG.md)