Compare commits

...

34 Commits

Author SHA1 Message Date
panni 92d0d70258 Release 2.6.5.3062 2019-05-01 15:32:36 +02:00
panni d44298993c Release 2.6.5.3055 2019-05-01 15:32:19 +02:00
panni 12300d4115 Merge branch 'develop-2.6' 2019-05-01 15:29:42 +02:00
pannal b4f08f61a6 Update README.md 2019-05-01 06:00:01 +02:00
pannal 861a25be41 Update README.md 2019-05-01 05:59:21 +02:00
pannal 3e175109a6 Merge pull request #641 from fossabot/master
Add license scan report and status
2019-05-01 05:48:14 +02:00
fossabot fb2210f2fd Add license scan report and status
Signed-off-by: fossabot <badges@fossa.io>
2019-04-30 20:44:05 -07:00
panni e928918201 add cloudscaper LICENSE 2019-05-01 05:13:13 +02:00
panni df607e5772 bump dev 2019-05-01 04:49:30 +02:00
panni a7cc470645 core: log cf domain 2019-05-01 04:48:48 +02:00
panni 4e6421b928 core: dns: set env var empty if not configured 2019-05-01 04:36:03 +02:00
panni df48e8fccd providers: subscene: remove obsolete imports 2019-05-01 04:27:11 +02:00
panni 58111bf204 core: remove old cfscrape implementation 2019-05-01 04:25:04 +02:00
panni 8c02e75fed providers: titlovi: match cfsrc for src 2019-05-01 04:24:31 +02:00
panni 6f3f1cb4b5 core: cf: harden. 2019-05-01 04:24:09 +02:00
panni dd27997deb core: cf: add cloudscaper 1.1.1@496900e instead of cfscrape 2019-05-01 03:12:01 +02:00
panni a1f70d1d4d core: add ENV:dns_resolvers_timeout 2019-05-01 02:39:18 +02:00
panni 7da0bac643 skip warning 2019-05-01 02:33:46 +02:00
panni b3ab2a451c core: http: don't query DNS with IPs. thanks @fgump 2019-05-01 02:27:30 +02:00
panni 850f836ebd back to dev 2019-04-28 05:27:26 +02:00
panni d9fa9d03da back to dev 2019-04-28 05:22:24 +02:00
pannal 76c20dc3d7 Update README.md 2019-04-28 05:21:35 +02:00
panni 4568e222d1 release 2.6.5.3041 2019-04-28 05:11:45 +02:00
panni 344025226a add missing changelog entry 2019-04-28 05:11:09 +02:00
panni f546fcffce release 2.6.5.3039 2019-04-28 05:08:00 +02:00
panni 068c2d4d00 Merge remote-tracking branch 'origin/master'
# Conflicts:
#	Contents/Info.plist
2019-04-28 05:04:51 +02:00
panni ccf5a902e5 core: cf: only store cookie if it had a value 2019-04-28 05:03:04 +02:00
panni 8c72cf9057 bump dev 2019-04-28 04:45:17 +02:00
panni 1ce14aa231 core: http: remove debug 2019-04-28 04:44:27 +02:00
panni 643485b879 core: cf: optimize
providers: titlovi: optimize cf/captcha handling
2019-04-28 04:43:03 +02:00
pannal 5b3d9f26be Update README.md 2019-04-28 03:47:55 +02:00
pannal 14f2f45f20 Update README.md 2019-04-22 05:37:47 +02:00
pannal 8ac6c9d7a7 Update README.md 2019-04-22 05:31:29 +02:00
pannal 237a47b8ed Update Info.plist 2019-04-21 03:48:37 +02:00
19 changed files with 3132 additions and 608 deletions
+34
View File
@@ -1,4 +1,38 @@
2.6.5.3017
Changelog
- core: SRT parsing: handle (bad) ASS color tag in SRT
- core: auto extract embedded: only use one unknown sub for first language
- core: better embedded streams language detection
- core: optimizations
- core: extract embedded: fix is_unknown check
- core: don't raise exception when subtitle not found inside archive
- core: search external subtitles: fix condition
- core: better plex transcoder path detection
- core: use Log.Warn instead of Log.Warning (#619, #629, #633)
- core: also check for "plex transcoder.exe" in case of windows (fixes #619)
- core: auto extract: use mbcs encoding for paths on windows
- core: Fix issue scandir not returning the name of the file inside Docker images on ARM systems. (thanks @giejay)
- core: also clean PYTHONHOME when calling external notification app
- core: update certifi to 2019.3.9
- core: scan_video: add series/title as alternative by scanning filename itself without parent folders
- core: add generic solution for solving captchas using anti captcha services
- core: increase cache time to 180d (was: 30d)
- core: guess_matches: handle multiple title matches; fixes bazarr#403
- windows: fix compatibility issues with plex transcoder
- compat: use lowercase paths on subtitle detection
- providers: addic7ed: re-enable (using paid anti captch service)
- providers: assrt: assume undefined Chinese flavor as Simplified (chs/zho-Hans)
- providers: subscene: make it work again by bypassing cf
- providers: subscene: don't fail on missing cover
- providers: titlovi: re-enable (might need paid anti captch service)
- providers: opensubtitles: fix only_foreign handling
- providers: opensubtitles: show subtitles with possibly mismatched series when manually listing subs
- menu: list subtitles: show subtitles with bad season/episode values as well
- refiners: omdb: fix imdb ids with spaces
2.6.4.2911
- core: improve file cache (windows especially); use fixed-length cache filenames; fixes #600
- core: don't log "Checking connections ..." when sonarr/radarr not activated
+1
View File
@@ -1083,6 +1083,7 @@ class Config(object):
def parse_custom_dns(self):
custom_dns = Prefs['use_custom_dns2'].strip()
os.environ["dns_resolvers"] = ""
if custom_dns:
ips = filter(lambda x: x, [d.strip() for d in custom_dns.split(",")])
if ips:
+3 -3
View File
@@ -13,7 +13,7 @@
<key>CFBundleSignature</key>
<string>????</string>
<key>CFBundleVersion</key>
<string>2.6.5.3023</string>
<string>2.6.5.3062</string>
<key>PlexFrameworkVersion</key>
<string>2</string>
<key>PlexPluginClass</key>
@@ -23,7 +23,7 @@
<key>PlexPluginConsoleLogging</key>
<string>0</string>
<key>PlexPluginDevMode</key>
<string>1</string>
<string>0</string>
<key>PlexPluginCodePolicy</key>
<!-- this allows channels to access some python methods which are otherwise blocked, as well as import external code libraries, and interact with the PMS HTTP API -->
<string>Elevated</string>
@@ -32,7 +32,7 @@
&lt;h1&gt;Sub-Zero for Plex&lt;/h1&gt;&lt;i&gt;Subtitles done right&lt;/i&gt;
Version 2.6.5.3023 DEV
Version 2.6.5.3062
Originally based on @bramwalet's awesome &lt;a href=&quot;https://github.com/bramwalet/Subliminal.bundle&quot;&gt;Subliminal.bundle&lt;/a&gt;
@@ -1,392 +0,0 @@
# coding=utf-8
import logging
import random
import re
import os
import json
import base64
from copy import deepcopy
from time import sleep
from collections import OrderedDict
from .jsfuck import jsunfuck
import js2py
from requests.sessions import Session
from subliminal_patch.pitcher import pitchers
try:
from requests_toolbelt.utils import dump
except ImportError:
pass
try:
from urlparse import urlparse
from urlparse import urlunparse
except ImportError:
from urllib.parse import urlparse
from urllib.parse import urlunparse
brotli_available = True
try:
from brotli import decompress as brdec
except:
brotli_available = False
logger = logging.getLogger(__name__)
__version__ = "2.0.3"
# Orignally written by https://github.com/Anorov/cloudflare-scrape
# Rewritten by VeNoMouS - <venom@gen-x.co.nz> for https://github.com/VeNoMouS/Sick-Beard - 24/3/2018 NZDT
DEFAULT_USER_AGENTS = [
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/65.0.3325.181 Chrome/65.0.3325.181 Safari/537.36",
"Mozilla/5.0 (Linux; Android 7.0; Moto G (5) Build/NPPS25.137-93-8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.137 Mobile Safari/537.36",
"Mozilla/5.0 (iPhone; CPU iPhone OS 7_0_4 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11B554a Safari/9537.53",
"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) Gecko/20100101 Firefox/59.0",
"Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0",
]
BUG_REPORT = """\
Cloudflare may have changed their technique, or there may be a bug in the script.
"""
cur_path = os.path.abspath(os.path.dirname(__file__))
if brotli_available:
brwsrs = os.path.join(cur_path, "browsers_br.json")
with open(brwsrs, "r") as f:
UA_COMBO = json.load(f, object_pairs_hook=OrderedDict)["chrome"]
else:
brwsrs = os.path.join(cur_path, "browsers.json")
UA_COMBO = []
with open(brwsrs, "r") as f:
_brwsrs = json.load(f, object_pairs_hook=OrderedDict)
for entry in _brwsrs:
_entry = OrderedDict(("-".join(a.capitalize() for a in key.split("-")), value)
for key, value in entry.iteritems())
_entry["User-Agent"] = None
UA_COMBO.append({"User-Agent": [entry["user-agent"]], "headers": _entry})
class NeedsCaptchaException(Exception):
pass
class CloudflareScraper(Session):
def __init__(self, *args, **kwargs):
self.delay = kwargs.pop('delay', 8)
self.debug = False
self._ua = None
self._hdrs = None
super(CloudflareScraper, self).__init__(*args, **kwargs)
if not self._ua:
# Set a random User-Agent if no custom User-Agent has been set
ua_combo = random.choice(UA_COMBO)
self._ua = random.choice(ua_combo["User-Agent"])
self._hdrs = ua_combo["headers"].copy()
self._hdrs["User-Agent"] = self._ua
self.headers['User-Agent'] = self._ua
def set_cloudflare_challenge_delay(self, delay):
if isinstance(delay, (int, float)) and delay > 0:
self.delay = delay
def is_cloudflare_challenge(self, resp):
if resp.headers.get('Server', '').startswith('cloudflare'):
if b'why_captcha' in resp.content or b'/cdn-cgi/l/chk_captcha' in resp.content:
raise NeedsCaptchaException
return (
resp.status_code in [429, 503]
and b"jschl_vc" in resp.content
and b"jschl_answer" in resp.content
)
return False
def debugRequest(self, req):
try:
print (dump.dump_all(req).decode('utf-8'))
except:
pass
def request(self, method, url, *args, **kwargs):
# self.headers = (
# OrderedDict(
# [
# ('User-Agent', self.headers['User-Agent']),
# ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
# ('Accept-Language', 'en-US,en;q=0.5'),
# ('Accept-Encoding', 'gzip, deflate'),
# ('Connection', 'close'),
# ('Upgrade-Insecure-Requests', '1')
# ]
# )
# )
self.headers = self._hdrs.copy()
resp = super(CloudflareScraper, self).request(method, url, *args, **kwargs)
if resp.headers.get('content-encoding') == 'br' and brotli_available:
resp._content = brdec(resp._content)
# Debug request
if self.debug:
self.debugRequest(resp)
# Check if Cloudflare anti-bot is on
try:
if self.is_cloudflare_challenge(resp):
# Work around if the initial request is not a GET,
# Superseed with a GET then re-request the orignal METHOD.
if resp.request.method != 'GET':
self.request('GET', resp.url)
resp = self.request(method, url, *args, **kwargs)
else:
resp = self.solve_cf_challenge(resp, **kwargs)
except NeedsCaptchaException:
# solve the captcha
site_key = re.search(r'data-sitekey="(.+?)"', resp.content).group(1)
challenge_s = re.search(r'type="hidden" name="s" value="(.+?)"', resp.content).group(1)
challenge_ray = re.search(r'data-ray="(.+?)"', resp.content).group(1)
if not all([site_key, challenge_s, challenge_ray]):
raise Exception("cf: Captcha site-key not found!")
pitcher = pitchers.get_pitcher()("cf", resp.request.url, site_key,
user_agent=self.headers["User-Agent"],
cookies=self.cookies.get_dict(),
is_invisible=True)
logger.info("cf: Solving captcha")
result = pitcher.throw()
if not result:
raise Exception("cf: Couldn't solve captcha!")
parsed_url = urlparse(resp.url)
domain = parsed_url.netloc
submit_url = '{}://{}/cdn-cgi/l/chk_captcha'.format(parsed_url.scheme, domain)
method = resp.request.method
cloudflare_kwargs = {
'allow_redirects': False,
'headers': {'Referer': resp.url},
'params': OrderedDict(
[
('s', challenge_s),
('g-recaptcha-response', result)
]
)
}
return self.request(method, submit_url, **cloudflare_kwargs)
return resp
def solve_cf_challenge(self, resp, **original_kwargs):
body = resp.text
# Cloudflare requires a delay before solving the challenge
if self.delay == 8:
try:
delay = float(re.search(r'submit\(\);\r?\n\s*},\s*([0-9]+)', body).group(1)) / float(1000)
if isinstance(delay, (int, float)):
self.delay = delay
except:
pass
sleep(self.delay)
parsed_url = urlparse(resp.url)
domain = parsed_url.netloc
submit_url = '{}://{}/cdn-cgi/l/chk_jschl'.format(parsed_url.scheme, domain)
cloudflare_kwargs = deepcopy(original_kwargs)
headers = cloudflare_kwargs.setdefault('headers', {'Referer': resp.url})
try:
params = cloudflare_kwargs.setdefault(
'params', OrderedDict(
[
('s', re.search(r'name="s"\svalue="(?P<s_value>[^"]+)', body).group('s_value')),
('jschl_vc', re.search(r'name="jschl_vc" value="(\w+)"', body).group(1)),
('pass', re.search(r'name="pass" value="(.+?)"', body).group(1)),
]
)
)
except Exception as e:
# Something is wrong with the page.
# This may indicate Cloudflare has changed their anti-bot
# technique. If you see this and are running the latest version,
# please open a GitHub issue so I can update the code accordingly.
raise ValueError("Unable to parse Cloudflare anti-bots page: {} {}".format(e.message, BUG_REPORT))
# Solve the Javascript challenge
params['jschl_answer'] = self.solve_challenge(body, domain)
# Requests transforms any request into a GET after a redirect,
# so the redirect has to be handled manually here to allow for
# performing other types of requests even as the first request.
method = resp.request.method
cloudflare_kwargs['allow_redirects'] = False
redirect = self.request(method, submit_url, **cloudflare_kwargs)
redirect_location = urlparse(redirect.headers['Location'])
if not redirect_location.netloc:
redirect_url = urlunparse(
(
parsed_url.scheme,
domain,
redirect_location.path,
redirect_location.params,
redirect_location.query,
redirect_location.fragment
)
)
return self.request(method, redirect_url, **original_kwargs)
return self.request(method, redirect.headers['Location'], **original_kwargs)
def solve_challenge(self, body, domain):
try:
js = re.search(
r"setTimeout\(function\(\){\s+(var s,t,o,p,b,r,e,a,k,i,n,g,f.+?\r?\n[\s\S]+?a\.value =.+?)\r?\n",
body
).group(1)
except Exception:
raise ValueError("Unable to identify Cloudflare IUAM Javascript on website. {}".format(BUG_REPORT))
js = re.sub(r"a\.value = ((.+).toFixed\(10\))?", r"\1", js)
js = re.sub(r'(e\s=\sfunction\(s\)\s{.*?};)', '', js, flags=re.DOTALL|re.MULTILINE)
js = re.sub(r"\s{3,}[a-z](?: = |\.).+", "", js).replace("t.length", str(len(domain)))
js = js.replace('; 121', '')
# Strip characters that could be used to exit the string context
# These characters are not currently used in Cloudflare's arithmetic snippet
js = re.sub(r"[\n\\']", "", js)
if 'toFixed' not in js:
raise ValueError("Error parsing Cloudflare IUAM Javascript challenge. {}".format(BUG_REPORT))
try:
jsEnv = """
var t = "{domain}";
var g = String.fromCharCode;
o = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=";
e = function(s) {{
s += "==".slice(2 - (s.length & 3));
var bm, r = "", r1, r2, i = 0;
for (; i < s.length;) {{
bm = o.indexOf(s.charAt(i++)) << 18 | o.indexOf(s.charAt(i++)) << 12 | (r1 = o.indexOf(s.charAt(i++))) << 6 | (r2 = o.indexOf(s.charAt(i++)));
r += r1 === 64 ? g(bm >> 16 & 255) : r2 === 64 ? g(bm >> 16 & 255, bm >> 8 & 255) : g(bm >> 16 & 255, bm >> 8 & 255, bm & 255);
}}
return r;
}};
function italics (str) {{ return '<i>' + this + '</i>'; }};
var document = {{
getElementById: function () {{
return {{'innerHTML': '{innerHTML}'}};
}}
}};
{js}
"""
innerHTML = re.search(
'<div(?: [^<>]*)? id="([^<>]*?)">([^<>]*?)<\/div>',
body,
re.MULTILINE | re.DOTALL
)
innerHTML = innerHTML.group(2).replace("'", r"\'") if innerHTML else ""
js = jsunfuck(jsEnv.format(domain=domain, innerHTML=innerHTML, js=js))
def atob(s):
return base64.b64decode('{}'.format(s)).decode('utf-8')
js2py.disable_pyimport()
context = js2py.EvalJs({'atob': atob})
result = context.eval(js)
except Exception:
logging.error("Error executing Cloudflare IUAM Javascript. {}".format(BUG_REPORT))
raise
try:
float(result)
except Exception:
raise ValueError("Cloudflare IUAM challenge returned unexpected answer. {}".format(BUG_REPORT))
return result
@classmethod
def create_scraper(cls, sess=None, **kwargs):
"""
Convenience function for creating a ready-to-go CloudflareScraper object.
"""
scraper = cls(**kwargs)
if sess:
attrs = ['auth', 'cert', 'cookies', 'headers', 'hooks', 'params', 'proxies', 'data']
for attr in attrs:
val = getattr(sess, attr, None)
if val:
setattr(scraper, attr, val)
return scraper
# Functions for integrating cloudflare-scrape with other applications and scripts
@classmethod
def get_tokens(cls, url, user_agent=None, debug=False, **kwargs):
scraper = cls.create_scraper()
scraper.debug = debug
if user_agent:
scraper.headers['User-Agent'] = user_agent
try:
resp = scraper.get(url, **kwargs)
resp.raise_for_status()
except Exception as e:
logging.error("'{}' returned an error. Could not collect tokens.".format(url))
raise
domain = urlparse(resp.url).netloc
cookie_domain = None
for d in scraper.cookies.list_domains():
if d.startswith('.') and d in ('.{}'.format(domain)):
cookie_domain = d
break
else:
raise ValueError("Unable to find Cloudflare cookies. Does the site actually have Cloudflare IUAM (\"I'm Under Attack Mode\") enabled?")
return (
{
'__cfduid': scraper.cookies.get('__cfduid', '', domain=cookie_domain),
'cf_clearance': scraper.cookies.get('cf_clearance', '', domain=cookie_domain)
},
scraper.headers['User-Agent']
)
@classmethod
def get_cookie_string(cls, url, user_agent=None, debug=False, **kwargs):
"""
Convenience function for building a Cookie HTTP header value.
"""
tokens, user_agent = cls.get_tokens(url, user_agent=user_agent, debug=debug, **kwargs)
return "; ".join("=".join(pair) for pair in tokens.items()), user_agent
create_scraper = CloudflareScraper.create_scraper
get_tokens = CloudflareScraper.get_tokens
get_cookie_string = CloudflareScraper.get_cookie_string
@@ -1,80 +0,0 @@
[
{
"connection": "close",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"user-agent": "Mozilla/5.0 (Windows NT 5.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.102 Safari/537.36",
"accept-encoding": "gzip,deflate",
"accept-language": "en-US,en;q=0.8"
},
{
"connection": "close",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"user-agent": "Mozilla/5.0 (Windows NT 5.2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.101 Safari/537.36",
"accept-encoding": "gzip,deflate",
"accept-language": "en-US,en;q=0.8"
},
{
"connection": "close",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"upgrade-insecure-requests": "1",
"user-agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36",
"accept-language": "en-US,en;q=0.8",
"accept-encoding": "gzip, deflate, "
},
{
"connection": "close",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"upgrade-insecure-requests": "1",
"user-agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36",
"accept-language": "en-US,en;q=0.8",
"accept-encoding": "gzip, deflate, "
},
{
"connection": "close",
"accept": "*/*",
"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:30.0) Gecko/20100101 Firefox/30.0"
},
{
"connection": "close",
"accept": "image/jpeg, image/gif, image/pjpeg, application/x-ms-application, application/xaml+xml, application/x-ms-xbap, */*",
"accept-language": "en-US",
"user-agent": "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E)",
"accept-encoding": "gzip, deflate"
},
{
"connection": "close",
"accept": "text/html, application/xhtml+xml, */*",
"accept-language": "en-US",
"user-agent": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)",
"accept-encoding": "gzip, deflate"
},
{
"connection": "close",
"accept": "text/html, application/xhtml+xml, */*",
"accept-language": "en-US",
"user-agent": "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)",
"accept-encoding": "gzip, deflate",
"dnt": "1"
},
{
"connection": "close",
"user-agent": "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"accept-language": "en-US,en;q=0.5",
"accept-encoding": "gzip, deflate"
},
{
"connection": "close",
"user-agent": "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:42.0) Gecko/20100101 Firefox/42.0",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"accept-language": "en-US,en;q=0.5",
"accept-encoding": "gzip, deflate"
},
{
"connection": "close",
"user-agent": "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"accept-language": "en-US,en;q=0.5",
"accept-encoding": "gzip, deflate"
}
]
@@ -0,0 +1,346 @@
import re
import ssl
try:
import brotli
except ImportError:
brotli = None
import logging
from copy import deepcopy
from time import sleep
from collections import OrderedDict
from requests.sessions import Session
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.ssl_ import create_urllib3_context
from subliminal_patch.pitcher import pitchers
from .interpreters import JavaScriptInterpreter
from .user_agent import User_Agent
try:
from requests_toolbelt.utils import dump
except ImportError:
pass
try:
from urlparse import urlparse
from urlparse import urlunparse
except ImportError:
from urllib.parse import urlparse
from urllib.parse import urlunparse
logger = logging.getLogger(__name__)
##########################################################################################################################################################
__version__ = '1.1.1'
BUG_REPORT = 'Cloudflare may have changed their technique, or there may be a bug in the script.'
##########################################################################################################################################################
class CipherSuiteAdapter(HTTPAdapter):
def __init__(self, cipherSuite=None, **kwargs):
self.cipherSuite = cipherSuite
super(CipherSuiteAdapter, self).__init__(**kwargs)
##########################################################################################################################################################
def init_poolmanager(self, *args, **kwargs):
kwargs['ssl_context'] = create_urllib3_context(ciphers=self.cipherSuite)
return super(CipherSuiteAdapter, self).init_poolmanager(*args, **kwargs)
##########################################################################################################################################################
def proxy_manager_for(self, *args, **kwargs):
kwargs['ssl_context'] = create_urllib3_context(ciphers=self.cipherSuite)
return super(CipherSuiteAdapter, self).proxy_manager_for(*args, **kwargs)
##########################################################################################################################################################
class NeedsCaptchaException(Exception):
pass
class CloudScraper(Session):
was_cf_request = False
def __init__(self, *args, **kwargs):
self.debug = kwargs.pop('debug', False)
self.delay = kwargs.pop('delay', None)
self.interpreter = kwargs.pop('interpreter', 'js2py')
self.allow_brotli = kwargs.pop('allow_brotli', True) and bool(brotli)
self.cipherSuite = None
super(CloudScraper, self).__init__()
if 'requests' in self.headers['User-Agent']:
# Set a random User-Agent if no custom User-Agent has been set
self.headers = User_Agent(allow_brotli=self.allow_brotli).headers
self.mount('https://', CipherSuiteAdapter(self.loadCipherSuite()))
##########################################################################################################################################################
@staticmethod
def debugRequest(req):
try:
print(dump.dump_all(req).decode('utf-8'))
except: # noqa
pass
##########################################################################################################################################################
def loadCipherSuite(self):
if self.cipherSuite:
return self.cipherSuite
ciphers = [
'GREASE_3A', 'GREASE_6A', 'AES128-GCM-SHA256', 'AES256-GCM-SHA256', 'AES256-GCM-SHA384', 'CHACHA20-POLY1305-SHA256',
'ECDHE-ECDSA-AES128-GCM-SHA256', 'ECDHE-RSA-AES128-GCM-SHA256', 'ECDHE-ECDSA-AES256-GCM-SHA384',
'ECDHE-RSA-AES256-GCM-SHA384', 'ECDHE-ECDSA-CHACHA20-POLY1305-SHA256', 'ECDHE-RSA-CHACHA20-POLY1305-SHA256',
'ECDHE-RSA-AES128-CBC-SHA', 'ECDHE-RSA-AES256-CBC-SHA', 'RSA-AES128-GCM-SHA256', 'RSA-AES256-GCM-SHA384',
'ECDHE-RSA-AES128-GCM-SHA256', 'RSA-AES256-SHA', '3DES-EDE-CBC'
]
self.cipherSuite = ''
ctx = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
for cipher in ciphers:
try:
ctx.set_ciphers(cipher)
self.cipherSuite = '{}:{}'.format(self.cipherSuite, cipher).rstrip(':')
except ssl.SSLError:
pass
return self.cipherSuite
##########################################################################################################################################################
def request(self, method, url, *args, **kwargs):
ourSuper = super(CloudScraper, self)
resp = ourSuper.request(method, url, *args, **kwargs)
if resp.headers.get('Content-Encoding') == 'br':
if self.allow_brotli and resp._content:
resp._content = brotli.decompress(resp.content)
else:
logging.warning('Brotli content detected, But option is disabled, we will not continue.')
return resp
# Debug request
if self.debug:
self.debugRequest(resp)
# Check if Cloudflare anti-bot is on
try:
if self.isChallengeRequest(resp):
self.was_cf_request = True
if resp.request.method != 'GET':
# Work around if the initial request is not a GET,
# Supersede with a GET then re-request the original METHOD.
self.request('GET', resp.url)
resp = ourSuper.request(method, url, *args, **kwargs)
else:
# Solve Challenge
resp = self.sendChallengeResponse(resp, **kwargs)
except NeedsCaptchaException:
self.was_cf_request = True
parsed_url = urlparse(url)
domain = parsed_url.netloc
# solve the captcha
site_key = re.search(r'data-sitekey="(.+?)"', resp.content).group(1)
challenge_s = re.search(r'type="hidden" name="s" value="(.+?)"', resp.content).group(1)
challenge_ray = re.search(r'data-ray="(.+?)"', resp.content).group(1)
if not all([site_key, challenge_s, challenge_ray]):
raise Exception("cf: Captcha site-key not found!")
pitcher = pitchers.get_pitcher()("cf: %s" % domain, resp.request.url, site_key,
user_agent=self.headers["User-Agent"],
cookies=self.cookies.get_dict(),
is_invisible=True)
parsed_url = urlparse(resp.url)
logger.info("cf: %s: Solving captcha", domain)
result = pitcher.throw()
if not result:
raise Exception("cf: Couldn't solve captcha!")
submit_url = '{}://{}/cdn-cgi/l/chk_captcha'.format(parsed_url.scheme, domain)
method = resp.request.method
cloudflare_kwargs = {
'allow_redirects': False,
'headers': {'Referer': resp.url},
'params': OrderedDict(
[
('s', challenge_s),
('g-recaptcha-response', result)
]
)
}
return self.request(method, submit_url, **cloudflare_kwargs)
return resp
##########################################################################################################################################################
@staticmethod
def isChallengeRequest(resp):
if resp.headers.get('Server', '').startswith('cloudflare'):
if b'why_captcha' in resp.content or b'/cdn-cgi/l/chk_captcha' in resp.content:
raise NeedsCaptchaException
return (
resp.status_code in [429, 503]
and all(s in resp.content for s in [b'jschl_vc', b'jschl_answer'])
)
return False
##########################################################################################################################################################
def sendChallengeResponse(self, resp, **original_kwargs):
body = resp.text
# Cloudflare requires a delay before solving the challenge
if not self.delay:
try:
delay = float(re.search(r'submit\(\);\r?\n\s*},\s*([0-9]+)', body).group(1)) / float(1000)
if isinstance(delay, (int, float)):
self.delay = delay
except: # noqa
pass
sleep(self.delay)
parsed_url = urlparse(resp.url)
domain = parsed_url.netloc
submit_url = '{}://{}/cdn-cgi/l/chk_jschl'.format(parsed_url.scheme, domain)
cloudflare_kwargs = deepcopy(original_kwargs)
try:
params = OrderedDict()
s = re.search(r'name="s"\svalue="(?P<s_value>[^"]+)', body)
if s:
params['s'] = s.group('s_value')
params.update(
[
('jschl_vc', re.search(r'name="jschl_vc" value="(\w+)"', body).group(1)),
('pass', re.search(r'name="pass" value="(.+?)"', body).group(1))
]
)
params = cloudflare_kwargs.setdefault('params', params)
except Exception as e:
raise ValueError('Unable to parse Cloudflare anti-bots page: {} {}'.format(e.message, BUG_REPORT))
# Solve the Javascript challenge
params['jschl_answer'] = JavaScriptInterpreter.dynamicImport(self.interpreter).solveChallenge(body, domain)
# Requests transforms any request into a GET after a redirect,
# so the redirect has to be handled manually here to allow for
# performing other types of requests even as the first request.
cloudflare_kwargs['allow_redirects'] = False
redirect = self.request(resp.request.method, submit_url, **cloudflare_kwargs)
redirect_location = urlparse(redirect.headers['Location'])
if not redirect_location.netloc:
redirect_url = urlunparse(
(
parsed_url.scheme,
domain,
redirect_location.path,
redirect_location.params,
redirect_location.query,
redirect_location.fragment
)
)
return self.request(resp.request.method, redirect_url, **original_kwargs)
return self.request(resp.request.method, redirect.headers['Location'], **original_kwargs)
##########################################################################################################################################################
@classmethod
def create_scraper(cls, sess=None, **kwargs):
"""
Convenience function for creating a ready-to-go CloudScraper object.
"""
scraper = cls(**kwargs)
if sess:
attrs = ['auth', 'cert', 'cookies', 'headers', 'hooks', 'params', 'proxies', 'data']
for attr in attrs:
val = getattr(sess, attr, None)
if val:
setattr(scraper, attr, val)
return scraper
##########################################################################################################################################################
# Functions for integrating cloudscraper with other applications and scripts
@classmethod
def get_tokens(cls, url, **kwargs):
scraper = cls.create_scraper(
debug=kwargs.pop('debug', False),
delay=kwargs.pop('delay', None),
interpreter=kwargs.pop('interpreter', 'js2py'),
allow_brotli=kwargs.pop('allow_brotli', True),
)
try:
resp = scraper.get(url, **kwargs)
resp.raise_for_status()
except Exception:
logging.error('"{}" returned an error. Could not collect tokens.'.format(url))
raise
domain = urlparse(resp.url).netloc
# noinspection PyUnusedLocal
cookie_domain = None
for d in scraper.cookies.list_domains():
if d.startswith('.') and d in ('.{}'.format(domain)):
cookie_domain = d
break
else:
raise ValueError('Unable to find Cloudflare cookies. Does the site actually have Cloudflare IUAM ("I\'m Under Attack Mode") enabled?')
return (
{
'__cfduid': scraper.cookies.get('__cfduid', '', domain=cookie_domain),
'cf_clearance': scraper.cookies.get('cf_clearance', '', domain=cookie_domain)
},
scraper.headers['User-Agent']
)
##########################################################################################################################################################
@classmethod
def get_cookie_string(cls, url, **kwargs):
"""
Convenience function for building a Cookie HTTP header value.
"""
tokens, user_agent = cls.get_tokens(url, **kwargs)
return '; '.join('='.join(pair) for pair in tokens.items()), user_agent
##########################################################################################################################################################
create_scraper = CloudScraper.create_scraper
get_tokens = CloudScraper.get_tokens
get_cookie_string = CloudScraper.get_cookie_string
@@ -0,0 +1,89 @@
import re
import sys
import logging
import abc
if sys.version_info >= (3, 4):
ABC = abc.ABC # noqa
else:
ABC = abc.ABCMeta('ABC', (), {})
##########################################################################################################################################################
BUG_REPORT = 'Cloudflare may have changed their technique, or there may be a bug in the script.'
##########################################################################################################################################################
interpreters = {}
class JavaScriptInterpreter(ABC):
@abc.abstractmethod
def __init__(self, name):
interpreters[name] = self
@classmethod
def dynamicImport(cls, name):
if name not in interpreters:
try:
__import__('{}.{}'.format(cls.__module__, name))
if not isinstance(interpreters.get(name), JavaScriptInterpreter):
raise ImportError('The interpreter was not initialized.')
except ImportError:
logging.error('Unable to load {} interpreter'.format(name))
raise
return interpreters[name]
@abc.abstractmethod
def eval(self, jsEnv, js):
pass
def solveChallenge(self, body, domain):
try:
js = re.search(
r'setTimeout\(function\(\){\s+(var s,t,o,p,b,r,e,a,k,i,n,g,f.+?\r?\n[\s\S]+?a\.value =.+?)\r?\n',
body
).group(1)
except Exception:
raise ValueError('Unable to identify Cloudflare IUAM Javascript on website. {}'.format(BUG_REPORT))
js = re.sub(r'\s{2,}', ' ', js, flags=re.MULTILINE | re.DOTALL).replace('\'; 121\'', '')
js += '\na.value;'
jsEnv = '''
function italics (str) {{ return "<i>" + this + "</i>"; }};
var document = {{
createElement: function () {{
return {{ firstChild: {{ href: "http://{domain}/" }} }}
}},
getElementById: function () {{
return {{"innerHTML": "{innerHTML}"}};
}}
}};
'''
try:
innerHTML = re.search(
r'<div(?: [^<>]*)? id="([^<>]*?)">([^<>]*?)</div>',
body,
re.MULTILINE | re.DOTALL
)
innerHTML = innerHTML.group(2) if innerHTML else ''
except: # noqa
logging.error('Error extracting Cloudflare IUAM Javascript. {}'.format(BUG_REPORT))
raise
try:
result = self.eval(
re.sub(r'\s{2,}', ' ', jsEnv.format(domain=domain, innerHTML=innerHTML), flags=re.MULTILINE | re.DOTALL),
js
)
float(result)
except Exception:
logging.error('Error executing Cloudflare IUAM Javascript. {}'.format(BUG_REPORT))
raise
return result
@@ -0,0 +1,32 @@
from __future__ import absolute_import
import js2py
import logging
import base64
from . import JavaScriptInterpreter
from .jsunfuck import jsunfuck
class ChallengeInterpreter(JavaScriptInterpreter):
def __init__(self):
super(ChallengeInterpreter, self).__init__('js2py')
def eval(self, jsEnv, js):
if js2py.eval_js('(+(+!+[]+[+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+[!+[]+!+[]]+[+[]])+[])[+!+[]]') == '1':
logging.warning('WARNING - Please upgrade your js2py https://github.com/PiotrDabkowski/Js2Py, applying work around for the meantime.')
js = jsunfuck(js)
def atob(s):
return base64.b64decode('{}'.format(s)).decode('utf-8')
js2py.disable_pyimport()
context = js2py.EvalJs({'atob': atob})
result = context.eval('{}{}'.format(jsEnv, js))
return result
ChallengeInterpreter()
@@ -80,18 +80,18 @@ CONSTRUCTORS = {
'RegExp': 'Function("return/"+false+"/")()'
}
def jsunfuck(jsfuckString):
for key in sorted(MAPPING, key=lambda k: len(MAPPING[k]), reverse=True):
if MAPPING.get(key) in jsfuckString:
jsfuckString = jsfuckString.replace(MAPPING.get(key), '"{}"'.format(key))
for key in sorted(SIMPLE, key=lambda k: len(SIMPLE[k]), reverse=True):
if SIMPLE.get(key) in jsfuckString:
jsfuckString = jsfuckString.replace(SIMPLE.get(key), '{}'.format(key))
#for key in sorted(CONSTRUCTORS, key=lambda k: len(CONSTRUCTORS[k]), reverse=True):
# for key in sorted(CONSTRUCTORS, key=lambda k: len(CONSTRUCTORS[k]), reverse=True):
# if CONSTRUCTORS.get(key) in jsfuckString:
# jsfuckString = jsfuckString.replace(CONSTRUCTORS.get(key), '{}'.format(key))
return jsfuckString
return jsfuckString
@@ -0,0 +1,46 @@
import base64
import logging
import subprocess
from . import JavaScriptInterpreter
##########################################################################################################################################################
BUG_REPORT = 'Cloudflare may have changed their technique, or there may be a bug in the script.'
##########################################################################################################################################################
class ChallengeInterpreter(JavaScriptInterpreter):
def __init__(self):
super(ChallengeInterpreter, self).__init__('nodejs')
def eval(self, jsEnv, js):
try:
js = 'var atob = function(str) {return Buffer.from(str, "base64").toString("binary");};' \
'var challenge = atob("%s");' \
'var context = {atob: atob};' \
'var options = {filename: "iuam-challenge.js", timeout: 4000};' \
'var answer = require("vm").runInNewContext(challenge, context, options);' \
'process.stdout.write(String(answer));' \
% base64.b64encode('{}{}'.format(jsEnv, js).encode('UTF-8')).decode('ascii')
return subprocess.check_output(['node', '-e', js])
except OSError as e:
if e.errno == 2:
raise EnvironmentError(
'Missing Node.js runtime. Node is required and must be in the PATH (check with `node -v`). Your Node binary may be called `nodejs` rather than `node`, '
'in which case you may need to run `apt-get install nodejs-legacy` on some Debian-based systems. (Please read the cloudscraper'
' README\'s Dependencies section: https://github.com/VeNoMouS/cloudscraper#dependencies.'
)
raise
except Exception:
logging.error('Error executing Cloudflare IUAM Javascript. %s' % BUG_REPORT)
raise
pass
ChallengeInterpreter()
@@ -0,0 +1,40 @@
import os
import json
import random
import logging
from collections import OrderedDict
##########################################################################################################################################################
class User_Agent():
##########################################################################################################################################################
def __init__(self, *args, **kwargs):
self.headers = None
self.loadUserAgent(*args, **kwargs)
##########################################################################################################################################################
def loadUserAgent(self, *args, **kwargs):
browser = kwargs.pop('browser', 'chrome')
user_agents = json.load(
open(os.path.join(os.path.dirname(__file__), 'browsers.json'), 'r'),
object_pairs_hook=OrderedDict
)
if not user_agents.get(browser):
logging.error('Sorry "{}" browser User-Agent was not found.'.format(browser))
raise
user_agent = random.choice(user_agents.get(browser))
self.headers = user_agent.get('headers')
self.headers['User-Agent'] = random.choice(user_agent.get('User-Agent'))
if not kwargs.get('allow_brotli', False):
if 'br' in self.headers['Accept-Encoding']:
self.headers['Accept-Encoding'] = ','.join([encoding for encoding in self.headers['Accept-Encoding'].split(',') if encoding.strip() != 'br']).strip()
File diff suppressed because it is too large Load Diff
@@ -10,6 +10,7 @@ import logging
import requests
import xmlrpclib
import dns.resolver
import ipaddress
from requests import exceptions
from urllib3.util import connection
@@ -17,7 +18,7 @@ from retry.api import retry_call
from exceptions import APIThrottled
from dogpile.cache.api import NO_VALUE
from subliminal.cache import region
from cfscrape import CloudflareScraper
from cloudscraper import CloudScraper
try:
from urlparse import urlparse
@@ -55,39 +56,44 @@ class CertifiSession(TimeoutSession):
self.verify = pem_file
class CFSession(CloudflareScraper):
def __init__(self):
super(CFSession, self).__init__()
class CFSession(CloudScraper):
_hdrs = None
def __init__(self, *args, **kwargs):
super(CFSession, self).__init__(*args, **kwargs)
self.debug = os.environ.get("CF_DEBUG", False)
self._was_cf = False
def request(self, method, url, *args, **kwargs):
parsed_url = urlparse(url)
domain = parsed_url.netloc
cache_key = "cf_data2_%s" % domain
cache_key = "cf_data3_%s" % domain
if not self.cookies.get("__cfduid", "", domain=domain):
if not self.cookies.get("cf_clearance", "", domain=domain):
cf_data = region.get(cache_key)
if cf_data is not NO_VALUE:
cf_cookies, user_agent, hdrs = cf_data
cf_cookies, hdrs = cf_data
logger.debug("Trying to use old cf data for %s: %s", domain, cf_data)
for cookie, value in cf_cookies.iteritems():
self.cookies.set(cookie, value, domain=domain)
self._hdrs = hdrs
self._ua = user_agent
self.headers['User-Agent'] = self._ua
self.headers = hdrs
ret = super(CFSession, self).request(method, url, *args, **kwargs)
try:
cf_data = self.get_cf_live_tokens(domain)
except:
pass
else:
if cf_data != region.get(cache_key) and cf_data[0]["__cfduid"] and cf_data[0]["cf_clearance"]:
logger.debug("Storing cf data for %s: %s", domain, cf_data)
region.set(cache_key, cf_data)
if self.was_cf_request:
self.was_cf_request = False
logger.debug("We've hit CF, seeing if we need to store previous data")
try:
cf_data = self.get_cf_live_tokens(domain)
except:
logger.debug("Couldn't get CF live tokens for re-use. Cookies: %r", self.cookies)
pass
else:
if cf_data != region.get(cache_key) and cf_data[0]["cf_clearance"]:
logger.debug("Storing cf data for %s: %s", domain, cf_data)
region.set(cache_key, cf_data)
return ret
@@ -101,11 +107,11 @@ class CFSession(CloudflareScraper):
"Unable to find Cloudflare cookies. Does the site actually have "
"Cloudflare IUAM (\"I'm Under Attack Mode\") enabled?")
return (OrderedDict([
return (OrderedDict(filter(lambda x: x[1], [
("__cfduid", self.cookies.get("__cfduid", "", domain=cookie_domain)),
("cf_clearance", self.cookies.get("cf_clearance", "", domain=cookie_domain))
]),
self._ua, self._hdrs
])),
self.headers
)
@@ -236,41 +242,47 @@ def patch_create_connection():
global _custom_resolver, _custom_resolver_ips, dns_cache
host, port = address
__custom_resolver_ips = os.environ.get("dns_resolvers", None)
try:
ipaddress.ip_address(unicode(host))
except (ipaddress.AddressValueError, ValueError):
__custom_resolver_ips = os.environ.get("dns_resolvers", None)
# resolver ips changed in the meantime?
if __custom_resolver_ips != _custom_resolver_ips:
_custom_resolver = None
_custom_resolver_ips = __custom_resolver_ips
dns_cache = {}
# resolver ips changed in the meantime?
if __custom_resolver_ips != _custom_resolver_ips:
_custom_resolver = None
_custom_resolver_ips = __custom_resolver_ips
dns_cache = {}
custom_resolver = _custom_resolver
custom_resolver = _custom_resolver
if not custom_resolver:
if _custom_resolver_ips:
logger.debug("DNS: Trying to use custom DNS resolvers: %s", _custom_resolver_ips)
custom_resolver = dns.resolver.Resolver(configure=False)
custom_resolver.lifetime = 8.0
try:
custom_resolver.nameservers = json.loads(_custom_resolver_ips)
except:
logger.debug("DNS: Couldn't load custom DNS resolvers: %s", _custom_resolver_ips)
if not custom_resolver:
if _custom_resolver_ips:
logger.debug("DNS: Trying to use custom DNS resolvers: %s", _custom_resolver_ips)
custom_resolver = dns.resolver.Resolver(configure=False)
custom_resolver.lifetime = os.environ.get("dns_resolvers_timeout", 8.0)
try:
custom_resolver.nameservers = json.loads(_custom_resolver_ips)
except:
logger.debug("DNS: Couldn't load custom DNS resolvers: %s", _custom_resolver_ips)
else:
_custom_resolver = custom_resolver
if custom_resolver:
if host in dns_cache:
ip = dns_cache[host]
logger.debug("DNS: Using %s=%s from cache", host, ip)
return _orig_create_connection((ip, port), *args, **kwargs)
else:
_custom_resolver = custom_resolver
if custom_resolver:
if host in dns_cache:
ip = dns_cache[host]
logger.debug("DNS: Using %s=%s from cache", host, ip)
else:
try:
ip = custom_resolver.query(host)[0].address
logger.debug("DNS: Resolved %s to %s using %s", host, ip, custom_resolver.nameservers)
dns_cache[host] = ip
except dns.exception.DNSException:
logger.warning("DNS: Couldn't resolve %s with DNS: %s", host, custom_resolver.nameservers)
raise
try:
ip = custom_resolver.query(host)[0].address
logger.debug("DNS: Resolved %s to %s using %s", host, ip, custom_resolver.nameservers)
dns_cache[host] = ip
return _orig_create_connection((ip, port), *args, **kwargs)
except dns.exception.DNSException:
logger.warning("DNS: Couldn't resolve %s with DNS: %s", host, custom_resolver.nameservers)
raise
logger.debug("DNS: Falling back to default DNS or IP on %s", host)
return _orig_create_connection((host, port), *args, **kwargs)
patch_create_connection._sz_patched = True
@@ -5,16 +5,11 @@ import logging
import os
import time
import inflect
import cfscrape
from random import randint
from zipfile import ZipFile
from babelfish import language_converters
from guessit import guessit
from dogpile.cache.api import NO_VALUE
from subliminal import Episode, ProviderError
from subliminal.cache import region
from subliminal.utils import sanitize_release_group
from subliminal_patch.http import RetryingCFSession
from subliminal_patch.providers import Provider
@@ -140,7 +140,7 @@ class TitloviProvider(Provider, ProviderSubtitleArchiveMixin):
def initialize(self):
self.session = RetryingCFSession()
load_verification("titlovi", self.session)
#load_verification("titlovi", self.session)
def terminate(self):
self.session.close()
@@ -181,42 +181,8 @@ class TitloviProvider(Provider, ProviderSubtitleArchiveMixin):
r = self.session.get(self.search_url, params=params, timeout=10)
r.raise_for_status()
except RequestException as e:
captcha_passed = False
if e.response.status_code == 403 and "data-sitekey" in e.response.content:
logger.info('titlovi: Solving captcha. This might take a couple of minutes, but should only '
'happen once every so often')
site_key = re.search(r'data-sitekey="(.+?)"', e.response.content).group(1)
challenge_s = re.search(r'type="hidden" name="s" value="(.+?)"', e.response.content).group(1)
challenge_ray = re.search(r'data-ray="(.+?)"', e.response.content).group(1)
if not all([site_key, challenge_s, challenge_ray]):
raise Exception("titlovi: Captcha site-key not found!")
pitcher = pitchers.get_pitcher()("titlovi", e.request.url, site_key,
user_agent=self.session.headers["User-Agent"],
cookies=self.session.cookies.get_dict(),
is_invisible=True)
result = pitcher.throw()
if not result:
raise Exception("titlovi: Couldn't solve captcha!")
s_params = {
"s": challenge_s,
"id": challenge_ray,
"g-recaptcha-response": result,
}
r = self.session.get(self.server_url + "/cdn-cgi/l/chk_captcha", params=s_params, timeout=10,
allow_redirects=False)
r.raise_for_status()
r = self.session.get(self.search_url, params=params, timeout=10)
r.raise_for_status()
store_verification("titlovi", self.session)
captcha_passed = True
if not captcha_passed:
logger.exception('RequestException %s', e)
break
logger.exception('RequestException %s', e)
break
else:
try:
soup = BeautifulSoup(r.content, 'lxml')
@@ -259,7 +225,8 @@ class TitloviProvider(Provider, ProviderSubtitleArchiveMixin):
# page link
page_link = self.server_url + sub.a.attrs['href']
# subtitle language
match = lang_re.search(sub.select_one('.lang').attrs['src'])
_lang = sub.select_one('.lang')
match = lang_re.search(_lang.attrs.get('src', _lang.attrs.get('cfsrc', '')))
if match:
try:
# decode language
@@ -2,7 +2,8 @@
OS_PLEX_USERAGENT = 'plexapp.com v9.0'
DEPENDENCY_MODULE_NAMES = ['subliminal', 'subliminal_patch', 'enzyme', 'guessit', 'subzero', 'libfilebot', 'cfscrape']
DEPENDENCY_MODULE_NAMES = ['subliminal', 'subliminal_patch', 'enzyme', 'guessit', 'subzero', 'libfilebot',
'cloudscraper']
PERSONAL_MEDIA_IDENTIFIER = "com.plexapp.agents.none"
PLUGIN_IDENTIFIER_SHORT = "subzero"
PLUGIN_IDENTIFIER = "com.plexapp.agents.%s" % PLUGIN_IDENTIFIER_SHORT
+21
View File
@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2019 VeNoMouS
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+25 -32
View File
@@ -1,7 +1,9 @@
# Sub-Zero for Plex
[![](https://img.shields.io/github/release/pannal/Sub-Zero.bundle.svg?style=flat&label=stable)](https://github.com/pannal/Sub-Zero.bundle/releases/latest)<!--[![](https://img.shields.io/github/release/pannal/Sub-Zero.bundle/all.svg?maxAge=2592000&label=testing+2.0+RC9)](https://github.com/pannal/Sub-Zero.bundle/releases)--> [![master](https://img.shields.io/badge/master-stable-green.svg?maxAge=2592000)]()
[![](https://img.shields.io/github/release/pannal/Sub-Zero.bundle.svg?style=flat&label=stable)](https://github.com/pannal/Sub-Zero.bundle/releases/latest)
[![master](https://img.shields.io/badge/master-stable-green.svg?maxAge=2592000)]()
[![Maintenance](https://img.shields.io/maintenance/yes/2019.svg)]()
[![Slack Status](https://szslack.fragstore.net/badge.svg)](https://szslack.fragstore.net)
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fpannal%2FSub-Zero.bundle.svg?type=shield)](https://app.fossa.io/projects/git%2Bgithub.com%2Fpannal%2FSub-Zero.bundle?ref=badge_shield)
<img src="https://raw.githubusercontent.com/pannal/Sub-Zero.bundle/master/Contents/Resources/subzero.gif" align="left" height="100"> <font size="5"><b>Subtitles done right!</b></font><br />
@@ -16,8 +18,12 @@ Check out **[the Sub-Zero Wiki](https://github.com/pannal/Sub-Zero.bundle/wiki)*
---
## Helping development
If you like this, buy me a beer: <br>[![Donate](https://www.paypalobjects.com/en_US/i/btn/btn_donate_LG.gif)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=G9VKR2B8PMNKG) <br>or become a Patreon starting at **1 $ / month** <br><a href="https://www.patreon.com/subzero_plex" target="_blank"><img src="http://www.wenspencer.com/wp-content/uploads/2017/02/patreon-button.png" height="42" /></a> <br>or use the OpenSubtitles Sub-Zero affiliate link to become VIP <br>**10€/year, ad-free subs, 1000 subs/day, no-cache *VIP* server**<br><a href="http://v.ht/osvip" target="_blank"><img src="https://static.opensubtitles.org/gfx/logo.gif" height="50" /></a>
If you register with an anti-captcha service and you decide to use [Anti-Captcha.com](http://getcaptchasolution.com/kkvviom7nh), you can use [this affiliate link](http://getcaptchasolution.com/kkvviom7nh) to help development.
## Introduction
#### What's Sub-Zero?
Sub-Zero is a metadata agent and interface-plugin at the same time, for the popular Plex Media Server environment.
@@ -84,43 +90,26 @@ the.vbm, mmgoodnow, Vertig0ne, thliu78, tattoomees, ostman, count_confucius, ehe
## Changelog
2.6.5.3017
2.6.5.3062
subscene, addic7ed and titlovi
- either of those providers might impose a reCAPTCHA verification. In order to use those providers, please create an account at an AntiCaptcha service (anti-captcha.com or deathbycaptcha.com), add funds, then supply your credentials/apikey in the configuration
- either of those providers might impose a reCAPTCHA verification. In order to use those providers, please create an account at an AntiCaptcha service ([anti-captcha.com](http://getcaptchasolution.com/kkvviom7nh) or [deathbycaptcha.com](http://deathbycaptcha.com)), add funds, then supply your credentials/apikey in the configuration
Changelog
- core: SRT parsing: handle (bad) ASS color tag in SRT
- core: auto extract embedded: only use one unknown sub for first language
- core: better embedded streams language detection
- core: optimizations
- core: extract embedded: fix is_unknown check
- core: don't raise exception when subtitle not found inside archive
- core: search external subtitles: fix condition
- core: better plex transcoder path detection
- core: use Log.Warn instead of Log.Warning (#619, #629, #633)
- core: also check for "plex transcoder.exe" in case of windows (fixes #619)
- core: auto extract: use mbcs encoding for paths on windows
- core: Fix issue scandir not returning the name of the file inside Docker images on ARM systems. (thanks @giejay)
- core: also clean PYTHONHOME when calling external notification app
- core: update certifi to 2019.3.9
- core: scan_video: add series/title as alternative by scanning filename itself without parent folders
- core: add generic solution for solving captchas using anti captcha services
- core: increase cache time to 180d (was: 30d)
- core: guess_matches: handle multiple title matches; fixes bazarr#403
- windows: fix compatibility issues with plex transcoder
- compat: use lowercase paths on subtitle detection
- providers: addic7ed: re-enable (using paid anti captch service)
- providers: assrt: assume undefined Chinese flavor as Simplified (chs/zho-Hans)
- providers: subscene: make it work again by bypassing cf
- providers: subscene: don't fail on missing cover
- providers: titlovi: re-enable (might need paid anti captch service)
- providers: opensubtitles: fix only_foreign handling
- providers: opensubtitles: show subtitles with possibly mismatched series when manually listing subs
- menu: list subtitles: show subtitles with bad season/episode values as well
- refiners: omdb: fix imdb ids with spaces
- core: cf: optimize
- core: http: don't query DNS with IPs. thanks @fgump (fixes sonarr/radarr)
2.6.5.3041
Changelog
- core: only reference guessed title if there actually is one
- core: cf: optimize
- core/config: add setting for one existing language to be enough, fixes #491
- core/compat: dns: support nameservers via ENV[dns_resolvers]; don't fall back to default DNS when configured custom DNS failed
- providers: titlovi: prevent repeated captcha solving for CF
[older changes](CHANGELOG.md)
@@ -128,3 +117,7 @@ Changelog
Subtitles provided by [OpenSubtitles.org](http://www.opensubtitles.org/), [Podnapisi.NET](https://www.podnapisi.net/), [TVSubtitles.net](http://www.tvsubtitles.net/), [Addic7ed.com](http://www.addic7ed.com/), [Legendas TV](http://legendas.tv/), [Napi Projekt](http://www.napiprojekt.pl/), [Shooter](http://shooter.cn/), [Titlovi](http://titlovi.com), [aRGENTeaM](http://argenteam.net), [SubScene](https://subscene.com/), [Hosszupuska](http://hosszupuskasub.com/)
[3rd party licenses](https://github.com/pannal/Sub-Zero.bundle/tree/master/Licenses)
## License
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fpannal%2FSub-Zero.bundle.svg?type=large)](https://app.fossa.io/projects/git%2Bgithub.com%2Fpannal%2FSub-Zero.bundle?ref=badge_large)