-
-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Cinemast provider #817
base: main
Are you sure you want to change the base?
Changes from 9 commits
26885b3
07fc287
d7d1769
5d74d55
2dfc825
31464b6
bc397d8
ddee12f
ecd885d
2ca23cf
7f55a9d
59a1312
f5d799d
f26b08f
b634a52
6eb24c1
1e75c43
be3cb67
11af13a
29d3358
e25589d
e140104
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,7 +16,7 @@ Changelog | |
^^^^^ | ||
**release date:** 2016-09-03 | ||
|
||
* Fix subscenter | ||
* Add Cinemast provider | ||
|
||
|
||
2.0.3 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,231 @@ | ||
# -*- coding: utf-8 -*- | ||
import bisect | ||
import io | ||
import logging | ||
import zipfile | ||
|
||
from babelfish import Language | ||
from guessit import guessit | ||
from requests import Session | ||
|
||
from . import Provider | ||
from .. import __short_version__ | ||
from ..exceptions import AuthenticationError, ConfigurationError, ProviderError | ||
from ..subtitle import Subtitle, fix_line_ending, guess_matches | ||
from ..utils import sanitize | ||
from ..video import Episode, Movie | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
||
class CinemastSubtitle(Subtitle): | ||
"""Cinemast Subtitle.""" | ||
provider_name = 'cinemast' | ||
|
||
def __init__(self, language, page_link, series, season, episode, title, subtitle_id, subtitle_key, | ||
releases): | ||
super(CinemastSubtitle, self).__init__(language, page_link=page_link) | ||
self.series = series | ||
self.season = season | ||
self.episode = episode | ||
self.title = title | ||
self.subtitle_id = subtitle_id | ||
self.subtitle_key = subtitle_key | ||
self.downloaded = 0 | ||
self.releases = releases | ||
|
||
@property | ||
def id(self): | ||
return str(self.subtitle_id) | ||
|
||
def get_matches(self, video): | ||
matches = set() | ||
|
||
# episode | ||
if isinstance(video, Episode): | ||
# series | ||
if video.series and (sanitize(self.title) in ( | ||
sanitize(name) for name in [video.series] + video.alternative_series)): | ||
matches.add('series') | ||
# season | ||
if video.season and self.season == video.season: | ||
matches.add('season') | ||
# episode | ||
if video.episode and self.episode == video.episode: | ||
matches.add('episode') | ||
# guess | ||
for release in self.releases: | ||
matches |= guess_matches(video, guessit(release, {'type': 'episode'})) | ||
# movie | ||
elif isinstance(video, Movie): | ||
# guess | ||
for release in self.releases: | ||
matches |= guess_matches(video, guessit(release, {'type': 'movie'})) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here |
||
|
||
# title | ||
if video.title and (sanitize(self.title) in ( | ||
sanitize(name) for name in [video.title] + video.alternative_titles)): | ||
matches.add('title') | ||
|
||
return matches | ||
|
||
|
||
class CinemastProvider(Provider): | ||
"""Cinemast Provider.""" | ||
languages = {Language.fromalpha2(l) for l in ['he']} | ||
server_url = 'http://www.cinemast.org/he/cinemast/api/' | ||
subtitle_class = CinemastSubtitle | ||
|
||
default_username = '[email protected]' | ||
default_password = 'subliminal' | ||
|
||
def __init__(self, username=None, password=None): | ||
if any((username, password)) and not all((username, password)): | ||
raise ConfigurationError('Username and password must be specified') | ||
|
||
self.session = None | ||
self.username = username or self.default_username | ||
self.password = password or self.default_password | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is there a default user? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. API is accessible only to registered users. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In that case, don't provide a default user, just use this one for testing and require USER and PASSWORD to be provided in the CLI. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really don't think this is necessary for this provider, since it's not really private. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If/When they introduce a limited number of downloads per user this is not going to work anymore. Moreover this is not a widely used providers as per the only language it supports. I don't think people would mind entering a user/password in their command line. A websites generates ad-revenue, an API does not. does not so maybe api restriction is intentional. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, removed. |
||
self.user_id = None | ||
self.token = None | ||
self.session = None | ||
|
||
def initialize(self): | ||
self.session = Session() | ||
self.session.headers['User-Agent'] = 'Subliminal/{}'.format(__short_version__) | ||
|
||
# login | ||
if self.username and self.password: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will always evaluate to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Verification is for user input (in case There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ofir123
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done! |
||
logger.debug('Logging in') | ||
url = self.server_url + 'login/' | ||
|
||
# actual login | ||
data = {'username': self.username, 'password': self.password} | ||
r = self.session.post(url, data=data, allow_redirects=False, timeout=10) | ||
|
||
if r.status_code != 200: | ||
raise AuthenticationError(self.username) | ||
|
||
try: | ||
result = r.json() | ||
if 'token' not in result: | ||
raise AuthenticationError(self.username) | ||
|
||
logger.info('Logged in') | ||
self.user_id = r.json().get('user') | ||
self.token = r.json().get('token') | ||
except ValueError: | ||
raise AuthenticationError(self.username) | ||
|
||
def terminate(self): | ||
# logout | ||
if self.token or self.user_id: | ||
logger.info('Logged out') | ||
self.token = None | ||
self.user_id = None | ||
|
||
self.session.close() | ||
|
||
def query(self, title, season=None, episode=None, year=None): | ||
query = { | ||
'q': title, | ||
'user': self.user_id, | ||
'token': self.token | ||
} | ||
|
||
# episode | ||
if season and episode: | ||
query['type'] = 'series' | ||
query['season'] = season | ||
query['episode'] = episode | ||
else: | ||
query['type'] = 'movies' | ||
if year: | ||
query['year_start'] = year - 1 | ||
query['year_end'] = year | ||
|
||
# get the list of subtitles | ||
logger.debug('Getting the list of subtitles') | ||
url = self.server_url + 'search/' | ||
r = self.session.post(url, data=query) | ||
r.raise_for_status() | ||
|
||
try: | ||
results = r.json() | ||
except ValueError: | ||
return {} | ||
|
||
# loop over results | ||
subtitles = {} | ||
for group_data in results.get('data', []): | ||
# create page link | ||
slug_name = group_data.get('name_en').lower().replace(' ', '-').replace('\'', '').replace('"', '') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about other characters? e.g. dot, semicolons There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not really.. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Make this a function so it doesn't look too long. |
||
if query['type'] == 'series': | ||
page_link = self.server_url + 'subtitle/series/{}/{}/{}/'.format(slug_name, season, episode) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. urlencode is required here due to above comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Diaoul urlencode is from urllib right? Requests already does the encoding and it's not needed. confirm? no other provider uses urlencode There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For |
||
else: | ||
page_link = self.server_url + 'subtitle/movie/{}/'.format(slug_name) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done! |
||
|
||
# go over each language | ||
for language_code, subtitles_data in group_data.get('subtitles', {}).items(): | ||
for subtitle_item in subtitles_data: | ||
# read the item | ||
language = Language.fromalpha2(language_code) | ||
subtitle_id = subtitle_item['id'] | ||
subtitle_key = subtitle_item['key'] | ||
release = subtitle_item['version'] | ||
|
||
# add the release and increment downloaded count if we already have the subtitle | ||
if subtitle_id in subtitles: | ||
logger.debug('Found additional release %r for subtitle %r', release, subtitle_id) | ||
bisect.insort_left(subtitles[subtitle_id].releases, release) # deterministic order | ||
subtitles[subtitle_id].downloaded += 1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do you increase the downloaded count? Isn't that the same as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done! |
||
continue | ||
|
||
# otherwise create it | ||
subtitle = self.subtitle_class(language, page_link, title, season, episode, title, subtitle_id, | ||
subtitle_key, [release]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 1 release = 1 instance |
||
logger.debug('Found subtitle %r', subtitle) | ||
subtitles[subtitle_id] = subtitle | ||
|
||
return subtitles.values() | ||
|
||
def list_subtitles(self, video, languages): | ||
season = episode = None | ||
|
||
if isinstance(video, Episode): | ||
titles = [video.series] + video.alternative_series | ||
season = video.season | ||
episode = video.episode | ||
else: | ||
titles = [video.title] + video.alternative_titles | ||
|
||
for title in titles: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As I said before, this is going to flood the servers with many useless requests sometimes returning duplicated results. I'm really not a big fan of this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Diaoul It will stop in the first result found. And dogpile caches it so next time it won't hit server. |
||
subtitles = [s for s in self.query(title, season, episode) if s.language in languages] | ||
if subtitles: | ||
return subtitles | ||
|
||
return [] | ||
|
||
def download_subtitle(self, subtitle): | ||
# download | ||
url = self.server_url + 'subtitle/download/{}/'.format(subtitle.language.alpha2) | ||
params = { | ||
'v': subtitle.releases[0], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does the result change if you take another release? |
||
'key': subtitle.subtitle_key, | ||
'sub_id': subtitle.subtitle_id | ||
} | ||
data = { | ||
'user': self.user_id, | ||
'token': self.token | ||
} | ||
r = self.session.post(url, data=data, params=params, timeout=10) | ||
r.raise_for_status() | ||
|
||
# open the zip | ||
with zipfile.ZipFile(io.BytesIO(r.content)) as zf: | ||
# remove some filenames from the namelist | ||
namelist = [n for n in zf.namelist() if not n.endswith('.txt')] | ||
if len(namelist) > 1: | ||
raise ProviderError('More than one file to unzip') | ||
|
||
subtitle.content = fix_line_ending(zf.read(namelist[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is potentially going to give an unreal good score to the subtitles as this is the union of all releases instead of picking the best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand what you mean..
If the subtitle matches multiple releases, I want to check all of them.
Should I change it so each subtitle will match only one release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know in that case if we shouldn't do a single Subtitle instance for each release. Even if this is the same actual subtitle.
I think for now if it does the job leave it as is but that's a thought for the long term.
@pannal: what about harmonizing the Subtitle object so that it's not provider specific but rather "subliminal"-specific? This would avoid you patching all the subtitle classes to change the guessing rules. Not sure the feasability though.