You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
>>> from goose import Goose
>>> url = ' https://www.alienvault.com/blogs/security-essentials/11-simple-yet-important-tips-to-secure-aws'
>>> g = Goose()
>>> article = g.extract(url=url)
Traceback
Traceback (most recent call last):
article = g.extract(url=url)
File "/scripts/venv/lib/python2.7/site-packages/goose/__init__.py", line 56, in extract
return self.crawl(cc)
File "/scripts/venv/lib/python2.7/site-packages/goose/__init__.py", line 66, in crawl
article = crawler.crawl(crawl_candiate)
File "/scripts/venv/lib/python2.7/site-packages/goose/crawler.py", line 154, in crawl
self.article.title = self.title_extractor.extract()
File "/scripts/venv/lib/python2.7/site-packages/goose/extractors/title.py", line 99, in extract
return self.get_title()
File "/scripts/venv/lib/python2.7/site-packages/goose/extractors/title.py", line 78, in get_title
return self.clean_title(title)
File "/scripts/venv/lib/python2.7/site-packages/goose/extractors/title.py", line 42, in clean_title
title = title.replace(site_name, '').strip()
TypeError: expected a string or other character buffer object
Fix
Make sure to check the value of site_name after this line. If it is None, dont fix the title.
if "site_name" in self.article.opengraph.keys():
site_name = self.article.opengraph['site_name']
# remove the site name from title
if site_name:
title = title.replace(site_name, '').strip()
The text was updated successfully, but these errors were encountered:
Trigger
Traceback
Fix
Make sure to check the value of
site_name
after this line. If it isNone
, dont fix the title.The text was updated successfully, but these errors were encountered: