Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

self.work() exception clause being raised to threading.py instead of being handled #1

Open
dale-wahl opened this issue Nov 3, 2021 · 1 comment

Comments

@dale-wahl
Copy link
Member

dale-wahl commented Nov 3, 2021

Theory: custom exceptions not being caught properly in the program itself? If that's the case, then is it a dmi-scheduler issue? The real problem is that any sort of logging is not handled and the job is not properly closed out by the dmi-scheduler.

Possible solution: at the very least we could ensure that our base logger also handles the root log so that logs are not just send to stderr/stdout and stored in the log file. This would not resolve the error handling/job closure.

Two examples where exceptions are being raised to _bootstrap_inner in threading.py.

Exception in thread crawler:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/opt/middleman-python/dmi_scheduler/worker.py", line 57, in run
    self.work()
  File "/opt/middleman-python/workers/crawl.py", line 46, in work
    network_pending.iterate()
  File "/opt/middleman-python/lib/network.py", line 222, in iterate
    new_corpus.add_LRUs(starting_points)
  File "/opt/middleman-python/lib/corpus.py", line 283, in add_LRUs
    self.add_URLs([hyphe.LRU_to_URL(lru) for lru in lrus])
  File "/opt/middleman-python/lib/corpus.py", line 241, in add_URLs
    self.add_URL(url)
  File "/opt/middleman-python/lib/corpus.py", line 255, in add_URL
    response = hyphe.call("store.declare_webentity_by_lrus_as_urls", [url], None, None, [], False, self.corpus_id)
  File "/opt/middleman-python/lib/hyphe.py", line 90, in call
    raise HypheException("Unexpected response format\n\n%s" % repr(response))
lib.exceptions.HypheException: Unexpected response format

{'jsonrpc': '2.0', 'id': '0dbec3f9-ed8e-4afc-a262-90057e3a6cb3', 'error': {'message': '', 'code': 8002, 'data': ''}}

or

Exception in thread crawler:                                                                                                                                                               
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/threading.py", line 926, in _bootstrap_inner                                                                                                                  
  self.run()
  File "/opt/middleman-python/dmi_scheduler/worker.py", line 57, in run                                                                                                                        
  self.work()
File "/opt/middleman-python/workers/crawl.py", line 46, in work
  network_pending.iterate()
File "/opt/middleman-python/lib/network.py", line 216, in iterate
  new_corpus = lib.corpus.Corpus.create(self.get_corpus_ID(next_iteration), self.local_db, self.remote_db, detail_level, self.depth)
File "/opt/middleman-python/lib/corpus.py", line 656, in create
  raise CorpusException("Could not create corpus %s: %s" % (name, repr(created)))
lib.exceptions.CorpusException: Could not create corpus ic-2-356267: {'message': 'There already is a corpus named "ic-2-356267".', 'code': 'fail'}
@dale-wahl
Copy link
Member Author

dale-wahl commented Nov 3, 2021

In middleman-python, I changed our base exception class from

class MiddlemanException(BaseException):
	pass

to

class MiddlemanException(Exception):
	pass

and a CorpusException error was caught.

2021-11-03 12:00:55,960 [ERROR]  Worker crawler raised exception CorpusException and will abort: Could not create corpus ic-2-356269: {'message': 'There already is a corpus named "ic-2-356269".', 'code': 'fail'} at worker.py:57->crawl.py:46->network.py:216->corpus.py:656

But it has always been BaseException so there must be another reason. At some point I switched self.log.error() to self.log.exception() in the Exception block after self.run() from the dmi-scheduler worker and then back again. Perhaps that was the culprit. I switched back to BaseException to test that, but it failed to catch the error again (while using self.log.error() per normal).

This seems fixed, but I do not know why it started to begin with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant