Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: Only runnable jobs can list needed tasks #170

Open
albertz opened this issue Jan 9, 2024 · 1 comment
Open

AssertionError: Only runnable jobs can list needed tasks #170

albertz opened this issue Jan 9, 2024 · 1 comment

Comments

@albertz
Copy link
Member

albertz commented Jan 9, 2024

I just got this:

...
[2024-01-09 22:54:58,572] ERROR: error: Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.uJVC0MQcnILG>                                                      
[2024-01-09 22:54:58,572] ERROR: error: Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.ugQXmf07iqFo>                                                      
[2024-01-09 22:54:58,572] ERROR: error: Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.vs5I2fwLg7W0>                                                      
...
[2024-01-09 22:54:59,027] INFO: error(51) queue(5) running(7) waiting(838)                                                                                      
Clear jobs in error state? [y/N] y
[2024-01-09 22:55:22,023] WARNING: Clearing: Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.uFPYnu3DCyrE>                                                 
[2024-01-09 22:55:22,159] INFO: Move: work/i6_core/returnn/forward/ReturnnForwardJobV2.uFPYnu3DCyrE to work/i6_core/returnn/forward/ReturnnForwardJobV2.uFPYnu3D
CyrE.cleared.0002                                                                                                                                               
[2024-01-09 22:55:22,244] WARNING: Clearing: Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.ronBYRCWAKNU>
[2024-01-09 22:55:22,264] INFO: Move: work/i6_core/returnn/forward/ReturnnForwardJobV2.ronBYRCWAKNU to work/i6_core/returnn/forward/ReturnnForwardJobV2.ronBYRCW
AKNU.cleared.0002                                                                                                                                               
...
[2024-01-09 22:55:23,374] WARNING: Clearing: Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.m77LwAXPRniE>                                                 
[2024-01-09 22:55:23,418] INFO: Move: work/i6_core/returnn/forward/ReturnnForwardJobV2.m77LwAXPRniE to work/i6_core/returnn/forward/ReturnnForwardJobV2.m77LwAXP
RniE.cleared.0002
[2024-01-09 22:55:23,437] ERROR: Exception in thread <_MainThread(MainThread, started 140556651528192)>:
EXCEPTION
Traceback (most recent call last):
  File "/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/tools.py", line 311, in default_handle_exception_interrupt_main_thread.<locals>.wrapped_func
    line: return func(*args, **kwargs)
    locals:
      func = <local> <function Manager.run at 0x7fd5e3f81800>
      args = <local> (<Manager(Thread-2, initial)>,)
      kwargs = <local> {}
  File "/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/manager.py", line 574, in Manager.run
    line: if not self.startup():
    locals:
      self = <local> <Manager(Thread-2, initial)>
      self.startup = <local> <bound method Manager.startup of <Manager(Thread-2, initial)>>
  File "/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/manager.py", line 535, in Manager.startup
    line: maybe_clear_state(gs.STATE_ERROR, self.clear_errors_once, clear_error)
    locals:
      maybe_clear_state = <local> <function Manager.startup.<locals>.maybe_clear_state at 0x7fd577fc39c0>
      gs = <global> <module 'sisyphus.global_settings' from '/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/global_settings.py'>
      gs.STATE_ERROR = <global> 'error'
      self = <local> <Manager(Thread-2, initial)>
      self.clear_errors_once = <local> False
      clear_error = <local> <function Manager.startup.<locals>.clear_error at 0x7fd5760ce2a0>
  File "/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/manager.py", line 532, in Manager.startup.<locals>.maybe_clear_state
    line: action()
    locals:
      action = <local> <function Manager.startup.<locals>.clear_error at 0x7fd5760ce2a0>
  File "/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/manager.py", line 516, in Manager.startup.<locals>.clear_error
    line: self.clear_states(state=gs.STATE_ERROR)
    locals:
      self = <local> <Manager(Thread-2, initial)>
      self.clear_states = <local> <bound method Manager.clear_states of <Manager(Thread-2, initial)>>
      state = <global> 'input_missing', len = 13
      gs = <global> <module 'sisyphus.global_settings' from '/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/global_settings.py'>
      gs.STATE_ERROR = <global> 'error'
  File "/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/manager.py", line 249, in Manager.clear_states
    line: job._sis_move()
    locals:
      job = <local> Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.m77LwAXPRniE>
      job._sis_move = <local> <bound method Job._sis_move of Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.m77LwAXPRniE>>
  File "/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/job.py", line 801, in Job._sis_move
    line: for t in self._sis_tasks():
    locals:
      t = <not found>
      self = <local> Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.m77LwAXPRniE>
      self._sis_tasks = <local> <bound method Job._sis_tasks of Job<work/i6_core/returnn/forward/ReturnnForwardJobV2.m77LwAXPRniE>>
  File "/u/zeyer/setups/combined/2021-05-31/tools/sisyphus/sisyphus/job.py", line 822, in Job._sis_tasks
    line: assert False, "Only runnable jobs can list needed tasks"
    locals:
       no locals
AssertionError: Only runnable jobs can list needed tasks
[2024-01-09 22:55:23,809] WARNING: Main thread exit. Still running non-daemon threads: {<LocalEngine(Thread-1, started 140556329707072)>}

After a restart of the Sisyphus manager, it again asked for clearing jobs, I again typed y, and then there was no error anymore.

@critias
Copy link
Contributor

critias commented Jan 11, 2024

I think what happens is that after moving the job it tries to reset the tasks. For this it request a list of all tasks which fails since the task is now not runnable anymore:
https://github.com/rwth-i6/sisyphus/blob/master/sisyphus/job.py#L801

It would probably be better to just remove the whole cache: del self._sis_task_cache instead of trying to reset each task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants