Continuing a process from PARSING phase loses retrieved temp folder #4783

chrisjsewell · 2021-02-26T11:16:10Z

Originally posted by @chrisjsewell in #4648 (comment)

@sphuber it seems here:

aiida-core/aiida/engine/processes/calcjobs/tasks.py

Lines 390 to 395 in c07e3ef

    
           elif self.data == RETRIEVE_COMMAND: 
        
               node.set_process_status(process_status) 
        
               # Create a temporary folder that has to be deleted by JobProcess.retrieved after successful parsing 
        
               temp_folder = tempfile.mkdtemp() 
        
               await self._launch_task(task_retrieve_job, node, transport_queue, temp_folder) 
        
               result = self.parse(temp_folder)

the temp_folder is created fresh everytime you reach this part of the code. So if the daemon worker is stopped during the parsing, and a new daemon worker picks up the task when the daemon is restarted, it will get a new temp_folder (and also the old temp_folder will not be removed), but the retrieval will be skipped and you ed up with an empty temp_folder:

aiida-core/aiida/engine/processes/calcjobs/tasks.py

Lines 236 to 238 in c07e3ef

    
           if node.get_state() == CalcJobState.PARSING: 
        
               logger.warning(f'CalcJob<{node.pk}> already marked as PARSING, skipping task_retrieve_job') 
        
               return

so perhaps the temp_folder location should be stored on the node, e.g. something like

temp_folder = node.get_retrieved_temp_folder()
if not temp_folder or not os.path.exists(temp_folder):
    temp_folder = tempfile.mkdtemp()
    node.set_retrieved_temp_folder(temp_folder)

chrisjsewell · 2021-02-26T11:17:17Z

or alternatively (as noted by @sphuber) you could re-do the retrieval

This definitely used to work at some point and without storing the temporary folder on the node. Instead after the restart it would generate a new temp folder and retrieve again. This is because you cannot guarantee that after the restart the original temp folder still exists so you have to accept the inefficiency of retrieving again.

chrisjsewell · 2021-02-26T11:30:58Z

@sphuber I'm really not sure this ever worked lol, because there is multiple places in the code that would stop this being possible. For example, you would also need to change here:

aiida-core/aiida/engine/daemon/execmanager.py

Lines 352 to 360 in c07e3ef

    
           # If the calculation already has a `retrieved` folder, simply return. The retrieval was apparently already completed 
        
           # before, which can happen if the daemon is restarted and it shuts down after retrieving but before getting the 
        
           # chance to perform the state transition. Upon reloading this calculation, it will re-attempt the retrieval. 
        
           link_label = calculation.link_label_retrieved 
        
           if calculation.get_outgoing(FolderData, link_label_filter=link_label).first(): 
        
               EXEC_LOGGER.warning( 
        
                   f'CalcJobNode<{calculation.pk}> already has a `{link_label}` output folder: skipping retrieval' 
        
               ) 
        
               return

you would need to rework this so that if the retrieved folder already exists and retrieve_temporary_list is not empty then you only retrieve these files

chrisjsewell added topic/daemon topic/engine type/bug labels Feb 26, 2021

chrisjsewell mentioned this issue Feb 26, 2021

Restarting the daemon excepts all jobs (aiida-core 1.6, python 3.7) #4648

Closed

chrisjsewell added this to the v1.6.1 milestone Feb 26, 2021

chrisjsewell self-assigned this Feb 26, 2021

mjclarke94 mentioned this issue Apr 14, 2021

Rerun transport/other aiida tasks where errors have occurred #4857

Open

sphuber modified the milestones: Preparations for the new repository, v2.0.0 Apr 28, 2021

chrisjsewell modified the milestones: v2.0.0, Post 2.0 May 5, 2021

sphuber mentioned this issue Sep 20, 2021

Add Parser.retrieved_temporary_folder to easily access it, like retrieved #3502

Open

sphuber removed this from the v2.3.0 milestone Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continuing a process from PARSING phase loses retrieved temp folder #4783

Continuing a process from PARSING phase loses retrieved temp folder #4783

chrisjsewell commented Feb 26, 2021 •

edited

Loading

chrisjsewell commented Feb 26, 2021 •

edited

Loading

chrisjsewell commented Feb 26, 2021 •

edited

Loading

Continuing a process from PARSING phase loses retrieved temp folder #4783

Continuing a process from PARSING phase loses retrieved temp folder #4783

Comments

chrisjsewell commented Feb 26, 2021 • edited Loading

chrisjsewell commented Feb 26, 2021 • edited Loading

chrisjsewell commented Feb 26, 2021 • edited Loading

chrisjsewell commented Feb 26, 2021 •

edited

Loading

chrisjsewell commented Feb 26, 2021 •

edited

Loading

chrisjsewell commented Feb 26, 2021 •

edited

Loading