Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New logic for STAGE_INPUT valorization #70

Merged
merged 6 commits into from
Jul 18, 2024

Conversation

cosimolupo
Copy link
Contributor

This is a solution to issue #68, where value assigned to key STAGE_INPUT shows circular dependencies on the user-defined list of stages present in the top-level config file.

@cosimolupo cosimolupo requested review from rgutzen and gulpgiulia June 19, 2024 08:23
@cosimolupo cosimolupo added this to the v0.2.0 milestone Jun 19, 2024
@cosimolupo cosimolupo self-assigned this Jun 19, 2024
stage_idx = locate_str_in_list(config_dict['STAGES'], stage)
if stage_idx is None:
stage_idx_local = locate_str_in_list(config_dict['STAGES'], stage)
stage_idx_global = locate_str_in_list([v for k,v in stages.items()], stage)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
stage_idx_global = locate_str_in_list([v for k,v in stages.items()], stage)
is_first_stage = locate_str_in_list(list(stages.values()), stage) == 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though the definition of the current stage index in the "global" (i.e. the one present in ~/.cobrawap/config file) list of stages is now become no longer useful, and hence left as a comment for potential upcoming changes, still the is_first_stage boolean should refer to the custom-defined list of stages. However, stage_idx_local already does the job, being equal to 0 or not, so introducing a further boolean variable for storing such info looks useless.

@@ -323,35 +323,36 @@ def run_stage(profile=None, stage=None, extra_args=None, **kwargs):
# lookup stage input file
pipeline_config_path = config_path / 'configs' / 'config.yaml'
config_dict = load_config_file(pipeline_config_path)
stage_idx = locate_str_in_list(config_dict['STAGES'], stage)
if stage_idx is None:
stage_idx_local = locate_str_in_list(config_dict['STAGES'], stage)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep the variable name as stage_idx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to differentiate among the stage index in the custom-defined list of stages, and the larger one present in the ~/.cobrawap/config file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only one stage_idx in use here (the "local" one), so there is no need to differentiate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but using "talking" names for variables helps... We can use "stage_idx" as it used to be, or "user_stage_idx", or any other meaningful suggestion...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's stick with stage_idx for now.
We may introduce specifications of different stage numeration in the future when the need arises.

cobrawap/__main__.py Outdated Show resolved Hide resolved
config_name=prev_config_name)
stage_input = output_path / profile / prev_stage / output_name
if stage_idx_local>0:
prev_stage = [v for k,v in stages.items()][stage_idx_local-1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert to the previous logic here.
This line tries to access the global cobrawap config with the index of the pipeline config!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a leftover bug from previous changes in this PR. Thanks for having pointed it out. Fixed through the last commit.

@mdenker
Copy link
Member

mdenker commented Jul 3, 2024

@rgutzen Please recheck the latest changes.

@@ -323,35 +323,36 @@ def run_stage(profile=None, stage=None, extra_args=None, **kwargs):
# lookup stage input file
pipeline_config_path = config_path / 'configs' / 'config.yaml'
config_dict = load_config_file(pipeline_config_path)
stage_idx = locate_str_in_list(config_dict['STAGES'], stage)
if stage_idx is None:
stage_idx_local = locate_str_in_list(config_dict['STAGES'], stage)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's stick with stage_idx for now.
We may introduce specifications of different stage numeration in the future when the need arises.

config_name=prev_config_name)
stage_input = output_path / profile / prev_stage / output_name

if stage_idx_local>0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if stage_idx_local>0:
if stage_idx_local > 0:

Comment on lines 16 to 20
logger.warning('No STAGE_INPUT defined for running the stage individually! '
'You can set it via the command line with '
'`--config STAGE_INPUT=/path/to/file`.')
if config['STAGE_NAME']!=get_setting('stages')['1']:
# logger.warning()
raise InputError('No STAGE_INPUT defined for running stage \'{}\' individually! '.format(config['STAGE_NAME']) +
'You can set it via the command line with ' +
'`--config STAGE_INPUT=/path/to/file`.')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be kept as a logger.warning. The stage itself will raise an error when there is a missing input with potentially additional information, that won't be shown if an error is raised before.

cobrawap/pipeline/utils/Snakefile Outdated Show resolved Hide resolved
cobrawap/pipeline/utils/Snakefile Outdated Show resolved Hide resolved
@gulpgiulia
Copy link
Collaborator

I also suggest to consider to send an alert to the user any time (as in this case) when the user's settings have been overwritten, something like: logger.warning(Cobrawap is running from first stage - ‘STAGE_INPUT forced to Nan. Your user settings have been overwritten)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants