-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix some merge bugs #757
Fix some merge bugs #757
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These seem like good things to catch! I had a different option for the first, up to you. But the second I think we want something a bit different?
bin/merge_sces.R
Outdated
purrr::map(\(sce) "adt" %in% altExpNames(sce)) |> | ||
unlist() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit more descriptive, I think?
purrr::map(\(sce) "adt" %in% altExpNames(sce)) |> | |
unlist() | |
purrr::map_lgl(\(sce) "adt" %in% altExpNames(sce)) |
merge.nf
Outdated
libraries_ch | ||
.filter{!(it.library_id in merge_libaries.getVal())} | ||
.subscribe{ | ||
log.warn("Processed files do not exist for ${it.library_id}. This library will not be included in the merged object.") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing the indentation, but I think the message here is a bit confusing, as there are processed files for multiplexed data, which will be listed again here, along with all of the spatial an bulk libraries.
libraries_ch | |
.filter{!(it.library_id in merge_libaries.getVal())} | |
.subscribe{ | |
log.warn("Processed files do not exist for ${it.library_id}. This library will not be included in the merged object.") | |
} | |
libraries_ch | |
.filter{!(it.library_id in merge_libaries.getVal())} | |
.subscribe{ | |
log.warn("Processed files do not exist for ${it.library_id}. This library will not be included in the merged object.") | |
} |
If you want to look for files without processed samples, I think you would have to do that more directly, maybe with a duplicate filter to the one below looking at the files themselves? Something like:
filtered_libraries_ch.single_sample
.map{
it.library_id,
file("${params.results_dir}/${it.project_id}/${it.sample_id}/${it.library_id}_processed.rds")
]}
.filter{!(it[1].exists() && it[1].size() > 0)}
.subscribe{
log.warn("Processed files do not exist for ${it[0]}. This library will not be included in the merged object.")
}
Thanks for the suggestions @jashapiro. I went ahead and incorporated both of them, I think we do want to avoid printing out multiplexed samples in the error message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Joshua Shapiro <[email protected]>
In merging the processed data for a project for the Portal, I came across a small error in the
merge_sces.R
script. R was complaining about calculating thesum()
of a list so I just added anunlist()
and that fixed that problem.The other precautionary measure I took here is to ensure that if a processed file exists but its size is 0, it's not actually included in the merging process. I can imagine that causing problems down the line if we try to read in an empty file. I also added a warning message that should print out any libraries that are not going to be included in the object.