Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slurm_collect throws 'Error in x$njobs : $ operator is invalid for atomic vectors' #40

Open
tXiao95 opened this issue Oct 12, 2023 · 2 comments

Comments

@tXiao95
Copy link

tXiao95 commented Oct 12, 2023

I am testing the slurmR package on my school HPC. Everything works great using the Slurm_lapply with plan="none" then sbatch call to launch the job array. However - I get the following strange error when using Slurm_collect.

Warning: The call to -sacct- failed. This is probably due to not having slurm accounting up and running. For more information, checkout this discussion: https://github.com/USCbiostats/slurmR/issues29
Error in x$njobs : $ operator is invalid for atomic vectors`

The code I run is

library(slurmR)
ans <- Slurm_lapply(1:10, sqrt, plan="none")
sbatch(ans)
result <- Slurm_collect(ans)

I understand my cluster does not have slurm accounting enabled - but it seems the error is unrelated to the warning? However when I enter debug mode, the x object has a njobs attribute and does not throw an error when I retrieve it directly.

@samkimhis
Copy link

I have the same error. When this error occurs, the jobs were still executed but the job object was not assigned, so I had to look into the tmp_path folder and manually read the job.rds file. Looking at the traceback (copied below), it appears to happen when the input for status() is just a job ID (a character string, an atomic vector), which is allowed by the current function definition. Can you update the function so that the atomic vector input can be handled properly?

> traceback()
11: sprintf("%i_%i", job_id, 1:x$njobs)
10: data.frame(JobID = sprintf("%i_%i", job_id, 1:x$njobs), State = NA_character_, 
        ExitCode = "0:0", stringsAsFactors = FALSE)
9: sacct_(x, brief = TRUE, parsable = TRUE, allocations = TRUE)
8: status.default(x)
7: status(x)
6: wait_slurm.integer(get_job_id(x), ...)
5: wait_slurm.slurm_job(x)
4: wait_slurm(x)
3: sbatch.slurm_job(ans, wait = plan$wait, submit = plan$submit)
2: sbatch(ans, wait = plan$wait, submit = plan$submit)
1: Slurm_lapply(rep(1e+06, 100), simpi, njobs = 3, mc.cores = 16, plan = "wait")

@gvegayon
Copy link
Member

gvegayon commented Apr 4, 2024

I think this is a bug I have been trying to fix: #29. I have not that much time these days, so any PRs are welcomed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants