-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to run one's own HTCondor instance #3
Comments
Great question @Ferryistaken. We have been thinking about ways to make melt-sim more accessible to others. Our initial ideas were around OSG, but running HTCondor locally is another approach. If you are running HTCondor locally, there are a lot of code and data preparation steps you can skip. You do not need to upload anything to osdf/squid, it can stay on your local machine. Regarding your last question
@samgelman what other steps can @Ferryistaken skip in the setup? |
To add to what @agitter said, you should be able to use a local HTCondor with modifications to the existing framework. The key files are:
Hopefully this is enough to get started, and if you encounter additional questions, I would be happy to help. |
I wanted to add that if you plan to have a local install of Python and Rosetta on each execute node, then you won't need to transfer those from the submit node. You would need to modify the files I listed above, especially |
Yes, my current architecture involves having the python and rosetta installs on each execute node, and a modification to the |
@Ferryistaken I would be interested in having you contribute your solution back to this repo if you get everything working. We would need to decide the best way to organize that based on how many files you modified and how much they changed. |
Hello,
I've ran the pipeline without HTCondor up until the
processing results
part (which I assume is not currently possible without running the pipeline in HTCondor unless I write a custom script that takes the non-HTCondorenergize_output
and packages it into a database understandable bymetl
).From my understanding, it's unfeasible to generate a good enough training set without parallelizing the computation of rosetta's energy parameters for all variants. I've setup my own HTCondor instance to which I'm able to connect a few
execute
nodes, and would like to runmetl-sim
on my this cluster. The part that I don't understand is: do I really need to uploadrosetta
andpython
to osdf/squid if I'm running the algorithm only on my own machines? Or is there another way (such as adding the rosetta and python env to allexecute
nodes through mydocker-compose
)?I might be wrong, but it seems like I would only need to upload to squid if I'm connecting to a highly distributed HTCondor cluster to which I don't have admin privileges to right?
Where in the scripts are the
osdf
python/rosetta env being accessed? Is there a workaround to skip that step and instead use a local install?The text was updated successfully, but these errors were encountered: