Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Hail with dsub #265

Open
buutrg opened this issue Jul 18, 2023 · 1 comment
Open

Using Hail with dsub #265

buutrg opened this issue Jul 18, 2023 · 1 comment

Comments

@buutrg
Copy link

buutrg commented Jul 18, 2023

Hi all, I am trying to use hail via dsub to extract a subset of variants in All of Us server. I think this is the most relevant image I can use https://github.com/DataBiosphere/terra-docker/tree/master/terra-jupyter-hail

But it result in error that pyspark is not found. I tried to install pyspark from https://dlcdn.apache.org/spark/spark-3.1.3/spark-3.1.3-bin-hadoop3.tgz. Now it says No FileSystem for scheme "gs".

May I ask do you have any idea how to use hail via dsub?
Your help is really appreciated!

@wnojopra
Copy link
Contributor

wnojopra commented Jul 18, 2023

It sounds like you're running dsub on the AoU platform. From this Aou Support Article, "Within the Researcher Workbench, internet access is restricted from batch VMs. With the exception of Google APIs, VMs are unable to send or receive network traffic including files, APIs, or packages/code". This isn't specifically a dsub issue. Please reach out to AoU support for help installing pyspark in that image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants