Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DeepSpeed easyblock #3450

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from
61 changes: 61 additions & 0 deletions easybuild/easyblocks/d/deepspeed.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
from easybuild.easyblocks.generic.pythonpackage import PythonPackage
VRehnberg marked this conversation as resolved.
Show resolved Hide resolved
from easybuild.tools.build_log import EasyBuildError
from easybuild.tools.config import build_option
import easybuild.tools.environment as env


class EB_DeepSpeed(PythonPackage):
"""Custom easyblock for DeepSpeed"""

@staticmethod
def extra_options():
"""Change some defaults for easyconfig parameters."""
extra_vars = PythonPackage.extra_options()
extra_vars['use_pip'][0] = True
extra_vars['download_dep_fail'][0] = True
extra_vars['sanity_pip_check'][0] = True
return extra_vars

def __init__(self, *args, **kwargs):
"""Initialize DeepSpeed easyblock."""
super().__init__(*args, **kwargs)

dep_names = set(dep['name'] for dep in self.cfg.dependencies())

# require that PyTorch is listed as dependency
if 'PyTorch' not in dep_names:
raise EasyBuildError('PyTorch not found as a dependency')

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-3.2.10, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-3.2.10, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-tcl-1.147, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-tcl-1.147, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-4.1.4, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-4.1.4, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency
VRehnberg marked this conversation as resolved.
Show resolved Hide resolved

# enable building with GPU support if CUDA is included as dependency
if 'CUDA' in dep_names:
self.with_cuda = True
else:
self.with_cuda = False

@property
def cuda_compute_capabilities(self):
return self.cfg['cuda_compute_capabilities'] or build_option('cuda_compute_capabilities')

def configure_step(self):
"""Set up DeepSpeed config"""

if self.with_cuda:
# https://github.com/microsoft/DeepSpeed/issues/3358
env.setvar('NVCC_PREPEND_FLAGS', '--forward-unknown-opts')

if self.cuda_compute_capabilities:
# specify CUDA compute capabilities via $TORCH_CUDA_ARCH_LIST
env.setvar('TORCH_CUDA_ARCH_LIST', ';'.join(self.cuda_compute_capabilities))

# By default prebuild all opts with a few exceptions
# http://www.deepspeed.ai/tutorials/advanced-install/#pre-install-deepspeed-ops
# > DeepSpeed will only install any ops that are compatible with your machine
env.setvar('DS_BUILD_OPTS', '1')

# These have bothersome dependencies
env.setvar('DS_BUILD_SPARSE_ATTN', '0') # requires PyTorch<2.0, triton==1.0.0
env.setvar('DS_BUILD_EVOFORMER_ATTN', '0') # requires PyTorch<2.0, triton==1.0.0
env.setvar('DS_BUILD_CUTLASS_OPS', '0') # requires dskernels
env.setvar('DS_BUILD_RAGGED_DEVICE_OPS', '0') # requires dskernels
VRehnberg marked this conversation as resolved.
Show resolved Hide resolved

super().configure_step()
Loading