GitHub

lanuch training task:

# [reference]: https://www.youtube.com/watch?v=KaAJtI1T2x4&list=PL_lsbAsL_o2CSuhUhJIiW0IkdT5C2wGWj

simple launch on one node:

python train/main.py

DDP (FSDP) launch on one node by torch.multiprocessing (e.g. 8 GPUs):

python train/main.py --dp_world_size 8 --torch_mp_launch

DDP (FSDP) launch on one node by torchrun (e.g. 8 GPUs):

torchrun --standalone --nproc_per_node=8 train/main.py

DDP (FSDP) launch on multi node by torchrun (e.g. 2 * 8 GPUs, two nodes):

# on node 0#
torchrun --nproc_per_node=8 --nnodes=2 -node_rank=0 --rdzv_backend=c10d --rdzv_endpoint=xxx.xxx.xxx.xxx:xxxx train/main.py

# on node 1#
torchrun --nproc_per_node=8 --nnodes=2 -node_rank=1 --rdzv_backend=c10d --rdzv_endpoint=xxx.xxx.xxx.xxx:xxxx train/main.py

lanuch generation task:

python gen/main.py

GPT configs:

gpt2

n_layer=12, n_head=12, n_embd=768 # 124M params

gpt2-medium

n_layer=24, n_head=16, n_embd=1024 # 350M params

gpt2-large

gpt2-largen_layer=36, n_head=20, n_embd=1280 # 774M params

gpt2-xl

n_layer=48, n_head=25, n_embd=1600 # 1558M params

env configuration:

env pytorch

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

pip install transformers==4.44.0

pip install tiktoken==0.7.0

pip install tqdm==4.66.5

tensorrt-llm

cd TensorRT-LLM/examples/bloom

pip install torch torchvision torchaudio (2.4.0, cuda 12.1)

conda install -y mpi4py

conda install openmpi

pip install tensorrt_llm==0.13.0.dev2024081300 --extra-index-
url https://pypi.nvidia.com

pip install -r ./requirements.txt

some useful links:

quant

https://chatgpt.com/share/31aa8af3-dce2-457f-85db-2b18b3c242ce

Name		Name	Last commit message	Last commit date
Latest commit History 123 Commits
ckpt		ckpt
config		config
data		data
data_pipeline		data_pipeline
dist		dist
gen		gen
img		img
log		log
models		models
tests		tests
train		train
tutorials		tutorials
utils		utils
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lanuch training task:

simple launch on one node:

DDP (FSDP) launch on one node by torch.multiprocessing (e.g. 8 GPUs):

DDP (FSDP) launch on one node by torchrun (e.g. 8 GPUs):

DDP (FSDP) launch on multi node by torchrun (e.g. 2 * 8 GPUs, two nodes):

lanuch generation task:

GPT configs:

gpt2

gpt2-medium

gpt2-large

gpt2-xl

env configuration:

env pytorch

tensorrt-llm

some useful links:

quant

About

Releases

Packages

Languages

HTHloveYDH/custom_gpt2

Folders and files

Latest commit

History

Repository files navigation

lanuch training task:

simple launch on one node:

DDP (FSDP) launch on one node by torch.multiprocessing (e.g. 8 GPUs):

DDP (FSDP) launch on one node by torchrun (e.g. 8 GPUs):

DDP (FSDP) launch on multi node by torchrun (e.g. 2 * 8 GPUs, two nodes):

lanuch generation task:

GPT configs:

gpt2

gpt2-medium

gpt2-large

gpt2-xl

env configuration:

env pytorch

tensorrt-llm

some useful links:

quant

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages