From 576e29c34df7516050851213757fea45aeaac38a Mon Sep 17 00:00:00 2001 From: Rex Cheng Date: Mon, 23 Dec 2024 16:39:22 -0600 Subject: [PATCH] Update TRAINING.md --- docs/TRAINING.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/TRAINING.md b/docs/TRAINING.md index 683b3f3..8e8d37e 100644 --- a/docs/TRAINING.md +++ b/docs/TRAINING.md @@ -13,7 +13,7 @@ Namely, before starting any training, we 4. Encode all extracted features into [MemoryMappedTensors](https://pytorch.org/tensordict/main/reference/generated/tensordict.MemoryMappedTensor.html) with [TensorDict](https://pytorch.org/tensordict/main/reference/tensordict.html) -**NOTE:** for maximum training speed (e.g., when training the base model with 2*H100s), you would need around 3~5 GB/s of random read speed. Spinning disks would not be able to catch up and most SSDs would struggle. In my experience, the best bet is to have a large enough system memory such that the OS can cache the data. This way, the data is read from RAM instead of disk. +**NOTE:** for maximum training speed (e.g., when training the base model with 2*H100s), you would need around 3~5 GB/s of random read speed. Spinning disks would not be able to catch up and most consumer-grade SSDs would struggle. In my experience, the best bet is to have a large enough system memory such that the OS can cache the data. This way, the data is read from RAM instead of disk. ## Preparing Audio-Video-Text Features