Whether can be in "datasets. GeneratorBasedBuilder. _generate_examples" load model in data processing. When using non-streaming load_dataset, whether the model can be removed to avoid VRAM occupation #6204
-
For some specific configurations of the dataset, I needed to load the model for preprocessing. At first I import my own package in the header and
And now that it's working, I was wondering if and how I could remove this model that was used to preprocess the data at training time. I registered it to |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Yes, feel free to do this. The only thing non-streaming HF datasets depend on are Arrow files that are memory-mapped when they are loaded (the |
Beta Was this translation helpful? Give feedback.
Yes, feel free to do this. The only thing non-streaming HF datasets depend on are Arrow files that are memory-mapped when they are loaded (the
.cache_files
attribute returns them)