You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I noticed that the implementation of minillm and dpkd requires installing your customized version of the transformers. After reviewing the code, I found that your modifications mainly involve model parallelism. So, can I use the official transformers and disable model parallelism? Will this allow me to run your full suite of code (including training and evaluation)?
The text was updated successfully, but these errors were encountered:
We also modified the transformers library to implement teacher-mixed-sampling (mixing the probabilities of the teacher and student models for decoding). The modified lines are wrapped with
# ### MiniLLM BEGIN ###
... SOME NEW CODES ...
# ### MiniLLM END ###
Hi, I noticed that the implementation of minillm and dpkd requires installing your customized version of the transformers. After reviewing the code, I found that your modifications mainly involve model parallelism. So, can I use the official transformers and disable model parallelism? Will this allow me to run your full suite of code (including training and evaluation)?
The text was updated successfully, but these errors were encountered: