-
Notifications
You must be signed in to change notification settings - Fork 468
Issues: pytorch/torchtune
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Llama3.1 models do not allow configuring Something isn't working
triaged
This issue has been assigned an owner and appropriate label
max_seq_len
bug
#2202
opened Dec 23, 2024 by
akashc1
NaN running official KD code on different dataset, with packing + compile
#2198
opened Dec 22, 2024 by
thnkinbtfly
Model request. Phi4
enhancement
New feature or request
triaged
This issue has been assigned an owner and appropriate label
#2190
opened Dec 20, 2024 by
krammnic
Add multiprocess dataset packing
enhancement
New feature or request
triaged
This issue has been assigned an owner and appropriate label
#2180
opened Dec 19, 2024 by
bratao
Add sample packing for DPO, PPO
enhancement
New feature or request
rlhf
Anything related to reinforcement learning w/ human feedback
#2177
opened Dec 18, 2024 by
SalmanMohammadi
GPU Middle Class?
discussion
Start a discussion
distributed
Anything related to distributed env (multi-GPU, multi-node)
triaged
This issue has been assigned an owner and appropriate label
#2161
opened Dec 16, 2024 by
EugenHotaj
Move Things we should be doing but aren't
triaged
This issue has been assigned an owner and appropriate label
update_recipe_state
to its own util
best practice
#2158
opened Dec 13, 2024 by
joecummings
what should I do if I want to improve the performance of hellaswag?
discussion
Start a discussion
#2154
opened Dec 12, 2024 by
mathCrazyy
Invalid kwarg fused passed to bitsandbytes AdamW8bit
better engineering
Tasks which help improve eng productivity e.g. building tools, cleaning up code, writing docs
#2152
opened Dec 12, 2024 by
mlazos
How to retrieve the distilled model in a manner similar to the OpenAI API interface ?
discussion
Start a discussion
#2148
opened Dec 11, 2024 by
lingq1
Loss becomes NaN during finetuning when turning on optimizer_in_bwd=True
#2145
opened Dec 10, 2024 by
acisseJZhong
70B Fine-tuning GPUs Utilization
discussion
Start a discussion
distributed
Anything related to distributed env (multi-GPU, multi-node)
#2142
opened Dec 10, 2024 by
fabiogeraci
Are there any plans to support context parallel?
enhancement
New feature or request
#2141
opened Dec 10, 2024 by
dz1iang
[small bug + generalization] saving config.yaml to output_dir
better engineering
Tasks which help improve eng productivity e.g. building tools, cleaning up code, writing docs
bug
Something isn't working
#2137
opened Dec 9, 2024 by
felipemello1
Query on Gradient accumulation
discussion
Start a discussion
#2134
opened Dec 9, 2024 by
Vattikondadheeraj
want to fine-tuned llama3.2.1b on MMLU and Arc_challenge and gsm8k(maths)
discussion
Start a discussion
#2132
opened Dec 8, 2024 by
sorobedio
Make it possible to distill into a full finetune model
enhancement
New feature or request
#2122
opened Dec 6, 2024 by
joecummings
[RFC] Remove automatic weight merging when training LoRA
discussion
Start a discussion
#2115
opened Dec 5, 2024 by
felipemello1
[RFC] Unify activation checkpointing APIs
rfc
Request for comments
#2114
opened Dec 5, 2024 by
ebsmothers
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.