cleaned up and tested tp support #976

debajyotidatta · 2024-09-30T15:07:46Z

No description provided.

pytorch-bot · 2024-09-30T15:07:50Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/976

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 273fd8c with merge base c0a81f9 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

HDCharles · 2024-10-01T04:36:55Z

torchao/_models/llama/generate.py

@@ -30,7 +31,7 @@ def device_sync(device):
 wd = Path(__file__).parent.parent.resolve()
 sys.path.append(str(wd))

-from torchao._models.llama.model import Transformer, prepare_inputs_for_model
+from model import Transformer, prepare_inputs_for_model


i think we want this to be the same as before, otherwise it can accidentally pull in gpt-fast model and it becomes annoying to test/debug

HDCharles

otherwise looks good

jerryzh168 · 2024-10-01T16:13:52Z

torchao/_models/llama/tp.py

+    funcol = None
+
+from torchao._models.llama.model import Attention, FeedForward, Transformer
+from torchao.quantization.GPTQ import WeightOnlyInt4Linear


we are not usingWeightOnlyInt4Linear any more in torchao I think, is this just for GPTQ?

jerryzh168 · 2024-10-01T17:35:54Z

@debajyotidatta the way we are supporting tp by composing quantized tensor subclass and DTensor: #939, I believe the way its done in gpt-fast is outdated, would you be interested in integrating the new tp flow in generate.py instead?

debajyotidatta · 2024-10-01T17:43:32Z

@debajyotidatta the way we are supporting tp by composing quantized tensor subclass and DTensor: #939, I believe the way its done in gpt-fast is outdated, would you be interested in integrating the new tp flow in generate.py instead?

Sounds good! I will do that.

HDCharles

can you add try to do tensor parallel support along the lines of https://github.com/pytorch/ao/blob/main/tutorials/developer_api_guide/tensor_parallel.py

cleaned up and tested tp support

454f50d

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 30, 2024

debajyotidatta marked this pull request as ready for review September 30, 2024 15:08

HDCharles reviewed Oct 1, 2024

View reviewed changes

HDCharles approved these changes Oct 1, 2024

View reviewed changes

fixing imports

273fd8c

jerryzh168 reviewed Oct 1, 2024

View reviewed changes

HDCharles requested changes Oct 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cleaned up and tested tp support #976

cleaned up and tested tp support #976

debajyotidatta commented Sep 30, 2024

pytorch-bot bot commented Sep 30, 2024 •

edited

Loading

HDCharles Oct 1, 2024

HDCharles left a comment

jerryzh168 Oct 1, 2024

jerryzh168 commented Oct 1, 2024

debajyotidatta commented Oct 1, 2024

HDCharles left a comment

cleaned up and tested tp support #976

Are you sure you want to change the base?

cleaned up and tested tp support #976

Conversation

debajyotidatta commented Sep 30, 2024

pytorch-bot bot commented Sep 30, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/976

✅ No Failures

HDCharles Oct 1, 2024

Choose a reason for hiding this comment

HDCharles left a comment

Choose a reason for hiding this comment

jerryzh168 Oct 1, 2024

Choose a reason for hiding this comment

jerryzh168 commented Oct 1, 2024

debajyotidatta commented Oct 1, 2024

HDCharles left a comment

Choose a reason for hiding this comment

pytorch-bot bot commented Sep 30, 2024 •

edited

Loading