Batch inference over multiple nodes #3103

boyang-nlp · 2025-01-24T09:44:02Z

I want to run multiple sglang runtimes on multiple nodes, each using only 4 GPUs, and then perform batch inference through a router. How should I proceed?

zhaochenyang20 · 2025-01-25T04:25:54Z

https://docs.sglang.ai/router/router.html

@boyang-nlp Refer to this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch inference over multiple nodes #3103

Batch inference over multiple nodes #3103

boyang-nlp commented Jan 24, 2025

zhaochenyang20 commented Jan 25, 2025

Batch inference over multiple nodes #3103

Batch inference over multiple nodes #3103

Comments

boyang-nlp commented Jan 24, 2025

zhaochenyang20 commented Jan 25, 2025