From eadedf7d7238b00da6dab63957836ed63d9fd072 Mon Sep 17 00:00:00 2001
From: Robert <robert.samoilescu@gmail.com>
Date: Tue, 4 Feb 2025 14:53:19 +0000
Subject: [PATCH] Wrote docs for inference_pool_gid

---
 docs-gb/user-guide/parallel-inference.md | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/docs-gb/user-guide/parallel-inference.md b/docs-gb/user-guide/parallel-inference.md
index 8a7c583b1..d45f0b710 100644
--- a/docs-gb/user-guide/parallel-inference.md
+++ b/docs-gb/user-guide/parallel-inference.md
@@ -77,6 +77,13 @@ The expected values are:
 - `0`, will disable the parallel inference feature.
   In other words, inference will happen within the main MLServer process.
 
+### `inference_pool_gid` 
+
+The `inference_pool_gid` field of the `model-settings.json` file (or alternatively, the `MLSERVER_MODEL_INFERENCE_POOL_GID` global environment variable) allows to load models on a dedicated inference pool based on the group ID (GID) to prevent starvation behavior.
+
+Complementing the `inference_pool_gid`, if the `autogenerate_inference_pool_gid` field of the `model-settings.json` file (or alternatively, the `MLSERVER_MODEL_AUTOGENERATE_INFERENCE_POOL_GID` global environment variable) is set to `True`, a UUID is automatically generated, and a dedicated inference pool will load the given model. This option is useful if the user wants to load a single model on an dedicated inference pool without having to manage the GID themselves.
+
+
 ## References
 
 Jiale Zhi, Rui Wang, Jeff Clune, and Kenneth O. Stanley. Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods. arXiv:2003.11164 [cs, stat], March 2020. [arXiv:2003.11164](https://arxiv.org/abs/2003.11164).
\ No newline at end of file