Skip to content

Commit

Permalink
[YUNIKORN-2894] Update KubeRay operator documentation for YuniKorn in…
Browse files Browse the repository at this point in the history
…tegration (#489)

Closes: #489

Signed-off-by: Craig Condit <[email protected]>
  • Loading branch information
ryankert01 authored and craigcondit committed Oct 24, 2024
1 parent 46a5f58 commit 293b7b1
Show file tree
Hide file tree
Showing 2 changed files with 126 additions and 9 deletions.
2 changes: 1 addition & 1 deletion docs/user_guide/workloads/kuberay/_ray_operator.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
```
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update
helm install kuberay-operator kuberay/kuberay-operator --version 1.1.1
helm install kuberay-operator kuberay/kuberay-operator --version 1.2.2 --set batchScheduler.name=yunikorn
```
- The result should be as shown below
![ray_cluster_operator](../../../assets/ray_cluster_operator.png)
133 changes: 125 additions & 8 deletions docs/user_guide/workloads/run_ray_cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,23 +29,138 @@ specific language governing permissions and limitations
under the License.
-->

:::info[Note]
This example demonstrates how to set up [KubeRay](https://docs.ray.io/en/master/cluster/kubernetes/getting-started.html) and run a [RayCluster](https://docs.ray.io/en/master/cluster/kubernetes/getting-started/raycluster-quick-start.html) with the YuniKorn scheduler. It relies on an admission controller to configure the default applicationId and queue name. If you want more details, please refer to [Yunikorn supported labels](https://yunikorn.apache.org/docs/user_guide/labels_and_annotations_in_yunikorn) and [Yunikorn queue setting](https://yunikorn.apache.org/docs/user_guide/queue_config).
:::note
This example demonstrates how to set up [KubeRay](https://docs.ray.io/en/master/cluster/kubernetes/getting-started.html) and run a [RayCluster](https://docs.ray.io/en/master/cluster/kubernetes/getting-started/raycluster-quick-start.html) with the YuniKorn scheduler. Here're the pre-requisites:
- This tutorial assumes YuniKorn is [installed](../../get_started/get_started.md) under the namespace `yunikorn`
- Use kube-ray version >= 1.2.2 to enable support for YuniKorn gang scheduling
:::

<YunikornConfigMapPatch />
## Install YuniKorn

A simple script to install YuniKorn under the namespace `yunikorn`, refer to [Get Started](../../get_started/get_started.md) for more details.

```shell script
helm repo add yunikorn https://apache.github.io/yunikorn-release
helm repo update
helm install yunikorn yunikorn/yunikorn --create-namespace --namespace yunikorn
```

<RayOperator/>

## Create RayCluster
## Create RayCluster with YuniKorn

In the example, we set the `ray.io/gang-scheduling-enabled` label to `true` to enable gang scheduling.

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

<Tabs>
<TabItem value="amd64" label="x86-64 (Intel/Linux)">

```yaml
cat <<EOF | kubectl apply -f -
apiVersion: ray.io/v1
kind: RayCluster
metadata:
name: test-yunikorn-0
labels:
ray.io/gang-scheduling-enabled: "true"
yunikorn.apache.org/app-id: test-yunikorn-0
yunikorn.apache.org/queue: root.default
spec:
rayVersion: "2.9.0"
headGroupSpec:
rayStartParams: {}
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.9.0
resources:
limits:
cpu: "1"
memory: "2Gi"
requests:
cpu: "1"
memory: "2Gi"
workerGroupSpecs:
- groupName: worker
rayStartParams: {}
replicas: 2
minReplicas: 2
maxReplicas: 2
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.9.0
resources:
limits:
cpu: "1"
memory: "1Gi"
requests:
cpu: "1"
memory: "1Gi"
EOF
```
helm install raycluster kuberay/ray-cluster --version 1.1.1

</TabItem>
<TabItem value="aarch64" label="Apple Silicon(arm64)">

```yaml
cat <<EOF | kubectl apply -f -
apiVersion: ray.io/v1
kind: RayCluster
metadata:
name: test-yunikorn-0
labels:
ray.io/gang-scheduling-enabled: "true"
yunikorn.apache.org/app-id: test-yunikorn-0
yunikorn.apache.org/queue: root.default
spec:
rayVersion: "2.9.0"
headGroupSpec:
rayStartParams: {}
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.9.0-aarch64
resources:
limits:
cpu: "1"
memory: "2Gi"
requests:
cpu: "1"
memory: "2Gi"
workerGroupSpecs:
- groupName: worker
rayStartParams: {}
replicas: 2
minReplicas: 2
maxReplicas: 2
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.9.0-aarch64
resources:
limits:
cpu: "1"
memory: "1Gi"
requests:
cpu: "1"
memory: "1Gi"
EOF
```

</TabItem>
</Tabs>

- RayCluster result
![ray_cluster_cluster](../../assets/ray_cluster_cluster.png)
- YuniKorn UI
![ray_cluster_on_ui](../../assets/ray_cluster_on_ui.png)

<RayCRDYunikornConfig />

## Submit a RayJob to RayCluster
```
Expand All @@ -57,10 +172,12 @@ kubectl exec -it $HEAD_POD -- python -c "import ray; ray.init(); print(ray.clust

Services in Kubernetes aren't directly accessible by default. However, you can use port-forwarding to connect to them locally.
```
kubectl port-forward service/raycluster-kuberay-head-svc 8265:8265
kubectl port-forward service/test-yunikorn-0-head-svc 8265:8265
```
After port-forward set up, you can access the Ray dashboard by going to `http://localhost:8265` in your web browser.

- Ray Dashboard
![ray_cluster_ray_dashborad](../../assets/ray_cluster_ray_dashborad.png)

Have doubts? Check out the [KubeRay integration with Apache YuniKorn](https://docs.ray.io/en/master/cluster/kubernetes/k8s-ecosystem/yunikorn.html) official documents.

0 comments on commit 293b7b1

Please sign in to comment.