diff --git a/.github/workflows/docs.yaml b/.github/workflows/docs.yaml
index 7f5318dea..34c3f9f24 100644
--- a/.github/workflows/docs.yaml
+++ b/.github/workflows/docs.yaml
@@ -21,7 +21,7 @@ jobs:
- run: |
pip install pillow cairosvg
sudo apt-get install -y libcairo2-dev libfreetype6-dev libffi-dev libjpeg-dev libpng-dev libz-dev
- pip install mkdocs-material mkdocs-material-extensions mkdocs-redirects --upgrade
+ pip install mkdocs-material pip install "mkdocs-material[imaging]" mkdocs-material-extensions mkdocs-redirects --upgrade
pip install git+https://${{ secrets.GH_TOKEN }}@github.com/squidfunk/mkdocs-material-insiders.git
mkdocs gh-deploy --config-file ../dstack/mkdocs.yml --force
working-directory: ./dstackai.github.io
\ No newline at end of file
diff --git a/docs/assets/stylesheets/extra.css b/docs/assets/stylesheets/extra.css
index cea8ac637..490b75bcc 100644
--- a/docs/assets/stylesheets/extra.css
+++ b/docs/assets/stylesheets/extra.css
@@ -616,6 +616,10 @@ body {
width: 26px;
text-align: center;
}
+
+ .md-nav--lifted > .md-nav__list > .md-nav__item > [for] {
+ display: none;
+ }
}
.md-sidebar--primary .md-nav__link {
diff --git a/docs/blog/.authors.yml b/docs/blog/.authors.yml
index 854e6d39d..3c6513c07 100644
--- a/docs/blog/.authors.yml
+++ b/docs/blog/.authors.yml
@@ -1,3 +1,5 @@
-peterschmidt85:
- name: Andrey Cheptsov
- avatar: https://github.com/peterschmidt85.png
\ No newline at end of file
+authors:
+ peterschmidt85:
+ name: Andrey Cheptsov
+ avatar: https://github.com/peterschmidt85.png
+ description: Creator
\ No newline at end of file
diff --git a/docs/blog/posts/lambda-cloud-support-preview.md b/docs/blog/posts/lambda-cloud-support-preview.md
index 6d4f19520..7dfd77dda 100644
--- a/docs/blog/posts/lambda-cloud-support-preview.md
+++ b/docs/blog/posts/lambda-cloud-support-preview.md
@@ -70,7 +70,7 @@ ide: vscode
You only need to install `cuda` if you intend to build a custom CUDA kernel. Otherwise, it is not necessary as the
essential CUDA drivers are already pre-installed.
-The [documentation](../../docs) and [examples](https://github.com/dstackai/dstack-examples/blob/main/README.md)
+The [documentation](../../docs/index.md) and [examples](../../examples/index.md)
are updated to reflect the changes.
!!! info "Give it a try and share feedback"
diff --git a/docs/blog/posts/new-configuration-format-and-cli-experience.md b/docs/blog/posts/new-configuration-format-and-cli-experience.md
index f5d057af5..4e2cdd6a3 100644
--- a/docs/blog/posts/new-configuration-format-and-cli-experience.md
+++ b/docs/blog/posts/new-configuration-format-and-cli-experience.md
@@ -160,7 +160,7 @@ Last but not least, here's a brief list of other improvements:
configured on the machine where the server is running).
* We've introduced a new Python API and UI for working with artifacts (more details to come later this week).
-The [documentation](../../docs) and [examples](https://github.com/dstackai/dstack-examples/blob/main/README.md)
+The [documentation](../../docs/index.md) and [examples](../../examples/index.md)
are updated to reflect the changes.
!!! info "Try the update and share feedback"
diff --git a/docs/blog/posts/prebuilt-environments-spot-and-retry-policies.md b/docs/blog/posts/prebuilt-environments-spot-and-retry-policies.md
index 052486674..844d251cf 100644
--- a/docs/blog/posts/prebuilt-environments-spot-and-retry-policies.md
+++ b/docs/blog/posts/prebuilt-environments-spot-and-retry-policies.md
@@ -96,9 +96,9 @@ Other improvements in this release:
- Deletion of repositories is now possible through the UI.
- When running a dev environment from a Git repo, you can now pull and push changes directly from the dev environment,
with `dstack` correctly configuring your Git credentials.
-- The newly added Python API for working with artifacts is now documented [here](../../docs/reference/api/python.md).
+- The newly added Python API for working with artifacts is now documented [here](../../docs/reference/api/python/index.md).
-The [documentation](../../docs) and [examples](https://github.com/dstackai/dstack-examples/blob/main/README.md)
+The [documentation](../../docs/index.md) and [examples](../../examples/index.md)
are updated to reflect the changes.
!!! info "Give it a try and share feedback"
diff --git a/docs/docs/guides/clouds.md b/docs/docs/guides/clouds.md
index 44c838d13..6e4369411 100644
--- a/docs/docs/guides/clouds.md
+++ b/docs/docs/guides/clouds.md
@@ -49,16 +49,16 @@ Configuring backends involves providing cloud credentials, and specifying storag
- [**AWS**
Learn how to set up an Amazon Web Services backend.
- ](../../reference/backends/aws/)
+ ](../reference/backends/aws.md)
- [**GCP**
- Learn how to set up a Google Cloud backend.
- ](../../reference/backends/gcp/)
+Learn how to set up a Google Cloud backend.
+ ](../reference/backends/gcp.md)
- [**Azure**
Learn how to set up an Microsoft Azure backend.
- ](../../reference/backends/azure/)
+ ](../reference/backends/azure.md)
- [**Lambda**
Learn how to set up a Lambda Cloud backend.
- ](../../reference/backends/lambda/)
+ ](../reference/backends/lambda.md)
diff --git a/docs/docs/reference/api/python/index.md b/docs/docs/reference/api/python/index.md
index 8811ed29e..c22af2422 100644
--- a/docs/docs/reference/api/python/index.md
+++ b/docs/docs/reference/api/python/index.md
@@ -5,7 +5,7 @@ The Python API allows for programmatically running tasks and services across con
#### Installation
Before you can use `dstack` Python API, ensure you have installed the `dstack` package,
-started a `dstack` server with [configured clouds](../../docs/docs/guides/clouds.md).
+started a `dstack` server with [configured clouds](../../../guides/clouds.md).
```shell
pip install "dstack[all]==0.11.3rc1"
diff --git a/docs/examples/deploy-python.md b/docs/examples/deploy-python.md
new file mode 100644
index 000000000..4a213cc30
--- /dev/null
+++ b/docs/examples/deploy-python.md
@@ -0,0 +1,150 @@
+# Deploying LLMs using Python API
+
+The [Python API](../docs/reference/api/python/index.md) of `dstack` can be used to run [tasks](../docs/guides/tasks.md) and [services](../docs/guides/services.md) programmatically.
+
+To demonstrate how it works, we've created a simple Streamlit app that uses `dstack`'s API to deploy a quantized
+version of Llama 2 to your cloud with a click of a button.
+
+![](images/python-api/dstack-python-api-streamlit-example.png){ width=800 }
+
+## Prerequisites
+
+Before you can use `dstack` Python API, ensure you have installed the `dstack` package,
+started a `dstack` server with [configured clouds](../docs/guides/clouds.md).
+
+```shell
+pip install "dstack[all]==0.11.3rc1"
+dstack start
+```
+
+## How does it work?
+
+### Create a client
+
+If you're familiar with Docker's Python SDK, you'll find `dstack`'s [Python API](../docs/reference/api/python/index.md)
+quite similar, except that it runs your workload in the cloud.
+
+To get started, create an instance of `dstack.Client` and use its methods to submit and manage runs.
+
+```python
+import dstack
+
+try:
+ client = dstack.Client.from_config(repo_dir=".")
+except dstack.api.hub.errors.HubClientError as e:
+ print(e)
+```
+
+### Create a task
+
+!!! info "NOTE:"
+ With `dstack.Client`, you can run [tasks](../docs/guides/tasks.md) and [services](../docs/guides/services.md).
+ Running a task allows you to programmatically access its ports and
+ forward traffic to your local machine. For example, if you run an LLM as a task, you can access it on `localhost`.
+ Services on the other hand allow deploying applications as public endpoints.
+
+In our example, we'll deploy an LLM as a task. To do this, we'll create a `dstack.Task` instance that configures how the
+LLM should be run.
+
+```python
+configuration = dstack.Task(
+ image="ghcr.io/huggingface/text-generation-inference:latest",
+ env={"MODEL_ID": model_id},
+ commands=[
+ "text-generation-launcher --trust-remote-code --quantize gptq",
+ ],
+ ports=["8080:80"], # LLM runs on port 80, forwarded to localhost:8080
+)
+```
+
+### Create resources
+
+Then, we'll need to specify the resources our LLM will require. To do this, we'll create a `dstack.Resources` instance:
+
+```python
+
+if model_id == "TheBloke/Llama-2-13B-chat-GPTQ":
+ gpu_memory = "20GB"
+elif model_id == "TheBloke/Llama-2-70B-chat-GPTQ":
+ gpu_memory = "40GB"
+
+resources = dstack.Resources(gpu=dstack.GPU(memory=gpu_memory))
+```
+
+### Submit the run
+
+To deploy the LLM, we submit the task using `runs.submit()` in `dstack.Client`.
+
+```python
+run_name = "deploy-python"
+
+run = client.runs.submit(configuration=configuration, run_name=run_name, resources=resources)
+```
+
+### Attach to the run
+
+Then, we use the `attach()` method on `dstack.Run`. This method waits for the task to start,
+and forwards the configured ports to `localhost`.
+
+```
+run.attach()
+```
+
+### Wait for the endpoint to start
+
+Finally, we wait until `http://localhost:8080/health` returns `200`, which indicates that the LLM is deployed and ready to
+handle requests.
+
+```python
+import time
+import requests
+
+while True:
+ time.sleep(0.5)
+ try:
+ r = requests.get("http://localhost:8080/health")
+ if r.status_code == 200:
+ break
+ except Exception:
+ pass
+```
+
+### Stop the run
+
+To undeploy the model, we can use the `stop()` method on `dstack.Run`.
+
+```python
+run.stop()
+```
+
+### Retrieve the status of a run
+
+Note: If you'd like to retrieve the `dstack.Run` instance by the name of the run,
+you can use the `runs.get()` method on `dstack.Client`.
+
+```python
+run = client.runs.get(run_name)
+```
+
+The `status()` method on `dstack.Run` can always provide the status of the run.
+
+```python
+if run:
+ print(run.status())
+```
+
+## Source code
+
+The complete, ready-to-run code is available in [dstackai/dstack-examples](https://github.com/dstackai/dstack-examples).
+
+```shell
+git clone https://github.com/dstackai/dstack-examples
+cd dstack-examples
+```
+
+Once the repository is cloned, feel free to install the requirements and run the app:
+
+```
+pip install -r deploy-python/requirements.txt
+streamlit run deploy-python/app.py
+```
\ No newline at end of file
diff --git a/docs/examples/python-api.md b/docs/examples/python-api.md
deleted file mode 100644
index 08fd0aee4..000000000
--- a/docs/examples/python-api.md
+++ /dev/null
@@ -1,154 +0,0 @@
-# Deploying LLMs using Python API
-
-The [Python API](../docs/reference/api/python/index.md) of `dstack` can be used to run
-[tasks](../docs/guides/tasks.md) and [services](../docs/guides/services.md) programmatically.
-
-Below is an example of a Streamlit app that uses `dstack`'s API to deploy a quantized version of Llama 2 to your cloud
-with a simple click of a button.
-
-![](images/python-api/dstack-python-api-streamlit-example.png){ width=800 }
-
-!!! info "How does the API work?"
- If you're familiar with Docker's Python SDK, you'll find dstack's Python API quite similar, except that it runs your
- workload in the cloud.
-
- To get started, create an instance of `dstack.Client` and use its methods to submit and manage runs.
-
- With `dstack.Client`, you can run [tasks](../docs/guides/tasks.md) and [services](../docs/guides/services.md). Running a task allows you to programmatically access its ports and
- forward traffic to your local machine. For example, if you run an LLM as a task, you can access it on localhost.
- Services on the other hand allow deploying applications as public endpoints.
-
-## Prerequisites
-
-Before you can use `dstack` Python API, ensure you have installed the `dstack` package,
-started a `dstack` server with [configured clouds](../../docs/docs/guides/clouds.md).
-
-```shell
-pip install "dstack[all]==0.11.3rc1"
-dstack start
-```
-
-## Run the app
-
-First, clone the repository with `dstack-examples`.
-
-```shell
-git clone https://github.com/dstackai/dstack-examples
-cd dstack-examples
-```
-
-Second, install the requirements, and run the app:
-
-```
-pip install -r streamlit-llama/requirements.txt
-streamlit run streamlit-llama/app.py
-```
-
-That's it! Now you can choose a model (e.g., 13B or 70B) and click the `Deploy` button.
-Once the LLM is up, you can access it at `localhost`.
-
-## Code walkthrough
-
-For the complete code,
-refer to the [full version](https://github.com/dstackai/dstack-examples/blob/main/streamlit-llama/app.py).
-
-First, we initialize the `dstack.Client`:
-
-```python
-if len(st.session_state) == 0:
- st.session_state.client = dstack.Client.from_config(".")
-```
-
-Then, we prompt the user to choose an LLM for deployment.
-
-```python
-def trigger_llm_deployment():
- st.session_state.deploying = True
- st.session_state.error = None
-
-with st.sidebar:
- model_id = st.selectbox("Choose an LLM to deploy",
- ("TheBloke/Llama-2-13B-chat-GPTQ",
- "TheBloke/Llama-2-70B-chat-GPTQ",),
- disabled=st.session_state.deploying or st.session_state.deployed)
- if not st.session_state.deploying:
- st.button("Deploy", on_click=trigger_llm_deployment, type="primary")
-```
-
-Prepare a `dstack` task and resource requirements based on the selected model.
-
-```python
-def get_configuration():
- return dstack.Task(
- image="ghcr.io/huggingface/text-generation-inference:latest",
- env={"MODEL_ID": model_id},
- commands=[
- "text-generation-launcher --trust-remote-code --quantize gptq",
- ],
- ports=["8080:80"],
- )
-
-
-def get_resources():
- if model_id == "TheBloke/Llama-2-13B-chat-GPTQ":
- gpu_memory = "20GB"
- elif model_id == "TheBloke/Llama-2-70B-chat-GPTQ":
- gpu_memory = "40GB"
- return dstack.Resources(gpu=dstack.GPU(memory=gpu_memory))
-```
-
-If the user clicks `Deploy`, we submit the task using `runs.submit()` on `dstack.Client`. Then, we use the `attach()`
-method on `dstack.Run`. This method waits for the task to start, forwarding the port to `localhost`.
-
-Finally, we wait until `http://localhost:8080/health` returns `200`.
-
-```python
-def wait_for_ok_status(url):
- while True:
- time.sleep(0.5)
- try:
- r = requests.get(url)
- if r.status_code == 200:
- break
- except Exception:
- pass
-
-if st.session_state.deploying:
- with st.sidebar:
- with st.status("Deploying the LLM...", expanded=True) as status:
- st.write("Provisioning...")
- try:
- run = st.session_state.client.runs.submit(configuration=get_configuration(), run_name=run_name,
- resources=get_resources())
- st.session_state.run = run
- st.write("Attaching to the LLM...")
- st.session_state.run.attach()
- wait_for_ok_status("http://localhost:8080/health")
- status.update(label="The LLM is ready!", state="complete", expanded=False)
- st.session_state.deploying = False
- st.session_state.deployed = True
- except Exception as e:
- st.session_state.error = str(e)
- st.session_state.deploying = False
- st.experimental_rerun()
-```
-
-If an error occurs, we display it. Additionally, we provide a button to undeploy the model using the `stop()` method on `dstack.Run`.
-
-```python
-def trigger_llm_undeployment():
- st.session_state.run.stop()
- st.session_state.deploying = False
- st.session_state.deployed = False
- st.session_state.run = None
-
-with st.sidebar:
- if st.session_state.error:
- st.error(st.session_state)
-
- if st.session_state.deployed:
- st.button("Undeploy", type="primary", key="stop", on_click=trigger_llm_undeployment)
-```
-
-!!! info "Source code"
- The complete, ready-to-run code is available in [dstackai/dstack-examples](https://github.com/dstackai/dstack-examples).
\ No newline at end of file
diff --git a/docs/overrides/examples.html b/docs/overrides/examples.html
index cd8eb69c5..4e0b28b93 100644
--- a/docs/overrides/examples.html
+++ b/docs/overrides/examples.html
@@ -1,4 +1,4 @@
-{% extends "main.html" %}
+{% extends "landing.html" %}
{% block content %}