Skip to content

Commit

Permalink
- Added terms and privacy pages
Browse files Browse the repository at this point in the history
- Updated the `fine-tuning` guide
- Minor changes to the landing page
  • Loading branch information
peterschmidt85 committed Nov 17, 2023
1 parent 6f9ddc9 commit 34eedc9
Show file tree
Hide file tree
Showing 12 changed files with 667 additions and 81 deletions.
6 changes: 3 additions & 3 deletions docs/assets/stylesheets/extra.css
Original file line number Diff line number Diff line change
Expand Up @@ -796,14 +796,14 @@ html .md-footer-meta.md-typeset a:is(:focus,:hover) {
}

.md-typeset :where(ul) > li:before {
background-color: #d1d5db;
background-color: rgba(0,0,0,87);
border-radius: 50%;
content: "";
height: 0.375em;
height: 0.48em;
width: 0.48em;
left: 0.25em;
position: absolute;
top: 0.6875em;
width: 0.375em;
}

.md-typeset :where(ol) > li:before {
Expand Down
31 changes: 30 additions & 1 deletion docs/assets/stylesheets/landing.css
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,28 @@
font-size: 20px;
}

.tx-container .md-button {
vertical-align: middle;
}

.tx-container .md-button--primary:hover {
transform: translateY(-2px);
transition: opacity .2s ease,transform .2s ease;
}

.tx-container .md-button .icon {
display: inline-block;
position: relative;
width: 15px;
height: 15px;
margin-left: 7px;
transition: opacity .2s ease,transform .2s ease;
}

.tx-container .md-button--primary:hover .icon {
transform: translateX(3px)
}

.md-header__buttons .md-button--primary,
.tx-container .md-button--primary {
background: -webkit-linear-gradient(45deg, #002aff, #002aff, #e165fe);
Expand Down Expand Up @@ -548,11 +570,18 @@

.plans_card__subtitle {
margin-bottom: 1.4rem;
font-size: 1.0em;
font-size: 0.98em;
line-height: 1.44;
color: #696F86;
}

.plans_card__buttons_subtitle {
margin-top: 10px;
margin-left: 10px;
color: #202128;
font-size: 0.7rem;
}

.plans_card__services {
display: flex;
flex-wrap: wrap;
Expand Down
67 changes: 47 additions & 20 deletions docs/docs/guides/fine-tuning.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Fine-tuning

For fine-tuning an LLM with `dstack`'s API, specify a model, dataset, training parameters,
and required compute resources. `dstack` takes care of everything else.
and required compute resources. The API takes care of everything else.

??? info "Prerequisites"
To use the fine-tuning API, ensure you have the latest version:
Expand All @@ -14,17 +14,36 @@ and required compute resources. `dstack` takes care of everything else.

</div>

> The API currently supports only supervised fine-tuning (SFT). Support for DPO and RLHF is coming soon.
## Prepare a dataset

The dataset should contain a `"text"` column with completions following the prompt format
of the corresponding model. Check the [example](https://huggingface.co/datasets/peterschmidt85/samsum)
(for fine-tuning Llama 2).

> Once the dataset is prepared, it must be [uploaded](https://huggingface.co/docs/datasets/upload_dataset) to Hugging Face.
??? info "Uploading a dataset"
Here's an example of how to upload a dataset programmatically:

```python
import pandas as pd
from datasets import Dataset

df = pd.read_json("samsum.jsonl", lines=True)
dataset = Dataset.from_pandas(df)
dataset.push_to_hub("peterschmidt85/samsum")
```

## Create a client

First, you connect to `dstack`:

```python
from dstack.api import Client, ClientError
from dstack.api import Client

try:
client = Client.from_config()
except ClientError:
print("Can't connect to the server")
client = Client.from_config()
```

## Create a task
Expand All @@ -41,15 +60,12 @@ task = FineTuningTask(
env={
"HUGGING_FACE_HUB_TOKEN": "...",
},
num_train_epochs=2
num_train_epochs=2,
max_seq_length=1024,
per_device_train_batch_size=2,
)
```

!!! info "Dataset format"
For the SFT fine-tuning method, the dataset should contain a `"text"` column with completions following the prompt format
of the corresponding model.
Check the [peterschmidt85/samsum](https://huggingface.co/datasets/peterschmidt85/samsum) example.

## Run the task

When running a task, you can configure resources, and many [other options](../../docs/reference/api/python/index.md#dstack.api.RunCollection.submit).
Expand All @@ -64,19 +80,22 @@ run = client.runs.submit(
)
```

!!! info "Fine-tuning methods"
The API currently supports only SFT, with support for DPO and other methods coming soon.
!!! info "GPU memory"
The API defaults to using QLoRA based on the provided
[training parameters](../../docs/reference/api/python/index.md#dstack.api.FineTuningTask).
When specifying GPU memory, consider both the model size and the specified batch size.
After a few attempts, you'll discover the best configuration.

When the training is done, `dstack` pushes the final model to the Hugging Face hub.
When the training is done, the API pushes the final model to the Hugging Face hub.

![](../../assets/images/dstack-finetuning-hf.png){ width=800 }

## Manage runs

You can use the instance of [`dstack.api.Client`](../../docs/reference/api/python/index.md#dstack.api.Client) to manage your runs,
including getting a list of runs, stopping a given run, etc.
You can manage runs using [API](../../docs/reference/api/python/index.md#dstack.api.Client),
the [CLI](../../docs/reference/cli/index.md), or the user interface.

## Track experiments
## Track metrics

To track experiment metrics, specify `report_to` and related authentication environment variables.

Expand All @@ -97,5 +116,13 @@ Currently, the API supports `"tensorboard"` and `"wandb"`.

![](../../assets/images/dstack-finetuning-wandb.png){ width=800 }

[//]: # (TODO: Example)
[//]: # (TODO: Next steps)
[//]: # (TODO: Examples - Llama 2, Mistral, etc)

## What's next?

- Once the model is trained, proceed to [deploy](text-generation.md) it as an endpoint.
The deployed endpoint can be used from your apps directly or via LangChain.
- The source code of the fine-tuning task is available
at [GitHub](https://github.com/dstackai/dstack/tree/master/src/dstack/api/_public/huggingface/finetuning/sft).
If you prefer using a custom script, feel free to do so using [dev environments](dev-environments.md) and
[tasks](tasks.md).
2 changes: 1 addition & 1 deletion docs/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ or use the cloud version (which provides GPU out of the box).

The client configuration is stored via `~/.dstack/config.yml`.

??? info "GPU cloud"
??? info "dstack Cloud"

If you want to use the cloud version of `dstack`,
<a href="#" data-tally-open="w7K17R">sign up</a>, and configure the client
Expand Down
62 changes: 30 additions & 32 deletions docs/docs/reference/cli/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# CLI

## dstack server
## Commands

### dstack server

This command starts the `dstack` server.

Expand All @@ -13,24 +15,9 @@ $ dstack server --help

</div>

### Environment variables

| Name | Description | Default |
|-----------------------------------|-----------------------------------------------|--------------------|
| `DSTACK_DEFAULT_CREDS_DISABLED` | Disables default credentials detection if set | `None` |
| `DSTACK_LOCAL_BACKEND_ENABLED` | Enables local backend for debug if set | `None` |
| `DSTACK_RUNNER_VERSION` | Sets exact runner version for debug | `latest` |
| `DSTACK_SERVER_ADMIN_TOKEN` | Has the same effect as `--token` | `None` |
| `DSTACK_SERVER_DIR` | Sets path to store data and server configs | `~/.dstack/server` |
| `DSTACK_SERVER_HOST` | Has the same effect as `--host` | `127.0.0.1` |
| `DSTACK_SERVER_LOG_LEVEL` | Has the same effect as `--log-level` | `WARNING` |
| `DSTACK_SERVER_PORT` | Has the same effect as `--port` | `3000` |
| `DSTACK_SERVER_ROOT_LOG_LEVEL` | Sets root logger log level | `ERROR` |
| `DSTACK_SERVER_UVICORN_LOG_LEVEL` | Sets uvicorn logger log level | `ERROR` |

[//]: # (DSTACK_SERVER_ENVIRONMENT, DSTACK_SERVER_CONFIG_DISABLED, DSTACK_SENTRY_DSN, DSTACK_SENTRY_TRACES_SAMPLE_RATE, DSTACK_SERVER_BUCKET_REGION, DSTACK_SERVER_BUCKET, DSTACK_ALEMBIC_MIGRATIONS_LOCATION)

## dstack init
### dstack init

This command initializes the current folder as a repo.

Expand All @@ -54,7 +41,7 @@ $ dstack init --help
By default, this command generates an SSH key that will be used for port forwarding and SSH access to running workloads.
You can override this key via `--ssh-identity`.

## dstack run
### dstack run

This command runs a given configuration.

Expand All @@ -68,11 +55,11 @@ $ dstack run . --help
</div>

??? info ".gitignore"
When running anything via CLI, `dstack` uses the exact version of code from your project directory.
When running anything via CLI, `dstack` uses the exact version of code from your project directory.

If there are large files, consider creating a `.gitignore` file to exclude them for better performance.

## dstack ps
### dstack ps

This command shows the status of runs.

Expand All @@ -85,7 +72,7 @@ $ dstack ps --help

</div>

## dstack stop
### dstack stop

This command stops run(s) within the current repository.

Expand All @@ -98,7 +85,7 @@ $ dstack stop --help

</div>

## dstack logs
### dstack logs

This command shows the output of a given run within the current repository.

Expand All @@ -111,12 +98,12 @@ $ dstack logs --help

</div>

## dstack config
### dstack config

Both the CLI and API need to be configured with the server address, user token, and project name
via `~/.dstack/config.yml`.
Both the CLI and API need to be configured with the server address, user token, and project name
via `~/.dstack/config.yml`.

At startup, the server automatically configures CLI and API with the server address, user token, and
At startup, the server automatically configures CLI and API with the server address, user token, and
the default project name (`main`). This configuration is stored via `~/.dstack/config.yml`.

To use CLI and API on different machines or projects, use the `dstack config` command.
Expand All @@ -130,7 +117,7 @@ $ dstack config --help

</div>

## dstack gateway
### dstack gateway

A gateway is required for running services.

Expand Down Expand Up @@ -187,8 +174,19 @@ $ dstack gateway update --help
</div>

## Environment variables
| Name | Description | Default |
|------------------------|------------------------------------|------------|
| `DSTACK_CLI_LOG_LEVEL` | Configures CLI logging level | `CRITICAL` |
| `DSTACK_PROFILE` | Has the same effect as `--profile` | `None` |
| `DSTACK_PROJECT` | Has the same effect as `--project` | `None` |

| Name | Description | Default |
|-----------------------------------|-----------------------------------------------|--------------------|
| `DSTACK_CLI_LOG_LEVEL` | Configures CLI logging level | `CRITICAL` |
| `DSTACK_PROFILE` | Has the same effect as `--profile` | `None` |
| `DSTACK_PROJECT` | Has the same effect as `--project` | `None` |
| `DSTACK_DEFAULT_CREDS_DISABLED` | Disables default credentials detection if set | `None` |
| `DSTACK_LOCAL_BACKEND_ENABLED` | Enables local backend for debug if set | `None` |
| `DSTACK_RUNNER_VERSION` | Sets exact runner version for debug | `latest` |
| `DSTACK_SERVER_ADMIN_TOKEN` | Has the same effect as `--token` | `None` |
| `DSTACK_SERVER_DIR` | Sets path to store data and server configs | `~/.dstack/server` |
| `DSTACK_SERVER_HOST` | Has the same effect as `--host` | `127.0.0.1` |
| `DSTACK_SERVER_LOG_LEVEL` | Has the same effect as `--log-level` | `WARNING` |
| `DSTACK_SERVER_PORT` | Has the same effect as `--port` | `3000` |
| `DSTACK_SERVER_ROOT_LOG_LEVEL` | Sets root logger log level | `ERROR` |
| `DSTACK_SERVER_UVICORN_LOG_LEVEL` | Sets uvicorn logger log level | `ERROR` |
6 changes: 6 additions & 0 deletions docs/overrides/header.html
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,12 @@
</div>
{% endif %}
<div class="md-header__buttons">
<!--<script>
function sign_in_on_click() {
window.location.href = "https://cloud.dstack.ai";
}
</script>
<a href="javascript:void(0)" class="md-button md-button-secondary" onclick="sign_in_on_click()">Sign in</a>-->
<a href="#" data-tally-open="w7K17R" class="md-button md-button--primary">Sign up</a>
</div>
</nav>
Expand Down
Loading

0 comments on commit 34eedc9

Please sign in to comment.