Skip to content

Commit

Permalink
Merge pull request #15 from peiniliu/dev
Browse files Browse the repository at this point in the history
Add GPU Numbers Support
  • Loading branch information
Thor-wl authored Jul 18, 2022
2 parents b65d659 + 64dbad0 commit 820bcae
Show file tree
Hide file tree
Showing 15 changed files with 627 additions and 71 deletions.
42 changes: 42 additions & 0 deletions doc/config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
## Config the volcano device plugin binary

The volcano device plugin has a number of options that can be configured. These options can be configured as command line flags, environment variables, or via a config file when launching the device plugin. The following section explains these configurations.

### As command line flags or envvars

| Flag | Envvar | Default Value |
|--------------------------|-------------------------|-----------------|
| `--gpu-strategy` | `$GPU_STRATEGY` | `"share"` |
| `--config-file` | `$CONFIG_FILE` | `""` |

when starting volcano-device-plugin.yml, users can specify these parameters by adding args to the container 'volcano-device-plugin'.
For example:
- args: ["--gpu-strategy=number"] will let device plugin using the gpu-number strategy

### As a configuration file
```
version: v1
flags:
GPUStrategy: "number"
```

### Configuration Option Details
**`GPU_STRATEGY`**:
the desired strategy for exposing GPU devices

`[number | share ] (default 'share')`

The `GPU_STRATEGY` option configures the daemonset to be able to expose
on GPU devices in numbers or sharing mode. More information on what
these strategies are and how to use it in Volcano can be found in Volcano scheduler.

**`CONFIG_FILE`**:
point the plugin at a configuration file instead of relying on command line
flags or environment variables

`(default '')`

The order of precedence for setting each option is (1) command line flag, (2)
environment variable, (3) configuration file. In this way, one could use a
pre-defined configuration file, but then override the values set in it at
launch time.
2 changes: 1 addition & 1 deletion docker/amd64/Dockerfile.centos7
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ RUN yum install -y \
wget && \
rm -rf /var/cache/yum/*

ENV GOLANG_VERSION 1.14.4
ENV GOLANG_VERSION 1.17.6
RUN wget -nv -O - https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-amd64.tar.gz \
| tar -C /usr/local -xz
ENV GOPATH /go
Expand Down
2 changes: 1 addition & 1 deletion docker/amd64/Dockerfile.ubuntu16.04
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
wget && \
rm -rf /var/lib/apt/lists/*

ENV GOLANG_VERSION 1.14.4
ENV GOLANG_VERSION 1.17.6
RUN wget -nv -O - https://storage.googleapis.com/golang/go${GOLANG_VERSION}.linux-amd64.tar.gz \
| tar -C /usr/local -xz
ENV GOPATH /go
Expand Down
3 changes: 3 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ require (
github.com/NVIDIA/gpu-monitoring-tools v0.0.0-20200421213100-de959f43b55a
github.com/fsnotify/fsnotify v1.4.9
github.com/prometheus/common v0.4.1
github.com/stretchr/testify v1.5.1
github.com/urfave/cli/v2 v2.4.0
golang.org/x/net v0.0.0-20200421231249-e086a090c8fd // indirect
google.golang.org/grpc v1.29.0
k8s.io/api v0.18.2
Expand All @@ -39,4 +41,5 @@ require (
k8s.io/klog v1.0.0
k8s.io/kubelet v0.0.0
k8s.io/kubernetes v1.18.2
sigs.k8s.io/yaml v1.2.0
)
11 changes: 10 additions & 1 deletion go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,11 @@ github.com/coreos/go-systemd v0.0.0-20181012123002-c6f51f82210d/go.mod h1:F5haX7
github.com/coreos/go-systemd v0.0.0-20190321100706-95778dfbb74e/go.mod h1:F5haX7vjVVG0kc13fIWeqUViNPyEJxv/OmvnBo0Yme4=
github.com/coreos/pkg v0.0.0-20160727233714-3ac0863d7acf/go.mod h1:E3G3o1h8I7cfcXa63jLwjI0eiQQMgzzUDFVpN/nH/eA=
github.com/coreos/pkg v0.0.0-20180108230652-97fdf19511ea/go.mod h1:E3G3o1h8I7cfcXa63jLwjI0eiQQMgzzUDFVpN/nH/eA=
github.com/cpuguy83/go-md2man v1.0.10 h1:BSKMNlYxDvnunlTymqtgONjNnaRV1sTpcovwwjF22jk=
github.com/cpuguy83/go-md2man v1.0.10/go.mod h1:SmD6nW6nTyfqj6ABTjUi3V3JVMnlJmwcJI5acqYI6dE=
github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU=
github.com/cpuguy83/go-md2man/v2 v2.0.1 h1:r/myEWzV9lfsM1tFLgDyu0atFtJ1fXn261LKYj/3DxU=
github.com/cpuguy83/go-md2man/v2 v2.0.1/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/creack/pty v1.1.7/go.mod h1:lj5s0c3V2DBrqTV7llrYr5NG6My20zk30Fl46Y7DoTY=
github.com/cyphar/filepath-securejoin v0.2.2/go.mod h1:FpkQEhXnPnOthhzymB7CGsFk2G9VLXONKD9G7QGMM+4=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
Expand Down Expand Up @@ -392,7 +395,6 @@ github.com/pelletier/go-toml v1.1.0/go.mod h1:5z9KED0ma1S8pY6P1sdut58dfprrGBbd/9
github.com/pelletier/go-toml v1.2.0/go.mod h1:5z9KED0ma1S8pY6P1sdut58dfprrGBbd/94hg7ilaic=
github.com/peterbourgon/diskv v2.0.1+incompatible/go.mod h1:uqqh8zWWbv1HBMNONnaR/tNboyR3/BZd58JJSHlUSCU=
github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pkg/errors v0.8.1 h1:iURUrRGxPUNPdy5/HRSm+Yj6okJ6UtLINN0Q9M4+h3I=
github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
Expand All @@ -417,8 +419,11 @@ github.com/rogpeppe/go-internal v1.1.0/go.mod h1:M8bDsm7K2OlrFYOpmOWEs/qY81heoFR
github.com/rogpeppe/go-internal v1.3.0/go.mod h1:M8bDsm7K2OlrFYOpmOWEs/qY81heoFRclV5y23lUDJ4=
github.com/rubiojr/go-vhd v0.0.0-20160810183302-0bfd3b39853c/go.mod h1:DM5xW0nvfNNm2uytzsvhI3OnX8uzaRAg8UX/CnDqbto=
github.com/russross/blackfriday v0.0.0-20170610170232-067529f716f4/go.mod h1:JO/DiYxRf+HjHt06OyowR9PTA263kcR/rfWxYHBV53g=
github.com/russross/blackfriday v1.5.2 h1:HyvC0ARfnZBqnXwABFeSZHpKvJHJJfPz81GNueLj0oo=
github.com/russross/blackfriday v1.5.2/go.mod h1:JO/DiYxRf+HjHt06OyowR9PTA263kcR/rfWxYHBV53g=
github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/ryanuber/go-glob v0.0.0-20170128012129-256dc444b735/go.mod h1:807d1WSdnB0XRJzKNil9Om6lcp/3a0v4qIHxIXzX/Yc=
github.com/satori/go.uuid v1.2.0/go.mod h1:dA0hQrYB0VpLJoorglMZABFdXlWrHn1NEOzdhQKdks0=
github.com/seccomp/libseccomp-golang v0.9.1/go.mod h1:GbW5+tmTXfcxTToHLXlScSlAvWlF4P2Ca7zGrPiEpWo=
Expand Down Expand Up @@ -459,6 +464,7 @@ github.com/spf13/viper v1.3.2/go.mod h1:ZiWeW+zYFKm7srdB9IoDzzZXaJaI5eL9QjNiN/DM
github.com/storageos/go-api v0.0.0-20180912212459-343b3eff91fc/go.mod h1:ZrLn+e0ZuF3Y65PNF6dIwbJPZqfmtCXxFm9ckv0agOY=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.1.1/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.2.0 h1:Hbg2NidpLE8veEBkEZTL3CvlkUIVzuU9jDplZO54c48=
github.com/stretchr/objx v0.2.0/go.mod h1:qt09Ya8vawLte6SNmTgCsAVtYtaKzEcn8ATUoHMkEqE=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
Expand All @@ -474,8 +480,11 @@ github.com/tmc/grpc-websocket-proxy v0.0.0-20170815181823-89b8d40f7ca8/go.mod h1
github.com/ugorji/go/codec v0.0.0-20181204163529-d75b2dcb6bc8/go.mod h1:VFNgLljTbGfSG7qAOspJ7OScBnGdDN/yBr0sguwnwf0=
github.com/ultraware/funlen v0.0.1/go.mod h1:Dp4UiAus7Wdb9KUZsYWZEWiRzGuM2kXM1lPbfaF6xhA=
github.com/ultraware/funlen v0.0.2/go.mod h1:Dp4UiAus7Wdb9KUZsYWZEWiRzGuM2kXM1lPbfaF6xhA=
github.com/urfave/cli v1.20.0 h1:fDqGv3UG/4jbVl/QkFwEdddtEDjh/5Ov6X+0B/3bPaw=
github.com/urfave/cli v1.20.0/go.mod h1:70zkFmudgCuE/ngEzBv17Jvp/497gISqfk5gWijbERA=
github.com/urfave/cli/v2 v2.2.0/go.mod h1:SE9GqnLQmjVa0iPEY0f1w3ygNIYcIJ0OKPMoW2caLfQ=
github.com/urfave/cli/v2 v2.4.0 h1:m2pxjjDFgDxSPtO8WSdbndj17Wu2y8vOT86wE/tjr+I=
github.com/urfave/cli/v2 v2.4.0/go.mod h1:NX9W0zmTvedE5oDoOMs2RTC8RvdK98NTYZE5LbaEYPg=
github.com/urfave/negroni v1.0.0/go.mod h1:Meg73S6kFm/4PpbYdq35yYWoCZ9mS/YSx+lKnmiohz4=
github.com/valyala/bytebufferpool v1.0.0/go.mod h1:6bBcMArwyJ5K/AmCkWv1jt77kVWyCJ6HpOuEn7z0Csc=
github.com/valyala/fasthttp v1.2.0/go.mod h1:4vX61m6KN+xDduDNwXrhIAVZaZaZiQ1luJk8LWSxF3s=
Expand Down
74 changes: 67 additions & 7 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,38 +17,98 @@
package main

import (
"flag"
"encoding/json"
"fmt"
"log"
"os"
"os/signal"
"syscall"

"github.com/fsnotify/fsnotify"
cli "github.com/urfave/cli/v2"
pluginapi "k8s.io/kubelet/pkg/apis/deviceplugin/v1beta1"

apis "volcano.sh/k8s-device-plugin/pkg/apis"
"volcano.sh/k8s-device-plugin/pkg/filewatcher"
"volcano.sh/k8s-device-plugin/pkg/plugin"
"volcano.sh/k8s-device-plugin/pkg/plugin/nvidia"
)

func getAllPlugins() []plugin.DevicePlugin {
return []plugin.DevicePlugin{
nvidia.NewNvidiaDevicePlugin(),
func loadConfig(c *cli.Context, flags []cli.Flag) (*apis.Config, error) {
config, err := apis.NewConfig(c, flags)
if err != nil {
return nil, fmt.Errorf("unable to finalize config: %v", err)
}
return config, nil
}

func getAllPlugins(c *cli.Context, flags []cli.Flag) ([]plugin.DevicePlugin, error) {
// Load the configuration file
log.Println("Loading configuration.")
config, err := loadConfig(c, flags)
if err != nil {
return nil, fmt.Errorf("unable to load config: %v", err)
}

// Print the config to the output.
configJSON, err := json.MarshalIndent(config, "", " ")
if err != nil {
return nil, fmt.Errorf("failed to marshal config to JSON: %v", err)
}
log.Printf("\nRunning with config:\n%v", string(configJSON))

return []plugin.DevicePlugin{
nvidia.NewNvidiaDevicePlugin(config),
}, nil
}

var version string

func main() {
flag.Parse()
var configFile string

log.Println("Starting file watcher.")
c := cli.NewApp()
c.Version = version
c.Action = func(ctx *cli.Context) error {
return start(ctx, c.Flags)
}

c.Flags = []cli.Flag{
&cli.StringFlag{
Name: "gpu-strategy",
Value: "share",
Usage: "the default strategy is using shared GPU devices while using 'number' meaning using GPUs individually. [number| share]",
EnvVars: []string{"GPU_STRATEGY"},
},
&cli.StringFlag{
Name: "config-file",
Usage: "the path to a config file as an alternative to command line options or environment variables",
Destination: &configFile,
EnvVars: []string{"CONFIG_FILE"},
},
}

err := c.Run(os.Args)
if err != nil {
log.SetOutput(os.Stderr)
log.Printf("Error: %v", err)
os.Exit(1)
}
}

func start(c *cli.Context, flags []cli.Flag) error {
watcher, err := filewatcher.NewFileWatcher(pluginapi.DevicePluginPath)
if err != nil {
log.Printf("Failed to created file watcher: %v", err)
os.Exit(1)
}

log.Println("Retrieving plugins.")
plugins := getAllPlugins()
plugins, err := getAllPlugins(c, flags)
if err != nil {
log.Printf("Failed to retrieving plugins: %v", err)
os.Exit(1)
}

log.Println("Starting OS signal watcher.")
sigCh := make(chan os.Signal, 1)
Expand Down
102 changes: 102 additions & 0 deletions pkg/apis/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
/*
Copyright 2022 The Volcano Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package apis

import (
"fmt"
"io"
"log"
"os"

cli "github.com/urfave/cli/v2"
"sigs.k8s.io/yaml"
)

// Version indicates the version of the 'Config' struct used to hold configuration information.
const Version = "v1beta1"

// Config is a versioned struct used to hold configuration information.
type Config struct {
Version string `json:"version" yaml:"version"`
Flags Flags `json:"flags,omitempty" yaml:"flags,omitempty"`
}

// NewConfig builds out a Config struct from a config file (or command line flags).
// The data stored in the config will be populated in order of precedence from
// (1) command line, (2) environment variable, (3) config file.
func NewConfig(c *cli.Context, flags []cli.Flag) (*Config, error) {
config := &Config{
Version: Version,
}

log.Println(c.String("gpu-strategy"))

configFile := c.String("config-file")
if configFile != "" {
var err error
config, err = parseConfig(configFile)
if err != nil {
return nil, fmt.Errorf("unable to parse config file: %v", err)
}
}

config.Flags.CommandLineFlags = NewCommandLineFlags(c)

return config, nil
}

// parseConfig parses a config file as either YAML of JSON and unmarshals it into a Config struct.
func parseConfig(configFile string) (*Config, error) {
reader, err := os.Open(configFile)
if err != nil {
return nil, fmt.Errorf("error opening config file: %v", err)
}
defer reader.Close()

config, err := parseConfigFrom(reader)
if err != nil {
return nil, fmt.Errorf("error parsing config file: %v", err)
}

return config, nil
}

func parseConfigFrom(reader io.Reader) (*Config, error) {
var err error
var configYaml []byte

configYaml, err = io.ReadAll(reader)
if err != nil {
return nil, fmt.Errorf("read error: %v", err)
}

var config Config
err = yaml.Unmarshal(configYaml, &config)
if err != nil {
return nil, fmt.Errorf("unmarshal error: %v", err)
}

if config.Version == "" {
config.Version = Version
}

if config.Version != Version {
return nil, fmt.Errorf("unknown version: %v", config.Version)
}

return &config, nil
}
37 changes: 37 additions & 0 deletions pkg/apis/flags.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
/*
Copyright 2022 The Volcano Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package apis

import (
cli "github.com/urfave/cli/v2"
)

// Flags holds the full list of flags used to configure the device plugin and GFD.
type Flags struct {
*CommandLineFlags
}

// CommandLineFlags holds the list of command line flags used to configure the device plugin and GFD.
type CommandLineFlags struct {
GPUStrategy string `json:"GPUStrategy" yaml:"GPUStrategy"`
}

func NewCommandLineFlags(c *cli.Context) *CommandLineFlags {
return &CommandLineFlags{
GPUStrategy: c.String("gpu-strategy"),
}
}
Loading

0 comments on commit 820bcae

Please sign in to comment.