Skip to content

Commit

Permalink
roachprod: fix issue with gcloud not able to handle concurrency
Browse files Browse the repository at this point in the history
The gcloud command to update disk labels are run concurrently. But, we saw an issue as the gcloud command is not able to handle the concurrency when we are trying to run 150 node with 4 disks each.
So, instead of running the command concurrently for all disks per node, this changes the code to run only 2 disks per node concurrently (1 boot disk and 1 PD).

Epic: None
Release note: None
  • Loading branch information
nameisbhaskar committed Oct 28, 2024
1 parent 1b5c419 commit 8be1984
Showing 1 changed file with 8 additions and 5 deletions.
13 changes: 8 additions & 5 deletions pkg/roachprod/vm/gce/gcloud.go
Original file line number Diff line number Diff line change
Expand Up @@ -2323,8 +2323,11 @@ func propagateDiskLabels(
if !useLocalSSD {
// The persistent disks are already created. The disks are suffixed with an offset
// which starts from 1. A total of "pdVolumeCount" disks are created.
for offset := 1; offset <= pdVolumeCount; offset++ {
g.Go(func() error {
g.Go(func() error {
// the loop is run inside the go-routine to ensure that we do not run all the gcloud commands.
// For a 150 node with 4 disks, we have seen that the gcloud command cannot handle so many concurrent
// commands.
for offset := 1; offset <= pdVolumeCount; offset++ {
persistentDiskArgs := append([]string(nil), argsPrefix...)
persistentDiskArgs = append(persistentDiskArgs, zoneArg...)
// N.B. additional persistent disks are suffixed with the offset, starting at 1.
Expand All @@ -2335,9 +2338,9 @@ func propagateDiskLabels(
if err != nil {
return errors.Wrapf(err, "Command: gcloud %s\nOutput: %s", persistentDiskArgs, output)
}
return nil
})
}
}
return nil
})
}
}
}
Expand Down

0 comments on commit 8be1984

Please sign in to comment.