Skip to content

Commit

Permalink
Merge #133533
Browse files Browse the repository at this point in the history
133533: roachprod: fix issue with gcloud not able to handle concurrency r=shailendra-patel,vidit-bhat a=nameisbhaskar

The `gcloud` command to update disk labels is run concurrently. But, we saw an issue as the `gcloud` command is not able to handle the concurrency when we are trying to run 150 node with 4 disks each. So, instead of running the command concurrently for all disks per node, this changes the code to run only 2 disks per node concurrently (1 boot disk and 1 PD).

Epic: None
Release note: None

Co-authored-by: Bhaskarjyoti Bora <[email protected]>
  • Loading branch information
craig[bot] and nameisbhaskar committed Oct 28, 2024
2 parents 93619b8 + 8be1984 commit eb6778b
Showing 1 changed file with 8 additions and 5 deletions.
13 changes: 8 additions & 5 deletions pkg/roachprod/vm/gce/gcloud.go
Original file line number Diff line number Diff line change
Expand Up @@ -2323,8 +2323,11 @@ func propagateDiskLabels(
if !useLocalSSD {
// The persistent disks are already created. The disks are suffixed with an offset
// which starts from 1. A total of "pdVolumeCount" disks are created.
for offset := 1; offset <= pdVolumeCount; offset++ {
g.Go(func() error {
g.Go(func() error {
// the loop is run inside the go-routine to ensure that we do not run all the gcloud commands.
// For a 150 node with 4 disks, we have seen that the gcloud command cannot handle so many concurrent
// commands.
for offset := 1; offset <= pdVolumeCount; offset++ {
persistentDiskArgs := append([]string(nil), argsPrefix...)
persistentDiskArgs = append(persistentDiskArgs, zoneArg...)
// N.B. additional persistent disks are suffixed with the offset, starting at 1.
Expand All @@ -2335,9 +2338,9 @@ func propagateDiskLabels(
if err != nil {
return errors.Wrapf(err, "Command: gcloud %s\nOutput: %s", persistentDiskArgs, output)
}
return nil
})
}
}
return nil
})
}
}
}
Expand Down

0 comments on commit eb6778b

Please sign in to comment.