-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add instrumentation #1001
base: main
Are you sure you want to change the base?
Add instrumentation #1001
Conversation
conn, err := network.ConnectToHost(rb.ctx, address, timeout, rb.tailscale, rb.tsConfig) | ||
if err != nil { | ||
return nil, err | ||
} | ||
// Use goroutine to avoid potential blocking from rate limiter's internal mutex | ||
go metrics.RecordDialTime(time.Since(start)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit confused on this, why would it block? the rate limiter doesn't just return if its not "allowed". I could see that other weird issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a mutex inside the rate limiter to acquire the count. Its likely a small and quick acquisition, but it could still block
pkg/gateway/gateway.go
Outdated
@@ -68,7 +68,7 @@ type Gateway struct { | |||
WorkerPoolRepo repository.WorkerPoolRepository | |||
EventRepo repository.EventRepository | |||
Tailscale *network.Tailscale | |||
metricsRepo repository.MetricsRepository | |||
metricsRepo repository.UsageRepository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be called usageRepo here?
pkg/scheduler/scheduler_test.go
Outdated
@@ -66,7 +66,7 @@ func NewSchedulerForTest() (*Scheduler, error) { | |||
workerPoolManager: workerPoolManager, | |||
requestBacklog: requestBacklog, | |||
containerRepo: containerRepo, | |||
schedulerMetrics: schedulerMetrics, | |||
schedulerUsage: schedulerMetrics, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
names should match
pkg/worker/image.go
Outdated
} | ||
} | ||
log.Info().Str("local_cache_path", localCachePath).Msg("local cache path") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this log anymore? this was for debugging right
pkg/worker/image.go
Outdated
@@ -293,7 +300,14 @@ func (c *ImageClient) BuildAndArchiveImage(ctx context.Context, outputLogger *sl | |||
if err != nil { | |||
return err | |||
} | |||
ociImageInfo, err := os.Stat(ociPath) | |||
if err != nil { | |||
log.Warn().Err(err).Msg("unable to inspect image size") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this fails, won't the following code not work? float64(ociImageInfo.Size()) / 1024 / 1024
or will it just report 0MB?
pkg/worker/image.go
Outdated
log.Warn().Err(err).Msg("unable to inspect image size") | ||
} | ||
bundleMB := float64(bundleInfo.Size()) / 1024 / 1024 | ||
metrics.RecordImageUnpackSpeed(bundleMB, time.Since(startTime)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
pkg/worker/image.go
Outdated
|
||
outputLogger.Info("Unpacking image...\n") | ||
tmpBundlePath := filepath.Join(baseTmpBundlePath, request.ImageId) | ||
err = c.unpack(ctx, baseImage.Repo, baseImage.Tag, tmpBundlePath) | ||
err = c.unpack(ctx, baseImage.Repo, baseImage.Tag, tmpBundlePath, imageMB) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe call this imageSizeMB
pkg/worker/image.go
Outdated
} | ||
|
||
func (c *ImageClient) unpack(ctx context.Context, baseImageName string, baseImageTag string, bundlePath string) error { | ||
func (c *ImageClient) unpack(ctx context.Context, baseImageName string, baseImageTag string, bundlePath string, imageMB float64) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above. although its kinda weird that we should even have to pass this in here, feels wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe would push this out
pkg/worker/worker.go
Outdated
@@ -56,7 +56,7 @@ type Worker struct { | |||
containerLock sync.Mutex | |||
containerWg sync.WaitGroup | |||
containerLogger *ContainerLogger | |||
workerMetrics *WorkerMetrics | |||
workerMetrics *WorkerUsage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
workerUsageMetrics? or workerUsage? I think in general all of these places where we have the repo called usage etc, it might still make sense to call it UsageMetrics. Usage alone isn't really clear imo
Resolve BE-2377