Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support gRPC probe #14134

Merged
merged 18 commits into from
Sep 21, 2023
Merged

Support gRPC probe #14134

merged 18 commits into from
Sep 21, 2023

Conversation

seongpyoHong
Copy link
Contributor

Fixes #12442

Proposed Changes

Exec probe by grpc-health-probe is the only way to check gRPC service healthy.

Add gRPC probe that kubernetes support from version 1.24 to make probing for gRPC service more convenient.

Release Note

Support gRPC probe.

@knative-prow
Copy link

knative-prow bot commented Jun 29, 2023

Welcome @seongpyoHong! It looks like this is your first PR to knative/serving 🎉

@knative-prow
Copy link

knative-prow bot commented Jun 29, 2023

Hi @seongpyoHong. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow knative-prow bot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. area/API API objects and controllers labels Jun 29, 2023
@knative-prow knative-prow bot added area/autoscale area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Jun 29, 2023
@nak3 nak3 added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 29, 2023
@seongpyoHong
Copy link
Contributor Author

seongpyoHong commented Jul 4, 2023

@evankanderson @mgencur @psschwei (I tagged you because you were designated as a reviewer.)

Could you please review this PR?

@psschwei
Copy link
Contributor

psschwei commented Jul 5, 2023

cc @dprotaso @skonto

Copy link
Member

@dprotaso dprotaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add some unit tests and e2e tests

client := grpchealth.NewHealthClient(conn)

resp, err := client.Check(metadata.NewOutgoingContext(ctx, make(metadata.MD)), &grpchealth.HealthCheckRequest{
Service: ptr.ToString(config.Service),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need to pull in a new pkg - we have https://github.com/knative/pkg/blob/main/ptr/ptr.go#L53

},
}

v := version.Get()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work - this package is from the k8s code and assumes the binary is built with certain linker flags.

I'd look what we're doing with the HTTPProbe

if err == context.DeadlineExceeded {
return fmt.Errorf("timeout: failed to connect service %q within %v", addr, config.Timeout)
} else {
return fmt.Errorf("error: failed to connect service at %q", addr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should wrap the error in this message - eg. use %w

return fmt.Errorf("timeout: health rpc did not complete within %v", config.Timeout)
}
}
return fmt.Errorf("error: health rpc probe failed: %+v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise wrap the error using %w

Comment on lines +225 to +236
dialer := &net.Dialer{
Control: func(network, address string, c syscall.RawConn) error {
return c.Control(func(fd uintptr) {
unix.SetsockoptLinger(int(fd), syscall.SOL_SOCKET, syscall.SO_LINGER, &unix.Linger{Onoff: 1, Linger: 1})
})
},
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and mention why we are using these options

Copy link
Contributor Author

@seongpyoHong seongpyoHong Jul 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about it.

Is it necessary to handle the Windows container? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - we currently don't have windows support

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 9, 2023
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 9, 2023
@knative-prow knative-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 9, 2023
@seongpyoHong
Copy link
Contributor Author

@dprotaso
I've rebased and apply your feedback except unit tests & e2e tests.

I will request a review again when the test-related work is completed. Thank you. 😄

@seongpyoHong
Copy link
Contributor Author

/retest

@knative-prow knative-prow bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 10, 2023
@knative-prow knative-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 10, 2023
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 14, 2023
@seongpyoHong
Copy link
Contributor Author

@dprotaso

Sorry for late response. I added 2 commits.

  • Ignore ReadinessProbe in user-container when it's gRPC probe(4245075)
  • Add unit test in pkg/reconciler/revision/resources/deploy_test.go & Modify test to use ephemeral port (dd1b4f5)

@codecov
Copy link

codecov bot commented Sep 15, 2023

Codecov Report

Patch coverage: 77.52% and project coverage change: -0.06% ⚠️

Comparison is base (4cb442c) 86.10% compared to head (d3e8464) 86.04%.
Report is 27 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14134      +/-   ##
==========================================
- Coverage   86.10%   86.04%   -0.06%     
==========================================
  Files         196      196              
  Lines       14787    14874      +87     
==========================================
+ Hits        12732    12799      +67     
- Misses       1749     1764      +15     
- Partials      306      311       +5     
Files Changed Coverage Δ
pkg/apis/serving/v1/revision_defaults.go 95.70% <40.00%> (-1.74%) ⬇️
pkg/queue/health/probe.go 53.84% <65.30%> (+4.67%) ⬆️
pkg/apis/serving/fieldmask.go 94.66% <100.00%> (+0.12%) ⬆️
pkg/apis/serving/k8s_validation.go 94.36% <100.00%> (+0.03%) ⬆️
pkg/queue/readiness/probe.go 90.62% <100.00%> (+0.88%) ⬆️
pkg/reconciler/revision/resources/deploy.go 90.13% <100.00%> (ø)
pkg/reconciler/revision/resources/queue.go 98.44% <100.00%> (+0.02%) ⬆️

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dprotaso
Copy link
Member

hey @seongpyoHong thanks for the changes - looks like you have just one small lint error left

https://github.com/knative/serving/actions/runs/6236884761/job/16936938875?pr=14134#step:5:28916

@seongpyoHong
Copy link
Contributor Author

Hi @dprotaso, I resolved lint error.

How can I run GitHub Actions for lint(code check)?

@dprotaso
Copy link
Member

How can I run GitHub Actions for lint(code check)?

Do you mean locally? We don't have an easy way to do it

What we do is here: https://github.com/knative/actions/blob/main/.github/workflows/reusable-helper-go-style.yaml

@dprotaso
Copy link
Member

/lgtm
/approve

thanks for all the help 🎉 getting this over the line

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Sep 21, 2023
@knative-prow
Copy link

knative-prow bot commented Sep 21, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, seongpyoHong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 21, 2023
@knative-prow knative-prow bot merged commit d02702e into knative:main Sep 21, 2023
63 checks passed
ReToCode pushed a commit to ReToCode/serving that referenced this pull request Jan 9, 2024
* Add gRPC probe

* Modify unit test

* Modify unit test

* Set default to grpcprobe's service field

* Use knative pkg for ptr operation

* Use config's KubeMajor & KubeMinor instead of k8s native version pkg

* Wrap error in GRPCProbe

* Add comment to explain why use dialer_others.go

* Run update scripts

* Add probe test

* Add test in readiness/probe_test.go

* update deps

* Ignore readinessProbe when it's gRPC

* Fix test to use empemeral port

* Resolve govet lint error

* Use errors.Is to compare

* Also use ephemeral port for handler test

* drop unneccesary else block

(cherry picked from commit d02702e)
openshift-merge-bot bot pushed a commit to openshift-knative/serving that referenced this pull request Jan 10, 2024
* Support gRPC probe (knative#14134)

* Add gRPC probe

* Modify unit test

* Modify unit test

* Set default to grpcprobe's service field

* Use knative pkg for ptr operation

* Use config's KubeMajor & KubeMinor instead of k8s native version pkg

* Wrap error in GRPCProbe

* Add comment to explain why use dialer_others.go

* Run update scripts

* Add probe test

* Add test in readiness/probe_test.go

* update deps

* Ignore readinessProbe when it's gRPC

* Fix test to use empemeral port

* Resolve govet lint error

* Use errors.Is to compare

* Also use ephemeral port for handler test

* drop unneccesary else block

(cherry picked from commit d02702e)

* Regenerate OCP release artifacts

---------

Co-authored-by: Seongpyo Hong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/autoscale area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for GRPC probes
5 participants