Enable QDR port 5671 to listen on both IPv4 and IPv6 #599

chaturvedi-kna · 2024-05-26T08:29:18Z

Summary

This PR updates the Qpid Dispatch Router (QDR) configuration to allow port 5671 to listen on all local addresses, both IPv4 and IPv6. Previously, the port was configured to listen only on IPv4 addresses. With this change, QDR can now support single stack IPv6 clusters as well as dual stack configurations.

Changes

Modified the Ansible task in component_qdr.yml to configure QDR to listen on all interfaces (both IPv4 and IPv6) on port 5671.
Updated the listener configuration to use host: '', which is the standard way to specify all interfaces for both IPv4 and IPv6.

Testing

Dual Stack OCP 4.12.6: Verified that QDR is listening on both IPv4 and IPv6 addresses on port 5671.
Single Stack IPv6 OCP 4.14.0: Confirmed that QDR is correctly listening on IPv6 addresses.
Connected OpenStack clusters to ensure proper connectivity and functionality.

Motivation and Context

This change is necessary to ensure compatibility with IPv6-only Kubernetes clusters, which are becoming more common as organizations prepare for the future of networking. Supporting both IPv4 and IPv6 ensures broader compatibility and future-proofs the service telemetry operator.

Related Issues

None

csibbitt

LGTM. I haven't tested it, but CI seems to be passing

chaturvedi-kna · 2024-05-28T09:40:22Z

Does the CI have IPV6 only env test case??

chaturvedi-kna · 2024-05-28T12:27:23Z

It looks like the change I made has impact on SG's for IPV6 only clusters.
I am getting below error:
cookie.go:209: net/http: invalid cookie.domain "fd01:0:0:1::896"; dropping domain attribute

Can someone provide insights on which component may be causing this problem, I am guessing that somewhere in SG also it is configured to connect for IPV4 only of AMQP

csibbitt · 2024-05-28T18:59:19Z

Does the CI have IPV6 only env test case??

No

csibbitt · 2024-05-28T19:08:58Z

It looks like the change I made has impact on SG's for IPV6 only clusters. I am getting below error: cookie.go:209: net/http: invalid cookie.domain "fd01:0:0:1::896"; dropping domain attribute

Can someone provide insights on which component may be causing this problem, I am guessing that somewhere in SG also it is configured to connect for IPV4 only of AMQP

The SG connects to AMQP from the sg-bridge component, which needs a small fix here: infrawatch/sg-bridge#37 but this seems like a different problem.

Since it says cookie.go, it must be from sg-core (it's written in golang, but the bridge is in C). I'm guessing it's probably in the prometheus exporter code, but I can't seem to put my finger on it yet. Can you provide any additional logging context or stack trace for that error?

chaturvedi-kna · 2024-05-29T05:10:08Z

The SG connects to AMQP from the sg-bridge component, which needs a small fix here: infrawatch/sg-bridge#37 but this seems like a different problem.

Since it says cookie.go, it must be from sg-core (it's written in golang, but the bridge is in C). I'm guessing it's probably in the prometheus exporter code, but I can't seem to put my finger on it yet. Can you provide any additional logging context or stack trace for that error?

Logging context:
I got the following logs from smart gateway pods created.
Here the IPV6 that is coming up is IP assigned to particular pod (self ip)

2024/05/24 13:42:57 provider.go:129: Defaulting client-id to system:serviceaccount:service-telemetry:smart-gateway
2024/05/24 13:42:57 provider.go:134: Defaulting client-secret to service account token /var/run/secrets/kubernetes.io/serviceaccount/token
2024/05/24 13:42:57 provider.go:358: Delegation of authentication and authorization to OpenShift is enabled for bearer tokens and client certificates.
2024/05/24 13:42:57 oauthproxy.go:210: mapping path "/" => upstream "http://localhost:8081/"
2024/05/24 13:42:57 oauthproxy.go:237: OAuthProxy configured for Client ID: system:serviceaccount:service-telemetry:smart-gateway
2024/05/24 13:42:57 oauthproxy.go:247: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain: samesite: refresh:disabled
2024/05/24 13:42:57 http.go:64: HTTP: listening on 127.0.0.1:4180
I0524 13:42:57.094070 1 dynamic_serving_content.go:132] "Starting controller" name="serving::/etc/tls/private/tls.crt::/etc/tls/private/tls.key"
2024/05/24 13:42:57 http.go:110: HTTPS: listening on [::]:8083
2024/05/28 11:51:31 cookie.go:209: net/http: invalid Cookie.Domain "fd01:0:0:3::883"; dropping domain attribute
2024/05/28 11:52:01 cookie.go:209: net/http: invalid Cookie.Domain "fd01:0:0:3::883"; dropping domain attribute
2024/05/28 11:52:31 cookie.go:209: net/http: invalid Cookie.Domain "fd01:0:0:3::883"; dropping domain attribute

vkmc

Looks good, thanks for your contribution. As already discussed, further changes are required in other STF components for this to work, and we don't have a CI w/IPv6.

csibbitt · 2024-05-29T13:13:35Z

Logging context: I got the following logs from smart gateway pods created. Here the IPV6 that is coming up is IP assigned to particular pod (self ip)
...
2024/05/24 13:42:57 oauthproxy.go:247: Cookie settings: name:_oauth_proxy secure(https):true httponly:true expiry:168h0m0s domain: samesite: refresh:disabled
2024/05/24 13:42:57 http.go:64: HTTP: listening on 127.0.0.1:4180
I0524 13:42:57.094070 1 dynamic_serving_content.go:132] "Starting controller" name="serving::/etc/tls/private/tls.crt::/etc/tls/private/tls.key"
2024/05/24 13:42:57 http.go:110: HTTPS: listening on [::]:8083
2024/05/28 11:51:31 cookie.go:209: net/http: invalid Cookie.Domain "fd01:0:0:3::883"; dropping domain attribute

Thanks, this makes more sense, now! The error is coming from the oauth-proxy container, not from the SG itself. It looks like this may be a problem in golang's cookie handling, though the ultimate impact is unclear without some additional testing. Here is another report of non-ipv6 compatible behavior in golang's cookie.go golang/go#65521

paramite · 2024-05-29T19:14:29Z

roles/servicetelemetry/tasks/component_qdr.yml

@@ -227,7 +227,7 @@
            prefix: ceilometer
        edgeListeners:
          - expose: true
-            host: 0.0.0.0
+            host: ''


Just note:

While documentation [1] says that empty string makes QDR listen on all local interfaces and [::] makes QDR listen on all IPv6 addresses, my manual testing with those two values was not successful:

case 1:

2024-05-29 19:04:10.899935 +0000 CONN_MGR (info) Configured Listener: '':5999 proto=any, role=normal 2024-05-29 19:04:10.909703 +0000 SERVER (notice) Operational, 4 Threads Running (process ID 1) 2024-05-29 19:04:10.909981 +0000 SERVER (notice) Process VmSize 114.67 MiB (125.40 GiB available memory) 2024-05-29 19:04:10.910211 +0000 SERVER (error) Listener error on '':5999: Name or service not known - listen on '':5999 (proton:io) 2024-05-29 19:04:10.910241 +0000 SERVER (critical) Shutting down, required listener failed '':5999

case2:

2024-05-29 19:11:11.596032 +0000 CONN_MGR (info) Configured Listener: [::]:5999 proto=any, role=normal 2024-05-29 19:11:11.601813 +0000 SERVER (notice) Operational, 4 Threads Running (process ID 1) 2024-05-29 19:11:11.601937 +0000 SERVER (notice) Process VmSize 114.67 MiB (125.40 GiB available memory) 2024-05-29 19:11:11.602230 +0000 SERVER (error) Listener error on [::]:5999: Name or service not known - listen on [::]:5999 (proton:io) 2024-05-29 19:11:11.602280 +0000 SERVER (critical) Shutting down, required listener failed [::]:5999

The only possible way to force QDR to listen on IPv6 address was to use the address explicitly.

[1] https://qpid.apache.org/releases/qpid-dispatch-1.17.0/man/qdrouterd.conf.html#_listener

This test is not official since my local build environment was giving me trouble today, but:

I edited my Interconnect object to say host: '' like in this patch

OCP returns it to me with empty double-quotes instead

$ oc get Interconnect -o yaml | grep host host: ""

The operator renders the config file with no host field at all

$ oc rsh pod/default-interconnect-574d468757-wshtn grep -C 4 "port: 5671" /opt/interconnect/etc/qdrouterd.conf } listener { role: edge port: 5671 saslMechanisms: PLAIN authenticatePeer: true sslProfile: openstack }

qdr is listening on both protocols

$ oc rsh pod/default-interconnect-574d468757-wshtn netstat -anp | grep :5671 tcp 0 0 0.0.0.0:5671 0.0.0.0:* LISTEN 1/qdrouterd tcp6 0 0 :::5671 :::* LISTEN 1/qdrouterd

Note that this was done on top of upstream builds and by manually editing the Interconnect object, not from actually running this PR code. I'll need to figure out what's wrong with my cluster; I think I'm just using it for too many things at once.

I also tried this with no host field in the Interconnect object at all, and the results were the same. Maybe that's the preferred way to write it to prevent any future doubts about the problems Martin has shown.

csibbitt · 2024-05-29T21:51:12Z

roles/servicetelemetry/tasks/component_qdr.yml

@@ -227,7 +227,7 @@
            prefix: ceilometer
        edgeListeners:
          - expose: true
-            host: 0.0.0.0
+            host: ''


Suggested change

host: ''

softwarefactory-project-zuul · 2024-05-30T07:56:14Z

This change depends on a change that failed to merge.

Change infrawatch/sg-bridge#37 is needed.

Co-authored-by: Chris Sibbitt <[email protected]>

mrunge · 2024-05-31T05:56:22Z

Can we please get an insight why infrawatch/sg-bridge#37 does not pass in CI, as these two both are needed for IPv6 to work (plus potentially more?)

elfiesmelfie · 2024-05-31T11:49:49Z

I am updating this PR to synch with master and include #603, so that it doesn't get stuck in the gate.

softwarefactory-project-zuul · 2024-05-31T13:45:47Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/3774ffdb897742a4943a19c165568454

✔️ stf-crc-ocp_412-local_build SUCCESS in 37m 33s
✔️ stf-crc-ocp_414-local_build SUCCESS in 41m 33s
❌ stf-crc-ocp_412-local_build-index_deploy RETRY_LIMIT in 20m 23s
✔️ stf-crc-ocp_414-local_build-index_deploy SUCCESS in 44m 11s

chaturvedi-kna · 2024-05-31T14:01:34Z

recheck

softwarefactory-project-zuul · 2024-05-31T15:34:19Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/531a6b4354344aaf9c2972c45fb94388

❌ stf-crc-ocp_412-local_build FAILURE in 24m 22s
✔️ stf-crc-ocp_414-local_build SUCCESS in 44m 17s
❌ stf-crc-ocp_412-local_build-index_deploy RETRY_LIMIT in 21m 01s
✔️ stf-crc-ocp_414-local_build-index_deploy SUCCESS in 51m 55s

vkmc · 2024-06-03T06:22:32Z

recheck

csibbitt · 2024-06-05T21:05:46Z

recheck

softwarefactory-project-zuul · 2024-06-05T21:39:46Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/1fcf852d1d8a4bbf89d3917a4a1d8eb6

❌ stf-crc-ocp_412-local_build FAILURE in 28m 43s
❌ stf-crc-ocp_414-local_build RETRY_LIMIT in 9m 07s
❌ stf-crc-ocp_412-local_build-index_deploy FAILURE in 29m 03s
❌ stf-crc-ocp_414-local_build-index_deploy FAILURE in 27m 25s

elfiesmelfie · 2024-06-06T18:16:44Z

The zuul jobs are blocked by this change: infrawatch/sg-core#135, which will shortly be resolved

Make port 5671 listen on all interfaces (IPv4 and IPv6)

18e502a

csibbitt approved these changes May 27, 2024

View reviewed changes

csibbitt requested a review from vkmc May 27, 2024 16:28

vkmc reviewed May 29, 2024

View reviewed changes

paramite reviewed May 29, 2024

View reviewed changes

Merge branch 'master' into qdr-listen-all

485b644

csibbitt reviewed May 29, 2024

View reviewed changes

Update roles/servicetelemetry/tasks/component_qdr.yml

0652a45

Co-authored-by: Chris Sibbitt <[email protected]>

Merge branch 'master' into qdr-listen-all

da793e3

vkmc approved these changes Jun 5, 2024

View reviewed changes

csibbitt enabled auto-merge (squash) June 5, 2024 21:05

csibbitt disabled auto-merge June 10, 2024 17:03

csibbitt merged commit b46f215 into infrawatch:master Jun 10, 2024
1 check failed

csibbitt mentioned this pull request Jun 19, 2024

Use scrapeconfig instead of servicemonitor #607

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable QDR port 5671 to listen on both IPv4 and IPv6 #599

Enable QDR port 5671 to listen on both IPv4 and IPv6 #599

chaturvedi-kna commented May 26, 2024 •

edited by vkmc

Loading

csibbitt left a comment

chaturvedi-kna commented May 28, 2024

chaturvedi-kna commented May 28, 2024

csibbitt commented May 28, 2024

csibbitt commented May 28, 2024

chaturvedi-kna commented May 29, 2024

vkmc left a comment

csibbitt commented May 29, 2024

paramite May 29, 2024

csibbitt May 29, 2024

csibbitt May 29, 2024

csibbitt May 29, 2024

softwarefactory-project-zuul bot commented May 30, 2024

mrunge commented May 31, 2024

elfiesmelfie commented May 31, 2024

softwarefactory-project-zuul bot commented May 31, 2024

chaturvedi-kna commented May 31, 2024 •

edited

Loading

softwarefactory-project-zuul bot commented May 31, 2024

vkmc commented Jun 3, 2024

csibbitt commented Jun 5, 2024

softwarefactory-project-zuul bot commented Jun 5, 2024

elfiesmelfie commented Jun 6, 2024

Enable QDR port 5671 to listen on both IPv4 and IPv6 #599

Enable QDR port 5671 to listen on both IPv4 and IPv6 #599

Conversation

chaturvedi-kna commented May 26, 2024 • edited by vkmc Loading

Summary

Changes

Testing

Motivation and Context

Related Issues

csibbitt left a comment

Choose a reason for hiding this comment

chaturvedi-kna commented May 28, 2024

chaturvedi-kna commented May 28, 2024

csibbitt commented May 28, 2024

csibbitt commented May 28, 2024

chaturvedi-kna commented May 29, 2024

vkmc left a comment

Choose a reason for hiding this comment

csibbitt commented May 29, 2024

paramite May 29, 2024

Choose a reason for hiding this comment

csibbitt May 29, 2024

Choose a reason for hiding this comment

csibbitt May 29, 2024

Choose a reason for hiding this comment

csibbitt May 29, 2024

Choose a reason for hiding this comment

softwarefactory-project-zuul bot commented May 30, 2024

mrunge commented May 31, 2024

elfiesmelfie commented May 31, 2024

softwarefactory-project-zuul bot commented May 31, 2024

chaturvedi-kna commented May 31, 2024 • edited Loading

softwarefactory-project-zuul bot commented May 31, 2024

vkmc commented Jun 3, 2024

csibbitt commented Jun 5, 2024

softwarefactory-project-zuul bot commented Jun 5, 2024

elfiesmelfie commented Jun 6, 2024

chaturvedi-kna commented May 26, 2024 •

edited by vkmc

Loading

chaturvedi-kna commented May 31, 2024 •

edited

Loading