Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS-only NATS #929

Closed
46bit opened this issue Jul 20, 2021 · 10 comments
Closed

TLS-only NATS #929

46bit opened this issue Jul 20, 2021 · 10 comments
Assignees

Comments

@46bit
Copy link
Contributor

46bit commented Jul 20, 2021

What is this issue about?

cf-deployment deploys a NATS cluster that is available over both plaintext and TLS. This was necessary because not all components supported TLS NATS.

Having a NATS cluster that supports both plaintext and TLS is complicated. NATS is only available on one at a time, so each NATS node is running two separate NATS agents (one for plaintext, one for TLS.) The cluster config has also been quite fragile (@ameowlia knows more about this.)

What change do you propose?

We want cf-deployment to default to deploying TLS-only NATS. This would simplify the cluster, and remove one of the remaining non-TLS components (furthering #906.)

We'd provide an optional opsfile so that users can choose to run plaintext-only NATS. Users with custom components that don't yet support TLS NATS could still use plaintext NATS.

We will also finish upgrading mainstream open-source components to support TLS NATS (e.g. service-discovery-controller and metric-discovery-registrar.)

How seriously is this work taken?

SAP want to get this done. For instance we've already switched Gorouter to use TLS NATS.

@cf-gitbot
Copy link

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/178948246

The labels on this github issue will be updated when the story is started.

@46bit 46bit changed the title Remove non-TLS NATS Default to TLS-only NATS Jul 20, 2021
@46bit 46bit changed the title Default to TLS-only NATS Switch the default to TLS-only NATS Jul 20, 2021
@46bit 46bit changed the title Switch the default to TLS-only NATS TLS-only NATS Jul 20, 2021
@ameowlia
Copy link
Member

ameowlia commented Jul 22, 2021

Hi @46bit,

🎉 I am so happy that you are taking on this work! Not only will this make our internal system more secure, but it will also cut the number of nodes in the NATS cluster in half, which will provide more stability.

Here is the list of NATS clients (that I know of):

bosh job able to be configured to use TLS Details
gorouter subscribes for route registration messages
route registrar publishes component routes for gorouter to consume
route emitter publishes app routes for gorouter to consume
route emitter windows publishes app routes on windows cells for gorouter to consume
service discovery controller 🚫 subscribes to internal route messages from route emitter
loggr-system-metric-scraper ✳️ publishes messages about metric targets
metrics discovery registrar ✳️ publishes messages about metric targets
scrape config generator ✳️ subscribes to information about metric targets

✅ - can be configured to use TLS or non-TLS
✳️ - always uses TLS
🚫 - always uses non-TLS

I've circulated this issue internally and no one (outside of my team) has jumped out saying that they maintain something that uses NATS, so hopefully this is a complete list.

@46bit
Copy link
Contributor Author

46bit commented Jul 23, 2021

@ameowlia Thanks for the list! Here's some first impressions.

  • I'm working on giving TLS NATS support to service-discovery-controller.
  • loggr-system-metric-scraper isn't shipped in default cf-deployment. It wasn't used by GOV.UK PaaS and isn't used by SAP BTP. I wonder if we can make using it conditional on plaintext NATS?
  • metrics-discovery-registrar seems to support TLS NATS since cloudfoundry/metrics-discovery-release@e8636c2.
  • scrape-config-generator seems to support TLS NATS since cloudfoundry/metrics-discovery-release@af2bfca.

@46bit
Copy link
Contributor Author

46bit commented Jul 23, 2021

@ameowlia Good news! Both metrics-discovery-registrar and scrape-config-generator seem to already be using mTLS for NATS in cf-deployment.

I've checked metrics-discovery-registrar in a running environment. Oddly I can't find scrape-config-generator used anywhere but I'm fairly sure its config would make it use TLS NATS as well.

@46bit
Copy link
Contributor Author

46bit commented Jul 23, 2021

I had hoped we'd ship TLS NATS by default, but provide an opsfile to revert to plaintext NATS if anyone needed to. Unfortunately that is not possible.

metrics-discovery-registrar and scrape-config-generator are both hardcoded to want TLS NATS. For instance see how https://github.com/cloudfoundry/metrics-discovery-release/blob/7964ba900fe89f301d67f398a7ed4a46f5c33572/src/cmd/discovery-registrar/main.go#L42-L84 will crash if it doesn't have mTLS details for NATS.

@ameowlia
Copy link
Member

ameowlia commented Jul 26, 2021

Thanks for the update @46bit !

metrics-discovery-registrar seems to support TLS NATS since cloudfoundry/metrics-discovery-release@e8636c2.
scrape-config-generator seems to support TLS NATS since cloudfoundry/metrics-discovery-release@af2bfca.

🤦‍♀️ yup, you are right. I was looking only for the "tls.enabled" property, but these don't have it because they ALWAYS use TLS. I fixed my table to reflect this.

loggr-system-metric-scraper isn't shipped in default cf-deployment. It wasn't used by GOV.UK PaaS and isn't used by SAP BTP.

You are right. It is currently (only?) used in Tanzu Application Service (see release notes here). I will bring up this work internally to get it prioritized.

I'm working on giving TLS NATS support to service-discovery-controller.

🎉 Yay! This work is closer to being done then I thought.

@46bit
Copy link
Contributor Author

46bit commented Jul 27, 2021

@ameowlia I've discovered that loggr-system-metric-scraper already uses TLS NATS: cloudfoundry/system-metrics-scraper-release@0c0dfac.

@ameowlia
Copy link
Member

Current state of the world:

  • All CF-d NATS clients have the ability to use nats-tls
    • Except for the service discovery controller nats client. The code is there (thanks to @46bit), but we still need to cut a release.

@davewalter
Copy link
Member

@ameowlia @46bit I just cut v17.0.0 that removes the plaintext NATs job so I think we can close this issue?

@46bit
Copy link
Contributor Author

46bit commented Nov 7, 2021

@davewalter agreed! and thanks to you, @ameowlia et al for getting this merged :)

@46bit 46bit closed this as completed Nov 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants