Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebSockets: add opt-in delegation of encryption to TLS #625

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

achingbrain
Copy link
Member

Rewords the Encryption section to allow for opt-in single encryption signalled via a noise extension.

Specifies the WebSockets transport with notes on certificates,
multiplexing, etc.

Fixes #594
@achingbrain achingbrain force-pushed the docs/fix-websockets-double-encryption branch from 1f227e4 to 0d8cb4e Compare August 13, 2024 13:23
Rewords the Encryption section to allow for opt-in single encryption
signaled via a noise extension.
@achingbrain achingbrain force-pushed the docs/fix-websockets-double-encryption branch from 0d8cb4e to b680589 Compare August 13, 2024 13:24
Copy link
Member

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could easily be missing stuff more libp2p-ish folks will see, but it makes sense to me


During connection establishment over WebSockets, before the connection is made available to the rest of the application, if all of the following criteria are met:

1. `noise` is negotiated as the connection encryption protocol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So despite noise being negotiated as the encryption protocol, no encryption is done at the libp2p level, because of the handshake_only which overrules it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. Noise does a handshake and encryption, and here we only want the handshake since encryption is done by TLS at the transport layer. This is similar to how the WebTransport and WebRTC transports work.

We still negotiate a connection encryption protocol for backwards compatibility and to allow for alternatives to noise in the future.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #564


This prevents double-encryption but only when both ends opt-in to ensure backwards compatibility with existing deployments.

### MITM mitigation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work unless the server somehow includes their domain name in their noise handshake. Otherwise, an attacker can simply announce that some peer id QmVictim is available at /dns/attacker.com/....

Furthermore, by including the domain name in the noise handshake, the server is opting-in to trusting the CA system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this also doesn't work unless the server trusts all owners of the given domain name. If the server operator is the sole owner of the domain name, this is fine. But if another party controls the domain name as well, but not the private key to the peer ID, they would be able to MITM/intercept this connection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points both.

This doesn't work unless the server somehow includes their domain name in their noise handshake

I've added a further tls_common_name extension to the noise handshake to prevent an attacker publishing an address with a bogus domain and then simply relaying the handshake.

by including the domain name in the noise handshake, the server is opting-in to trusting the CA system

I've added a note to this effect.

Note that this also doesn't work unless the server trusts all owners of the given domain name

I've copied the Security Considerations section from #564 as it seems like they have the same problems with third party root certificates installed on the machine.

Do you think this covers it?

Comment on lines +224 to +225
optional bool handshake_only = 3;
optional string tls_common_name = 4;
Copy link
Member

@Stebalien Stebalien Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this less specific to websocket/TLS? E.g., one way to do this is to:

  1. Setup a noise stream.
  2. Over the noise stream, send the TLS common name, etc.
  3. Close the noise stream but leave the connection open.
  4. Use the connection without encryption.

To support a better upgrade path, this method can be advertised as a new "stream muxer" (more like a "next protocol"). I.e.:

  1. Initiator negotiates the "/drop-noise" protocol (or something like that) along with the expected TLS common name (or nothing?).
  2. Receiver returns their TLS common name and terminates their end of the noise stream.
  3. Client terminates their end of the noise stream.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it would add quite a lot of round trips? Part of the joy of the extensions is they arrive as part of the handshake message.

I know connection establishment with WebSockets has a bunch of round trips already but is this a reason to make it slower, particularly when we already have WebTransport-specific stuff in the extensions registry?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. You're right, this'll add a round trip.

Copy link
Contributor

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a section saying the server MUST complete the handshake before reading or sending any application data. I’m afraid it might be a little too subtle that if the server drops the noise handshake after replying, it has failed to authenticate the client.

Also, just to check my understanding, how many round trips is this before we have a usable WSS connection?


The TLS certificate used should be signed by a trusted certificate authority, the host name should correspond to the common name contained within the certificate, and the domain being connected to should match the common name sent as part of the noise handshake.

This requires trusting the certificate authority to issue correct certificates, but is necessary due to limitations of certain user agents, namely web browsers which do not allow use of self-signed certificates that could be otherwise be verified via preshared certificate fingerprints.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d include a sentence that mentions Keying Material Exporters (RFC 5705), which, in a perfect world, would be the preferred solution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a sentence - please suggest an edit if you think it needs more exposition.

Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a new feature to me. As far as I can tell, this isn't implemented anywhere yet. Why does the PR title say it's a fix?

I'm also not convinced that this is a problem worth solving. Sure, double encryption sucks, on paper. And maybe in a synthetic benchmark. But WebSocket connections are relatively rare, and there's a better solution around that solves this and many other problems: WebTransport.

@achingbrain
Copy link
Member Author

achingbrain commented Sep 4, 2024

@MarcoPolo

I think we should add a section saying the server MUST complete the handshake before reading or sending any application data

I've added this - please suggest an edit if you think it can be clearer.

how many round trips is this before we have a usable WSS connection?

..a few. You need to establish TLS, open a HTTP connection then upgrade it to WS. I think the actual number depends on support for things like TLS1.3 0-RTT, etc.

@marten-seemann

WebSocket connections are relatively rare

Right now, yes, but this will hopefully change with the introduction of Let's Encrypt certs for libp2p WebSocket listeners which will let us enable the the transport by default since historically the need to obtain and set up a cert was the blocker to it being usable.

there's a better solution... WebTransport

All things being equal WebTransport is a better solution, agreed, and will make a lot of this obsolete. However we're not quite there yet - support is not universal and implementations are still quite buggy.

The WebSocket API has the advantage of being supported everywhere, including mobile browsers which means you don't have to muck around with native WebRTC modules for things like react-native which results in better DX.

Not doing unnecessary encryption also means less CPU and battery usage which is very important for mobile so I think this is a problem worth solving.

@achingbrain achingbrain changed the title docs: fix WebSockets double encryption WebSockets: add opt-in delegation of encryption to TLS Sep 4, 2024
@marten-seemann
Copy link
Contributor

I don't have a dog in this fight anymore 🤷‍♂️, but here are my 2c. Will probably disengage from the discussion after this post.

I think the actual number depends on support for things like TLS1.3 0-RTT, etc.

WebSocket upgrade requests are not idempotent, and thus not permissible in 0-RTT.

Not doing unnecessary encryption also means less CPU and battery usage which is very important for mobile so I think this is a problem worth solving.

If you're on a mobile device and you're worried about CPU usage, you don't open dozens of WebSocket connections (or WebTransport, for that matter). Instead you do plain old and boring HTTPS, and ideally let the OS handle your connection pool.

Not doing unnecessary encryption also means less CPU and battery usage which is very important for mobile so I think this is a problem worth solving.

What I was asking for in my previous comment was actual real world data. Of course it's slower to encrypt twice (duh!), but how much does it actually matter? Modern CPUs are pretty fast when doing crypto, to the point where optimized QUIC stacks do multiple Gbit/s per CPU core (and that includes TLS, header / frame parsing, UDP syscalls and all the other QUIC logic).

Now you can argue that not all stacks will be that optimized (true!) and that doing crypto in the browser is slower than doing it in C / Assembly (also true!), but then you're also limited to a few dozens connections and you're probably not trying to achieve sustained Gbit/s throughput in your browser application anyway.

@MarcoPolo
Copy link
Contributor

I'm suggesting a change in #564 that would allow clients to initiate the mutual authentication. That would mean that HTTP Peer ID authentication would have the same number of round trips as this change. In that case, I'd prefer if we use HTTP Peer ID authentication to authenticate the WebSocket endpoint instead of this change.

A couple of reasons why:

  1. It fits neatly into the WebSocket standard by using HTTP Authentication.
  2. It happens before we establish a WebSocket connection. Which is useful because:
    a. Allows offloading authentication to a generic HTTP server.
    b. In the case you want to allow only some Peer IDs to establish a connection you can do that before allocating resources for a full duplex WebSocket.
  3. It allows clients and servers to use a bearer token to authenticate future connections.
    a. Important in the browser use case, where we may want to aggressively prune connections.
    b. Saves one round trip in connection establishment for future connections.
  4. Allows implementations to avoid a Noise dependency.
    a. We are relying on TLS to provide encryption. It seems silly to have a Noise dependency just for this handshake.
  5. Fewer equivalent ways to do the same thing. As far as I can tell, this has the same properties as HTTP Peer ID authentication, so there are no benefits specific to this change.

@achingbrain
Copy link
Member Author

achingbrain commented Sep 5, 2024

This scheme looks better, I agree. I like that it saves round trips and leverages plain-old-HTTP, however I'm not sure you can implement it using the browser WebSocket API as specified.

The WebSocket constructor takes only two arguments:

new WebSocket(url, protocols)

url is a URL (unsurprisingly), protocols is an optional string or Array<string> which result in a single Sec-WebSocket-Protocol header being set (with comma-delimited values if an array is passed - so protocols can't have a comma in them).

You cannot send custom headers when creating a WebSocket connection, and you cannot access any part of the flow before the upgrade to the WebSocket protocol is complete.

So we could (ab)use the protocols field to send authentication information in the Sec-WebSocket-Protocol header but I don't know if that's better or worse than mandating a noise dependency, which you'd have to have in your app anyway for backwards compatibility. Plus you can't read the response unless it's sent as a WebSocket message event after the protocol upgrade has completed so you've already allocated the resources at that point.

This is a good thread, including why URL params, cookies, HTTP auth, bearer tokens, etc are either unfeasible, insecure or don't work everywhere.

The one thing in all this, I guess, is that it will all be deprecated once WebTransport becomes viable.

@MarcoPolo
Copy link
Contributor

Some interesting history around supporting the "Authorization" header in WebSocket at the WHATWG: whatwg/websockets#16

@MarcoPolo
Copy link
Contributor

You cannot send custom headers when creating a WebSocket connection, and you cannot access any part of the flow before the upgrade to the WebSocket protocol is complete.

That's a shame. With that in mind, this change makes sense to me.

Now you can argue that not all stacks will be that optimized (true!) and that doing crypto in the browser is slower than doing it in C / Assembly (also true!), but then you're also limited to a few dozens connections and you're probably not trying to achieve sustained Gbit/s throughput in your browser application anyway.

My guess is that relying on the browser's TLS stack will make this all ~2-5x more efficient compared to the Noise JS implementation. This guess is informed by my noise-wasm package for js-libp2p which saw ~4x improvement compared to the JS version.

Avoiding double encryption here probably also allows the same server to handle more requests. Probably not 2x the number of requests, but a safe bet would be that it is more than 1x.

Another nice property of this is that it allows for securely connecting (via the Web PKI) to a peer without having to know that peer's peer id ahead of time.

@sukunrt
Copy link
Member

sukunrt commented Sep 8, 2024

You cannot send custom headers when creating a WebSocket connection, and you cannot access any part of the flow before the upgrade to the WebSocket protocol is complete.

Can we do this using http over libp2p(on the websocket connection)?

@MarcoPolo
Copy link
Contributor

Can we do this using http over libp2p(on the websocket connection)?

Yes, but this would add at least 1 RT.

Base automatically changed from docs/add-websockets-spec to master October 23, 2024 10:53
@achingbrain
Copy link
Member Author

I thought of another attack involving things like AutoTLS.

The normal flow is something like:

  1. Alice contacts Bob, asks Bob to respond to the ACME DNS challenge on her behalf
  2. Alice creates a CSR and sends it to Colin
  3. Colin performs the ACME DNS challenge, which Bob answers
  4. Colin sends Alice a certificate
  5. David does a DNS lookup for Alice, which is answered by Bob's DNS server
  6. David contacts Alice with the resolved IP, who supplies the certificate
  7. Noise handshake, etc.

The attack is:

  1. Eve compromises Bob
  2. Alice does the certificate dance with Bob and Colin
  3. David does a DNS lookup for Alice, which is answered by Bob's DNS server, but returns an IP of Eve's choosing
  4. Eve does the certificate dance with Colin (or maybe doesn't need to, since they have a wildcard cert)
  5. David contacts Eve, thinking they are Alice
  6. Eve supplies the certificate minted in 3, and relays traffic to Alice

I think the mitigation here is for Alice to also include the fingerprint of their cert in the Noise early data. Since it's signed, David can trust it and should abort the connection if the fingerprint of the TLS certificate used on the connection does not match the one in the signed data.

@MarcoPolo
Copy link
Contributor

Yes, this is the same attack I mentioned in #625 (comment)

hence the need for double encryption. The early data would work as well, but if an end client can access the cert fingerprint they could probably also access the key material exporter and use that instead.

I don’t think browsers support reading the fingerprint, so it wouldn’t help them. And I believe websocket is only useful for browser clients.

@2color 2color added the blocked label Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

8 participants