Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgiWebSocketRecv calls recvCb with frame fragments when frames span multiple TCP segment #51

Open
n1ywb opened this issue Nov 30, 2018 · 3 comments

Comments

@n1ywb
Copy link

n1ywb commented Nov 30, 2018

When it receives a websockets frame that spans multiple TCP segments, the receive callback gets called once with each segment, with no obvious way to reassemble them.

https://github.com/chmorgan/libesphttpd/blob/master/util/cgiwebsocket.c#L174

https://github.com/chmorgan/libesphttpd/blob/master/util/cgiwebsocket.c#L254
^ this should get called once, with the whole frame

console log

I (1420127) user_main: UartWs: connect
D (1420147) cgiwebsocket: Upgrade: websocket
I (1420147) user_main: McuWs: connect
D (1420167) cgiwebsocket: Upgrade: websocket
I (1420167) user_main: EchoWs: connect
D (1420187) cgiwebsocket: Frame payload. wasHeaderByte 1 fr.len 9 sl 9 cmd 0x82
I (1420187) user_main: McuWs: len=9, more=0, bin=2
I (1420187) user_main: set time: 1543542461
I (1420197) RTC: >> writeValue: 1543542461
I (1420197) RTC:  - Fri Nov 30 01:47:41 2018

D (1437817) cgiwebsocket: Frame payload. wasHeaderByte 1 fr.len 1538 sl 1428 cmd 0x82
I (1437817) user_main: McuWs: len=1428, more=0, bin=2
I (1437817) user_main: invalid schedule len: 1428; expected 1538
D (1437827) cgiwebsocket: Frame payload. wasHeaderByte 0 fr.len 110 sl 110 cmd 0x82
I (1437827) user_main: McuWs: len=110, more=0, bin=2
I (1437837) user_main: invalid set time len: 110

browser log

image

wireshark decode

image

@tidklaas
Copy link

Isn't that what the userData element in the struct Websock is for?

@dmcnaugh
Copy link

dmcnaugh commented Dec 2, 2018

Isn't that what the userData element in the struct Websock is for?

No, for a range of reasons:

  • userData is provided but not used by ay of the libesphttpd code, its for the user to attach web socket specific data to
  • lwip (like other ip stacks) does not perform TCP segment reassembly
  • this is not websocket fragmentation using the FIN flag, this is caused by TCP segmentation and only the next layer in the stack i.e. the websocket protocol layer has visibility of all the information that allows the TCP segments to be reassembled
  • the user implementation of a ws->recvCb is an application level function that does not need to know anything about the behaviour of the TCP stack or the websocket protocol layer and how the incoming data is segmented between packets or how it should be reassembled

The current implementation of cgiWebSocketRecv identifies that the websocket payload has been segmented and flags it in the ws->priv struct
https://github.com/chmorgan/libesphttpd/blob/master/util/cgiwebsocket.c#L272
but makes no attempt to reassemble the websocket payload when it receives the next part of the payload in the next TCP segment.

Reassembly can be non-trivial as a total websocket payload can be 2^63 bytes long, but this problem is not only exposed by big websocket payloads. It also manifests with small payloads when the client decides to transmit multiple WebSocket.send's in a single TCP frame (which cannot by controlled by the frontend developer in modern browsers) and the combined size exceeds a single TCP frame (typically <1500 bytes). In this case reassembly should be simpler but still should be implemented in the websocket protocol layer.

The first case of large websocket payloads spanning many TCP frames is made complex because of memory constraints with a micro-controller, and a limit could be set using a configuration parameter for libesphttp to make it manageable.
The second case of a websocket payload that has be segmented across two (2) TCP frames simply because of aggregation in TCP frames for transmission needs to be solved for even small payloads to work reliably in the libesphttpd websockets implementation.

I am going to try and fix the second case in the sort term while ignoring the first case as that is not currently the problem I am having. If I come up with something useful I will setup a PR with my working code.

@tidklaas
Copy link

tidklaas commented Dec 2, 2018

Disclaimer: I have not used Websockets for anything serious, just played around around a bit with some examples.

The way I understand the mechanisms here, the websocket will provide a reliable, ordered stream of application level data. Like with Berkeley sockets, there is no guarantee that blobs of data will be received in a single operation, so any application has to be able to either process incoming fragmented data on the fly, or buffer the data across multiple reads until a complete payload unit has been received. Such a buffering mechanism can easily be implemented using the private userData pointer.

Assembly by the libesphttpd would be a very bad idea, since, as you pointed out, payloads can get quite big and the lib has absolutely no idea of what the contents should look like, so it would have to buffer all of it. The application, on the other hand, knows exactly how to handle the data, which parts can be processed on the fly or simply be discarded, and, crucially, detect corrupted or malicious payloads and then act immediately.

I do see a minor problem in the way the MORE flag is handled. If a final frame gets fragmented, the MORE flag should be kept set until the last fragment's data gets handed to the callback function.
Line 253 should be something like this

if ((ws->priv->fr.flags & FLAG_FIN) == 0 || (ws->priv->fr.len > sl)){
    flags |= WEBSOCK_FLAG_MORE;
}

instead of this
if ((ws->priv->fr.flags&FLAG_FIN)==0) flags|=WEBSOCK_FLAG_MORE;

Again, technically this is not strictly necessary, as the application should (must!) verify the received data and can therefore deduce that fragmentation has happened.

Tido

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants