Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix clearing endpoint halt #13

Merged
merged 2 commits into from
Nov 23, 2022
Merged

fix clearing endpoint halt #13

merged 2 commits into from
Nov 23, 2022

Conversation

tlyu
Copy link

@tlyu tlyu commented Nov 23, 2022

Patch the low-level firmware to set the endpoint to NAK instead of ACK when clearing an endpoint halt. Prevents a bus reset loop situation with Windows due to an unexpectedly large queued IN packet.

Also, add a handler to reset our endpoint buffer state when the low-level firmware clears an endpoint halt.

@tlyu
Copy link
Author

tlyu commented Nov 23, 2022

The firmware was temporarily modified to make the write consolidation occur more often.

The hex digit strings are raw control request contents. The lines starting with S are the calls to USBCore_::send. The lines starting with _ are the beginning of the endpoint flush process. The lines starting with > are immediately after the flush has queued the packet on the peripheral. The format is endpoint > length.

Trace logs from waking Windows from sleep, before patch:

[10:56:43.646] 00 03 01 00 00 00 00 00

This is the SetFeature(RemoteWakeup) sent prior to suspending the bus.

[10:57:01.295] S5>1
[10:57:01.296] S5>29
[10:57:01.296] _5>30
[10:57:01.362] S5>1
[10:57:01.362] S5>29
[10:57:01.363] _5>60
[10:57:01.364] >5>60

These are the key press and release events to trigger the wakeup. The first transmission is skipped, and consolidated with the second.

[10:57:02.340] 00 01 01 00 00 00 00 00
[10:57:02.344] 21 0A 00 00 02 00 00 00
[10:57:02.348] 21 09 00 02 02 00 01 00
[10:57:02.353] 21 0A 00 00 03 00 00 00
[10:57:02.357] Error

This is the timeout when Windows refuses to ACK the overly-large packet.

[10:57:02.357] 21 09 08 02 03 00 02 00
[10:57:02.364] Reset
[10:57:02.448] 80 06 00 01 00 00 40 00
[10:57:02.452] Reset
[10:57:02.535] 00 05 01 00 00 00 00 00
[10:57:02.549] 80 06 00 01 00 00 12 00
[10:57:02.554] 00 09 01 00 00 00 00 00

Windows does a reset and re-enumeration.

[10:57:02.559] 02 01 00 00 85 00 00 00

This is the ClearFeature(EndpointHalt) sent by Windows.

[10:57:02.849] 21 09 08 02 03 00 02 00
[10:57:02.854] 21 09 00 02 02 00 01 00
[10:57:03.349] Error
[10:57:03.356] Reset
[10:57:03.440] 80 06 00 01 00 00 40 00
[10:57:03.444] Reset
[10:57:03.528] 00 05 01 00 00 00 00 00
[10:57:03.543] 80 06 00 01 00 00 12 00
[10:57:03.548] 00 09 01 00 00 00 00 00
[10:57:03.553] 02 01 00 00 85 00 00 00
[10:57:03.558] Error
[10:57:03.563] Reset
[10:57:03.647] 80 06 00 01 00 00 40 00
[10:57:03.651] Reset
[10:57:03.734] 00 05 01 00 00 00 00 00
[10:57:03.748] 80 06 00 01 00 00 12 00
[10:57:03.753] 00 09 01 00 00 00 00 00
[10:57:03.758] 02 01 00 00 85 00 00 00
[10:57:05.565] Error
[10:57:05.573] Reset
[10:57:05.656] 80 06 00 01 00 00 40 00
[10:57:05.660] Reset
[10:57:05.743] 00 05 01 00 00 00 00 00
[10:57:05.757] 80 06 00 01 00 00 12 00
[10:57:05.762] 00 09 01 00 00 00 00 00
[10:57:05.767] 02 01 00 00 85 00 00 00

And there is a bus reset loop, because the queued oversized packet keeps getting retransmitted.

After the patch:

[11:03:28.719] 00 03 01 00 00 00 00 00
[11:03:41.167] S5>1
[11:03:41.167] S5>29
[11:03:41.168] _5>30
[11:03:41.263] S5>1
[11:03:41.264] S5>29
[11:03:41.264] _5>60
[11:03:41.265] >5>60

Again, the consolidated write is queued on the peripheral.

[11:03:42.204] 00 01 01 00 00 00 00 00
[11:03:42.208] 21 0A 00 00 02 00 00 00
[11:03:42.212] 21 09 00 02 02 00 01 00
[11:03:42.217] 21 0A 00 00 03 00 00 00
[11:03:42.221] Error
[11:03:42.222] 21 09 08 02 03 00 02 00
[11:03:42.228] Reset
[11:03:42.312] 80 06 00 01 00 00 40 00
[11:03:42.316] Reset
[11:03:42.400] 00 05 01 00 00 00 00 00
[11:03:42.414] 80 06 00 01 00 00 12 00
[11:03:42.419] 00 09 01 00 00 00 00 00
[11:03:42.424] 02 01 00 00 85 00 00 00

This time, the ClearFeature(EndpointHalt) works correctly, as seen by the lack of reset loop below.

[11:03:42.695] 21 09 08 02 03 00 02 00
[11:03:42.700] 21 09 00 02 02 00 01 00
[11:03:57.805] S5>1
[11:03:57.806] S5>29
[11:03:57.806] _5>30
[11:03:57.807] >5>30
[11:03:57.911] S5>1
[11:03:57.911] S5>29
[11:03:57.912] _5>30
[11:03:57.913] >5>30

When clearing an endpoint halt condition, set endpoint status to NAK
instead of VALID.

When Windows receives more data than expected from an interrupt
endpoint, it resets the device and attempts to clear the endpoint halt
condition on that endpoint. On an IN endpoint, setting the status to
VALID will cause a retransmission of the offending packet, resulting in
a loop of attempted recovery operations.

Signed-off-by: Taylor Yu <[email protected]>
Re-init the endpoint buffer state on ClearFeature(EndpointHalt).

Signed-off-by: Taylor Yu <[email protected]>
@obra obra merged commit ece7362 into keyboardio:main Nov 23, 2022
@tlyu tlyu deleted the fix-clear-halt branch November 23, 2022 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants