-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Missing Serial Data Under Heavy Load (RDT-1099) #334
Conversation
Is it a very common error for now? It looks like adding 0.1 might help in some cases, but it won't solve the issue itself. |
I’m not sure which part of the code is causing the problem, but I've noticed that as the listener processes more data, it slows down and eventually starts missing data. This might be related to the queue size. On my macOS, it's 32 KB, and the issue appears around offset 0x7C80. There seems to be a correlation. Adding sleep helps to see more data at the console without corruption. (At least in 30 seconds time period) |
@erhankur Thanks for the PR. The serial object sets the read/write interval here. If the queue is empty, the `_listen' function is skipped. Could you check if changing the interval in |
Unfortunately, increasing the timeout only in |
okay. let's merge it. Thank you @erhankur! |
@erhankur Unfortunately, this implementation led to extremely slow chunk reading when used under QEMU. |
While I was debugging, len(s) was maximum 1020 bytes. Do you say if there is more than 1020 bytes we need to get more and put to the queue? If there is a way to increase max chunk size, it would be also a solution. |
Description
Under heavy loads, the
_listen()
function produces truncated or corrupted log output.Example test:
In a normal sequence, each log line should contain a contiguous block of data like
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
.However, in the snippet below, we see a break in the expected pattern:
In line 7c80, the string abruptly jumps from ABCDEFGHIJ to qrstuvwxyz0123456789+/, indicating partial or misaligned writes—a classic sign of log corruption. Furthermore, the next line jumps from data-len identifier 7c80 to bc40, which hints at missing data in between. Under high throughput, _listen() may process chunks so rapidly that they become interleaved or partially dropped.
During the internal CI tests, it appears in the
hw_stack_guard_cpu1
test with the following snippet:Adding 100ms sleep at the end of each loop iteration fixes the truncation and data missing.
Related
Testing
Checklist
Before submitting a Pull Request, please ensure the following: