-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flashing errors with recent Windows update #1025
Comments
@polat-ahmet reported in #1032:
|
@microbit-carlos In case this is related. I am seeing this error when using the microbit on Ubuntu 23.04. It initially works a few times then the timeouts, 503, start happening. I haven't tried stopping/starting the USB bus yet since one bus impacts the keyboard and the other my wireless. But I could dig in deeper. @microbit-carlos FYI: For same hardware I booted into Windows 10 and it worked every time. This sort of feels like USB emulation is incomplete so could this bug be on microbit side ? By that I mean is that after file is dropped into microbit the USB connection seems to be reset so users have another go at flashing again. This USB reset process may be faulty and some required DAPLink API calls are not made, but should have been. |
@mathias-arm This is a critical issue for us, we will appreciate if you provide an estimation this issue, when can it be fixed? |
We had an update from Microsoft that they expect to release a fix in the September Windows update 🎉 |
Tested with Windows 11 22H2 22621.2283 September 12th 2023 build. |
Yes, it looks like the update has been pushed for October, hopefully it'll be finally be out by then. |
From Microsoft
I Tested the Windows 11 22H2 22621.2428 Oct 10th 2023 build. |
I can still consistently repro this bug on Win 11 22H2 23560.1000 insider preview. DAPLink Build ID: v0257-gc782a5ba |
Issue is still reproduced while flashing fw to the MAX32625PICO |
This should be fix with Windows 11 build 22621.2506, released on the 31st of October. I've tested this build with a BBC micro:bit with DAPLink 0257 and could not replicate the issue anymore. @felix-qorvo @top-5 @selimgullulu could you update this this version and try again? Thanks! |
Hi @microbit-carlos , is there an equivalent update for Windows 10? This seems to be for Windows 11. |
I don't know, sorry. Do you have the latest cumulative update installed? (probably KB5031445) And it still has issue there? |
Hi, I used the KB5031445 on two different Windows 10 Laptops (Surface & Dell) and the drag&drop success rate was 100%. I'm waiting confirmation from some colleagues about the resolution. In the meantime, can you please let me know if your PCs are also using encryption for storage? Has anyone experienced this problem on a PC withOUT encryption? Thanks. |
A recent Windows 10 and Windows 11 update has started triggering checksum and time our errors on DAPLink.
This has been reported by micro:bit & Calliope users, and we have been able to replicate in Windows 10 and 11 when the OS is kept up-to-date. We haven't tried Windows 8.1, but that has reached end of life last January.
Triggering Windows Update
The cumulative updates have been found and installed using this Microsoft catalogue:
https://www.catalog.update.microsoft.com/Search.aspx?q=Cumulative%20Update%20Windows%2011%2022H2%20x64
Windows 11 22H2
I went through installing and uninstalling cumulative updates, and in my findings the problem is triggered when installing 2023-02 Cumulative Update Preview for Windows 11 Version 22H2 for x64-based Systems (KB5022913) from the 28th of February, which updates Windows 11 22H2 to OS Build 22621.1344.
The previous cumulative update KB5022845 from the 14th of Feb (OS Build 22621.1265) doesn't trigger this issue.
Windows 11 21H2
The Microsoft update catalog doesn't show any updates for Win 11 1 21H2 since November 2022, so I won't bother to test this Windows version.
Windows 10 22H2
The issue was triggered for me using 2023-03 Cumulative Update Preview for Windows 10 Version 22H2 for x64-based Systems (KB5023773) from the 21st of March, which updates the OS to build 1904x.2788.
The previous cumulative update KB5023696 from the 14th of March (OS Build 1904x.2728) doesn't trigger this issue.
Windows 10 21H2
We've also been able to replicate this issue in Win 10 21H2, and Microsoft is still releasing updates for this OS version, so it makes sense that we could identify a specific cumulative update to introduce this issue.
I probably won't be looking into this one, nor Win 10 20H2 as it's unlikely to provide any additional useful information.
Failure modes
We've encountered a few different ways in which errors emerge:
The errors are not triggered on every flash, but different users have reported different error frequencies. In our internal testing some teammates measured 20% failure rate and others up to 60%. Some users have reported errors happenning on "almost every flash".
We've used micro:bit Universal Hex files for the majority of these tests, which are a bit more resilient to this issue (more info in the "Identifying the Cause" section), so other DAPLink users flashing Intel Hex files might encounter this problem more often (it's also likely that the micro:bit user that reported an error on "almost every flash" was using Intel Hex files as well).
Identifying the Cause
I’ve collected a couple of RTT logs from DAPLink with additional debug prints to track how the OS writes the file blocks to disk, and peaking at the actual data. While it’s still a bit early (I need more time to capture more data and analyse it), initial findings point at the problem being caused by file blocks being sent out of order by the OS.
In previous Windows versions, the file blocks are sent in order, but after the listed Windows updates are installed it looks like some file blocks are first sent as zeros, and then later down the file transfer the blocks are sent again with the real file data.
For example:
And this can happen more than once on the same file transfer.
However, not every file transfer sends files out of order, some are sent in order and it all works fine.
The check sum errors are encountered when the OS sends a block filled with zeros and DAPLink tries to calculate the checksum of an Intel Hex record. I still need to capture a better log for timeout errors, but I believe those are usually triggered when out of order blocks are ignored by DAPLink and then when the OS has finished sending the file, then DAPLink waits for more data to arrive (as the ignored blocks are not counted when measuring how much file data was transferred) until it eventually times out.
For the micro:bit specifically we use Universal Hex files, a superset of the Intel Hex format, which contains data for micro:bit V1 and micro:bit V2 in the same file. In file transfers where the out-of-order blocks correspond only for a section of the Universal Hex file that is not relevant the target MCU being flashed, the flash can still be successful. So while I haven't yet compared failure rates of Intel vs Universal Hex, it's very likely Intel Hex (and bin) files fail more frequently.
A checksum error log and Universal Hex file can be found here:
(Also note that because there is a lot of log data captured, data is sometimes dropped, so it might look like some blocks are not being sent, but we can look at the variables tracking the file size transferred to confirm that data has been processed, it's just that the RTT buffer was likely full).
Workarounds
Using
robocopy
with the/z
flag, for restartable mode, seems to be work so far.For example, with the terminal at the path where your
file.hex
is located, and assuming DAPLink is mounted as driveE:\
:Also, WebUSB flashing works, so for Intel Hex and bin files this demo from DAPJs can still flash the boards:
https://armmbed.github.io/dapjs/examples/daplink-flash/web.html
For micro:bit Universal Hex files, with online WebUSB tool will work too:
https://microbit.org/tools/webusb-hex-flashing/
The text was updated successfully, but these errors were encountered: