-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't parse zip if hostname contains '-' #52
Comments
That's definitely a bug and thank you for reporting it. The zip file handling section looks like it would need to be updated to handle that correctly probably here. https://github.com/orlikoski/CDQR/blob/master/src/cdqr.py#L1753-L1765 |
Does changing the name of the zip file get you past the problem for now? |
No, and I even changed the hostname (to a single word no spaces or special chars) and re-ran the CyLR tool but it still stopped on the same error on both the 5.1.0 and 5.0.0 versions. However, I think the old hostname will be embedded in some of the files but if the error is in the unzipping part then it should not affect it. On that thought, I also noticed when I did an general unzip |
I just did a test on a zip file that has Can you try making a copy of the zip file with no name conflicts when it unzips manually and then try CDQR on that new zip file? That will determine if it's related or a red herring. Here are the logs of that run.
|
Yes unzipping and accepting the clobbering of the dupes and then re-zipping it worked. I'll have a look tomorrow at what files were duplicated to see if there is a source listed twice in CyLR. Thx |
That's great to know what it is and thank you for helping troubleshoot. What version of CyLR were you using and on what OS? |
Versions: The files the unzip process wanted to add dupes of were all of the following type: (note I have changed the in path username to ):
There was a set of about 20 of these for each user account on the system (I'd added just 4). I wonder if the issue is that the CyLR is accidentally adding the dir twice. |
none of those are duplicates as they each have a unique name in the example provided and the username should be unique. It pulls the usernames from the registry in order to lookup the pathing for those. I wonder if there are multiple registry entries for the same username being used multiple times (like added, deleted, added again). |
This is so weird. I verified in the CyLR code that ....AppData/Roaming/Microsoft/Windows/Recent/AutomaticDestinations is only referenced once: https://github.com/orlikoski/CyLR/blob/master/CyLR/src/CollectionPaths.cs#L91 I also confirmed that everything in that folder is being stored twice (but only for that one path as all the others have no duplicates) so that is an issue. I don't know why it didn't show up before. Nor do I know why CDQR is spitting out errors on extracting it now. I also don't know why the unzip if failing as it uses ZipFile https://docs.python.org/3/library/zipfile.html . I then did a test at the command line to extract a CyLR 2.1.0 collection from Windows 10 ProWITH DUPLICATES using Then I tried using the CDQR 5.1.0 docker to process the zip file and it worked with no errors. I don't know what's going on to cause it to work for me and not for others.
|
Hi!
Note that originally the hostname was DESKTOP-XXXXXX, and I removed the dash in the filename (with the original filename, the same error happened in position 96). I'm running CyLR 2.1.0 on WIN10 PRO 1903, and CDQR on the Skadi OVA (default, just downloaded today and ran in VMWare Workstation). Edit: tried the unzip (same problem with overwriting files in the same folder as the above post) and zipping the result, and now fails with another character: |
hi @jofarinha I will try take a look at this sometime week and test myself to see if I experience the same issue |
Any news for this? Im facing a similar problem Results folder contains 16GB of data after this, means it was extracted successfully but Kibana isnt showing any case data. The letter 'ä' is neither part of the zip folder name nor the host name. The ZIP file is around 360MB and just cancels after some time. Unzipping it leads to 16G of data and a instant successfull finish of cdqr when using the folder instead of the zip file. |
If I remember correctly, this is an issue with the library used by python for archives. Changing that out is possible but will require reworking that function to ensure it works correctly in Window, Linux, and MacOS environments. The easiest and fastest way forward is to extract the any archives that have this issue manually into a temp folder, such as |
First off - cool tool 👍
My hostname has two '-' in it and this causes the cdqr to fail at position 113.
The u2013 char is the - https://www.fileformat.info/info/unicode/char/2013/index.htm
The text was updated successfully, but these errors were encountered: