Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7-Zip and PeaZip can't open tar files created by DART #229

Open
diamondap opened this issue Mar 5, 2020 · 17 comments
Open

7-Zip and PeaZip can't open tar files created by DART #229

diamondap opened this issue Mar 5, 2020 · 17 comments

Comments

@diamondap
Copy link
Member

diamondap commented Mar 5, 2020

I'm recording this as an informational issue. The bug is actually in 7-Zip, which PeaZip uses under the hood, as shown in the attached screen shot. 7-Zip's inability to parse PAX headers is an open issue at https://sourceforge.net/p/sevenzip/bugs/2116/

Note that DART uses PAX headers because the old ustar headers could describe files only up to 4GB in size. PAX headers let us include files >4GB in the archive.

The image below show's PeaZip's detailed error. Before you see this, PeaZip shows a message saying it thinks the archive is encrypted, and it asks for a password. This is incorrect. DART does not create password-protected tar files.

PeaZip_Error

@diamondap
Copy link
Member Author

diamondap commented Mar 5, 2020

You can open DART tar files on Windows using Windows built-in tar command. See https://www.addictivetips.com/windows-tips/use-tar-on-windows-10/ for details on how to use it.

Microsoft's release of tar is described here. Their version uses libarchive under the hood, which does support PAX headers

@diamondap
Copy link
Member Author

Closing: This is a bug in 7-Zip, not in DART.

@kieranjol
Copy link

Bug raised with https://sourceforge.net/p/sevenzip/bugs/2396/

@kieranjol
Copy link

kieranjol commented Jul 18, 2023

Hi Andrew, there does appear to be an issue with the TAR files created by DART.
From Igor's thread on the 7-ZIP forum:

Tar requires 2 records of zeros at the end:
The end of an archive is marked by at least two consecutive zero-filled records.
Record is 512-byte.
So it's bug in DART , that doesn't write zeros to the end of tar.

If we have no zeros at the end of tar, we have no confirmation that archive is correct. For example, the archive file clould be truncated.

I made a tar with 7-zip and also the tarfile module in python and it has the required zeroes at the end. DART tars have only 82 bytes of zeroes at the end from my limited hex viewing.

Edit: 7-zip has improved TAR support in general in the last few years. previous versions only extracted folders and no files, and there were PAXHEADER warnings. Newer versions extract files from DART TAR files just fine, they just have that warning at the end.

@diamondap
Copy link
Member Author

Good to know. The issue likely comes from the underlying JavaScript tar-stream library. I'm in the process of rewriting DART, so I hope this will be fixed in future versions.

@kieranjol
Copy link

I made a >4 gig file with https://github.com/mafintosh/tar-fs which uses tar-stream (same developer) and it has the correct number of zeroes at the end. Good to know that this is on the list anyhow. SHould this be re-opened or will I make a new issue as I think it's now a different situation to the original post in this issue?

@diamondap
Copy link
Member Author

I'll reopen it.

@diamondap diamondap reopened this Jul 18, 2023
@zoidy
Copy link

zoidy commented Aug 28, 2023

I'm not sure on what side (DART or 7-Zip) or when the change happened but I've been having no problems opening DART-created .tar files with 7-Zip for a while now. I'm using DART 2.022 and 7-Zip 22.01 dated 2022-07-15

@diamondap
Copy link
Member Author

I'm working on a new version of DART. It won't be available for a few months, but I will test DART tar files with 7-zip in the new version.

@zoidy
Copy link

zoidy commented Aug 28, 2023

Not quite sure how to interpret your response, it's actually already working with the current DART version 2.0.22, at least on Windows. :) See responses below

@kieranjol
Copy link

Are you not getting the 'Unexpected end of archive' message with 7-zip? 7-ZIP also successfully extracts TAR packages for me but I get that warning due to the insufficient number of zeros at the end of the TAR file.

@zoidy
Copy link

zoidy commented Aug 31, 2023

@kieranjol I do not. Both testing and extracting TARs created using the versions of DART and 7-Zip I mentioned above work fine with no warnings or errors. This was curious to me and after a bit of testing, I think there are two peculiarities going on. The first is a 7-Zip issue, the second is DART-related.

1. Difference between 7-Zip GUI and 7-Zip command line
I took a look at your bug report on the 7-Zip bug tracker and downloaded your test file. I use the 7-Zip GUI and not the context menus. The GUI does not give any errors for me with your test file.
image
However, using the command line version does show the "Unexpected end of archive" message. This indicates that 7-Zip GUI and command line / DLL versions are handling errors differently

2. Difference in TARs generated by DART and dart-runner
I'm not sure what version of DART you used for your test file but I've been using dart-runner to generate bags in an automated workflow. The bags generated by dart-runner don't seem to suffer from the "Unexpected end of archive" issue. I tested this by generating a bag using the DART GUI (on Windows) and I observed the issue (notably, the 7-zip GUI continued to not show any errors at all). This may be because dart-runner is Linux-based and might use a different TAR library? If so, then it may be that only the Windows version of DART is affected by this bug (I haven't tested the Mac or Linux versions)

@kieranjol
Copy link

Thanks for this detailed breakdown.

  1. I can replicate this with 7-ZIP 23.01. Using the Extract menu in the GUI, rather than the context menu, there is no error warning. I would argue that there should be a warning, if an insufficient number of zero bytes are present at the end of the tar, as prescribed in the specification. So I think that this should be flagged with the 7-ZIP developer to see what he thinks, as I would imagine that his intention, based on the sourceforge thread, is that a warning should appear?
  2. I think I was actually using 2.0.21, but I just upgraded to 2.0.22 DART GUI on Mac there, and I can still see the same number of zeroes at the end of the tar in a hex editor. Very interesting that dart-runner behaves differently!

Can you share an example tar from dart-runner?

@zoidy
Copy link

zoidy commented Aug 31, 2023

Sure thing. Here it is

@kieranjol
Copy link

kieranjol commented Aug 31, 2023

That is so fascinating! There are the correct amount of zero bytes at the end of the dart-runner example!
DART GUI on the left, dart-runner on the right.
Screenshot 2023-08-31 at 16 39 52

@diamondap
Copy link
Member Author

Sorry I haven't been tuned in to this tread. I'm on vacation.

DART is written in JavaScript and uses the tar-stream library to write tar files. DART Runner is written in Go and uses the archive/tar library, which is part of the core of the Go language. It doesn't surprise me that one produces incorrect output while the other is correct. I've found many JavaScript libraries to be unreliable.

I'm in the process of writing DART 3.0 right now, ditching JavaScript in favor of Go. The JavaScript-Node-Electron platform on which DART is build has been exceedingly hard to maintain and extend. The goal of the rewrite is to provide a more stable, reliable, maintainable and extensible platform.

So the short answer to this issue is that it almost certainly won't be fixed in the DART 2.x line. It will be fixed in DART 3.0. I hope to have a beta release of 3.0 later this year.

@kieranjol
Copy link

Thanks for all this, it's been helpful for me to get to the bottom of this as well. 3.0 fix sounds like it makes the most sense. Looking forward to testing it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants