Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[.NET 10 Preview 1] System.IO.Compression.ZipArchive produces subtly incorrect zip headers #112017

Open
jonathanpeppers opened this issue Jan 30, 2025 · 32 comments

Comments

@jonathanpeppers
Copy link
Member

Description

We noticed this here:

We use System.IO.Compression.ZipArchive to create Android .apk files (and other archives).

When signing an .apk with Android's zipalign tool, it reports:

01-30 21:38:27.669 38611 159726 W zip     : WARNING: header mismatch 

However, if I inspect the actual file, some tools are ok with it:

> 7z t D:\Downloads\com.companyname.DotNetNewandroid.apk

7-Zip 24.09 (x64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-11-29

Scanning the drive for archives:
1 file, 7069900 bytes (6905 KiB)

Testing archive: D:\Downloads\com.companyname.DotNetNewandroid.apk
--
Path = D:\Downloads\com.companyname.DotNetNewandroid.apk
Type = zip
Physical Size = 7069900

Everything is Ok

Files: 46
Size:       20333467
Compressed: 7069900

So, then we tried:

$ zipdetails com.companyname.DotNetNewandroid.apk
...
#
# ERROR: Found 1 Field Mismatch for Filename 'AndroidManifest.xml'
#
#  --------------------------------------------------------------------------------------
#  | Field Name       | Central Offset     | Central Value | Local Offset | Local Value | 
#  --------------------------------------------------------------------------------------
#  | Extract Zip Spec | 0x6BD1E2 (7066082) | 0x14 (20) 2.0 | 0x7 (7)      | 0x0 (0) 0.0 | 
#  --------------------------------------------------------------------------------------
#
# ERROR: Found 1 Field Mismatch for Filename 'res/layout/activity_main.xml'
#
#  --------------------------------------------------------------------------------------
#  | Field Name       | Central Offset     | Central Value | Local Offset | Local Value | 
#  --------------------------------------------------------------------------------------
#  | Extract Zip Spec | 0x6BD223 (7066147) | 0x14 (20) 2.0 | 0x460 (1120) | 0x0 (0) 0.0 | 
#  --------------------------------------------------------------------------------------
#
# ERROR: Found 1 Field Mismatch for Filename 'res/layout/layout.xml'
#
#  --------------------------------------------------------------------------------------
#  | Field Name       | Central Offset     | Central Value | Local Offset | Local Value | 
#  --------------------------------------------------------------------------------------
#  | Extract Zip Spec | 0x6BD26D (7066221) | 0x14 (20) 2.0 | 0x5C0 (1472) | 0x0 (0) 0.0 | 
#  --------------------------------------------------------------------------------------
#
# ERROR: Found 1 Field Mismatch for Filename 'res/mipmap-anydpi-v26/appicon.xml'
#
#  ----------------------------------------------------------------------------------------
#  | Field Name       | Central Offset     | Central Value | Local Offset   | Local Value | 
#  ----------------------------------------------------------------------------------------
#  | Extract Zip Spec | 0x6BD7A4 (7067556) | 0x14 (20) 2.0 | 0x779C (30620) | 0x0 (0) 0.0 | 
#  ----------------------------------------------------------------------------------------
#
# ERROR: Found 1 Field Mismatch for Filename 'res/mipmap-anydpi-v26/appicon_round.xml'
#
#  ----------------------------------------------------------------------------------------
#  | Field Name       | Central Offset     | Central Value | Local Offset   | Local Value | 
#  ----------------------------------------------------------------------------------------
#  | Extract Zip Spec | 0x6BD7F3 (7067635) | 0x14 (20) 2.0 | 0x78B0 (30896) | 0x0 (0) 0.0 | 
#  ----------------------------------------------------------------------------------------
#
# Error Count: 5

Full output: https://gist.github.com/grendello/3007335dba59383ac7782311d262e844

Reproduction Steps

  • dotnet new android
  • dotnet build
Xamarin.Android.Common.targets(2547,2): error ANDZA0000: 01-30 21:38:27.669 38611 159726 W zip     : WARNING: header mismatch 

Expected behavior

ZipArchive produces files that do not cause Android tooling to produce errors.

Actual behavior

ZipArchive produces files that cause Android tooling to produce errors.

Regression?

Yes

Known Workarounds

We are attempting a flag we have $(_AndroidUseLibZipSharp)=true, that will fall back to an open-source zip library we used to use.

Configuration

.NET SDK: 10.0.100-preview.1.25079.13
.NET runtime: 10.0.0-preview.1.25078.5

Archive: com.companyname.DotNetNewandroid.apk.zip

Other information

No response

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 30, 2025
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jan 30, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

@carlossanlop
Copy link
Member

@edwardneal seems there's a bug from the latest changes.

@jonathanpeppers
Copy link
Member Author

It could be related to us using this file as input: packaged_resources.zip

A tool named aapt2 creates this original file.

Then we add additional files and save it to disk.

@carlossanlop
Copy link
Member

Actually, it might not be a bug. We did change the logic that writes the zip headers, @jonathanpeppers . If the files are not malformed and can be read without issues, then maybe we need to let the tool know that the mismatch is expected this time.

@carlossanlop
Copy link
Member

These were the changes introduced to System.IO.Compression.ZipArchive for preview1:

#103153
#102704
#111802

@carlossanlop carlossanlop removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 30, 2025
@carlossanlop carlossanlop self-assigned this Jan 30, 2025
@jonathanpeppers
Copy link
Member Author

then maybe we need to let the tool know that the mismatch is expected this time.

zipalign is not owned by us; it's part of Google's Android SDK and we have to shell out to it.

Some .zip tools can open the archive, but it's unknown yet if Android apps are installable and are able to run.

We are trying a workaround to see if this is blocking, thanks.

@carlossanlop
Copy link
Member

If you open the archive with different tools, can you open the individual files and read their contents normally?

@jonathanpeppers
Copy link
Member Author

Android Studio can open the file and view contents:

Image

But because the "warning" message causes a build error, I don't know if it will run on Android yet.

@lewing
Copy link
Member

lewing commented Jan 30, 2025

for reference https://github.com/aosp-mirror/platform_build/blob/e8ac1cdd2583213911e91a990398e19c51002924/tools/zipalign/ZipEntry.cpp#L93

@carlossanlop
Copy link
Member

The full log prints the local header and the central header, and indeed there is only one field that does not match between the two, the Extract Zip Spec:

000000 LOCAL HEADER #1       04034B50 (67324752)
000004 Extract Zip Spec      00 (0) '0.0'   <- Mismatch
000005 Extract OS            00 (0) 'MS-DOS'
000006 General Purpose Flag  0000 (0)
       [Bits 1-2]            0 'Normal Compression'
000008 Compression Method    0008 (8) 'Deflated'
00000A Modification Time     00210000 (2162688) 'Tue Jan  1 01:00:00 1980'
00000E CRC                   44FE7197 (1157525911)
000012 Compressed Size       00000428 (1064)
000016 Uncompressed Size     00000C20 (3104)
00001A Filename Length       0013 (19)
00001C Extra Length          0000 (0)
00001E Filename              'AndroidManifest.xml'
000031 PAYLOAD

vs

6BD1DB CENTRAL HEADER #1     02014B50 (33639248)
6BD1DF Created Zip Spec      14 (20) '2.0'    <- Mismatch
6BD1E0 Created OS            03 (3) 'Unix'
6BD1E1 Extract Zip Spec      14 (20) '2.0'
6BD1E2 Extract OS            00 (0) 'MS-DOS'
6BD1E3 General Purpose Flag  0000 (0)
       [Bits 1-2]            0 'Normal Compression'
6BD1E5 Compression Method    0008 (8) 'Deflated'
6BD1E7 Modification Time     00210000 (2162688) 'Tue Jan  1 01:00:00 1980'
6BD1EB CRC                   44FE7197 (1157525911)
6BD1EF Compressed Size       00000428 (1064)
6BD1F3 Uncompressed Size     00000C20 (3104)
6BD1F7 Filename Length       0013 (19)
6BD1F9 Extra Length          0000 (0)
6BD1FB Comment Length        0000 (0)
6BD1FD Disk Start            0000 (0)
6BD1FF Int File Attributes   0000 (0)
       [Bit 0]               0 'Binary Data'
6BD201 Ext File Attributes   00000000 (0)
6BD205 Local Header Offset   00000000 (0)
6BD209 Filename              'AndroidManifest.xml'

@carlossanlop
Copy link
Member

carlossanlop commented Jan 30, 2025

The erroneous value seems to be 0. The possible values for VersionMadeBySpecification are those indicated in this enum, and 0 is not among them:

internal enum ZipVersionNeededValues : ushort
{
Default = 10,
ExplicitDirectory = 20,
Deflate = 20,
Deflate64 = 21,
Zip64 = 45
}

@jonathanpeppers you mention the zip file is created with the external tool aapt2. If we're opening the original zip archive using System.IO.Compression to add extra files, do you know if the files reporting the warning were added by us, or did they come from the aapt2?


Just so we have these links available for quick reach if needed, I collected the related lines of code that were changed recently:

Expand me

The length of the VersionMadeBySpecification value is now specified here:

public const int VersionMadeBySpecification = sizeof(byte);

The location of the field is specified here:

public static readonly int VersionMadeBySpecification = Signature + FieldLengths.Signature;

The central directory value is read here:

header.VersionMadeBySpecification = buffer[FieldLocations.VersionMadeBySpecification];

The central directory value is written here:

cdStaticHeader[ZipCentralDirectoryFileHeader.FieldLocations.VersionMadeBySpecification] = (byte)_versionMadeBySpecification;

Also, the shared apk.zip file weighs 6.75 MB, meaning we're not using zip64 (that code kicks in at 4GB and above). I mention this because there's a method called VersionToExtractAtLeast (which was not modified in any of the recent PRs) that can modify the value of _versionMadeBySpecification when zip64 is needed, but we can discard this case.

@jonathanpeppers
Copy link
Member Author

The entries that zipdetails reports ERROR: Found 1 Field Mismatch for Filename

Appear to have been in the input archive created by aapt2: #112017 (comment)

But not all of them are incorrect, only 5 seem to be wrong?

@carlossanlop
Copy link
Member

carlossanlop commented Jan 30, 2025

Appear to have been in the input archive created by aapt2: #112017 (comment)

Okay, that might be good news from the runtime perspective, then. The recent changes may have not introduce the bug.

But not all of them are incorrect, only 5 seem to be wrong?

You're right, that's weird. The 5 files reported with warnings had a problem with the spec field. Bot for most of the entries, that field is correct. Example of an entry without issues:

654120 LOCAL HEADER #46      04034B50 (67324752)
654124 Extract Zip Spec      14 (20) '2.0'
654125 Extract OS            00 (0) 'MS-DOS'
654126 General Purpose Flag  0000 (0)
       [Bits 1-2]            0 'Normal Compression'
654128 Compression Method    0008 (8) 'Deflated'
65412A Modification Time     5A3EA840 (1514055744) 'Thu Jan 30 22:02:00 2025'
65412E CRC                   A34CA322 (2739708706)
654132 Compressed Size       00069074 (430196)
654136 Uncompressed Size     00152E18 (1388056)
65413A Filename Length       0029 (41)
65413C Extra Length          0000 (0)
65413E Filename              'lib/x86_64/libxamarin-debug-app-helper.so'
654167 PAYLOAD
6BE05F CENTRAL HEADER #46    02014B50 (33639248)
6BE063 Created Zip Spec      14 (20) '2.0'
6BE064 Created OS            03 (3) 'Unix'
6BE065 Extract Zip Spec      14 (20) '2.0'
6BE066 Extract OS            00 (0) 'MS-DOS'
6BE067 General Purpose Flag  0000 (0)
       [Bits 1-2]            0 'Normal Compression'
6BE069 Compression Method    0008 (8) 'Deflated'
6BE06B Modification Time     5A3EA840 (1514055744) 'Thu Jan 30 22:02:00 2025'
6BE06F CRC                   A34CA322 (2739708706)
6BE073 Compressed Size       00069074 (430196)
6BE077 Uncompressed Size     00152E18 (1388056)
6BE07B Filename Length       0029 (41)
6BE07D Extra Length          0000 (0)
6BE07F Comment Length        0000 (0)
6BE081 Disk Start            0000 (0)
6BE083 Int File Attributes   0000 (0)
       [Bit 0]               0 'Binary Data'
6BE085 Ext File Attributes   81E40000 (2179203072)
       [Bits 16-24]          01E4 (484) 'Unix attrib: rwxr--r--'
       [Bits 28-31]          08 (8) 'Regular File'
6BE089 Local Header Offset   00654120 (6635808)
6BE08D Filename              'lib/x86_64/libxamarin-debug-app-helper.so'

@jonathanpeppers did the aapt2 tool get updated recently? Is it being used for the first time? I think the investigation should now go in that direction.

@carlossanlop
Copy link
Member

carlossanlop commented Jan 31, 2025

@jonathanpeppers There is another case we need to verify: Do you have access to the original archive, before it is updated by System.IO.Compression? Can you run that original file with the zipdetails tool?

I ask because the new ZipArchive code will reorder and rewrite any existing entries that were modified and will also rewrite all the entries that succeed the first one that was modified. And then after that it will write the new entries at the end.

I'd like to confirm the file was malformed before it was updated.

@edwardneal
Copy link
Contributor

Thanks @carlossanlop. I've checked the original and updated ZIP files. It's the combination of not rewriting the archive, and an unusual input file.

The original archive has entries which are written with an Extract Zip Spec ("version needed to extract" in the spec.) field of 0x00. In the five cases we see, this isn't valid - the entry's compression methods are Deflate, and the specification says that the minimum Extract Zip Spec for this is 2.0 (0x14).

ZipArchiveEntry notes this when it's constructed and sets the Extract Zip Spec field correctly. When the Central Directory Header is written, this correct value is then written out as part of that process. The entries with incorrect Extract Zip Spec fields haven't had their contents or changed though, so they (and their local file headers containing the field value of 0x00) aren't rewritten. This is where the mismatch comes from - the CD header has been corrected, but the local file headers haven't been.

I'm fairly sure that we can modify VersionToExtractAtLeast to set the ChangeState.FixedLengthMetadata bit on ZipArchiveEntry.Changes when the value of the Extract Zip Spec field needs to change. This'll force the local file headers of those five entries to be rewritten with the correct values when the file is written out.

I can submit a PR fixing this tomorrow, but if that's too close to the release and _AndroidUseLibZipSharp doesn't fix the problem then perhaps rolling the three changes back and pushing them into the next preview would be safer?

I'm not sure why aapt2 is generating the files as it does though.

@carlossanlop
Copy link
Member

We are past code complete for Preview1. If there's no workaround for this, @jonathanpeppers , we can bring this up to Tactics and determine if we should revert the changes or take the fix or do nothing.

Thanks for the confirmation @edwardneal . Please send the PR with the fix to have it ready.

@carlossanlop
Copy link
Member

carlossanlop commented Jan 31, 2025

Good news: We got green light to merge a fix, @edwardneal . We still have some runway.

You'd send it to main, I'd take care of backporting it to preview1.

@jonathanpeppers
Copy link
Member Author

@carlossanlop we have a valid workaround, we switched to using our old library for handling zip files:

The only drawback, it has worse performance than System.IO.Compression. But I think that is ok for a preview.

@carlossanlop
Copy link
Member

carlossanlop commented Jan 31, 2025

@jonathanpeppers I am ready to send an email to Tactics requesting backport approval for the fix #112032 .

Do you prefer to use the workaround for preview1 (no backport) and then switch back to aapt2 + the ziparchive fix for preview2?

@jonathanpeppers
Copy link
Member Author

It can be up to you, we are OK to leave the workaround in for preview 1.

And then main/preview 2, we can verify the fix.

@carlossanlop
Copy link
Member

Ok thanks. Let's follow the cautious approach and make the fix available in preview 2 instead.

jonathanpeppers added a commit to dotnet/android that referenced this issue Jan 31, 2025
Changes: dotnet/sdk@aca4b81...ee4ea82

Updates:

* Microsoft.NET.Sdk: from 10.0.100-alpha.1.25069.2 to 10.0.100-preview.1.25080.14
* Microsoft.NETCore.App.Ref: from 10.0.0-alpha.1.25067.10 to 10.0.0-preview.1.25078.5 (parent: Microsoft.NET.Sdk)
* Microsoft.NET.ILLink.Tasks: from 10.0.0-alpha.1.25067.10 to 10.0.0-preview.1.25078.5 (parent: Microsoft.NET.Sdk)

Other changes:

* Default app project builds to `$(_AndroidUseLibZipSharp)=true` to avoid:

    error ANDZA0000: 01-30 21:38:27.669 38611 159726 W zip     : WARNING: header mismatch

Reported: dotnet/runtime#112017

* Update `<GitBranch/>` MSBuild task to substring long `darc-` branch names to avoid:

    NuGet.Build.Tasks.Pack.targets(221,5): error NU5123: Warning As Error: The file 'package/services/metadata/core-properties/027a96e260344b159133798c830dab61.psmdcp' path, name, or both are too long. Your package might not work without long file path support. Please shorten the file path or file name.

Co-authored-by: Jonathan Peppers <[email protected]>
jonathanpeppers added a commit to dotnet/android that referenced this issue Feb 3, 2025
Changes: dotnet/sdk@aca4b81...d6bc791
Changes: dotnet/runtime@6c58f79...e51af40

Updates:

* Microsoft.NET.Sdk: from 10.0.100-alpha.1.25069.2 to 10.0.100-preview.2.25102.3
* Microsoft.NETCore.App.Ref: from 10.0.0-alpha.1.25067.10 to 10.0.0-preview.2.25101.4
* Microsoft.NET.ILLink.Tasks: from 10.0.0-alpha.1.25067.10 to 10.0.0-preview.2.25101.4

Other changes:

Context: dotnet/runtime#112017
Context: dotnet/runtime#112032

* Default to `$(_AndroidUseLibZipSharp)=true` to temporarily workaround
a `System.IO.Compression.ZipArchive` issue.

In a future PR, we'll revert this change to use System.IO.Compression where
appropriate.

Co-authored-by: Jonathan Peppers <[email protected]>
@carlossanlop
Copy link
Member

@jonathanpeppers the fix is part of preview2. Are you able to see the failure gone? Can we close this?

jonathanpeppers added a commit to dotnet/android that referenced this issue Feb 25, 2025
Context: dotnet/runtime#112017

In .NET 10 Preview 1, when using System.IO.Compression to create
`.apk` files, `zipalign` was giving the error:

    01-30 21:38:27.669 38611 159726 W zip     : WARNING: header mismatch

To workaround, we temporarily set `$(_AndroidUseLibZipSharp)=true`.

We think this is fixed now, so partially revert f3ef4fe.
@jonathanpeppers
Copy link
Member Author

@jonathanpeppers
Copy link
Member Author

I think the results now might be worse...

The new error acts like the file is missing (but it is present at that path):

`zipalign` Unable to open 'obj/Debug/android/bin/com.xamarin.defaultitems.apk' as zip archive: No such file or directory 

I think they might just print this error if unable to open the file:

When I download the file and inspect, 7z reports errors:

> 7z t "D:\Downloads\com.xamarin.defaultitems.apk"

7-Zip 24.09 (x64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-11-29

Scanning the drive for archives:
1 file, 7063450 bytes (6898 KiB)

Testing archive: D:\Downloads\com.xamarin.defaultitems.apk
--
Path = D:\Downloads\com.xamarin.defaultitems.apk
Type = zip
Physical Size = 7063450

ERROR: Headers Error : assets\foo\bar.txt

Sub items Errors: 1

Archives with Errors: 1

Sub items Errors: 1

Here is the file, I renamed the extension to .zip:

com.xamarin.defaultitems.zip

@carlossanlop
Copy link
Member

Okay, I think we will have to revert the last fix.

@edwardneal
Copy link
Contributor

I agree - the current behaviour leaves us in a worse position than before, can we revert it from preview2?

I'll try to smoke-test the Android build with my local runtime in case there are any other issues, then reattempt the fix.

@carlossanlop
Copy link
Member

can we revert it from preview2

Maybe. I need to check if we have runway.

I'll submit the revert to main for now, and if I get a green light, will submit the backport too.

I'll try to smoke-test the Android build with my local runtime in case there are any other issues, then reattempt the fix.

Do you mean try again this fix to mitigate the aapt2/zipalign problem? I don't think it's worth it. We aren't really at fault here.

@carlossanlop
Copy link
Member

carlossanlop commented Feb 25, 2025

I don't think it's worth it. We aren't really at fault here.

@edwardneal But what about reverting to the previous behavior before the refactoring? Looks like we tried to be more correct with the refactoring but maybe we should keep the handling of the "version needed to extract" exactly the same as it was before the refactoring.

@edwardneal
Copy link
Contributor

Thanks @carlossanlop. We could do that - we'd need to revert #102704, maintaining the fix from #11802. We'd lose the memory usage improvements, but the BinaryReader/BinaryWriter removals would remain in situ (and so the API proposal for async ZipFile APIs would remain unblocked.)

At the moment though, I think it'd be better to get more feedback from preview1. As of the current release, the issue is that ZipArchive doesn't gracefully handle files which were originally slightly malformed, not that it's corrupting well-formed inputs. If we see any instances of the latter, then I'm in favour of reverting.

Separately: although we're not strictly at fault for the malformed ZIP file, the step to correct the header is also fairly trivial. I'd personally prefer to make ZipArchive more forgiving in this case, so that dotnet/android can use the library for .NET 10 and doesn't need to account for the version of aapt2 in use. What do you think?

@carlossanlop
Copy link
Member

Sorry if I wasn't clear, I don't think we need to revert the memory improvements you already merged. They are all good.

I agree with you that we should make the behavior more forgiving. I also agree that the change to allow that is probably trivial.

So what I actually meant is that specifically for the header bug, we should try to find a way to treat the version needed to extract exactly like we used to before all these improvements were merged (and only that field, everything else is good and should stay merged).

@edwardneal
Copy link
Contributor

Thanks carlossanlop.

I've spent a little time with the Android build, and can reproduce the problem with preview1 if I set _AndroidUseLibSharp appropriately in a project file. I've created a new PR (#113306) which is identical to the original bugfix (but also adds an extra verification step to the test.)

When I substitute my local Release build of System.IO.Compression, I can successfully build a previously-failing Android project. All headers match, the output passes a test in 7-Zip. Besides testing the change when building an Android project, I've also created a new .apk file with aapt2 and modified it, and this also worked.

The second problem we're dealing with is related to zipalign, not apt2. zipalign alters the size of the extra field in the ZipArchiveEntry's local file header, adding extra data. This extra data is literally just zeroes though, not a well-formed ZipExtraField structure.

Previously, ZipArchive would have silently ignored this malformed data; only well-formed ZipExtraField structures would have been written out to the backing stream upon disposal. For unmodified entries, we now try to seek past the local file header and file data. We work out how many bytes forward to seek by summing up the lengths of all well-formed ZipExtraField structures - and this is where the problem lies. The X bytes of malformed extra data aren't accounted for. As a result, we find that the field in the central directory header which stores the offset of the local file header falls slightly out of sync with the stream's position.

This shows up in com.xamarin.defaultitems.zip. The ZIP file content is correct, but the central directory header for assets/foo/bar.txt specified a local file header offset of 0x5579. The local file header for this file was actually at 0x557c, a three-byte difference. The local file header directly before this specified a three-byte "extra data" section.

#113306 will reintroduce the previously-reverted change and fix the first problem. I think a fix for the second problem is a few days away though.

@carlossanlop
Copy link
Member

Thanks @edwardneal. I'll review your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants