Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace WixSharp's MSI parser with a pure C# parser based on OpenMcdf #154

Merged
merged 12 commits into from
Dec 23, 2024

Conversation

miyoyo
Copy link
Contributor

@miyoyo miyoyo commented Dec 18, 2024

The current version of WinTuner uses WixSharp.UI.MsiParser to read the embedded database inside of msi files to get the product's code, and version, which is not available on 'nixes.

While the product code is easily available via winget, it appears that the product version field is not available, or not available anymore. Since the product version is required for msi deployment, and it is not available otherwise, this means msi packages cannot be created from 'nixes, like Linux or macOS.

This pull request implements a fallback to msitools, a red hat/GNOME maintained project that implements support for msi file reading.

This package is available in all linux distros I have checked (including smaller ones like alpine and void) as well as macOS' homebrew.

I have run the Pester test suite on Linux, as well as manually tested New-WtWingetPackage, and it does manage to package msi files with no issues.

PS /mnt/c/Users/Hidden/Documents/WingetIntune> ./test.ps1

Starting discovery in 10 files.
Discovery found 28 tests in 467ms.
Running tests.
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/Deploy-WtMsStoreApp.Tests.ps1 419ms (65ms|202ms)
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/Deploy-WtWin32App.Tests.ps1 57ms (2ms|25ms)
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/Get-WtWin32Apps.Tests.ps1 43ms (2ms|12ms)
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/New-IntuneWinPackage.Tests.ps1 38ms (2ms|8ms)
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/NewWtWingetPackage.Tests.ps1 60ms (13ms|17ms)
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/RemoveWtWin32App.Tests.ps1 54ms (4ms|22ms)
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/Search-WtWinGetPackage.Tests.ps1 457ms (413ms|16ms)
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/Unprotect-IntuneWinPackage.Tests.ps1 39ms (2ms|9ms)
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/Update-WtIntuneApp.Tests.ps1 47ms (2ms|11ms)
[+] /mnt/c/Users/Hidden/Documents/WingetIntune/tests/WinTuner.Cmdlets.Tests/WinTuner.Tests.ps1 1.14s (1.03s|63ms)
Tests completed in 2.38s
Tests Passed: 28, Failed: 0, Skipped: 0, Inconclusive: 0, NotRun: 0
hidden@Prosopagnosia:/mnt/c/Users/Hidden/Documents/WingetIntune$ pwsh -Command "Import-Module ./dist/WinTuner/WinTuner.psd1; New-WtWingetPackage 7zip.7zip repo -TempFolder broken -Debug -Verbose; Remove-Module WinTuner"
INFO: Packaging package 7zip.7zip 24.09
INFO: [WingetManager] Getting package info for 7zip.7zip 24.09 from github
INFO: [ComputeBestInstallerForPackageCommand] Computing best installer for 7zip.7zip 24.09 Unknown
INFO: [IntuneManager] Generating IntuneWin package for 7zip.7zip 24.09 X64 System in repo
INFO: [IntuneManager] Downloading installer from https://7-zip.org/a/7z2409-x64.msi to broken/7zip.7zip/24.09/7z2409-x64.msi
INFO: [IntuneManager] Found 1 captures
INFO: [Packager] Creating Intune package from broken/7zip.7zip/24.09
INFO: [Packager] Creating package for broken/7zip.7zip/24.09/7z2409-x64.msi in broken/7zip.7zip/24.09 to repo/7zip.7zip/24.09
INFO: [Packager] Compressing the source folder broken/7zip.7zip/24.09 to /tmp/f99daf83-b0d8-4d63-8e8f-4c849a38c4ca/IntuneWinPackage/Contents/IntunePackage.intunewin
INFO: [Packager] Generating application info
INFO: [Packager] Encrypting file /tmp/f99daf83-b0d8-4d63-8e8f-4c849a38c4ca/IntuneWinPackage/Contents/IntunePackage.intunewin
INFO: [Packager] Generated detection XML file /tmp/f99daf83-b0d8-4d63-8e8f-4c849a38c4ca/IntuneWinPackage/Metadata/Detection.xml
INFO: [Packager] Done creating package for broken/7zip.7zip/24.09/7z2409-x64.msi in broken/7zip.7zip/24.09 to repo/7zip.7zip/24.09
INFO: [IntuneManager] Downloading logo from https://api.winstall.app/icons/7zip.7zip.png
INFO: [DefaultFileManager] Skipping download of https://api.winstall.app/icons/7zip.7zip.png to /mnt/c/Users/Hidden/Documents/WingetIntune/repo/7zip.7zip/logo.png because the file already exists
INFO: [IntuneManager] Writing detection info with msi details 7zip.7zip {23170F69-40C1-2702-2409-000001000000}
INFO: [IntuneManager] Writing package readme for package 7zip.7zip

PackageId          : 7zip.7zip
Version            : 24.09
PackageFolder      : repo/7zip.7zip/24.09
PackageFile        : 7z2409-x64.intunewin
InstallerFile      : 7z2409-x64.msi
InstallerArguments : -x64.msi /qn /norestart

Do note: this is the first C# code I've written in a very long time, so it may be better to cherry pick parts of this implementation if the code quality is insufficient, especially around error handling.

Closes #79

@svrooij
Copy link
Owner

svrooij commented Dec 18, 2024

I like you found a way to gather msi product codes on other platforms, but executing a process on the machine does not really provide a solution that also works in the cloud. If this is possible in an other language (so it seems) would it also be possible to do in c#?

@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 18, 2024

I like you found a way to gather msi product codes on other platforms, but executing a process on the machine does not really provide a solution that also works in the cloud. If this is possible in an other language (so it seems) would it also be possible to do in c#?

I can think of implementing libmsi as an FFI'd library, but reimplementing the entire thing in C# would be an entire project by itself, there's multiple thousand lines of code in libmsi.

Which kind of cloud workflow are you thinking of where shell commands (well, in this case, there is no shell, it's direct program execution with an argument) would not be allowed? As far as I know, most cloud platforms would let you ship msitools as part of the runtime container

@svrooij
Copy link
Owner

svrooij commented Dec 18, 2024

I spend a great deal of time making this fully cross-platform, and want to remove all things that depend upon a binary that can only be run on Windows. That is why I'm quite hesitant against introducing another platform dependency.

I'm thinking Azure Functions, I know you can run those from a container and you could modify the built process to include another binary, but still that will cause a lot of headache. This is just a one-men project (for now), I'm not sure if I have the time to support this.

May I propose something else? What if you could specify a command format that would be executed to extract details from the msi.
-MsiCommand "msitool.ps1 {msiLocation}", this command should then return productCode;version

Then there is no calling binaries inside this library and you are still able to extract msi details on linux. Other discussion is that this information should be correct in the Winget manifest....

@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 18, 2024

That command argument would probably be fine, although it would change nothing (you would still require calling external binaries on Linux or it would break, and it would not need to do so on Windows thanks to WixSharp)

I'm trying to see if https://github.com/ironfede/openmcdf can be used to extract the version information, hopefully that can also remove WixSharp, as it's pure C#

@svrooij
Copy link
Owner

svrooij commented Dec 18, 2024

That command argument would probably be fine, although it would change nothing (you would still require calling external binaries on Linux

This would change that I mark that as Use this unsupported argument at your own risk and then I would not have to deal with people that ask questions about that argument.

@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 19, 2024

I definitely believe OpenMcdf is an option, I've already implemented decoding the string names and tables from an MSI, all I have to do now is figure out which string is the version and code string, but it's almost 2 AM, so that'll be tomorrow.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Graph.Beta.Models;
using OpenMcdf;

namespace WingetIntune.Internal.Msi;
internal class MsiDecoder
{
    public string GetCode()
    {
        throw new NotImplementedException();
    }
    public string GetVersion()
    {
        throw new NotImplementedException();
    }
    public MsiDecoder(string filePath)
    {
        using (var cf = new CompoundFile(filePath))
        {
            var pool = LoadStringPool(cf);
        }
    }



    // references for the next lines:
    // https://stackoverflow.com/questions/9734978/view-msi-strings-in-binary

    private char BaseMSIDecode(char c)
    {
        // 0-0x3F converted to '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz._'
        // all other values higher as 0x3F converted also to '_'

        int result;

        if (c < 10)
            result = c + '0';             // 0-9 (0x0-0x9) -> '0123456789'
        else if (c < (10 + 26))
            result = c - 10 + 'A';        // 10-35 (0xA-0x23) -> 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
        else if (c < (10 + 26 + 26))
            result = c - 10 - 26 + 'a';   // 36-61 (0x24-0x3D) -> 'abcdefghijklmnopqrstuvwxyz'
        else if (c == (10 + 26 + 26))       // 62 (0x3E) -> '.'
            result = '.';
        else
            result = '_';                 // 63-0xffffffff (0x3F-0xFFFFFFFF) -> '_'

        return (char)result;
    }

    string DecodeStreamName(string name)
    {
        var result = new List<char>();
        var source = name.ToCharArray();

        foreach (char c in source)
        {
            var reduced = 0;
            if (c == 0x4840)
            {
                result.Add('$');
            }
            else if ((c >= 0x3800) && (c < 0x4840))
            {
                if (c >= 0x4800)
                {
                    reduced = c - 0x4800;
                    result.Add(BaseMSIDecode((char)(reduced)));
                }
                else
                {
                    reduced = c - 0x3800;
                    result.Add(BaseMSIDecode((char)(reduced & 0x3F)));
                    result.Add(BaseMSIDecode((char)((reduced >> 6) & 0x3F)));
                }
            }
            else
            {
                result.Add(c);
            }
        }

        return new string(result.ToArray());
    }

    private char BaseMSIEncode(char c)
    {
        // only '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz._' are allowed and converted to 0-0x3F

        int result;


        if ((c >= '0') && (c <= '9'))   // '0123456789' -> 0-9  (0x0-0x9)
            result = c - '0';
        else if ((c >= 'A') && (c <= 'Z'))   // 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' -> 10-35 (26 chars) - (0xA-0x23)
            result = c - 'A' + 10;
        else if ((c >= 'a') && (c <= 'z'))   // 'abcdefghijklmnopqrstuvwxyz' -> 36-61 (26 chars) - (0x24-0x3D)
            result = c - 'a' + 10 + 26;
        else if (c == '.')
            result = 10 + 26 + 26;        // '.' -> 62 (0x3E)
        else if (c == '_')
            result = 10 + 26 + 26 + 1;      // '_' -> 63 (0x3F) - 6 bits
        else
            result = -1; // other -> -1 (0xFF)
        return (char)result;
    }

    string EncodeStreamName(string name)
    {
        var result = new List<char>();

        for (int i = 0; i < name.Length; i++)
        {
            var c = name[i];
            if (c == '$')
            {
                result.Add((char)0x4840);
            }
            else if (c < 0x80 && BaseMSIEncode(c) <= 0x3F && i + 1 != name.Length)
            {
                i++;
                var first = BaseMSIEncode(c);
                var second = BaseMSIEncode(name[i]);
                result.Add((char)(first + (second << 6) + 0x3800));
            }
            else
            {
                result.Add((char)(BaseMSIEncode(c) + 0x4800));
            }

        }

        return new string(result.ToArray());
    }

    Dictionary<int, string> LoadStringPool(CompoundFile cf)
    {
        var decodedStringPool = EncodeStreamName("$_StringPool");
        var streamStringPool = cf.RootStorage.GetStream(decodedStringPool);
        var stringPoolBytes = streamStringPool.GetData();
        var poolLength = streamStringPool.Size;
        var poolWLength = BitConverter.ToInt16(stringPoolBytes, 0);
        var poolRefCount = BitConverter.ToInt16(stringPoolBytes, 2);

        var decodedStringData = EncodeStreamName("$_StringData");
        var streamStringData = cf.RootStorage.GetStream(decodedStringData);
        var stringDataBytes = streamStringData.GetData();

        var strings = new Dictionary<int, string>();

        for (int src = 4, stringId = 1, offset = 0; src < poolLength; src += 4)
        {
            Console.WriteLine("Starting decode");
            var entryLength = (int)BitConverter.ToInt16(stringPoolBytes, src);
            var entryRef = (int)BitConverter.ToInt16(stringPoolBytes, src + 2);

            Console.WriteLine($"Of entry {entryLength} {entryRef}");

            if (entryLength == 0 && entryRef == 0)
            {
                // Empty entry, skip.
                Console.WriteLine("Skipping");
                stringId++;
                continue;
            }
            else if (entryLength == 0 && entryRef != 0)
            {
                // wide entry over 64kb
                Console.WriteLine("Wide Entry");
                continue;
            }

            if (src != 4)
            {

                var previousEntryLength = BitConverter.ToInt16(stringPoolBytes, src - 4);
                var previousEntryRef = BitConverter.ToInt16(stringPoolBytes, src - 2);
                Console.WriteLine($"Previous entry {previousEntryLength} {previousEntryRef}");

                if (previousEntryLength == 0 && previousEntryRef != 0)
                {
                    entryLength += previousEntryLength << 16;
                    Console.WriteLine($"New Size {entryLength}");
                }
            }

            Console.WriteLine($"Adding {Encoding.UTF8.GetString(stringDataBytes.Skip(offset).Take(entryLength).ToArray())}");

            strings.Add(stringId, Encoding.UTF8.GetString(stringDataBytes.Skip(offset).Take(entryLength).ToArray()));
            offset += entryLength;
            stringId++;
        }

        return strings;
    }
}

I want to put a special lowlight to whoever thought GLib was a good idea. msitools is unreadable.

@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 19, 2024

Welp, looks like addiction won in the end, it works.

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Graph.Beta.Models;
using OpenMcdf;

namespace WingetIntune.Internal.Msi;
internal class MsiDecoder
{
    private int stringSize = 2;
    private Dictionary<int, string> intToString;
    private Dictionary<string, int> stringToInt;
    public string GetCode()
    {
        throw new NotImplementedException();
    }
    public string GetVersion()
    {
        throw new NotImplementedException();
    }
    public MsiDecoder(string filePath)
    {
        using (var cf = new CompoundFile(filePath))
        {
            intToString = LoadStringPool(cf);
            stringToInt = intToString.ToDictionary(x => x.Value, x => x.Key);

            foreach(var entry in intToString)
            {
                Console.WriteLine($"{entry.Key}: {entry.Value}");
            }

            var decodedPropertyName = EncodeStreamName("$Property");
            var streamProperty = cf.RootStorage.GetStream(decodedPropertyName);
            var propertyBytes = streamProperty.GetData();
            var enubytes = Enumerable.Range(0, propertyBytes.Length / 2).Select(i => BitConverter.ToUInt16(propertyBytes, i * 2)).ToArray();
            var cells = new List<string>();

            foreach (var b in enubytes)
            {
                cells.Add(intToString[b]);
            }

            var tableSize = cells.Count() / 2;

            var code = cells[cells.IndexOf("ProductCode") + tableSize];
            var version = cells[cells.IndexOf("ProductVersion") + tableSize];

            Console.WriteLine($"Code: {code}, Version: {version}");
        }
    }



    // references for the next lines:
    // https://stackoverflow.com/questions/9734978/view-msi-strings-in-binary

    private char BaseMSIDecode(char c)
    {
        // 0-0x3F converted to '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz._'
        // all other values higher as 0x3F converted also to '_'

        int result;

        if (c < 10)
            result = c + '0';             // 0-9 (0x0-0x9) -> '0123456789'
        else if (c < (10 + 26))
            result = c - 10 + 'A';        // 10-35 (0xA-0x23) -> 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
        else if (c < (10 + 26 + 26))
            result = c - 10 - 26 + 'a';   // 36-61 (0x24-0x3D) -> 'abcdefghijklmnopqrstuvwxyz'
        else if (c == (10 + 26 + 26))       // 62 (0x3E) -> '.'
            result = '.';
        else
            result = '_';                 // 63-0xffffffff (0x3F-0xFFFFFFFF) -> '_'

        return (char)result;
    }

    string DecodeStreamName(string name)
    {
        var result = new List<char>();
        var source = name.ToCharArray();

        foreach (char c in source)
        {
            var reduced = 0;
            if (c == 0x4840)
            {
                result.Add('$');
            }
            else if ((c >= 0x3800) && (c < 0x4840))
            {
                if (c >= 0x4800)
                {
                    reduced = c - 0x4800;
                    result.Add(BaseMSIDecode((char)(reduced)));
                }
                else
                {
                    reduced = c - 0x3800;
                    result.Add(BaseMSIDecode((char)(reduced & 0x3F)));
                    result.Add(BaseMSIDecode((char)((reduced >> 6) & 0x3F)));
                }
            }
            else
            {
                result.Add(c);
            }
        }

        return new string(result.ToArray());
    }

    private char BaseMSIEncode(char c)
    {
        // only '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz._' are allowed and converted to 0-0x3F

        int result;


        if ((c >= '0') && (c <= '9'))   // '0123456789' -> 0-9  (0x0-0x9)
            result = c - '0';
        else if ((c >= 'A') && (c <= 'Z'))   // 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' -> 10-35 (26 chars) - (0xA-0x23)
            result = c - 'A' + 10;
        else if ((c >= 'a') && (c <= 'z'))   // 'abcdefghijklmnopqrstuvwxyz' -> 36-61 (26 chars) - (0x24-0x3D)
            result = c - 'a' + 10 + 26;
        else if (c == '.')
            result = 10 + 26 + 26;        // '.' -> 62 (0x3E)
        else if (c == '_')
            result = 10 + 26 + 26 + 1;      // '_' -> 63 (0x3F) - 6 bits
        else
            result = -1; // other -> -1 (0xFF)
        return (char)result;
    }

    string EncodeStreamName(string name)
    {
        var result = new List<char>();

        for (int i = 0; i < name.Length; i++)
        {
            var c = name[i];
            if (c == '$')
            {
                result.Add((char)0x4840);
            }
            else if (c < 0x80 && BaseMSIEncode(c) <= 0x3F && i + 1 != name.Length)
            {
                i++;
                var first = BaseMSIEncode(c);
                var second = BaseMSIEncode(name[i]);
                result.Add((char)(first + (second << 6) + 0x3800));
            }
            else
            {
                result.Add((char)(BaseMSIEncode(c) + 0x4800));
            }

        }

        return new string(result.ToArray());
    }

    Dictionary<int, string> LoadStringPool(CompoundFile cf)
    {
        var decodedStringPool = EncodeStreamName("$_StringPool");
        var streamStringPool = cf.RootStorage.GetStream(decodedStringPool);
        var stringPoolBytes = streamStringPool.GetData();
        var poolLength = streamStringPool.Size;
        var poolWLength = BitConverter.ToUInt16(stringPoolBytes, 0);
        var poolRefCount = BitConverter.ToUInt16(stringPoolBytes, 2);

        if (poolRefCount == 0)
            stringSize = 2;
        else if (poolRefCount == 0x8000)
            stringSize = 3;

        var decodedStringData = EncodeStreamName("$_StringData");
        var streamStringData = cf.RootStorage.GetStream(decodedStringData);
        var stringDataBytes = streamStringData.GetData();

        var strings = new Dictionary<int, string>();

        for (int src = 4, stringId = 1, offset = 0; src < poolLength; src += 4)
        {
            Console.WriteLine("Starting decode");
            var entryLength = (int)BitConverter.ToUInt16(stringPoolBytes, src);
            var entryRef = (int)BitConverter.ToUInt16(stringPoolBytes, src + 2);

            Console.WriteLine($"Of entry {entryLength} {entryRef}");

            if (entryLength == 0 && entryRef == 0)
            {
                // Empty entry, skip.
                Console.WriteLine("Skipping");
                stringId++;
                continue;
            }
            else if (entryLength == 0 && entryRef != 0)
            {
                // wide entry over 64kb
                Console.WriteLine("Wide Entry");
                continue;
            }

            if (src != 4)
            {

                var previousEntryLength = BitConverter.ToInt16(stringPoolBytes, src - 4);
                var previousEntryRef = BitConverter.ToInt16(stringPoolBytes, src - 2);
                Console.WriteLine($"Previous entry {previousEntryLength} {previousEntryRef}");

                if (previousEntryLength == 0 && previousEntryRef != 0)
                {
                    entryLength += previousEntryLength << 16;
                    Console.WriteLine($"New Size {entryLength}");
                }
            }

            Console.WriteLine($"Adding {Encoding.UTF8.GetString(stringDataBytes.Skip(offset).Take(entryLength).ToArray())}");

            strings.Add(stringId, Encoding.UTF8.GetString(stringDataBytes.Skip(offset).Take(entryLength).ToArray()));
            offset += entryLength;
            stringId++;
        }

        return strings;
    }
}

This is overly simplistic as it can only get product code and product version, but if you want other data from the MSI, I can get that too

a few tables that could be interesting
Property Table

Property        Value
s72     l0
Property        Property
UpgradeCode     {23170F69-40C1-2702-0000-000004000000}
LicenseAccepted 1
Manufacturer    Igor Pavlov
ProductCode     {23170F69-40C1-2702-2409-000001000000}
ProductLanguage 1033
ProductName     7-Zip 24.09 (x64 edition)
ProductVersion  24.09.00.0
MSIRMSHUTDOWN   2
ALLUSERS        2
ARPURLINFOABOUT http://www.7-zip.org/
ARPHELPLINK     http://www.7-zip.org/support.html
ARPURLUPDATEINFO        http://www.7-zip.org/download.html
DefaultUIFont   WixUI_Font_Normal
WixUI_Mode      FeatureTree
WixUI_WelcomeDlg_Next   LicenseAgreementDlg
WixUI_LicenseAgreementDlg_Back  WelcomeDlg
WixUI_LicenseAgreementDlg_Next  CustomizeDlg
WixUI_CustomizeDlg_BackChange   MaintenanceTypeDlg
WixUI_CustomizeDlg_BackCustom   SetupTypeDlg
WixUI_CustomizeDlg_BackFeatureTree      LicenseAgreementDlg
WixUI_CustomizeDlg_Next VerifyReadyDlg
WixUI_VerifyReadyDlg_BackCustom CustomizeDlg
WixUI_VerifyReadyDlg_BackChange CustomizeDlg
WixUI_VerifyReadyDlg_BackRepair MaintenanceTypeDlg
WixUI_VerifyReadyDlg_BackTypical        SetupTypeDlg
WixUI_VerifyReadyDlg_BackFeatureTree    CustomizeDlg
WixUI_VerifyReadyDlg_BackComplete       SetupTypeDlg
WixUI_MaintenanceWelcomeDlg_Next        MaintenanceTypeDlg
WixUI_MaintenanceTypeDlg_Change CustomizeDlg
WixUI_MaintenanceTypeDlg_Repair VerifyRepairDlg
WixUI_MaintenanceTypeDlg_Remove VerifyRemoveDlg
WixUI_MaintenanceTypeDlg_Back   MaintenanceWelcomeDlg
WixUI_VerifyRemoveDlg_Back      MaintenanceTypeDlg
WixUI_VerifyRepairDlg_Back      MaintenanceTypeDlg
ErrorDialog     ErrorDlg
SecureCustomProperties  OLDERVERSIONBEINGUPGRADED

File table
File	Component_	FileName	FileSize	Version	Language	Attributes	Sequence
s72	s72	l255	i4	S72	S20	I2	i4
File	File
_7zFM.exe	Fm	7zFM.exe	990720	24.9.0.0	1033	0	1
_7zip32.dll	ShellExt32	7-zip32.dll	67072	24.9.0.0	1033	0	2
_7zip.dll	ShellExt	7-zip.dll	101376	24.9.0.0	1033	0	3
_7zG.exe	Gui	7zG.exe	712704	24.9.0.0	1033	0	4
_7z.dll	Formats	7z.dll	1907712	24.9.0.0	1033	0	5
_7z.exe	CmdLine	7z.exe	564736	24.9.0.0	1033	0	6
_7z.sfx	GuiSfx	7z.sfx	213504	24.9.0.0	1033	0	7
_7zCon.sfx	ConSfx	7zCon.sfx	193024	24.9.0.0	1033	0	8
descript.ion	Docs	descript.ion	366			0	9
_7zip.chm	Help	7-zip.chm	124920			0	13
en.ttt	Lang	en.ttt	7881			0	14
History.txt	Docs	History.txt	8226			0	10
License.txt	Docs	License.txt	6031			0	11
readme.txt	Docs	readme.txt	1714			0	12
af.txt	Lang	af.txt	4621			0	15
an.txt	Lang	an.txt	7372			0	16
ar.txt	Lang	ar.txt	12299			0	17
ast.txt	Lang	ast.txt	4967			0	18
az.txt	Lang	az.txt	10289			0	19
ba.txt	Lang	ba.txt	10837			0	20
be.txt	Lang	be.txt	11457			0	21
bg.txt	Lang	bg.txt	17574			0	22
bn.txt	Lang	bn.txt	14633			0	23
br.txt	Lang	br.txt	4953			0	24
ca.txt	Lang	ca.txt	9747			0	25
co.txt	Lang	co.txt	11444			0	26
cs.txt	Lang	cs.txt	9720			0	27
cy.txt	Lang	cy.txt	4812			0	28
da.txt	Lang	da.txt	7870			0	29
de.txt	Lang	de.txt	10040			0	30
el.txt	Lang	el.txt	18214			0	31
eo.txt	Lang	eo.txt	4848			0	32
es.txt	Lang	es.txt	10597			0	33
et.txt	Lang	et.txt	6667			0	34
eu.txt	Lang	eu.txt	8399			0	35
ext.txt	Lang	ext.txt	7317			0	36
fa.txt	Lang	fa.txt	13282			0	37
fi.txt	Lang	fi.txt	8517			0	38
fr.txt	Lang	fr.txt	10868			0	39
fur.txt	Lang	fur.txt	7113			0	40
fy.txt	Lang	fy.txt	6029			0	41
ga.txt	Lang	ga.txt	7906			0	42
gl.txt	Lang	gl.txt	9099			0	43
gu.txt	Lang	gu.txt	17365			0	44
he.txt	Lang	he.txt	10909			0	45
hi.txt	Lang	hi.txt	17467			0	46
hr.txt	Lang	hr.txt	8122			0	47
hu.txt	Lang	hu.txt	10373			0	48
hy.txt	Lang	hy.txt	13636			0	49
id.txt	Lang	id.txt	8739			0	50
io.txt	Lang	io.txt	4604			0	51
is.txt	Lang	is.txt	8251			0	52
it.txt	Lang	it.txt	9906			0	53
ja.txt	Lang	ja.txt	12387			0	54
ka.txt	Lang	ka.txt	17799			0	55
kaa.txt	Lang	kaa.txt	7698			0	56
kab.txt	Lang	kab.txt	8094			0	57
kk.txt	Lang	kk.txt	10328			0	58
ko.txt	Lang	ko.txt	10267			0	59
ku.txt	Lang	ku.txt	5370			0	60
ku_ckb.txt	Lang	ku-ckb.txt	11933			0	61
ky.txt	Lang	ky.txt	12052			0	62
lij.txt	Lang	lij.txt	7471			0	63
lt.txt	Lang	lt.txt	9030			0	64
lv.txt	Lang	lv.txt	5016			0	65
mk.txt	Lang	mk.txt	8352			0	66
mn.txt	Lang	mn.txt	8069			0	67
mng.txt	Lang	mng.txt	19786			0	68
mng2.txt	Lang	mng2.txt	21169			0	69
mr.txt	Lang	mr.txt	10395			0	70
ms.txt	Lang	ms.txt	4785			0	71
ne.txt	Lang	ne.txt	13050			0	72
nl.txt	Lang	nl.txt	9608			0	73
nb.txt	Lang	nb.txt	5649			0	74
nn.txt	Lang	nn.txt	5525			0	75
pa_in.txt	Lang	pa-in.txt	14259			0	76
pl.txt	Lang	pl.txt	9911			0	77
ps.txt	Lang	ps.txt	8236			0	78
pt.txt	Lang	pt.txt	10167			0	79
pt_br.txt	Lang	pt-br.txt	10094			0	80
ro.txt	Lang	ro.txt	10040			0	81
ru.txt	Lang	ru.txt	15936			0	82
sa.txt	Lang	sa.txt	18834			0	83
si.txt	Lang	si.txt	18706			0	84
sk.txt	Lang	sk.txt	10142			0	85
sl.txt	Lang	sl.txt	8407			0	86
sq.txt	Lang	sq.txt	5579			0	87
sr_spl.txt	Lang	sr-spl.txt	6765			0	88
sr_spc.txt	Lang	sr-spc.txt	11589			0	89
sv.txt	Lang	sv.txt	8711			0	90
sw.txt	Lang	sw.txt	8039			0	91
ta.txt	Lang	ta.txt	12057			0	92
tg.txt	Lang	tg.txt	14632			0	93
th.txt	Lang	th.txt	15450			0	94
tk.txt	Lang	tk.txt	8736			0	95
tr.txt	Lang	tr.txt	9756			0	96
tt.txt	Lang	tt.txt	13706			0	97
ug.txt	Lang	ug.txt	10982			0	98
uk.txt	Lang	uk.txt	16419			0	99
uz.txt	Lang	uz.txt	8888			0	100
uz_cyrl.txt	Lang	uz-cyrl.txt	14672			0	101
va.txt	Lang	va.txt	9508			0	102
vi.txt	Lang	vi.txt	8111			0	103
yo.txt	Lang	yo.txt	10469			0	104
zh_cn.txt	Lang	zh-cn.txt	8236			0	105
zh_tw.txt	Lang	zh-tw.txt	8383			0	106
Upgrade table

UpgradeCode	VersionMin	VersionMax	Language	Attributes	Remove	ActionProperty
s38	S20	S20	S255	i4	S255	s72
Upgrade	UpgradeCode	VersionMin	VersionMax	Language	Attributes
{23170F69-40C1-2702-0000-000004000000}	4.38	24.09.00.0		256		OLDERVERSIONBEINGUPGRADED

@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 19, 2024

Alright, pushed my changes, this version is not perfect yet, as I'm parsing large numbers (i4) incorrectly, otherwise, parsing numbers is sufficient, and adding queries can be done in a similar way as both GetCode and GetVersion.

Probably deserves a cleanup before it's accepted too.

EDIT: Also tried it on windows while disabling WixSharp, seemingly works just fine!

EDIT2: If anybody stumbles upon this because you're trying to find info on MSI databases, here you go, lil' writeup

MSI files are OLE2 Compound files, akin to Word Documents and thumbs databases

They are essentially composed of two things:
- One or more cabinet files to extract and install on disk
- An embedded database defining what the msi runtime will show (options, disclaimers, legal babble, paths...)

Within the context of WinTuner, we want to read the ProductCode and ProductVersion embedded in the msi database.

First of all, msi stream names (and therefore, table names) are... really messed up for no reason whatsoever.
It's some form of base64 stored in a nonsensical format, and tables have a special header, which I represent using '$', but it's not actually a $.
Thanks to https://stackoverflow.com/questions/9734978/view-msi-strings-in-binary for documenting it in a readable way.

The msi database has 5 "utility"/"known" streams (byte arrays):
- _StringPool
- _StringData
- _Tables
- _Columns
- _Validation

The first step to read anything in the database is to decode the embedded Strings, to do so, we can read the StringPool,
which is composed of entries of two WORDS (16 bits), making it a 32 bit entry each.

The very first entry is a header, if the 17th bit is set to 1, the string IDs are 3 bytes long, which is only required if you have more than (2^16)-1 IDs
Otherwise, these entries contain the length of the string, as well as a reference count.

If the length and the reference count are not zero, take 'length' characters from _StringData, which is your string, and update the base offset by adding 'length' to it.

If the length is zero, and the reference count is not zero, then the entry is an extension of the next entry.
The reference count contains the upper WORD of the length of the String, otherwise, the decoding is the same.

If the length and the reference count are zero, and should be counted as an empty entry (AKA, it's ID should point to nothing)

Adding an empty string at ID 0 is fine for pure reading purposes. Practically, a string with ID 0 is null.


Once you have decoded the strings, you can start to decode the _Tables table.

The byte structure of tables are pretty simple, they are stored column by column, so, for example, this table:
Name|Num|Site
----+---+----
John|123|Home
Mark|456|Work

Would have it's data structured like

John
Mark
123
456
Home
Work

(Do note that the strings are stored as IDs pointing to the string data we decoded in the previous step)

_Tables is therefore easy to decode, as it only has one column, collect all String IDs from the table, then look them up in the ID:String map.

However, for _Columns, we need to properly decode the table, so, first, we need to understand how the column types work.

All column types are stored in _Columns, however, _Columns is expected to be decoded with hardcoded types in the program.

A column type is a WORD used as a bitfield and an 8 bit number:

UTKN SLV? iiii iiii

U: Unknown Type
T: Temporary
K: Key
N: Nullable

S: String
L: Localizable
V: Valid (I don't know what is meant by this)
?: not sure what this is, maybe Number

i: 8bit number specifying the size of the type, maxing out at 255

As far as types, there are only really 2 types we care about: String (which can be an id 2 or 3 bytes long) and Numeric (i2 is 2 bytes long, i4 is 4 bytes long)

Since we now know our types and their length, we can determine how many rows a table has: it's the number of bytes divided by the combined byte length of all columns together.

Finally, we can loop from column to column, and row by row, by taking however many bits the column type specifies, adding the count to the offset, and saving the data we took.

Numbers are stored with an offset, so i2's need to substract 0x8000, and i4's need to subtract 0x80000000.

Once _Columns is decoded, the same operation can be repeated on other tables by sourcing column names and type information from _Columns.

@miyoyo miyoyo changed the title Support parsing msi product code and version on Linux via msitools Replace WixSharp's MSI parser with a pure C# parser based on OpenMcdf Dec 19, 2024
@miyoyo miyoyo changed the title Replace WixSharp's MSI parser with a pure C# parser based on OpenMcdf [BETA] Replace WixSharp's MSI parser with a pure C# parser based on OpenMcdf Dec 19, 2024
@miyoyo miyoyo changed the title [BETA] Replace WixSharp's MSI parser with a pure C# parser based on OpenMcdf Replace WixSharp's MSI parser with a pure C# parser based on OpenMcdf Dec 19, 2024
@svrooij
Copy link
Owner

svrooij commented Dec 19, 2024

If this all works as expected, I'm also for removing the wixsharp code that is included here: https://github.com/svrooij/WingetIntune/tree/main/src/WingetIntune/Internal/Msi

The idea is that it will use this implementation on all platforms right?

@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 19, 2024

If this all works as expected, I'm also for removing the wixsharp code that is included here: https://github.com/svrooij/WingetIntune/tree/main/src/WingetIntune/Internal/Msi

The idea is that it will use this implementation on all platforms right?

Yeah, since this is pure C#, I don't see a reason to keep WixSharp around

I haven't disabled it yet, but it's probably cautious to keep it until it's battle tested? Up to you.

@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 19, 2024

Wixsharp's gone, tests pass on my end.

@svrooij
Copy link
Owner

svrooij commented Dec 20, 2024

I think it might be a good idea to add a test that uses this new code.
But for the rest I think it looks good

@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 20, 2024

Alright, do you think It's better to embed an msi file, or redownload one for the test?

I'll write the test(s) tonight (gmt+1) otherwise.

@svrooij
Copy link
Owner

svrooij commented Dec 20, 2024

If you can find a really small MSI somewhere I would add that to the tests folder and use that. Otherwise just take an url from somewhere.

Also had to implement a second constructor based on a stream, so the MSI file doesn't have to be written to disk.
@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 20, 2024

There we go, I'm using a small msi from Microsoft's servers (LAPS 6.2.0.0, 1.1MB), so I'm reasonably confident it'll stay in place for a while.

I'm only testing the publicly exposed interface, since that's what's in use, and I don't see the point in testing it from the IntuneManager side, since it'd basically be the same code, just with more effort, as OpenMcdf reads the files itself otherwise.

@miyoyo
Copy link
Contributor Author

miyoyo commented Dec 22, 2024

No clue why the tests failed, there shouldn't be an impact on them, but I just imported MSAL directly.

Also ran formatter (dammit I keep forgetting that one)

@svrooij svrooij merged commit 1d6f429 into svrooij:main Dec 23, 2024
@svrooij
Copy link
Owner

svrooij commented Dec 23, 2024

That did not go as plannend, let me try again.

I always squash to cleanup the history....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: PowerShell Deploy fails with "Value cannot be null. (Parameter 's')" (mac only)
2 participants