Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add (Windows): Unicode and long-paths support #36

Closed
wants to merge 0 commits into from

Conversation

dnzbk
Copy link

@dnzbk dnzbk commented Oct 9, 2024

@animetosho hi,

I added support for Unicode and long-paths on Windows - looking forward to your feedback,

NZBGet currently uses the par2cmdline library, which was vendored and patched to support Unicode some time ago. However, I’d like to switch to your improved fork as it offers better performance and maintenance. I’m aiming to use it as an external dependency, pulling it directly from the repository. To achieve this, I’ll need to refactor the code significantly, as par2cmdline-turbo isn’t currently designed for easy library integration. This will involve creating namespaces, separating header files for external and internal usage, and making other necessary changes.
Additionally, I’d like to transition from using msbuild and autotools to CMake, primarily due to its cross-platform compatibility and better dependency management.
I plan to implement these changes in my fork. Would you be interested in reviewing my changes and potentially incorporating them into your main repository?
I’d appreciate any feedback or suggestions you might have on this plan. Are there any aspects I might have overlooked?

@animetosho
Copy link
Owner

animetosho commented Oct 9, 2024

Really appreciate all the work you've put into this!

Sorry to ask this of you, but would you be able to file this PR to the upstream project instead? I don't actually intend to maintain par2cmdline - this fork is mostly a simple modification, but I intend to pull down changes made upstream.

I haven't looked at either, but you also might be interested in this PR made there, which sounds like it tries to do something similar.

The other changes you intend to make sound really useful to me, but please also file PRs for them upstream.
Unfortunately the upstream project is relatively dead, so there's a good chance it won't get reviewed or merged, however they are seeking a maintainer; considering the amount of changes you intend to contribute, that might end up being easier for you.


Alternatively, if you prefer to not be an 'official' par2cmdline maintainer, you could maintain your own fork, and I'll consider re-pointing this project to that as upstream.

@dnzbk
Copy link
Author

dnzbk commented Oct 10, 2024

I see, thanks.
I’d definitely like to use your improved version of par2cmdline.
I’ll try to make improvements in my fork and we’ll see how things go.

@animetosho
Copy link
Owner

If it helps, I have CMake scripts for the gf16 and hasher components.
They're meant for the testing application, but it might help you if you're trying to do a full CMake migration.

@dnzbk
Copy link
Author

dnzbk commented Oct 14, 2024

That will be very helpful, thank you!

@Safihre
Copy link

Safihre commented Oct 15, 2024

@dnzbk We at SABnzbd have been waiting/hoping for par2cmdline-turbo with UTF8 and Long-path support, so if you could get this fixed in general also for the executable version, that would really help us ❤️
(cc: @mnightingale)

@dnzbk
Copy link
Author

dnzbk commented Oct 16, 2024

@Safihre
Sure, that's the plan.

@animetosho
Copy link
Owner

Re f65eb8a: not freeing is actually intentional. gfmat_init initialises some lookup constants, but only allocates once, so it's not a leak in the sense that it'll constantly grow.

gfmat_free is mostly there for test cases to stop ASAN complaining about a possible leak.
Doing extra frees should be fine though - gfmat_init will have to recompute all the constants when it's called next, though I don't think that ever happens in par2cmdline.

Good on you for identifying it nonetheless.

@dnzbk
Copy link
Author

dnzbk commented Oct 30, 2024

@animetosho thanks for the feedback.

Yes, it's not that important if Par2 is used as a standalone command line tool, but I think it’s crucial when it used as a library as we do in NZBGet.

@animetosho
Copy link
Owner

but I think it’s crucial when it used as a library as we do in NZBGet.

I'm interested in changing things to avoid issues, but I don't understand how this would be an issue when used as a library?

@dnzbk
Copy link
Author

dnzbk commented Oct 30, 2024

I'm interested in changing things to avoid issues, but I don't understand how this would be an issue when used as a library?

Each new instance of Par2Repairer calls gfmat_init() as it aggregates PAR2Proc which leads to memory allocations without freeing it when PAR2Proc and Par2Repairer are destroyed.

@animetosho
Copy link
Owner

That's not quite it, only the first call allocates memory. Subsequent instances do not allocate memory.

@dnzbk
Copy link
Author

dnzbk commented Oct 31, 2024

That's not quite it, only the first call allocates memory. Subsequent instances do not allocate memory.

Yes, you're right. Memory is indeed allocated only once. Valgrind was producing false positives as the memory is not actually explicitly freed up at program termination but this is not really something we should care about.
I'll revert my changes.
Thanks for pointing that out.

@animetosho
Copy link
Owner

Ah that makes sense - thanks for letting me know! Maybe I can look at adding a Valgrind leak suppression file so that it doesn't turn up in reports (though people running Valgrind will need to know to include it).

@Safihre
Copy link

Safihre commented Nov 11, 2024

@dnzbk I saw you merged the nzbget PR. Is there also a way to get par2cmdline (turbo) with proper long path and utf8 support as a stand-alone binary? :)

@dnzbk
Copy link
Author

dnzbk commented Nov 12, 2024

@Safihre
It’s still under testing, but sure, I can do that. Just give me a moment to make some adjustments in the repo.

@dnzbk dnzbk marked this pull request as draft November 12, 2024 13:01
@dnzbk dnzbk force-pushed the turbo branch 2 times, most recently from ffee646 to 2a27e47 Compare November 12, 2024 13:14
@dnzbk dnzbk closed this Nov 12, 2024
@dnzbk
Copy link
Author

dnzbk commented Nov 12, 2024

@Safihre
I built the binaries for you. Unfortunately, there’s no macOS AMD64 build yet due to a pipeline failure, but that's not a big deal right now. Please test it and let me know your feedback.

@animetosho
I investigated the macOS AMD64 build pipeline failures:
So, the get-release step fails with this error:

/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:7768
    throw error;
    ^

TypeError: undefined is not iterable (cannot read property Symbol(Symbol.iterator))
    at macosRelease (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:622:26)
    at osName (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:120:18)
    at getUserAgent (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:[7](https://github.com/nzbgetcom/par2cmdline-turbo/actions/runs/11802451849/job/32878141973#step:5:8)762:53)
    at parseOptions (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:2561:57)
    at new Octokit (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:355[8](https://github.com/nzbgetcom/par2cmdline-turbo/actions/runs/11802451849/job/32878141973#step:5:9):31)
    at 46[9](https://github.com/nzbgetcom/par2cmdline-turbo/actions/runs/11802451849/job/32878141973#step:5:10) (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:5663:28)
    at __webpack_require__ (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:22:30)
    at 940 (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:25090:29)
    at __webpack_require__ (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:22:30)
    at 31 (/Users/runner/work/_actions/bruceadams/get-release/v1.2.3/dist/index.js:394:[13](https://github.com/nzbgetcom/par2cmdline-turbo/actions/runs/11802451849/job/32878141973#step:5:14))

After updating bruceadams/get-release to version v1.3.2 the pipeline started failing at the Run make step:

`ld: Undefined symbols:
  std::exception_ptr::__from_native_exception_pointer(void*), referenced from:
      std::__1::promise<bool>::~promise() in libpar2.a[25](libparpar_gf16_a-controller.o)
  ___cxa_init_primary_exception, referenced from:
      std::__1::promise<bool>::~promise() in libpar2.a[25](libparpar_gf16_a-controller.o)
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [par2] Error 1
make: *** [all] Error 2`

I hope my investigation will help you fix this. Or maybe I’ll take care of it when I have some free time.

animetosho added a commit that referenced this pull request Nov 13, 2024
animetosho added a commit that referenced this pull request Nov 13, 2024
@animetosho
Copy link
Owner

Ah yes, diagnosing failing Github Actions - lots of fun.
Thanks for pointing that out - it looks like this fixed it (in addition to setting the OS to macos-13 as newer versions may be ARM64).

@Safihre
Copy link

Safihre commented Nov 13, 2024

@animetosho still can't persuade you to merge this into turbo directly? 🤗

@animetosho
Copy link
Owner

I had a quick look at the changes made in nzbgetcom's fork. I mostly don't get the intention behind moving the headers around, and this could make future updates a little finicky. So I'm a bit reluctant to use the repository as an upstream at the moment.

For @Safihre's purposes, using the nzbgetcom fork might do everything you need it to.

I'm reiterating, but appreciate the work @dnzbk has put in.

@dnzbk
Copy link
Author

dnzbk commented Nov 14, 2024

The nzbget branch includes some changes (signals, namespaces, headers refactoring and etc.) specific to nzbget and is not needed by anyone else except nzbget itself.
The utf8 branch is a “vanilla” Par2-turbo + better UTF-8 support on Windows, which I used to build the binaries for @Safihre and maybe someone else.

@animetosho
Copy link
Owner

Ah I see, thanks for the explanation!

@Safihre
Copy link

Safihre commented Nov 16, 2024

@animetosho we can merge the utf8 branch?

@Safihre
Copy link

Safihre commented Nov 16, 2024

@dnzbk I tried it but it still has a problem with our unicode-test case NZB (which works with Multipar).
I also tried it in the latest nzbget build, and there it fails to. So it seems a problem with the utf8 code.
https://sabnzbd.org/tests/unicode_rar_broken.nzb

@dnzbk
Copy link
Author

dnzbk commented Nov 17, 2024

@Safihre Thanks, I'll take a look.

@dnzbk
Copy link
Author

dnzbk commented Nov 17, 2024

@Safihre Should be fixed now. At least it does work in nzbget now. Please test it out.

@Safihre
Copy link

Safihre commented Nov 17, 2024

Success!

@Safihre
Copy link

Safihre commented Nov 17, 2024

Now we can finally switch to par2cmdline (from Multipar) for the Windows release of SABnzbd. It was the only platform where we didn't use it yet 🥳
But for maintainability I think we will have to fork turbo also in sabnzbd Github organization, if this doesn't become part of base turbo.

@animetosho
Copy link
Owner

If you're happy with maintaining a fork, I can point this repository to that as upstream.
(I'm a bit reluctant to point to the nzbget branch as upstream due to changes to ParPar code)

animetosho added a commit to animetosho/ParPar that referenced this pull request Dec 10, 2024
Not sure how to make this file obvious to those that need it, but at least it's here
Ref animetosho/par2cmdline-turbo#36 (comment)
animetosho added a commit that referenced this pull request Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants