Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libjpeg-turbo: Support non-Altivec PowerPC #11625

Merged
merged 1 commit into from
Nov 11, 2021

Conversation

evanmiller
Copy link
Contributor

The CMake file supplied by libjpeg-turbo 2.1.0 incorrectly detects SIMD support on G3 processors. As a result, dynamic libraries are produced which will not load on the host machine. This patch forces SIMD to be disabled on G3 processors.

Closes: https://trac.macports.org/ticket/63259

Description

Type(s)
  • bugfix
  • enhancement
  • security fix
Tested on

macOS 10.4.11 8S165 Power Macintosh
Component versions: DevToolsCore-798.0; DevToolsSupport-794.0

Verification

Have you

  • followed our Commit Message Guidelines?
  • squashed and minimized your commits?
  • checked that there aren't other open pull requests for the same change?
  • referenced existing tickets on Trac with full URL?
  • checked your Portfile with port lint?
  • tried existing tests with sudo port test?
  • tried a full install with sudo port -vst install?
  • tested basic functionality of all binary files?

@macportsbot
Copy link

Notifying maintainers:
@larryv for port libjpeg-turbo.

@kencu
Copy link
Contributor

kencu commented Jul 19, 2021

We wouldn’t want to turn off Altivec for the 99% of users who have it, to support one person on a G3.

Instead, I mentioned there are Portfiles in the repo that test for Altivec using sysctl now. We want to do what they do, and only turn it off if the user doesn’t have it.

@kencu
Copy link
Contributor

kencu commented Jul 19, 2021

something like this:

platform powerpc {
    # absence of altivec is not automatically detected
    if {[catch {sysctl hw.vectorunit} result] || $result == 0} {
        configure.args-append   -DWITH_SIMD=OFF
    }
}

I copied this right out of the ffmpeg Portfile and added your suggested configure arg. I would test this first of course to make sure that it is the right way around -- I always have to verify that these nested constructs actually give the right answer I am expecting -- but it's been in the ffmpeg for years, so presumably is right.

I don't know about the other part of your PR here yet.

@evanmiller
Copy link
Contributor Author

@kencu Thanks for the feedback. Do G4s have a vector unit? My understanding was that Altivec is only available on ppc64 (i.e. G5 processors).

@kencu
Copy link
Contributor

kencu commented Jul 19, 2021

the G4/74xx series was the first to have Altivec, I believe. For these older systems we like to squeak all the work out of the cpu cycles we can, ergo the test-before-disable is desirable.

Thanks!

@evanmiller
Copy link
Contributor Author

@kencu Please see the latest commit. I can squash and force-push if needed.

@evanmiller
Copy link
Contributor Author

@kencu It configures correctly on my G3. I do not have a G4/G5 to test with.

@ryandesign
Copy link
Contributor

platform powerpc {
    # absence of altivec is not automatically detected
    if {[catch {sysctl hw.vectorunit} result] || $result == 0} {
        configure.args-append   -DWITH_SIMD=OFF
    }
}

This means the port builds differently depending on the user's computer, which we don't usually want to do in MacPorts. (The ffmpeg port that this method was copied from is wrong in this regard too.) We don't currently have binaries for Tiger, but if we did, and if our Tiger PowerPC build machine had a G4 or better, then the binary we shipped would be built for G4 and would not work on a G3.

Usually we like to express build differences as variants. For example we might define a variant called "g3" that turns off the G3-incompatible SIMD optimizations (with optimizations on being the default if the variant is not selected). Or we could invert it and define a variant called "g4" that turns them on (and make the default that optimizations are off unless the variant is selected). Whichever we decide, code like the above could be used to decide whether to enable or disable the variant by default. If we use a variant name "g3" or "g4" which is PowerPC-specific we would have to decide how to handle non-PowerPC systems. Defining the variants only on PowerPC systems is one possibility, though the buildbot mirroring script doesn't like ports that define variants conditionally. A more platform-neutral variant name might be "simd"; it could be turned on by default on all systems except G3s.

It would be terribly nifty if MacPorts could just build it twice—once for arch ppc without SIMD optimizations for G3s and once for arch ppc7400 with SIMD for G4s and G5s—and ship it as a universal binary. MacPorts does have support for universal building but it was never designed for building for two different 32-bit PowerPC architectures and there is probably not a great deal of interest in adding such support at this late date.

@ryandesign
Copy link
Contributor

There's also the matter of universal builds. platform powerpc only applies when building on a PowerPC. What happens when you build universal for i386 / ppc on a PowerPC Mac vs. what happens when you build universal for i386 / ppc on an Intel Mac?

What you probably meant is for -DWITH_SIMD=OFF to be used when building for PowerPC (regardless what system you're building on). So you need to add that flag only for the ppc part of the build. Doing that requires using the muniversal portgroup; fortunately, this port already does.

@kencu
Copy link
Contributor

kencu commented Jul 20, 2021

that's what Evan had at first and I chased him away from it.

I would not like to cripple most systems for the extremely rare G3 users.

As you say there is no ppc Tiger buildbot. Leopard requires a G4. So this is not an issue really, but to future-proof it if you someday put up a ppc Tiger buildbot we can just disable archive sites so the user has to build it.

ppc needs every Altivec enhancement it can get!

@evanmiller
Copy link
Contributor Author

"The little G3 that could" has raised a more complicated issue than I expected! I'm treating it as a recreational coding machine, so I'd be open to working on a universal ppc + ppc7400 build in the future.

If you want to take the variant approach for now, I'd suggest a name along the lines of simd, nosimd, or scalar. The WITH_SIMD=OFF build option in libjpeg-turbo can be applied to any architecture, not just PowerPC.

@evanmiller
Copy link
Contributor Author

Here's a quick implementation of a no_simd variant in a few ports, removing the problematic run-time CPU checks: evanmiller@633979a

My thinking is that we happy few G3 users can just add +no_simd to our variants.conf and avoid fighting with the G4/G5 gang in the future.

@mojca
Copy link
Member

mojca commented Jul 21, 2021

Using negative variants has been considered an antipattern for years now. The idea is to have simd and have it turned on by default where supported.

@@ -46,8 +47,11 @@ if {${universal_possible} && [variant_isset universal]} {
} elseif {${configure.build_arch} in {i386 x86_64}} {
depends_build-append port:nasm
configure.env ASM_NASM=${prefix}/bin/nasm
} elseif {${configure.build_arch} eq "ppc"} {
if {[catch {sysctl hw.vectorunit} result] || $result == 0} {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't correct if binary distribution can occur (the sysctl result on the build machine could easily be different to that on the host machine), but of course we don't build binaries for ppc. Someone else could though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added an archive_sites directive so that G3s will not pull binary distributions.

@@ -46,8 +47,11 @@ if {${universal_possible} && [variant_isset universal]} {
} elseif {${configure.build_arch} in {i386 x86_64}} {
depends_build-append port:nasm
configure.env ASM_NASM=${prefix}/bin/nasm
} elseif {${configure.build_arch} eq "ppc"} {
if {[catch {sysctl hw.vectorunit} result] || $result == 0} {
configure.args-append -DWITH_SIMD=OFF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also needs to go in merger_configure_args(ppc) in the universal case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this comment or point me to an example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there is a reasonable strategy for adding this option in the universal case. If the host is i386 then the test will not produce a useful result.

@evanmiller
Copy link
Contributor Author

Here is a solution following @mojca’s comment, adding a simd variant and disabling by default on G3:

evanmiller@95745c3

If it looks OK to the parties concerned, I will open a separate pull request.

@kencu
Copy link
Contributor

kencu commented Jul 26, 2021

the test you have put in to override the simd detection built in to libjpeg-turbo is set to affect every system of every kind now, not just the powerpc systems.

I'm not sure every processor with simd instructions (which I think is all of them except a G3 ppc) passes that sysctlt test, and for the future, who knows?

So whatever is done to support G3 powerpc has to be set up to leave everything else alone.

By the way, any idea why the existing Altivec test in the simd CMakeLists.txt is apparently passing for you, even though you don't have Altivec? They are going to some lengths to do this right already, and would be likely interested in fixing this if it is broken somehow.

I continue to fell that the better approach here, is to use runtime detection (best by the build itself, but in the Portfile if the build detection is broken) but disable the ( nonexistant) buildbot by clearing archive_sites instead of these variants, which add confusion.

@evanmiller
Copy link
Contributor Author

@kencu You can see how libjpeg-turbo tests for Altivec support here:

https://github.com/libjpeg-turbo/libjpeg-turbo/blob/e6e952d5d5db6c0f9c505854995988ad13b30ec8/simd/CMakeLists.txt#L410-L419

If it can compile Altivec code using the -maltivec flag, the test passes. However, this compilation succeeds on G3 machines, even when setting other flags such as -mcpu=G3. So technically it appears to be a bug in GCC.

Since getting a bug fixed in GCC will take a while to make its way to MacPorts, I can modify the sysctl test to run only on PowerPC. I will note that hw.vectorunit = 0 on the Apple M1 machine on which I'm typing – so you're right, the test shouldn't be relied upon on other platforms.

@evanmiller
Copy link
Contributor Author

Yes, the Altivec test passes on a G3. GCC emits ppc7400 code when -maltivec is set, regardless of other flags.

If the libjpeg-turbo test attempts to run the code, rather than just compile, it will break cross-compilation.

@kencu
Copy link
Contributor

kencu commented Jul 26, 2021

that's what I thought.. gcc has no bug. The G3 can build Altivec code, just can't run it.

We don't care about cross-compilation here...we want to build what the build machine can run.

@kencu
Copy link
Contributor

kencu commented Jul 27, 2021

Would you have a moment to change:

https://github.com/libjpeg-turbo/libjpeg-turbo/blob/e6e952d5d5db6c0f9c505854995988ad13b30ec8/simd/CMakeLists.txt#L412

from check_c_source_compiles to check_c_source_runs?

That should fix it to properly do the test, that can be sent upstream, and we can keep that patch in the port until they do.

@evanmiller
Copy link
Contributor Author

@kencu I'm afraid the check_c_source_runs change cannot be sent upstream. It will break the build for anyone trying to cross-compile libjpeg-turbo for PowerPC (e.g. Intel -> PowerPC builds). We're not the only cross-compilers!

I guess it's debatable, but the fact that gcc -mcpu=G3 -maltivec produces code that does not actually run on a G3 seems like a problem to me. I would expect an error about conflicting flags.

I will wait for a consensus on the correct overall strategy before making additional commits. For anyone interested, there is a similar issue with -maltivec incorrectly enabled on clang-3.4 that I have reported here: https://trac.macports.org/ticket/63304

@evanmiller
Copy link
Contributor Author

Sure it can. I will send it. We are interested here in the code running on the build machine. Not cross compiling.

I think we are misunderstanding each other. Other consumers of libjpeg-turbo may want to cross-compile. I'm not talking about MacPorts folks. If someone downloads libjpeg-turbo with the proposed change onto an Intel machine and tries to cross-compile for PPC/Altivec, check_c_source_compiles will succeed, but check_c_source_runs will fail. So they will not get the SIMD/Altivec build that they desire. That's why I'm saying check_c_source_runs cannot be sent upstream.

evanmiller referenced this pull request Jul 31, 2021
Closes: https://trac.macports.org/ticket/63231

Remove libtool dependency which seems no longer to be used, at least on
current macOS systems.

Switch i386 CPU optimizations from i586 to sse now that the upstream
bug with text relocations is fixed (since several years already).

Remove Portfile code duplication regarding CPU optimization flags.
@evanmiller
Copy link
Contributor Author

How difficult / insane would it be to define a separate ppc7400 (i.e. Altivec) architecture in MacPorts? My thinking is that instead of reinventing the wheel with run-time detection and/or variants just use the existing architecture mechanism to select the correct build settings and provide better error messages on unsupported hardware. G4s would prefer this architecture to ppc the same way G5s currently prefer ppc64. Would such an approach be at all feasible?

@evanmiller
Copy link
Contributor Author

I have outlined a "correct" alternative approach to this problem in macports/macports-base#245.

Absent that, I would like to see the current PR (which uses host feature detection just like ffmpeg and others) merged so that we can close this down and move on.

@ccawley2011
Copy link

I'm not an expert, but wouldn't it be possible to adapt the init_simd function to detect the presence of Altivec on PPC Macs at runtime, rather than at build time? In particular, SDL detects Altivec at runtime, so I'd assume it would be simple to do for libjpeg-turbo as well.

This was referenced Oct 14, 2021
@mascguy mascguy assigned mascguy and unassigned larryv Oct 16, 2021
@mascguy
Copy link
Member

mascguy commented Oct 16, 2021

Evan, I've created a new devel version of this port, so that we can proactively test these types of changes. And given that libjpeg-turbo is very much a foundational port, it would be good to test in the wild first.

So if you're interested in applying these changes to libjpeg-turbo-devel instead, feel free to update your commits accordingly. And I'll happily push those in a prompt manner, since we don't have to worry about breaking anything.

@evanmiller
Copy link
Contributor Author

@mascguy OK, opened #12597

@evanmiller
Copy link
Contributor Author

The only issue I see with the current approach is that branching on the build_arch potentially means that an x86 => PPC cross-compile will have SIMD disabled. Possibly better to test the host architecture? Anyway, someone's cross-compiles will be broken since we are feature-testing the host.

@mascguy
Copy link
Member

mascguy commented Oct 26, 2021

So now that we've been able to proactively release this into the wild, via libjpeg-turbo-devel, what do folks think of the changes? Are we good with the approach thus far, or are further refinements necessary?

@evanmiller
Copy link
Contributor Author

I'm not an expert, but wouldn't it be possible to adapt the init_simd function to detect the presence of Altivec on PPC Macs at runtime, rather than at build time? In particular, SDL detects Altivec at runtime, so I'd assume it would be simple to do for libjpeg-turbo as well.

@ccawley2011 The issue is one of instruction sets. G3 machines cannot load a dylib that self-reports as ppc7400. I believe you'd need to have a separate Altivec dylib that is conditionally loaded at run-time after the initial detection takes place, or else some PPC assembly trickery. This may require significant changes to libjpeg-turbo.

@mascguy
Copy link
Member

mascguy commented Nov 11, 2021

I'm not an expert, but wouldn't it be possible to adapt the init_simd function to detect the presence of Altivec on PPC Macs at runtime, rather than at build time? In particular, SDL detects Altivec at runtime, so I'd assume it would be simple to do for libjpeg-turbo as well.

@ccawley2011 The issue is one of instruction sets. G3 machines cannot load a dylib that self-reports as ppc7400. I believe you'd need to have a separate Altivec dylib that is conditionally loaded at run-time after the initial detection takes place, or else some PPC assembly trickery. This may require significant changes to libjpeg-turbo.

Okay, this PR has been sitting here long enough. And these changes have been in-play with libjpeg-turbo-devel for nearly a month, without issue.

@jmroot Apart from some minor disagreement related to precisely how we detect Altivec - and frankly, that shouldn't hold this up any longer - are you OK with merging?

@mascguy
Copy link
Member

mascguy commented Nov 11, 2021

(Evan, my force-pushes were simply to rebase against master. Otherwise, nothing was touched/changed.)

@mascguy
Copy link
Member

mascguy commented Nov 11, 2021

Evan, are there any other refinements you'd like to make, or are we ready to merge?

@evanmiller
Copy link
Contributor Author

@mascguy I'm fine with it as it stands. Since it will need a rev-bump to get the change out to my fellow G3 user(s)(?), it would be great to roll out the 2.1.1 release whenever it is deemed ready.

@mascguy
Copy link
Member

mascguy commented Nov 11, 2021

@mascguy I'm fine with it as it stands. Since it will need a rev-bump to get the change out to my fellow G3 user(s)(?), it would be great to roll out the 2.1.1 release whenever it is deemed ready.

I'll update the non-Devel port to 2.1.1, via a separate PR.

On a semi-related note, it looks like upstream recently committed a change related to AltiVec detection:

libjpeg-turbo/libjpeg-turbo#552

It was added just 12 days ago, so it's not released yet. But worth taking a look at, if you haven't seen it!

@mascguy
Copy link
Member

mascguy commented Nov 11, 2021

PR created for 2.1.1 update:

#12910

@mascguy mascguy merged commit 85c0218 into macports:master Nov 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

9 participants