Optimize CI workflow. #1743

player-03 · 2024-01-12T22:06:05Z

The -eval option means most of the CI tasks no longer have to wait for all the others. If we run more of them in parallel, users will be able to get much quicker feedback on their pull requests.

This way, it doesn't have to wait for the slow macos-ndll step.

We want as few steps as possible to have to wait for macos-ndll. There's no way around package-haxelib, but `-eval` gives us a much better option for ios-ndll.

Not required, but it'll make the workflow summary neater.

I'm almost sure Homebrew is what makes macos-ndll so slow, but this will help confirm it.

This function attempts to skip steps if the corresponding function was not overridden. However, in `-eval` mode, it may incorrectly detect the `@ignore` metadata on subclass functions, causing it to always skip. Running an empty function is not a problem, but we wouldn't want to log that step if nothing was being done. Therefore I switched the check to only apply to the log call. For one reason or another, the "watch" command can't be skipped, so I left it alone.

If switching to Linux is what made android-ndll so much faster, we should do the same for android-samples.

This removes the need to wait on macos-ndll.

air-samples and neko-samples run on Haxe 3, which didn't have Eval mode, so they need to wait for the host platform's ndll. Everything else only needs the ndll that will be included in the app (if any). Now we have some samples that can start immediately!

Maybe this requirement could be removed in the future? Not sure why `lime setup` would even need it.

Some of these make sense, but I have no idea why air-samples is complaining about hxcpp. I also don't understand why macos-samples needs the submodules (or perhaps why linux-samples doesn't), but `-eval` should get around that.

For air-samples, this means downloading the submodules so that `lime rebuild tools` can work. For windows-samples, we can use `-eval` to skip `rebuild tools` entirely.

player-03 · 2024-01-14T04:49:30Z

At this point, it's possible to complete everything else while waiting for the brew bundle step of macos-ndll.

Apparently Homebrew dropped support for macOS 11 sometime last month, which may line up with when it started being extremely slow. Worth a shot!

player-03 · 2024-01-14T05:11:52Z

Credit to @Apprentice-Alchemist for figuring out why Homebrew was so slow.

player-03 · 2024-01-14T05:31:16Z

Workflow succeeded in 25 minutes, compared to ~1h45m before this PR.

joshtynjala · 2024-01-15T22:30:36Z

Great to see the CI finishing faster! I'm concerned about one thing that you changed, though. The removal of this dependency from the samples builds:

needs: package-haxelib

I made the samples builds rely on package-haxelib on purpose. The idea was to ensure that the Haxelib .zip bundle that we're generating is valid, and can build projects for all supported targets.

player-03 · 2024-01-16T05:06:47Z

I see where you're coming from. On the one hand, with the brew issue fixed (for now), going back to the package-haxelib bottleneck shouldn't slow things down too much. On the other hand, it has been packaging correctly for years, it would throw an error if any of the pieces were missing, and we're testing the pieces individually. I don't see how it could start going wrong now.

How about this as a compromise? We test each target as soon as the required ndlls become available, saving time. Later when package-haxelib finishes, we test its output by downloading the bundle on Windows/Mac/Linux, and building a sample in HashLink. (HashLink because all we're trying to do is prove that the bundled version of Lime works.)

joshtynjala · 2024-01-16T05:23:07Z

How about this as a compromise? We test each target as soon as the required ndlls become available, saving time. Later when package-haxelib finishes, we test its output by downloading the bundle on Windows/Mac/Linux, and building a sample in HashLink. (HashLink because all we're trying to do is prove that the bundled version of Lime works.)

Works for me!

For instance, we're merging "android-ndll" with "android-samples" and merging "windows-ndll" with "windows-samples." There's no need to set up the same environment twice.

We need to make sure that lime-haxelib.zip is enough, on its own, to run Lime. And it needs to work on all three of Windows, Mac, and Linux. Since this code will run at the very end of the workflow, not in parallel with anything, it's important to keep it quick. HashLink, Neko, and Flash are good candidates, but of those three, only HashLink is maintained.

It's quick too, might as well test package-haxelib a bit more thoroughly.

This allows it to run immediately, rather than having to wait for Windows. Plus, it makes the workflow summary clearly legible again. The downside is that `-eval` wasn't available in Haxe 3, so we can no longer test that. However, I just added more Neko+Haxe 3 tests, so I think it balances out.

player-03 · 2024-01-16T21:20:39Z

Ok, the variables gave me a fair bit of trouble, and I think the 25-minute run might have been a fluke, but it's working.

player-03 · 2024-01-16T21:21:56Z

Hmm, some of the Neko runs take a while to download the artifact. Maybe there are just too many in parallel. Could scale it back, I guess.

The post-package-haxelib jobs all need to download its output, and some of them (seemingly at random) spend an extra ~6m waiting for it.

player-03 · 2024-01-17T01:36:41Z

Ugh, reducing the number of jobs didn't even help. All of the time saved was because the iOS builds happened to go a bit faster.

The ios job is currently the biggest bottleneck, and this sample is the least informative of the three.

Because apparently Mac can be the bottleneck sometimes.

player-03 · 2024-01-17T16:25:13Z

As an experiment, let's try going back to hard-coded runs-on values. It's possible GitHub optimizes based on knowing in advance which machines to use, and that's how we got that 25-minute run.

I guess you can use job output variables there, but not environment ones. But the environment variable is nicer for everything else, so we'll just not use a variable here.

player-03 · 2024-01-17T18:17:23Z

It's definitely faster with hard-coded machines to run on. Still not 25 minutes, but I think that was only possible because we didn't test lime-haxelib.zip. @joshtynjala, does this look good to merge?

player-03 added 9 commits January 12, 2024 16:51

CI: run android-ndll on Linux.

d3771bb

This way, it doesn't have to wait for the slow macos-ndll step.

CI: remove macos-ndll dependency from ios-ndll.

dc02d9d

We want as few steps as possible to have to wait for macos-ndll. There's no way around package-haxelib, but `-eval` gives us a much better option for ios-ndll.

CI: remove linux-ndll dependency from android-ndll.

0b68e8e

Not required, but it'll make the workflow summary neater.

CI: split brew bundle into its own step.

3a39670

I'm almost sure Homebrew is what makes macos-ndll so slow, but this will help confirm it.

CI: run android-samples on Linux.

a83f405

If switching to Linux is what made android-ndll so much faster, we should do the same for android-samples.

CI: update step names.

74c939a

CI: give each sample its own step.

a4e3f9c

CI: run android-samples right after android-ndll.

0a47462

This removes the need to wait on macos-ndll.

player-03 marked this pull request as draft January 14, 2024 00:52

player-03 added 5 commits January 13, 2024 21:30

CI: install format before using lime -eval.

7cd2def

Maybe this requirement could be removed in the future? Not sure why `lime setup` would even need it.

CI: also install hxp.

3cad65a

CI: fix dependency errors.

b8d974a

Some of these make sense, but I have no idea why air-samples is complaining about hxcpp. I also don't understand why macos-samples needs the submodules (or perhaps why linux-samples doesn't), but `-eval` should get around that.

CI: fix missing submodules.

2b602c2

For air-samples, this means downloading the submodules so that `lime rebuild tools` can work. For windows-samples, we can use `-eval` to skip `rebuild tools` entirely.

CI: switch to macos-12.

5060bf2

Apparently Homebrew dropped support for macOS 11 sometime last month, which may line up with when it started being extremely slow. Worth a shot!

player-03 marked this pull request as ready for review January 14, 2024 05:31

player-03 added 7 commits January 16, 2024 11:51

CI: after creating an ndll, immediately test it.

9980d25

For instance, we're merging "android-ndll" with "android-samples" and merging "windows-ndll" with "windows-samples." There's no need to set up the same environment twice.

CI: use single quotes.

2c7ed59

CI: fix air-samples Haxe version.

fcbf65d

CI: convert version numbers to variables.

39328dc

CI: also run neko-samples after package-haxelib.

42700b1

It's quick too, might as well test package-haxelib a bit more thoroughly.

player-03 force-pushed the organize-workflow branch from 5532882 to dc9d325 Compare January 16, 2024 18:59

CI: use job outputs instead of variables.

8ea9c16

player-03 force-pushed the organize-workflow branch from dc9d325 to 8ea9c16 Compare January 16, 2024 19:12

player-03 added 2 commits January 16, 2024 14:13

CI: create dummy step for init job.

3d17ead

CI: try another way to get variables.

7423585

Reduce how many jobs compete for a download.

ed1eab8

The post-package-haxelib jobs all need to download its output, and some of them (seemingly at random) spend an extra ~6m waiting for it.

player-03 added 2 commits January 16, 2024 20:41

CI: skip HelloWorld sample on iOS.

5c00ccb

The ios job is currently the biggest bottleneck, and this sample is the least informative of the three.

CI: skip HelloWorld sample on Mac.

ac61c34

Because apparently Mac can be the bottleneck sometimes.

player-03 force-pushed the organize-workflow branch 2 times, most recently from aaebef7 to 053e01f Compare January 17, 2024 16:33

CI: try hard-coding the runs-on values again.

c209c8d

player-03 force-pushed the organize-workflow branch from 053e01f to c209c8d Compare January 17, 2024 16:34

CI: hard-code Haxe versions for neko-samples.

def9e9b

I guess you can use job output variables there, but not environment ones. But the environment variable is nicer for everything else, so we'll just not use a variable here.

joshtynjala self-requested a review January 18, 2024 06:49

joshtynjala approved these changes Jan 18, 2024

View reviewed changes

player-03 merged commit a414f64 into openfl:develop Jan 18, 2024
23 checks passed

player-03 deleted the organize-workflow branch January 18, 2024 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize CI workflow. #1743

Optimize CI workflow. #1743

player-03 commented Jan 12, 2024 •

edited

Loading

player-03 commented Jan 14, 2024

player-03 commented Jan 14, 2024

player-03 commented Jan 14, 2024

joshtynjala commented Jan 15, 2024

player-03 commented Jan 16, 2024

joshtynjala commented Jan 16, 2024

player-03 commented Jan 16, 2024

player-03 commented Jan 16, 2024

player-03 commented Jan 17, 2024

player-03 commented Jan 17, 2024

player-03 commented Jan 17, 2024

Optimize CI workflow. #1743

Optimize CI workflow. #1743

Conversation

player-03 commented Jan 12, 2024 • edited Loading

player-03 commented Jan 14, 2024

player-03 commented Jan 14, 2024

player-03 commented Jan 14, 2024

joshtynjala commented Jan 15, 2024

player-03 commented Jan 16, 2024

joshtynjala commented Jan 16, 2024

player-03 commented Jan 16, 2024

player-03 commented Jan 16, 2024

player-03 commented Jan 17, 2024

player-03 commented Jan 17, 2024

player-03 commented Jan 17, 2024

player-03 commented Jan 12, 2024 •

edited

Loading