Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run tests in parallel #240

Open
3 tasks
gforcada opened this issue Oct 5, 2018 · 6 comments
Open
3 tasks

Run tests in parallel #240

gforcada opened this issue Oct 5, 2018 · 6 comments

Comments

@gforcada
Copy link
Member

gforcada commented Oct 5, 2018

During the saltlab sprint (October 2018) @Rotonen worked on getting our jenkins jobs to run the tests in a much faster fashion by splitting the test setup into different layers and parallelizing their run (or something along the lines 😅 )

During the sprint it was reverted as it still had some issues.

Here is a list of the changes to be done to implement it again on a given job:

  • set the Job Weight to 6
  • add the -j 6 parameter to the test runner (i.e. bin/test -j 6 OTHER PARAMETERS FOLLOW)

@Rotonen anything else that I missed? 🤔 weren't you using a mtest script of the sorts?

TODO:

  • make output visible
  • robot framework tests clash on logging its output?
  • verbose output (-vvv) does not play nice with splitting
  • anything else?
@Rotonen
Copy link
Contributor

Rotonen commented Oct 5, 2018

The mtest script is only relevant if we need to also split execution within a layer. We're in a fortunate position to not have to bother with it as there are no disproportionately slow individual layers standing out from the mass.

Ultimately I hope to tackle that in zope.testbrowser itself, but I'm not deep enough into Python multiprocessing shared memory data structures yet to pull that off.

@Rotonen
Copy link
Contributor

Rotonen commented Oct 7, 2018

On closer inspection I've found out the root cause of the Robot hangs on bin/test -j builds - they share the same xvfb. So we do need mtest, just to have an unique xvfb instance per layer. Oh well.

@gforcada
Copy link
Member Author

@davisagli this might interest you, given that we have the robot tests on another job already, we might want to try this again and see how it fares.

I will be on vacations unfortunately from tomorrow until October 24th, we can try this when I'm back if no one wants to try it before 😄

@Rotonen
Copy link
Contributor

Rotonen commented Oct 13, 2022

At this point in time I'd recommend to take the rough test discovery mechanism and port that to use a massive pile of Github Actions.

bin/test --list or whatever it was and then parse that with sed and feed those as inputs or chunked inputs into the workflow.

@davisagli
Copy link
Member

@gforcada Thanks for pointing it out. I guess the issue with shared xvfb might not matter now that we use headlesschrome.

@Rotonen I'm hoping to take a look at whether we can move more of our builds to Github Actions during the sprint this weekend. The initial focus will be on how to replace the things that mr.roboto does, so I won't worry about parallelizing it yet, but yeah, something like that could work.

@Rotonen
Copy link
Contributor

Rotonen commented Oct 14, 2022

In case looking into going parallel at some point later: the tests leak a lot of state. Many of the layer setups have over time ended up accidentally depending on stuff leaked from previous layers.

Local testing with a simple bin/test -j "$(($(getconf _NPROCESSORS_ONLN) * 3))" will start exposing those to you. The -j parameter runs layers independent of each other in subprocesses.

If also starting to chunk the large layers into multiple sub-5-minute chunks for a speedup, then all the within-layer leaks also need to get addressed.

The tests are highly run order dependent due to basically every layer of the onion leaking state and other stuff then later down the line accidentally depending on it. There's a neat way to catch these in-layer leaks with a parameter which randomises the test run order.

Here's my old bag of testing tricks from 2019 one can lean on for figuring out what is it I saw back then which made me gave up on making progress here.

https://github.com/4teamwork/opengever.core/blob/1e523c9a54a91dc4616b105db3cca202a9e84b44/docs/intern/development/testing.rst

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants