Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync report names across ranks #25

Merged
merged 4 commits into from
Apr 14, 2022
Merged

Conversation

jorblancoa
Copy link
Collaborator

@jorblancoa jorblancoa commented Apr 12, 2022

  • Fixed deadlock bug when some ranks don't have all the reports

 - Create sub communicators for report names accordingly
 - Fixed deadlock bug when some ranks dont have all the reports
…report names accordingly - Fixed deadlock bug when some ranks dont have all the reports
@WeinaJi
Copy link
Collaborator

WeinaJi commented Apr 12, 2022

I have run my CI test with this PR and it solved the deadlock issue. Thanks @jorblancoa

Copy link
Collaborator

@WeinaJi WeinaJi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@jorblancoa
Copy link
Collaborator Author

jorblancoa commented Apr 14, 2022

Runing blueconfigs pipeline here
https://bbpgitlab.epfl.ch/hpc/sim/blueconfigs/-/pipelines/48793

@pramodk
Copy link
Contributor

pramodk commented Apr 14, 2022

@jorblancoa : let's not forget to update one of the existing BlueConfig to cover this regression :)

@@ -129,86 +198,33 @@ struct ParallelImplementation {
int num_reports = report_names.size();
MPI_Comm_split(MPI_COMM_WORLD, num_reports == 0, 0, &SonataReport::has_nodes_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it's worth having ticket about not referring to MPI_COMM_WORLD anywhere in the parallel MPI library i.e. when libsonatareport is initialised, the communicator should be passed as an argument (almost always).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created the issue as good for beginners :)
#27

@WeinaJi
Copy link
Collaborator

WeinaJi commented Apr 14, 2022

@jorblancoa : let's not forget to update one of the existing BlueConfig to cover this regression :)

The deadlock issue showed up in the blueconfig ngv test with Fernando's mulitpop improvements, not the main branch.
Once this PR is merged, Fernando's MRs will pass and be merged.
https://bbpgitlab.epfl.ch/hpc/sim/blueconfigs/-/merge_requests/35
https://bbpgitlab.epfl.ch/hpc/sim/neurodamus-py/-/merge_requests/39

@jorblancoa jorblancoa merged commit 5f3fdda into master Apr 14, 2022
@jorblancoa jorblancoa deleted the jblanco/sync_report_names branch April 14, 2022 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants