-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coarray exploitation #3
Comments
Hi Stefano,
I will update these Github repositories within the next days by adding new
code versions using F2008 SYNC MEMORY and atomic subroutines. The current
code versions do work with the compilers but are not confoming to the
(F2008/15) standard. I can already tell that the Load Balancing Example
does work using SYNC MEMORY and atomic subroutines: Since the
synchronizations are still coded 'manually', it is possible to synchronize
between objects on the same coarray image. Nevertheless, I was unable to
achieve something similar using F2015 Events: From my current practical
experiences you can use Events only for synchronizations between distinct
coarray images.
It is important to notice that the use of SYNC MEMORY does form execution
segments which are unordered within the example program.
Best regards
Michael
|
Dear @MichaelSiehl , You are very kind, I was not expecting to bother you so early, I have to study with much more care your great work. Anyhow, just because you mentioned the topic, I like to start our corrispondence 😄 Your approach for obtaining MPMD is foundamental for me to exploit coarray for this generic container. The events of F2015 are not necessary, your sync method should be definitely perfect for my aims. My main concerns are twofold:
In particular, for this project (that seems a stupid toy, but it is a very foundamental block for others...), I need to obtain a generic container that entails MPMD features. It is passed one half of year from my last view, but I remember that there were issues to encapsulate your MPMD technique into an object, namely building a OOP class entailing your MPMD approach. Is this still an issue, or you have built derived types encapsulating MPMD features? At some point I would like to talk wiyh all other members of the group (e.g. @rouson and @zbeekman) but for now I have to study. Nevertheless, if you are so kind, I like to bother you even during my coarray-training period, this could be of great help for me. Cheers. |
I know one potential limitation---which is almost fixed---is the use of allocatable components of derived type coarrays was not supported by GCC/OpenCoarrays until some of the latest patches to GCC are merged, and the support for this is finalized in OpenCoarrays. In the near future, code such as the following will now be possible using GCC-7 (AKA GCC trunk, until the 7.1 release in the spring) in combination with an OpenCoarrays future release: type foo
real, allocatable :: bar(:)
end type
type(foo) :: foobar[*] It is possible that there are unexercised bugs in OpenCoarrays, but the more people use it and report back to us, the faster we can resolve them. Anyway, @rouson is the real expert here, so my apologies if I've misinterpreted anything. |
@zbeekman I was thinking just this issue that I have tagged weak in the previous post. As I said I need a lot of time to study MPMD technique of @MichaelSiehl , thus for the time I'll be ready I am sure that the implementations (opencoaray/GNU gfortran) supporting caf will be bullet-proof 😄 |
Hi Izaak, Best Regards 2016-10-02 16:12 GMT+02:00 Izaak Beekman [email protected]:
|
@rouson: We may want to incorporate tests like @MichaelSiehl's into OpenCoarrays. @szaghi sorry for hijacking your thread 😄 🔫 |
Regarding allocatable components of derived type coarrays, I would also IF (Coarray_Object[intImageNumber] % logAllocationStatus) THEN As far as I remember, ifort did allow to use the LBOUND and UBOUND 2016-10-03 23:13 GMT+02:00 Stefano Zaghi [email protected]:
|
|My main concerns are twofold:
|In particular, for this project (that seems a stupid toy, but it is a very |At some point I would like to talk wiyh all other members of the group |
I just created a new GitHub repository containing the above example program 2016-10-05 1:22 GMT+02:00 michael siehl [email protected]:
|
@MichaelSiehl would you be willing to submit this as a test case to OpenCoarrays, to test our implementation once it goes live? CC: @rouson |
@zbeekman: Yes, of course. The code compiles and runs well with ifort. Does submitting this as a test case to OpenCoarrays require any further steps by myself or can you do it yourself? |
I think the one thing that should probably happen is that you open a pull On Fri, Oct 14, 2016 at 4:49 PM Michael Siehl [email protected]
|
@MichaelSiehl Great! Thank you very much! @zbeekman Enjoy your trip 😄 |
@MichaelSiehl @zbeekman @rouson (and every-else having coarray experience). Dear all, I have started my experiment with CAF on this project. There is something that I do not understand, please find in the following some questions for you:
type :: baz
integer, pointer :: mic(:)=>null()
contains
procedure :: add
end type baz
type :: foo
type(baz), allocatable :: bar(:)[:]
end type foo
...
! somewhere in the rainbow
allocate(foo%bar(99)[*])
call foo%bar(32)%add(923)
My guess about the last point is that I am missing some synchronization steps. Actually, I synchronize images only before destroying the tables, maybe I need synchronization also before using it. However, I missing why. If you consider the pseudo code above,
My implied idea was that an allocation of CAF implies a synchronization, thus I did not further ones. Consider that Thank you in advance for any help! Cheers. HASTY failing test reportCompilationstefano@zaghi(04:58 PM Mon Nov 14) on feature/add-coarray-buckets [!]
~/fortran/HASTY 13 files, 320Kb
→ FoBiS.py build -mode tests-intel-caf-debug
Builder options
Directories
Building directory: "exe"
Compiled-objects .o directory: "exe/obj"
Compiled-objects .mod directory: "exe/mod"
Compiler options
Vendor: "intel"
Compiler command: "ifort"
Module directory switch: "-module"
Compiling flags: "-cpp -c -assume realloc_lhs -O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray -DCAF"
Linking flags: "-O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray"
Preprocessing flags: "-DCAF"
Coverage: False
Profile: False
PreForM.py used: False
PreForM.py output directory: None
PreForM.py extensions processed: []
Building src/tests/hasty_test_dictionary.f90
Compiling src/third_party/PENF/src/lib/penf_global_parameters_variables.F90 serially
Compiling src/third_party/PENF/src/lib/penf_b_size.F90 serially
Compiling src/third_party/PENF/src/lib/penf_stringify.F90 serially
Compiling src/third_party/PENF/src/lib/penf.F90 serially
Compiling src/lib/hasty_key_base.f90 serially
Compiling src/lib/hasty_content_adt.f90 serially
Compiling src/lib/hasty_dictionary_node.f90 serially
Compiling src/lib/hasty_dictionary.f90 serially
Compiling src/lib/hasty_key_morton.f90 src/lib/hasty_hash_table.f90 using 2 concurrent processes
Compiling src/lib/hasty.f90 src/third_party/fortran_tester/src/tester.f90 using 2 concurrent processes
Compiling src/tests/hasty_test_dictionary.f90 serially
src/tests/hasty_test_dictionary.f90(93): remark #7712: This variable has not been used. [KEY]
subroutine iterator_max(key, content, done)
--------------------------^
Linking exe/hasty_test_dictionary
Target src/tests/hasty_test_dictionary.f90 has been successfully built
Builder options
Directories
Building directory: "exe"
Compiled-objects .o directory: "exe/obj"
Compiled-objects .mod directory: "exe/mod"
Compiler options
Vendor: "intel"
Compiler command: "ifort"
Module directory switch: "-module"
Compiling flags: "-cpp -c -assume realloc_lhs -O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray -DCAF"
Linking flags: "-O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray"
Preprocessing flags: "-DCAF"
Coverage: False
Profile: False
PreForM.py used: False
PreForM.py output directory: None
PreForM.py extensions processed: []
Building src/tests/hasty_test_hash_table.f90
Compiling src/tests/hasty_test_hash_table.f90 serially
src/tests/hasty_test_hash_table.f90(72): remark #7712: This variable has not been used. [KEY]
subroutine iterator_max(key, content, done)
--------------------------^
Linking exe/hasty_test_hash_table
Target src/tests/hasty_test_hash_table.f90 has been successfully built
Builder options
Directories
Building directory: "exe"
Compiled-objects .o directory: "exe/obj"
Compiled-objects .mod directory: "exe/mod"
Compiler options
Vendor: "intel"
Compiler command: "ifort"
Module directory switch: "-module"
Compiling flags: "-cpp -c -assume realloc_lhs -O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray -DCAF"
Linking flags: "-O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray"
Preprocessing flags: "-DCAF"
Coverage: False
Profile: False
PreForM.py used: False
PreForM.py output directory: None
PreForM.py extensions processed: []
Building src/tests/hasty_test_hash_table_homo.f90
Compiling src/tests/hasty_test_hash_table_homo.f90 serially
Linking exe/hasty_test_hash_table_homo
Target src/tests/hasty_test_hash_table_homo.f90 has been successfully built
Builder options
Directories
Building directory: "exe"
Compiled-objects .o directory: "exe/obj"
Compiled-objects .mod directory: "exe/mod"
Compiler options
Vendor: "intel"
Compiler command: "ifort"
Module directory switch: "-module"
Compiling flags: "-cpp -c -assume realloc_lhs -O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray -DCAF"
Linking flags: "-O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray"
Preprocessing flags: "-DCAF"
Coverage: False
Profile: False
PreForM.py used: False
PreForM.py output directory: None
PreForM.py extensions processed: []
Building src/tests/hasty_test_hash_table_homokey_failure.f90
Compiling src/tests/hasty_test_hash_table_homokey_failure.f90 serially
Linking exe/hasty_test_hash_table_homokey_failure
Target src/tests/hasty_test_hash_table_homokey_failure.f90 has been successfully built
Builder options
Directories
Building directory: "exe"
Compiled-objects .o directory: "exe/obj"
Compiled-objects .mod directory: "exe/mod"
Compiler options
Vendor: "intel"
Compiler command: "ifort"
Module directory switch: "-module"
Compiling flags: "-cpp -c -assume realloc_lhs -O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray -DCAF"
Linking flags: "-O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces#-fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -coarray"
Preprocessing flags: "-DCAF"
Coverage: False
Profile: False
PreForM.py used: False
PreForM.py output directory: None
PreForM.py extensions processed: []
Building src/tests/hasty_test_hash_table_homocontent_failure.f90
Compiling src/tests/hasty_test_hash_table_homocontent_failure.f90 serially
Linking exe/hasty_test_hash_table_homocontent_failure
Target src/tests/hasty_test_hash_table_homocontent_failure.f90 has been successfully built Execution errorstefano@zaghi(05:17 PM Mon Nov 14) on feature/add-coarray-buckets
~/fortran/HASTY 14 files, 324Kb
→ export FOR_COARRAY_NUM_IMAGES=2
stefano@zaghi(05:30 PM Mon Nov 14) on feature/add-coarray-buckets
~/fortran/HASTY 14 files, 324Kb
→ ./exe/hasty_test_hash_table
forrtl: severe (174): SIGSEGV, segmentation fault occurred
In coarray image 1
Image PC Routine Line Source
hasty_test_hash_t 00000000004F7021 Unknown Unknown Unknown
hasty_test_hash_t 00000000004F515B Unknown Unknown Unknown
hasty_test_hash_t 00000000004A5B84 Unknown Unknown Unknown
hasty_test_hash_t 00000000004A5996 Unknown Unknown Unknown
hasty_test_hash_t 0000000000467EB9 Unknown Unknown Unknown
hasty_test_hash_t 000000000046C8A6 Unknown Unknown Unknown
libpthread-2.24.s 00007FF2913A2080 Unknown Unknown Unknown
hasty_test_hash_t 0000000000443A4D hasty_hash_table_ 112 hasty_hash_table.f90
hasty_test_hash_t 0000000000460BA2 hasty_test_hash_t 67 hasty_test_hash_table.f90
hasty_test_hash_t 000000000045DD6B MAIN__ 23 hasty_test_hash_table.f90
hasty_test_hash_t 0000000000403DAE Unknown Unknown Unknown
libc-2.24.so 00007FF290E0F291 __libc_start_main Unknown Unknown
hasty_test_hash_t 0000000000403CAA Unknown Unknown Unknown
application called MPI_Abort(comm=0x84000004, 3) - process 0
forrtl: severe (174): SIGSEGV, segmentation fault occurred
In coarray image 2
Image PC Routine Line Source
hasty_test_hash_t 00000000004F7021 Unknown Unknown Unknown
hasty_test_hash_t 00000000004F515B Unknown Unknown Unknown
hasty_test_hash_t 00000000004A5B84 Unknown Unknown Unknown
hasty_test_hash_t 00000000004A5996 Unknown Unknown Unknown
hasty_test_hash_t 0000000000467EB9 Unknown Unknown Unknown
hasty_test_hash_t 000000000046C8A6 Unknown Unknown Unknown
libpthread-2.24.s 00007F672A75E080 Unknown Unknown Unknown
hasty_test_hash_t 0000000000443A4D hasty_hash_table_ 112 hasty_hash_table.f90
hasty_test_hash_t 0000000000460BA2 hasty_test_hash_t 67 hasty_test_hash_table.f90
hasty_test_hash_t 000000000045DD6B MAIN__ 23 hasty_test_hash_table.f90
hasty_test_hash_t 0000000000403DAE Unknown Unknown Unknown
libc-2.24.so 00007F672A1CB291 __libc_start_main Unknown Unknown
hasty_test_hash_t 0000000000403CAA Unknown Unknown Unknown
application called MPI_Abort(comm=0x84000002, 3) - process 1 |
Dear all, ( @rouson @zbeekman @MichaelSiehl ) I am out of office, but I have done a test on my tablet: with OpenCoarrays 1.7.5 (manually compiled), GNU gfortran 6.2.0 (from dev ubuntu repo), MPICH 3.2 (manually compiled) the above test works well 🎉 The problem is with Intel compiler 😢 the code seems to be standard conforming. For who is interested there is an official bug report on intel support here As a consequence, for HASTY I have to stick on only OpenCoarrays/GNU gfortran until this Intel's bug will be solved. This is a problem for me, because using as much as possible many compilers concurrently is my preferred workflow to likely capture my errors... I some you have access to other compilers, e.g. IBM, Cray, NAG... and will be so kind to occasionally do a compilation-on-the-fly I will be very happy 😄 |
@szaghi, I have access to a Cray and would be glad to test for you. As I'm sure you know, I work best interactively so I'd suggest occasionally setting up a time for us to talk so that you can walk me through running your tests. Alternatively, it would be great if you would contribute your bug reports to the AdHoc repository that I set up to track bug reports related to modern Fortran. It's nice to have everything in one place and run all the tests interactively. I will also make sure an Intel compiler support engineer with whom I've interacted quite often is aware of this bug. I'm certain Steve will do a great job tracking it, but he has announced his retirement so it might help to have another engineer watching it also. |
Dear Damian, you are very kind! When I'll have meaningful tests all bother you. I always assumed that Cray is a really great compiler (as IBM/INTEL/GNU). One "obscure object" for me is NAG: do you have ever used it? For the Intel's bug, I am very happy to contribute to your (great) AdHoc. Later toady I'll try to create a PR. Cheers. |
Dear Damian, I just created a PR for AdHoc with the intel issue. Steve as raised it as an official issue, but not yet as an official bug. I hope my PR conforms to your conventions, if it does not feel free to reject it, I'll amend the PR with your corrections. Cheers. |
Thanks for contributing the PR to AdHoc. Regarding compilers, I view GNU as in the in lead right now for Fortran 2015 support. It's only missing two major 2015 features: parameterized derived types (which happens to be a 2003 feature) and teams. NAG is generally considered the best compiler for checking standards-conformance. For that reason, it would be great if NAG were the most common second compiler -- sort of like how English is the most common second language. As of the last time I checked, NAG was missing only one feature to reach 2003 compliance (user-defined derived type input/output) and hadn't yet started on any of the bigger features required for 2008 or 2015 compliance. Cray is a great compiler and is fully 2008-compliant and a sizable amount of 2015, but not quite as much as GNU. For my purposes, there are only three useful compilers: Cray, Intel, and GNU. Almost everything I do these days involves CAF and those are the only three compilers that support CAF. However, even in what I just wrote, I'm being overly generous because Fortran 2015 events are very important to parallel performance with CAF and only GNU supports events. Once GNU became the first compiler to support Fortran 2015 events, I decided to fully embrace Fortran 2015 and stop waiting for other compilers to catch up. In the multi-core/many-core era, it doesn't make much sense to talk about serial performance and GNU is generally in the lead for parallel performance -- especially if one fully embraces all available Fortran 2015 features. |
You are much more than welcome.
This is what I read elsewhere, NAG is one of the best "debugging-compiler".
As I said, I am trying to moving my bosses from MPI to CAF, this is a complex goal... When I joined CNR-INSEAN my first goal was move them from static allocation to dynamic one. Then I showed that OOP could help us and to adopt it, I started in 2010... I win now 😢 I hope the CAF-mission will be less than 6 years... (OT do you think that Fanfarillo or Filippone could accept an invited lecture at INSEAN about CAF?) I rely on GNU for much all of my stuff, I use it also for research simulations, but for the few commercial works that we do, we prefer to use Intel: the code we use for commercial works has been initially developed with PGI, then, when I moved my bosses to dynamic allocation, I convinced them to buy Intel thus they now are more confident with Intel than GNU (my bad 😢 ).
Sante Parole (English: Saint Words): I am still discussing on google on how much goto is faster/clear than select case... My best regards. |
@rouson @zbeekman @MichaelSiehl @afanfa @LadaF @jeffhammond @certik @victorsndvg @milancurcic @muellermichel @jacobwilliams @cmacmackin Dear all, I am sorry to ping/bother you directly, feel free to ignore this help request without further comments. The baseline serial hash-table structure is now ready and my preliminary tests with CAF are really interesting (although I had to switch off Intel in flavor of only GNU). I am now at point where I need the help of more expert parallel programmers like you. A very brief description:
My experience with parallel hash-table is nearly zero... I studied many resources, but most of them are referred to distributed or lock-free hash-table designed for the cloud (mostly peer-peer torrent-like aims) where the main need is to ensure to avoid to re-hash/re-distribute all nodes when a bucket is added/removed (mainly related to consistent hashing) by means of circular mappings. My needs are different:
In general, I know a priori the maximum number of grid-refinement levels allowed, thus I know a priori the maximum dimension of the hash-table, so I can select the buckets number consistently, thus re-hashing is a minor necessity. My main concern is
In my current test all images have the same number of buckets, thus the hashed-keys map on the same bucket on all images...., meaning the same identical hash-table copied over all images. Probably I need some sort of offset to (uniquely) distribute the nodes over the CAF images, but I have no idea of which kind of offset and the resulting hash function arising from this offset. For example I can imagine something like:
Are some of you aware of something similar or can you give me a better idea? Thank you in advance for any hints! Cheers. |
although I had to switch off Intel in flavor of only GNU
Why?
|
I Jeff, the reason is a small Intel compiler's issue, see this It is still not an official bug, but Steve Lionel raised it as an official issue of the current Intel version. Anyhow, this issue prevents me to call TBP of CAF members, thus all HASTY structure results not viable. Indeed, relying on only GNU is a concern for me: my "bosses" want Intel for production, but, more importantly, I am somehow OCD and testing my bad-codes with at least 2 different compilers alleviate my irritability... the chances to catch my errors is reduced by 50%... |
I Jeff, the reason is a small Intel compiler's issue, see this It is still not an official bug, but Steve Lionel raised it as an official issue of the current Intel version. Anyhow, this issue prevents me to call TBP of CAF members, thus all HASTY structure results not viable. Indeed, relying on only GNU is a concern for me: my "bosses" want Intel for production, but, more importantly, I am somehow OCD and testing my bad-codes with at least 2 different compilers alleviate my irritability... the change to catch my errors is reduced by 50%...
You want to try Cray compiler? I can help you with NERSC access.
|
Jeff, you are too much kind, thank you very much! In the future when I have some meaningful tests I'll like to prepare some tests for you and Damian that have access to Cray, but now it is premature, I do not want to waste your time. Your help in the form of comments/idea/critics here is already an God-given for me! Cheers. |
I looked at SAMRAI, BoxLib, Paramesh and many others. You are right, it is not good to reinvent the wheel, but in this case I prefer... The AMR data structure is a crucial key for my work, I want it all in my hands and, probably a more important point, I want to learn and only to use an AMR-library. Finally, to my knowledge, no ones are CAF based 😄 HASTY is a small piece of my tool for my personal research. Concerning the parallel hashing, I have done some little, but important steps today: hopefully Monday I'll ask for your comments. Cheers. |
Exploit coarray to make hash table a massively parallel container is a challenging aim, probably I'll be not up to the task. To start I have to deeply study the work of @MichaelSiehl
The text was updated successfully, but these errors were encountered: