You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I've been trying to run distributed Galois for quite some time. I've tried running all the provided apps and have been encountering a segmentation fault.
Command used : ./sssp-pull --startNode=0 $graphPath
Error observed:
[0] Master distribution time : 0.239983 seconds to read 168 bytes in 20 seeks (0.00070005 MBPS)
[0] Starting graph reading.
[0] Reading graph complete.
[0] Edge inspection time: 0.246308 seconds to read 148615096 bytes (603.371 MBPS)
Loading edge-data while creating edges
[0] Edge loading time: 0.529808 seconds to read 271105352 bytes (511.705 MBPS)
[0] Graph construction complete.
[0] InitializeGraph::go called
[0] SSSP::go run 0 called
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6dc8604 in PMPI_Iallreduce () from /lfs/sware/openmpi411/lib/libmpi.so.40
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.x86_64 numactl-libs-2.0.9-7.el7.x86_64
Here are some of my observations from plugging the binary onto gdb.
The segfault has been occuring in the library libdist/libgalois_dist_async.a on a PMPI_Iallreduce call.
After observing the segfault, I opened gdb and noticed a constant address that is outside the process's memory bounds being accessed.
This address was being pushed into r9 in the preamble to calling MPI_IallReduce and was moved into rbp. This address does not seem to be accessible ever.
As per SystemV ABI, r9 would be the sixth argument being passed into a function, which for our case is MPI_COMM_WORLD.
This could happen if MPI_COMM_WORLD was not initialised, which would indicate the code flow lacking an MPI_Init().
Also, gdb could only set a future breakpoint in MPI_Init, and the segfault int MPI_Iallreduce before MPI_Init breakpoint. I don't notice any boost_mpi libraries.
Hello,
I've been trying to run distributed Galois for quite some time. I've tried running all the provided apps and have been encountering a segmentation fault.
Command used :
./sssp-pull --startNode=0 $graphPath
Error observed:
Here are some of my observations from plugging the binary onto gdb.
The segfault has been occuring in the library
libdist/libgalois_dist_async.a
on a PMPI_Iallreduce call.After observing the segfault, I opened gdb and noticed a constant address that is outside the process's memory bounds being accessed.
This address was being pushed into r9 in the preamble to calling MPI_IallReduce and was moved into rbp. This address does not seem to be accessible ever.
As per SystemV ABI, r9 would be the sixth argument being passed into a function, which for our case is MPI_COMM_WORLD.
This could happen if MPI_COMM_WORLD was not initialised, which would indicate the code flow lacking an MPI_Init().
Also, gdb could only set a future breakpoint in MPI_Init, and the segfault int MPI_Iallreduce before MPI_Init breakpoint. I don't notice any boost_mpi libraries.
This is the preamble to PMPI_Iallreduce call:
This is where the segfault is happening.
Note the address 0x44000000. This address seems not accessible.
The text was updated successfully, but these errors were encountered: