Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threads in thread pool getting hung on ppc64le #79

Closed
npanpaliya opened this issue Sep 30, 2016 · 1 comment
Closed

Threads in thread pool getting hung on ppc64le #79

npanpaliya opened this issue Sep 30, 2016 · 1 comment

Comments

@npanpaliya
Copy link

Hi,

We have built torch on ppc64le. And while testing it, we found a problem wherein threads get hung if we have a thread pool containing higher number of threads say 16 (at times, even with 8). I guess something similar to (#62, OR #16)

Here is our test case -

local Threads = require 'threads'
nthreads = 64
Threads.serialization('threads.sharedserialize')
thrds = Threads(nthreads,
         function()  print('Starting thread ') end,
         function()  require 'image' end
      );
thrds:synchronize()
print "Done"

Some of my observations -

  • After debugging using gdb, I saw all of the threads are sitting in pthread_cond_wait function.
  • Also, in above example, I also observed that if I remove "require 'image'", it even works with 64 threads. So, I think either it has to be something with the memory as well, as require image loads a bunch of stuff and probably eating up the memory. But really not sure.
  • Above test works well on x86 even with 64 threads.

Any help would be really appreciated.

@npanpaliya
Copy link
Author

We got this issue fixed by building torch using the latest Advanced Toolchain to 10.0 which has gcc 6.2.1 and glibc 2.24.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant