Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Each task process only use one CPU? #279

Open
buaasun opened this issue Apr 24, 2016 · 6 comments
Open

Each task process only use one CPU? #279

buaasun opened this issue Apr 24, 2016 · 6 comments

Comments

@buaasun
Copy link

buaasun commented Apr 24, 2016

I use srun -N2 -n4 ... to start a grappa application, but I notice the CPU usage of each process is about 100%, seems like it only use one CPU. How can I config the CPU number for grappa application?

@nelsonje
Copy link
Member

On some systems, srun is not sufficient to run an MPI job properly; you need to use salloc and mpirun or mpiexec to run multi-node jobs properly. Does the command

make demo-hello_world && salloc -N2 -n4 mpirun applications/demos/hello_world.exe

behave differently than

make demo-hello_world && srun -N2 -n4 applications/demos/hello_world.exe

In general, Grappa CPU usage will be high whether or not a task is idle---it busy-waits for new work to minimize startup latency.

@buaasun
Copy link
Author

buaasun commented Apr 25, 2016

Can grappa process reach CPU usage of 200%, 300% or even higher ... ?
I write a simple multi thread program, here is test_thread.cpp

#include <thread>
#include <vector>
int main(int argc, char *argv[]) {
  std::vector<std::thread> threads(2);
  for (auto& thr : threads) {
    thr = std::thread([](){
      while(1){
      }
    });
  }
  for (auto& thr : threads) {
    thr.join();
  }
}

then I run the program with srun
srun -N1 -n2 test_thread
this produced 2 test_thread process, and each of the process cost 200% of CPU usage.

I also wirte a simple grappa app test_grappa

#include <Grappa.hpp>
void app_main(int argc, char *argv[]) {
  Grappa::CompletionEvent ce(2);
  for(int i=0;i<2;i++){
    Grappa::spawn([&ce]{
      while(1){
      }
      ce.complete();
    });
  }
  ce.wait();
}
int  main(int argc, char *argv[]) {
  Grappa::init(&argc,&argv);
  Grappa::run([=]{
    app_main(argc,argv);
  });
  Grappa::finalize();
}

then I run the program with srun
srun -N1 -n2 test_grappa
this also produced 2 test_grappa process, but each of the process cost only 100% of CPU usage.
I think the grappa process is binded to a CPU core and it can not use more than one CPU.
Am I right? @nelsonje

@bmyerz
Copy link
Member

bmyerz commented Apr 25, 2016

First, the program test_grappa probably is not what you intended to write. When you call Grappa::run, it runs the lambda on core0 as a task. Grappa::spawn runs a task that is private to the calling core. So once you call ce.wait, your program has 3 tasks, all running on core0. If you want the program to actually run 2 tasks on both cores, then try this:

void app_main(int argc, char *argv[]) {
  Grappa::on_all_cores([] {
  Grappa::CompletionEvent ce(2);
  for(int i=0;i<2;i++){
    Grappa::spawn([&ce]{
      while(1){
      }
      ce.complete();
    });
  }
  ce.wait();
 });
}

When you call ce.wait() on both cores, you will have 7 tasks. The main task on core0, 2 tasks spawned by on_all_cores (1 on core0 and 1 on core1), 2 tasks by for loop on core 0, and 2 tasks by for loop on core 1.

To answer your question directly, each Grappa process is bound to a CPU core. If you want to use more cores, you run more processes.

@buaasun
Copy link
Author

buaasun commented Apr 26, 2016

Do you have plan to improve Grappa to support multiple CPU cores per process? @nelsonje @bmyerz

I think it's quite different between multi thread and multi process, threads can share memory while processes can not. So I want Grappa to use not only multi process but multi thread.

If you guys agree with me but do not plan to do it, I'd like to have a try. Do you have any ideas?

@simonkahan
Copy link
Member

My concern would be that doing so would complicate the programming model.
In Grappa today, when tasks executed by distinct cores access the same
global variable, the accesses are guaranteed to be serialized through the
core to which the process owning the variable is bound. No atomics are
needed.
In what you propose, when tasks executed by distinct cores access the same
global variable, they would need to use atomic operations to maintain
consistency: otherwise, for example, the operation performed by one core
might be overwritten by the other.

The difficulty of mixing access to global variables by having some tasks
use delegates and others use atomics I believe would be a programmer's
nightmare: diagnosing races would be insufferable.

So, rather than corrupt the existing memory semantics, I suggest you
consider adding an additional abstraction: a task "bundle". The bundle
consists of a set of tasks all executed as a single process that can span
multiple cores. Tasks within the bundle share memory within their own
address space only with other tasks in the bundle. Access to a global
variable uses the grappa delegate mechanism.

On Tue, Apr 26, 2016 at 5:26 AM, Sun Chenggen [email protected]
wrote:

Do you have plan to improve Grappa to support multiple CPU cores per
process? @nelsonje https://github.com/nelsonje @bmyerz
https://github.com/bmyerz

I think it's quite different between multi thread and multi process,
threads can share memory while processes can not. So I want Grappa to use
not only multi process but multi thread.

If you guys agree with me but do not plan to do it, I'd like to have a
try. Do you have some ideas?


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#279 (comment)

@bmyerz
Copy link
Member

bmyerz commented Apr 26, 2016

I'm curious to hear what your specific motivation to implement multi-threaded Grappa processes is? Grappa's programming model provides its own shared memory abstraction.

Is it that you want to implement core-to-core communication with intra-process shared memory because you think it will be faster? Then you might change the implementation of Grappa to be multithreaded without changing its user API.

Or, do you just want to be able to use existing multithreaded applications within a Grappa program? If so, then it should be possible to tweak the code (found within this function I think https://github.com/uwsampa/grappa/blob/master/system/Grappa.cpp#L359) that pins Grappa processes to cores so that when you spawn a pthread in your Grappa process your pthread is allowed to run on another core than the core the main pthread is running on. Be warned: if you spawn more pthreads you should not have them call Grappa APIs. Grappa's APIs assume they are being called by only the main pthread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants