Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreading efficiency #208

Open
patham9 opened this issue Jun 22, 2022 · 8 comments
Open

Multithreading efficiency #208

patham9 opened this issue Jun 22, 2022 · 8 comments
Milestone

Comments

@patham9
Copy link
Member

patham9 commented Jun 22, 2022

Make multi-threading effective again.
See if it can be used for temporal compounding in sensorimotor inference, which likely would bring the biggest performance gain.

@patham9 patham9 added this to the v0.9.2 milestone Jun 22, 2022
@automenta
Copy link

lets see some benchmarks to know if the concurrency support actually does anything, including if they make performance worse or better and how much on different #s of threads (1, 2 4, ...) also on diff cpu/arch/OS

@patham9
Copy link
Member Author

patham9 commented Jun 26, 2022

Since inverted term index was introduced I removed the related parallel-for loops as it didn't bring benefits for v0.9.1.
But with thew new additions planned for this issue it will be effective again.
I typically use time command on evaluation.py:
time python3 evaluation.py
This runs a timing summary on all examples so will include both changes in performance of senantic and sensorimotor inference.
We can run it once the branch is merged to see.

@automenta
Copy link

https://en.wikipedia.org/wiki/Amdahl's_law
a profiler's thread view will show what parts are parallelized and what parts, if any, are serial.

running on synthetic tests will probably not represent the actual kind of workload in, for example, a NAR with a steady full memory "at cruising altitude". ie. gains made on tests are not likely to apply completely for actual workloads.

as far as i can tell, the only way to sufficiently parallelize a NAR is by using a fixed threadpool of threads (less than, or about as many cpu cores) that execute reasoner cycles asynchronously and independently.

this still requires some synchronization points, for example the various mutable data structures like Bag's, BeliefTable's, etc... it is possible to safely isolate the critical sections to these data structures - particularly where WRITES are involved - by using locks, atomic CAS, etc.

@patham9
Copy link
Member Author

patham9 commented Jun 26, 2022

Over time new experiments have been added to the collection, some of which lead the system to operate at full capacity, while others are more "synthetic". Of course this collection should be extended and stay part of the repository, that's how the benchmarks can be reproducible by others.

Regarding profiling and thread pools: I agree, this seems to be the way to go for this implementation.

@automenta
Copy link

so to make it thread safe:

  1. identify all the mutable data structures and make those thread safe (locks, atomics, etc...). assume everything else is going to run in its own oblivious single thread that will call the shared data structure's functions.

  2. manage a threadpool with individual on/off toggles. simply run whatever cycle proceedure in each thread. the toggles will provide basic throttling ability (ex: 40% cpu usage)

if possible i would try to use well developed external lib for the threading and locks and concurrent data structures. something like Boost. java provides so much infrastructure for doing something like this and its easy to take it for granted. ex: its StampLock's

@patham9
Copy link
Member Author

patham9 commented Jun 27, 2022

I agree to the former two points.
To the latter: I know you are not very used to OS API's but POSIX threads are all that is needed, featuring locks condition variables etc. Boost is a C++ lib and a bit bloat when only used for threading, and last time I checked it even lacked some relevant threading functions for fine-grained thread control.

@automenta
Copy link

i mean, for example: you may want to borrow an off-the-shelf data structures (ex: concurrent hashtable, concurrent skip-list, etc). i know you dont want Boost - i'm saying don't implement these from scratch. you might as well use something already made:

https://www.reddit.com/r/cpp/comments/mzxjwk/experiences_with_concurrent_hash_map_libraries/

@jnorthrup
Copy link
Member

https://keep-calm.medium.com/rules-of-structured-concurrency-in-kotlin-dad5623423a4
ctrl-alt-k all the things in intellij. concurrency is boring in kotlin

@patham9 patham9 modified the milestones: v0.9.2, v0.9.3 Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants