-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement polite waiting #685
Conversation
Script: import matplotlib.pyplot as plt
import subprocess
subprocess.run(["git", "checkout", "460c5552b74da2118802f53258b4ef37a2a85601"], check=True)
subprocess.run(["cmake", "--build", ".", "--parallel"], check=True)
time = []
for i in range(1, 100):
out = subprocess.run(["./perf-startup-fast", str(i)], check=True, capture_output=True)
out = out.stdout.decode("utf-8")
for line in out.split("\n"):
if line.startswith("Taken: "):
time.append(float(line.split()[1]))
break
subprocess.run(["git", "checkout", "97b76756707371dda8ae74c95274266a21d217e3"], check=True)
subprocess.run(["cmake", "--build", ".", "--parallel"], check=True)
time2 = []
for i in range(1, 100):
out = subprocess.run(["./perf-startup-fast", str(i)], check=True, capture_output=True)
out = out.stdout.decode("utf-8")
for line in out.split("\n"):
if line.startswith("Taken: "):
time2.append(float(line.split()[1]))
break
plt.boxplot([time, time2], labels=["futex", "nofutex"], vert=True, patch_artist=True, showmeans=True)
plt.show() |
Results look great. How many cores on the test box? |
It is on a 7773X with 64c128t. |
Some additional tests on Latest Surface Pro (X Elite) with Native ARM64 VS2022. (ignore the MSYS path, I simply use their git cuz I don't want to install 😄 ). import matplotlib.pyplot as plt
import subprocess
subprocess.run(["C:\\msys64\\usr\\bin\\git", "checkout", "14c439afcd08a00c48d23df18fd17403a12cf4da"], check=True)
subprocess.run(["cmake", "--build", ".", "--config", "Release", "--parallel", "--target", "perf-startup-fast"], check=True)
time = []
for i in range(1, 1000):
out = subprocess.run(["Release\\perf-startup-fast", str(i)], check=True, capture_output=True)
out = out.stdout.decode("utf-8")
for line in out.split("\n\r"):
if line.startswith("Taken: "):
time.append(float(line.split()[1]))
break
subprocess.run(["C:\\msys64\\usr\\bin\\git", "checkout", "97b76756707371dda8ae74c95274266a21d217e3"], check=True)
subprocess.run(["cmake", "--build", ".", "--config", "Release", "--parallel", "--target", "perf-startup-fast"], check=True)
time2 = []
for i in range(1, 1000):
out = subprocess.run(["Release\\perf-startup-fast", str(i)], check=True, capture_output=True)
out = out.stdout.decode("utf-8")
for line in out.split("\n\r"):
if line.startswith("Taken: "):
time2.append(float(line.split()[1]))
break
plt.boxplot([time, time2], labels=["futex", "nofutex"], vert=True, patch_artist=True, showmeans=True)
plt.show() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me. Thanks for doing this.
@SchrodingerZhu, so I ran it and got a performance slow down: This is running on an Azure VM F72s class. I am guessing this could be the virtualisation layer making the remote wake-up slower? Do you have any thoughts? futex is the version from this PR (retry 100 times). futexN is with a retry of N times. |
Interesting. I tried another time on surface pro's WSL just now and I indeed see a slow down. Do you have the bandwidth to try this on more devices? If the performance is unstable, I think we can make this a cmake option. |
My observations:
Given that WSL2/Azure VM does witness the degradation, maybe it is due to virtualization? |
Do you want me to make this an option? |
If that is okay, it would be great. Could you add a CI pipeline to cover the non-default option. I'm thinking polite by default, and disable as an option? |
Looks like Mac OS 12 doesn't support this. Perhaps we need to disable on that version. |
Co-authored-by: Matthew Parkinson <[email protected]>
Co-authored-by: Matthew Parkinson <[email protected]>
@mjp41 addressed all issues. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for packaging this so nicely.
fixed formatting |
Results:
I think the performance is actually slightly better on the left?