Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-thread cpu time #181

Open
wants to merge 1 commit into
base: v1.10.2+RAI
Choose a base branch
from
Open

Per-thread cpu time #181

wants to merge 1 commit into from

Conversation

d-netto
Copy link
Member

@d-netto d-netto commented Sep 20, 2024

PR Description

Expands on #175 by:

  • Adding per-thread start-time.
  • Using a proper macro/API to increment metrics.
  • Splitting GC time into safe-point time and proper GC time.
  • Introducing internal SpinLock time.
  • Introducing feature flag to report metrics in terms of CPU time or wall time.

Checklist

Requirements for merging:

  • I have opened an issue or PR upstream on JuliaLang/julia: <link to JuliaLang/julia>
  • I have removed the port-to-* labels that don't apply.
  • I have opened a PR on raicode to test these changes:

@github-actions github-actions bot added port-to-v1.10 This change should apply to Julia v1.10 builds port-to-master This change should apply to all future Julia builds labels Sep 20, 2024
@d-netto d-netto force-pushed the dcn-user-cpu-time branch 3 times, most recently from 4e1bf4c to 296eeb4 Compare September 23, 2024 16:40
Copy link
Member

@nickrobinson251 nickrobinson251 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work. Have you been able to test this out on some simple examples to sense-check the numbers?

base/timing.jl Outdated Show resolved Hide resolved
src/julia_locks.h Show resolved Hide resolved
src/threading.c Outdated Show resolved Hide resolved
base/timing.jl Outdated Show resolved Hide resolved
base/timing.jl Outdated Show resolved Hide resolved
src/threading.c Outdated Show resolved Hide resolved
JL_DLLEXPORT jl_task_t *jl_task_get_next(jl_value_t *trypoptask, jl_value_t *q, jl_value_t *checkempty)
{
uint64_t t0 = jl_record_time_for_tls_metric();
jl_task_t *task = _jl_task_get_next(trypoptask, q, checkempty);
jl_increment_timing_tls_metric(jl_current_task->ptls, scheduler_time, jl_record_time_for_tls_metric() - t0);
return task;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this got lost in the handoff, but the original plan was to not record the current time twice for the "fast-path" of task switching.

The concern is that adding two syscalls can meaningfully slow down task switching, and the idea was that we probably don't need to do this anyway since the time we'd be measuring in the fast-path case is trivial (we believe).

So ideally I think we'd want to only do this in the slow paths inside jl_task_get_next?

BUT to be honest, I find your approach much cleaner and clearer - and it's nice that we would get an accurate measure of that switch time which is supposed to be cheap......... AND we are going to need something like this anyway for the per-task CPU time in https://relationalai.atlassian.net/browse/RAI-30482...

So maybe if we could solve the vDSO / fast instruction problem, then we could do it this way after all?
Something to resolve CC @kpamnany.

@d-netto d-netto force-pushed the dcn-user-cpu-time branch 5 times, most recently from abe0d2e to d7c0b4c Compare September 30, 2024 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
port-to-master This change should apply to all future Julia builds port-to-v1.10 This change should apply to Julia v1.10 builds
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants