Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profiling for C calls #5

Open
saulshanabrook opened this issue Jun 4, 2020 · 7 comments
Open

Add profiling for C calls #5

saulshanabrook opened this issue Jun 4, 2020 · 7 comments

Comments

@saulshanabrook
Copy link
Contributor

Currently, if a library calls another library through their C API we are unable to trace it. This includes calling anything in Cython. This is too bad, because a lot of calls to NumPy are from Cython or C libraries.

One idea on how to achieve this, from talking to @scopatz, was to use lldb's Python API. It is now building on Conda Forge on mac so I can get started exploring this.

@amueller
Copy link

amueller commented Jun 4, 2020

FWIW the most calls to numpy in sklearn are in Python, I think Cython might call more directly to BLAS or we're implementing our own logic.

@saulshanabrook
Copy link
Contributor Author

Looking through the skimage codebase, I saw a bunch of that is basically just calling out to the normal NumPy API but in Cython, which we totally miss, like this: https://github.com/scikit-image/scikit-image/blob/f71be82423e73cda4f3026a0eb656614db937bbc/skimage/feature/_cascade.pyx#L581-L598

@mattip
Copy link

mattip commented Aug 17, 2020

PEP 578 provides C- and Python- level hooks for this kind of thing. Maybe there could be an opt-in Cython mode for this?

@saulshanabrook
Copy link
Contributor Author

Maybe there could be an opt-in Cython mode for this?

That would help definitely... Would require upstream change to Cython right?

Another thought would be to have Cython build in such a way that it doesn't actually unroll the Python interpreter... For debugging purposes? Not sure how hard this would be.

@mattip
Copy link

mattip commented Aug 17, 2020

Cython build in such a way that it doesn't actually unroll the Python interpreter.

I think that would have a severe performance hit, but it is worth exploring these ideas with them.

@saulshanabrook
Copy link
Contributor Author

I think that would have a severe performance hit, but it is worth exploring these ideas with them.

Cool, well that would be nice to explore down the road then. This whole thing is super severe performance hit already! So I wouldn't worry about that for our use case, although of course you would only want to build in this mode for debugging or tracing like this.

@jack-pappas
Copy link

What about gathering the data using something like bpftrace / bcc? The PEP 578 audit hook @mattip mentioned is included in the static probes / tracepoints compiled into CPython (search for PyDTrace_AUDIT), so you should be able to get at it with bpftrace's usdt probe on Linux (or DTrace, if you're on a Mac).

The ustack function can be used to get all the user-mode C calls within a process; I think you'd then filter down to look for stacks containing calls to the numpy C API. uprobe / uretprobe probes can instrument specific functions so you can e.g. print out arguments and return values to numpy C API functions.

Additional references:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants