Profiling speed and memory consumption #176
jcgraciosa
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It is sometimes necessary to profile code to determine which parts are slow or whether memory leakage is occurring or not, among others. Here, I show various ways to profile the speed and memory consumption that can be used with underworld3:
A. Profiling speed
A.1. Using Underworld3 timing
This module, described in
src/underworld3/timing.py
, allows for high level timing operations and is mostly used in performing weak and strong scaling tests. Note that this only records the timing for Underworld3 calls and it is necessary to set the "UW_ENABLE_TIMING" variable to "1" prior to importing underworld3. In general:A.2. Using PETSc logging features
PETSc includes some features that enable speed profiling that are enabled by default (
--with-log=1
). One can use:-log_view :<log_filename>
to print out performance data at the end of the program to<log_filename>
or console (w/o log_filename). For example, in parallel:mpiexec -n 8 python3 uw3_model.py -log_view :<log_filename>
. This will produce a .txt file containing the results that is a bit hard to comprehend. To make this more understandable install enscript, then run:enscript -r -fCourier9 logfile.txt -o - | ps2pdf - logfile.pdf
and open the output pdf file.-log_view :<log_filename>:ascii_flamegraph
to create a flame graph output that can be visualised using speedscope, for example. Although interesting, I had difficulty in interpreting the outputs obtained from this method.-log_view :<log_filename>:ascii_xml
to create an xml output. While this option allows you to view the output in a nested format where the event hierarchy is explicit, some processes are left out and customised calls are not outputted.Due to limitations in options 2 and 3, I recommend starting profiling with option 1.
A.2.1 customisation of events to track
Using the above options as is only profiles parts of the code related to PETSc. However, it is possible to define customised events (e.g. a specific subset of the code), for profiling. The easiest way to do so would be:
The profiles of the defined events are subsequently found in the generated flame graph and log file.
B. Profiling memory consumption
B.1. psutil and mem_footprint defined in Underworld3
This function, defined in
src/underworld3/utilities/_utils.py
, calculates the resident set size (RSS) of a process. To use:B.2. Petsc calls and how to add them
It is sometimes necessary to use PETSc calls that are not defined in petsc4py. To do so, one can create a customised wrapper function in their own fork. As an example, we create a function for getting the current memory usage using PETSc function calls. In
src/underworld3/function/_function.pyx
, we add the following lines in the appropriate places:In
src/underworld3/function/__init__.py
:In
src/underworld3/cython/petsc_types.pxd
:In
src/underworld3/cython/petsc_extras.pxi
:In
src/underworld3/utilities/_jitextension.py
:After rebuilding, you should be able to use the function
petsc_memory_get_current_usage
.B.3. Heapy/guppy and tracemalloc
guppy3 and tracemalloc are Python3 packages that can be used to analyse the memory heap in order to trace possible memory leaks. Both are installable through pip.
To use guppy to determine changes in the memory heap:
It is then possible to view the changes in the memory heap by looking at
h
. Most useful (so far) are following views:h.bytype
h.byrcs
h.bysize
h.byid
h.byvia
One can also go further into the heap data. For example
h.byrcs[0].byvia
.Using
tracemalloc
follows similar steps, although with the additional step of setting an environment variable:B.4. Valgrind
Valgrind seems to be a powerful tool for analysing the memory usage of code. However, I currently don't know how to interpret its results. Nevertheless, for completion, I am including instructions to run and use it:
valgrind --tool=memcheck --leak-check=full --suppressions=valgrind-python.supp --log-file=<test.log> python3 <uw3_model.py>
. Running will take much longer than usual (up to 30x longer).B.5. Memory profiler
Memory profiler is a module for monitoring the memory usage of a python program as a function of time. This temporal monitoring makes it potentially useful as it allows you to monitor the memory consumption "in real time." However, it appears that this module is no longer maintained. :( Still, to use it:
pip install memory_profiler
mprof run <uw3_model.py> --output <output_log.dat>
Beta Was this translation helpful? Give feedback.
All reactions