pfm parsing for perf stat codes per instruction #33

PhilipDeegan · 2024-11-09T20:56:53Z

Summary by CodeRabbit

New Features
- Added command-line argument -e/--extra for enhanced performance command options.
- Introduced new functionality for performance event checks in a specified directory.
- Added new modules for NVIDIA Compute Sanitizer and Nsight Compute CLI.
- New YAML configuration file for project build profiles.
- New script to automate the setup of the libpfm4 library.
Bug Fixes
- Improved error handling for command-line argument verification.
Documentation
- Updated available modules in main application files.
Chores
- Enhanced cleanup script for broader file removal.
- Minor updates to logging configuration across various scripts.

coderabbitai · 2024-11-09T20:57:00Z

Walkthrough

The changes in this pull request involve multiple updates across various files, including enhancements to command-line interfaces, logging configurations, and the introduction of new functionalities for performance monitoring and event checking. Notably, new command-line arguments have been added to several scripts, allowing for more flexible user inputs. Additionally, new files have been created for handling performance events and for integrating NVIDIA tools. The .gitignore file has been modified to include new patterns, and various scripts have been improved for better error handling and operational efficiency.

Changes

File	Change Summary
`.gitignore`	Updated to ignore files matching `*scope_timer.txt` and added `tpp` to ignored files.
`phlop/app/perf.py`	Added `-e/--extra` argument to `cli_args_parser` for passing extra strings.
`phlop/app/pfm/__init__.py`	Added blank lines; no functional changes.
`phlop/app/pfm/check_events.py`	Introduced functions for performance event checks: `parse_check_events_output`, `run_check_events`, `get_evt_perf_code`.
`phlop/app/pfm/showevtinfo.py`	Added `EVTInfos` class for managing event information, including methods for filtering and appending.
`phlop/app/stats_man.py`	Added `-s/--summary` argument to `cli_args_parser` and updated logic in `verify_cli_args`.
`phlop/run/mpirun_perf.py`	Removed DEBUG logging configuration and enhanced error handling for `interval` argument.
`phlop/run/mpirun_stats_man.py`	Removed DEBUG logging configuration; no other changes.
`phlop/run/perf.py`	Updated `perf_stat_cmd` to accept an additional `options` parameter.
`phlop/testing/parallel_processor.py`	Reformatted `TestCaseFailure` class, no functional changes.
`sh/setup_pfm.sh`	New script to automate setup of `libpfm4` library.
`sh/test.sh`	Simplified directory change command and added checks for `tpp/pfm` directory existence.
`tests/_phlop/app/pfm/test_pfm.py`	New script for gathering performance event information and executing commands.
`phlop/app/nvidia/__init__.py`	Added blank lines; no functional changes.
`phlop/app/nvidia/csan.py`	Introduced functions for the NVIDIA Compute Sanitizer and updated argument parsing.
`phlop/app/nvidia/ncu.py`	Enhanced Nsight Compute CLI functionality with new command-building functions and metrics list.
`inc/phlop/timing/threaded_scope_timer.hpp`	Modified `ScopeTimerMan` class for better timer report management and introduced `ThreadLifeWatcher`.
`phlop/app/__main__.py`	Updated `available_modules` to include `phlop.app.nvidia` and `phlop.app.pfm`.
`phlop/app/nvidia/__main__.py`	Added `available_modules` for NVIDIA modules.
`phlop/app/pfm/__main__.py`	Added `available_modules` for PFMs.
`sh/clean.sh`	Enhanced error handling and expanded cleanup targets.
`mkn.pfm.yaml`	New YAML configuration file defining build profiles for `phlop.pfm`.

Poem

In the land of code where rabbits hop,
New features bloom, and changes don't stop.
With timers and stats, we dance in delight,
Performance checks ready, all set for flight!
So let’s hop to it, with joy in our hearts,
For every new change, a fresh work of art! 🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 13

🧹 Outside diff range and nitpick comments (17)

sh/setup_pfm.sh (1)
5-5: Consider adding error handling and logging.

While set -ex provides basic error handling and debugging:

-e exits on error

-x prints commands before execution

Consider adding:

-u to error on undefined variables

-o pipefail to catch pipe failures
-set -ex
+set -euxo pipefail
sh/test.sh (2)
21-21: Improve error handling for perf command.

The current error handling assumes any failure is due to CI environment, which could mask real issues in other environments.

Consider this more descriptive approach:
-py -Om phlop.run.perf -e="--all-user" echo yes || echo "perf failed, assumed CI"
+if ! py -Om phlop.run.perf -e="--all-user" echo yes; then
+  if [ -n "$CI" ]; then
+    echo "perf failed in CI environment (expected)"
+  else
+    echo "WARNING: perf failed in non-CI environment"
+    exit 1
+  fi
+fi
23-23: Improve setup script reference.

The comment should provide more detailed setup instructions.

Consider this more informative comment:
-# install via ./sh/setup_pfm.sh
+# Required: Install libpfm4 by running ./sh/setup_pfm.sh before running this test
tests/_phlop/app/pfm/test_pfm.py (3)
1-8: Add proper file documentation.

The file is missing a docstring explaining its purpose and usage. Consider removing the empty comments and adding a proper module-level docstring.
-#
-#
-#
-
-
+"""
+Test script for performance monitoring functionality.
+
+This script gathers performance event information and executes perf stat
+command to monitor specific performance events.
+"""
+
 from phlop.proc import run
 from phlop.string import decode_bytes
9-11: Consider moving imports to the module level.

Unless there's a specific reason for importing within the __main__ block (like avoiding circular imports), it's better to move these imports to the top of the file for better readability and consistency.
 from phlop.proc import run
 from phlop.string import decode_bytes
+from phlop.app.pfm.check_events import get_evt_perf_code
+from phlop.app.pfm.showevtinfo import get_evt_info
 
 if __name__ == "__main__":
-    from phlop.app.pfm.check_events import get_evt_perf_code
-    from phlop.app.pfm.showevtinfo import get_evt_info
9-27: Add proper unit tests for the functionality.

This script appears to be a test but lacks proper unit testing structure. Consider:

Converting it into proper pytest-style tests

Adding test cases for different event types

Mocking the command execution for reliable testing

Would you like me to help create a proper test structure for this functionality?

🧰 Tools

🪛 Ruff

17-17: Loop control variable key not used within loop body

Rename unused key to _key

(B007)
mkn.yaml (1)
22-30: Consider relocating source files from examples directory.

The current configuration uses files from the examples directory as main entry points. This might not be suitable for production use as example code often lacks proper error handling and production-grade robustness.

Consider:

Moving these files to a dedicated src or tools directory

Adapting them for production use with proper error handling and logging

Updating the paths in the configuration accordingly:
-  main: tpp/pfm/examples/check_events.c
+  main: tpp/pfm/src/tools/check_events.c
-  main: tpp/pfm/examples/showevtinfo.c
+  main: tpp/pfm/src/tools/showevtinfo.c
phlop/app/pfm/check_events.py (3)
1-6: Add file documentation.

Please add a docstring describing the purpose of this module, its main functionality, and any important details about the performance event checks it handles.
 #
 #
 #
 #
 #

+"""
+Parser for performance event checks using libpfm4.
+
+This module provides functionality to:
+- Parse check_events output from libpfm4
+- Run check_events commands
+- Convert event codes to perf stat compatible format
+"""
18-18: Remove unused constant.

The check_events_start constant is not used anywhere in the code.
-check_events_start = "Total events:"
36-48: Refactor main block for better maintainability.

The main execution block could be improved in several ways:

Remove commented debug code

Extract the logic into a testable function

Use proper error handling for the perf stat command
 if __name__ == "__main__":
     from phlop.app.pfm.showevtinfo import get_evt_info
 
-    key, code = "[MULT_FLOPS]", ""
-    for info in get_evt_info():
-        if key in info.umask:
-            code = f"{info.name}:{info.umask[key].code}"
-            break
+    def get_mult_flops_code() -> str:
+        """Get the event code for MULT_FLOPS."""
+        key = "[MULT_FLOPS]"
+        for info in get_evt_info():
+            if key in info.umask:
+                return f"{info.name}:{info.umask[key].code}"
+        raise ValueError(f"Event {key} not found")
 
-    assert code != ""
+    try:
+        code = get_mult_flops_code()
+        evt_code = get_evt_perf_code(code)
+        result = run(f"perf stat -e {evt_code} sleep 5", check=True)
+        print(result)
+    except Exception as e:
+        logger.error(f"Failed to run perf stat: {e}")
+        raise
-    # print("get_evt_perf_code", get_evt_perf_code(code))
-    print(run(f"perf stat -e {get_evt_perf_code(code)} sleep 5"))
phlop/app/pfm/showevtinfo.py (4)
1-6: Add file header documentation.

The empty comment block should include copyright information, license details, and a module docstring explaining the purpose and usage of this module.
-#
-#
-#
-#
-#
-
+# Copyright (c) 2024 CodeRabbit Inc.
+# SPDX-License-Identifier: MIT
+"""
+Parser for performance monitoring event information.
+
+This module provides functionality to parse and structure event information
+from libpfm4's showevtinfo tool output.
+"""
19-19: Follow Python naming conventions for constants.

Constants should use UPPER_CASE naming.
-EVTINFO_delimiter = "#-----------------------------"
+EVTINFO_DELIMITER = "#-----------------------------"
22-36: Enhance dataclasses with type hints and documentation.

The dataclasses should include:

Type hints for all fields

Class docstrings explaining their purpose

Input validation where appropriate
 @dataclass
 class EVTUMask:
+    """Represents a unit mask for a performance monitoring event."""
-    id: str
-    desc: str
-    code: str
+    id: str
+    desc: str
+    code: str

 @dataclass
 class EVTInfo:
+    """Represents detailed information about a performance monitoring event."""
-    idx: str
-    pmu: str
-    name: str
-    umask: dict = field(default_factory=lambda: {})
-    ect: dict = field(default_factory=lambda: {})
+    idx: str
+    pmu: str
+    name: str
+    umask: dict[str, EVTUMask] = field(default_factory=dict)
+    ect: dict[str, str] = field(default_factory=dict)
80-81: Enhance the main execution block.

The main block should include proper logging setup and error handling.
 if __name__ == "__main__":
-    print(get_evt_info())
+    import sys
+    logging.basicConfig(level=logging.INFO)
+    try:
+        events = get_evt_info()
+        for event in events:
+            print(f"\nEvent: {event.name} (PMU: {event.pmu})")
+            for umask_id, umask in event.umask.items():
+                print(f"  Umask {umask_id}: {umask.desc} (code: {umask.code})")
+    except Exception as e:
+        logger.error(f"Failed to get event info: {e}")
+        sys.exit(1)
phlop/run/perf.py (1)
22-26: LGTM! Consider adding type hints.

The function modification correctly implements the new options parameter and maintains clean path handling.

Consider adding type hints for better code maintainability:
-def perf_stat_cmd(cli_args, path, line, options):
+def perf_stat_cmd(cli_args: argparse.Namespace, path: Path, line: str, options: str) -> str:
phlop/app/perf.py (2)
120-120: Enhance the help text for the --extra argument.

The current help text "forward string to perf command" is too vague. Consider providing more detailed information about valid options and usage examples.
-        extra="forward string to perf command",
+        extra="additional perf command options (e.g., '-A' for per-cpu stats, '-g' for call-graph)",
Line range hint 69-70: Fix unreachable code in stat_cmd function.

The function has two return statements, with the second one being unreachable:
    return f"perf stat -j {options} {out_str(output_file)} {events_str(events)} {exe}"
    return f"perf stat -x , {options} {out_str(output_file)} {events_str(events)} {exe}"
This needs to be fixed. Consider:

Removing one of the return statements, or

Adding a condition to choose between JSON (-j) and CSV (-x ,) output formats

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8810bad and 9f7f103.

📒 Files selected for processing (14)

.gitignore (1 hunks)
mkn.yaml (2 hunks)
phlop/app/perf.py (2 hunks)
phlop/app/pfm/__init__.py (1 hunks)
phlop/app/pfm/check_events.py (1 hunks)
phlop/app/pfm/showevtinfo.py (1 hunks)
phlop/app/stats_man.py (0 hunks)
phlop/run/mpirun_perf.py (0 hunks)
phlop/run/mpirun_stats_man.py (0 hunks)
phlop/run/perf.py (2 hunks)
phlop/testing/parallel_processor.py (1 hunks)
sh/setup_pfm.sh (1 hunks)
sh/test.sh (2 hunks)
tests/_phlop/app/pfm/test_pfm.py (1 hunks)

💤 Files with no reviewable changes (3)

phlop/app/stats_man.py
phlop/run/mpirun_perf.py
phlop/run/mpirun_stats_man.py

✅ Files skipped from review due to trivial changes (2)

phlop/app/pfm/init.py
phlop/testing/parallel_processor.py

🧰 Additional context used

🪛 Ruff

tests/_phlop/app/pfm/test_pfm.py

17-17: Loop control variable key not used within loop body

Rename unused key to _key

(B007)

🔇 Additional comments (8)

.gitignore (2)

9-9: LGTM! Good improvement using wildcard pattern.

The change from a specific filename to a wildcard pattern *scope_timer.txt provides better flexibility for ignoring all files ending with scope_timer.txt.

10-10: LGTM! Clear and concise ignore pattern.

The addition of tpp aligns well with the project's new directory structure mentioned in the AI summary.

sh/setup_pfm.sh (1)

1-4: LGTM: Script initialization looks good.

The script properly:

Uses the appropriate shebang
Determines the current working directory
Changes to the parent directory

sh/test.sh (2)

3-3: LGTM!

The combined command correctly sets and changes to the script's directory.

24-24: 🛠️ Refactor suggestion

Verify setup script and improve error handling.

The current implementation silently skips the test if the directory is missing, which might hide setup issues.

Let's verify the setup script exists and suggest improved error handling:

Consider this more robust implementation:

-[ -d "tpp/pfm" ] && py -O tests/_phlop/app/pfm/test_pfm.py || echo "pfm missing, skipped"
+if [ ! -d "tpp/pfm" ]; then
+  echo "ERROR: pfm directory not found. Please run ./sh/setup_pfm.sh first"
+  exit 1
+fi
+py -O tests/_phlop/app/pfm/test_pfm.py

mkn.yaml (1)

14-20: LGTM! Well-structured library configuration.

The static library configuration is appropriate for a performance monitoring library, with proper include path prioritization.

phlop/app/pfm/check_events.py (1)

37-37: Verify showevtinfo module integration.

Let's verify that the showevtinfo module is properly integrated and contains the required functionality.

phlop/run/perf.py (1)

53-57: Verify CLI argument handling for the 'extra' parameter.

While the code structure is clean, we should ensure that cli_args.extra is properly initialized and handled when not provided.

Let's verify the CLI argument setup:

✅ Verification successful

The verification results show that:

The --extra argument is properly defined in phlop/app/perf.py with a default empty string value
It's defined as a string type parameter with a default value of ""
The parameter is used only once in the codebase, specifically in the reviewed code

CLI argument handling for 'extra' parameter is properly implemented

The code is safe as the --extra argument is properly defined with a default empty string value, ensuring it always exists when accessed via cli_args.extra. No additional validation is needed since the type and default value are handled by the argument parser.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the 'extra' argument is properly defined in the argument parser
# and verify its default value handling

# Look for the argument definition
rg -A 5 "add_argument.*--extra" .

# Check for any default value assignments
rg "extra.*default" .

# Look for any validation of the extra parameter
rg "cli_args.extra" .

Length of output: 628

sh/setup_pfm.sh

coderabbitai · 2024-11-09T21:59:39Z

tests/_phlop/app/pfm/test_pfm.py

+    events = " ".join([f"-e {get_evt_perf_code(ev)}" for ev in code.split(" ")])
+    print(decode_bytes(run(f"perf stat {events} sleep 5").stderr))


🛠️ Refactor suggestion

Improve command execution robustness and configurability.

The command execution could be improved in several ways:

Make the sleep duration configurable

Add error handling for the command execution

Consider parsing and validating the command output instead of directly printing it

+ # Configuration + SLEEP_DURATION = 5 # seconds + events = " ".join([f"-e {get_evt_perf_code(ev)}" for ev in code.split(" ")]) - print(decode_bytes(run(f"perf stat {events} sleep 5").stderr)) + try: + result = run(f"perf stat {events} sleep {SLEEP_DURATION}") + output = decode_bytes(result.stderr) + # TODO: Add output parsing and validation + print(output) + except Exception as e: + print(f"Error executing perf stat command: {e}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

events = " ".join([f"-e {get_evt_perf_code(ev)}" for ev in code.split(" ")])

print(decode_bytes(run(f"perf stat {events} sleep 5").stderr))

# Configuration

SLEEP_DURATION = 5 # seconds

events = " ".join([f"-e {get_evt_perf_code(ev)}" for ev in code.split(" ")])

try:

result = run(f"perf stat {events} sleep {SLEEP_DURATION}")

output = decode_bytes(result.stderr)

# TODO: Add output parsing and validation

print(output)

except Exception as e:

print(f"Error executing perf stat command: {e}")

coderabbitai · 2024-11-09T21:59:40Z

tests/_phlop/app/pfm/test_pfm.py

+    code = ""
+    key0, key1 = "[MULT_FLOPS]", "[ADD_SUB_FLOPS]"
+    for info in get_evt_info():
+        if key0 in info.umask:
+            for key, umask in info.umask.items():
+                code += f"{info.name}:{umask.code} "
+            break
+        # if key1 in info.umask:
+        #     code += f"{info.name}:{info.umask[key1].code} "
+
+    code = code.strip()
+    assert code != ""


🛠️ Refactor suggestion

Multiple improvements needed in the event code gathering logic.

The hardcoded keys make the script inflexible. Consider making them configurable.

Remove commented code if it's no longer needed.

The loop variable key is unused as flagged by static analysis.

- code = "" - key0, key1 = "[MULT_FLOPS]", "[ADD_SUB_FLOPS]" + # Define events to monitor - consider moving to a config file + EVENTS_TO_MONITOR = ["[MULT_FLOPS]"] + code = "" for info in get_evt_info(): - if key0 in info.umask: - for key, umask in info.umask.items(): + if any(event in info.umask for event in EVENTS_TO_MONITOR): + for _key, umask in info.umask.items(): code += f"{info.name}:{umask.code} " break - # if key1 in info.umask: - # code += f"{info.name}:{info.umask[key1].code} "

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

code = ""

key0, key1 = "[MULT_FLOPS]", "[ADD_SUB_FLOPS]"

for info in get_evt_info():

if key0 in info.umask:

for key, umask in info.umask.items():

code += f"{info.name}:{umask.code} "

break

# if key1 in info.umask:

# code += f"{info.name}:{info.umask[key1].code} "

code = code.strip()

assert code != ""

# Define events to monitor - consider moving to a config file

EVENTS_TO_MONITOR = ["[MULT_FLOPS]"]

code = ""

for info in get_evt_info():

if any(event in info.umask for event in EVENTS_TO_MONITOR):

for _key, umask in info.umask.items():

code += f"{info.name}:{umask.code} "

break

code = code.strip()

assert code != ""

🧰 Tools

🪛 Ruff

17-17: Loop control variable key not used within loop body

Rename unused key to _key

(B007)

mkn.yaml

coderabbitai · 2024-11-09T21:59:40Z

phlop/app/pfm/showevtinfo.py

+def run_evtinfo():
+    with pushd(FILE_DIR.parent.parent.parent):
+        return decode_bytes(run("./tpp/pfm/examples/showevtinfo").stdout).splitlines()
+
+
+def get_evt_info():
+    return parse_evtinfo_output(run_evtinfo())
+


⚠️ Potential issue

Improve error handling and configuration in execution functions.

The functions need better error handling and should avoid hard-coded paths.

+SHOWEVTINFO_PATH = "tpp/pfm/examples/showevtinfo" def run_evtinfo(): + """Execute the showevtinfo command and return its output. + + Returns: + list[str]: Command output lines + + Raises: + RuntimeError: If command execution fails + """ with pushd(FILE_DIR.parent.parent.parent): - return decode_bytes(run("./tpp/pfm/examples/showevtinfo").stdout).splitlines() + try: + result = run(f"./{SHOWEVTINFO_PATH}", check=True) + return decode_bytes(result.stdout).splitlines() + except Exception as e: + raise RuntimeError(f"Failed to execute {SHOWEVTINFO_PATH}: {e}") from e def get_evt_info(): + """Get parsed event information. + + Returns: + list[EVTInfo]: List of parsed event information + + Raises: + RuntimeError: If event information cannot be retrieved + """ return parse_evtinfo_output(run_evtinfo())

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def run_evtinfo():

with pushd(FILE_DIR.parent.parent.parent):

return decode_bytes(run("./tpp/pfm/examples/showevtinfo").stdout).splitlines()

def get_evt_info():

return parse_evtinfo_output(run_evtinfo())

SHOWEVTINFO_PATH = "tpp/pfm/examples/showevtinfo"

def run_evtinfo():

"""Execute the showevtinfo command and return its output.

Returns:

list[str]: Command output lines

Raises:

RuntimeError: If command execution fails

"""

with pushd(FILE_DIR.parent.parent.parent):

try:

result = run(f"./{SHOWEVTINFO_PATH}", check=True)

return decode_bytes(result.stdout).splitlines()

except Exception as e:

raise RuntimeError(f"Failed to execute {SHOWEVTINFO_PATH}: {e}") from e

def get_evt_info():

"""Get parsed event information.

Returns:

list[EVTInfo]: List of parsed event information

Raises:

RuntimeError: If event information cannot be retrieved

"""

return parse_evtinfo_output(run_evtinfo())

phlop/app/pfm/showevtinfo.py

coderabbitai · 2024-11-09T21:59:41Z

phlop/app/pfm/showevtinfo.py

+def _parse_evtinfo(bits_list):
+    assert len(bits_list) >= 7
+    info = EVTInfo(
+        idx=bits_list[0][1].strip(),
+        pmu=bits_list[1][1].strip(),
+        name=bits_list[2][1].strip(),
+    )
+    for bits in bits_list[7:]:
+        if bits[0].strip().startswith("Umask"):
+            info.umask[bits[3].strip()] = EVTUMask(
+                id=bits[3].strip(), desc=bits[5].strip(), code=bits[1].strip()
+            )
+    return info
+


⚠️ Potential issue

Improve error handling and documentation in _parse_evtinfo.

The function has several areas for improvement:

The assert statement should include an error message

The magic number 7 should be explained or made into a constant

Missing error handling for malformed input

+MIN_BITS_LENGTH = 7 # Minimum required fields: idx, pmu, name, and 4 other required fields def _parse_evtinfo(bits_list): + """Parse event information from a list of bit fields. + + Args: + bits_list: List of strings containing event information fields + + Returns: + EVTInfo: Parsed event information + + Raises: + ValueError: If input format is invalid + """ - assert len(bits_list) >= 7 + if len(bits_list) < MIN_BITS_LENGTH: + raise ValueError(f"Expected at least {MIN_BITS_LENGTH} fields, got {len(bits_list)}") + + try: info = EVTInfo( idx=bits_list[0][1].strip(), pmu=bits_list[1][1].strip(), name=bits_list[2][1].strip(), ) + except IndexError as e: + raise ValueError("Invalid field format in input") from e for bits in bits_list[7:]: - if bits[0].strip().startswith("Umask"): - info.umask[bits[3].strip()] = EVTUMask( - id=bits[3].strip(), desc=bits[5].strip(), code=bits[1].strip() - ) + try: + if bits[0].strip().startswith("Umask"): + info.umask[bits[3].strip()] = EVTUMask( + id=bits[3].strip(), + desc=bits[5].strip(), + code=bits[1].strip() + ) + except IndexError as e: + logger.warning(f"Skipping malformed umask entry: {bits}") return info

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _parse_evtinfo(bits_list):

assert len(bits_list) >= 7

info = EVTInfo(

idx=bits_list[0][1].strip(),

pmu=bits_list[1][1].strip(),

name=bits_list[2][1].strip(),

)

for bits in bits_list[7:]:

if bits[0].strip().startswith("Umask"):

info.umask[bits[3].strip()] = EVTUMask(

id=bits[3].strip(), desc=bits[5].strip(), code=bits[1].strip()

)

return info

MIN_BITS_LENGTH = 7 # Minimum required fields: idx, pmu, name, and 4 other required fields

def _parse_evtinfo(bits_list):

"""Parse event information from a list of bit fields.

Args:

bits_list: List of strings containing event information fields

Returns:

EVTInfo: Parsed event information

Raises:

ValueError: If input format is invalid

"""

if len(bits_list) < MIN_BITS_LENGTH:

raise ValueError(f"Expected at least {MIN_BITS_LENGTH} fields, got {len(bits_list)}")

try:

info = EVTInfo(

idx=bits_list[0][1].strip(),

pmu=bits_list[1][1].strip(),

name=bits_list[2][1].strip(),

)

except IndexError as e:

raise ValueError("Invalid field format in input") from e

for bits in bits_list[7:]:

try:

if bits[0].strip().startswith("Umask"):

info.umask[bits[3].strip()] = EVTUMask(

id=bits[3].strip(),

desc=bits[5].strip(),

code=bits[1].strip()

)

except IndexError as e:

logger.warning(f"Skipping malformed umask entry: {bits}")

return info

coderabbitai · 2024-11-09T21:59:41Z

phlop/app/pfm/showevtinfo.py

+def run_evtinfo():
+    with pushd(FILE_DIR.parent.parent.parent):
+        return decode_bytes(run("./tpp/pfm/examples/showevtinfo").stdout).splitlines()
+


💡 Codebase verification

Security concerns identified with external binary execution

The code attempts to execute a binary ./tpp/pfm/examples/showevtinfo that:

Is not present in the repository

Has no associated build or installation instructions

Uses relative path traversal with parent.parent.parent

Has no path validation or existence checks

Has no documented source or security considerations

Recommendations:

Add proper validation for binary existence before execution

Document the binary's source and installation process

Consider using absolute paths or environment variables instead of path traversal

Add error handling for cases when the binary is missing or has incorrect permissions

🔗 Analysis chain

Verify security implications of external command execution.

The code executes an external binary. Please ensure:

The binary path is properly sanitized

The binary's permissions are correctly set

The binary's source is trusted

Let me gather more information about the binary execution and path handling.

Let me check the actual implementation and context of the binary execution.

Let me check one final aspect about the binary and its source.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Check the permissions and ownership of the showevtinfo binary fd showevtinfo --type f --exec ls -l {}

Length of output: 123

Script:

#!/bin/bash # Check for the actual binary and its location fd -t f "tpp/pfm/examples/showevtinfo" # Look for related build or installation files rg -l "showevtinfo" --type make --type cmake # Check how the binary is used in the codebase rg -l "run\(.*showevtinfo" -g '!*.pyc' # Check the path handling ast-grep --pattern 'pushd($$$)'

Length of output: 559

Script:

#!/bin/bash # Check the implementation details of the run function ast-grep --pattern 'def run($$$)' # Check if there's any validation or security measures rg -A 5 "run\(" phlop/app/pfm/showevtinfo.py # Check the FILE_DIR definition rg "FILE_DIR =" phlop/app/pfm/showevtinfo.py # Look for any setup or installation instructions fd "README" --type f --exec cat {}

Length of output: 2125

Script:

#!/bin/bash # Check if the binary is part of the repository or expected to be built rg -l "showevtinfo" -g "setup.*" -g "*.toml" -g "*.cmake" -g "CMakeLists.txt" # Check for any documentation about the binary rg -l "showevtinfo" -g "*.md" -g "*.rst" # Check the import and dependencies rg "^from|^import" phlop/app/pfm/showevtinfo.py

Length of output: 339

coderabbitai · 2024-11-09T21:59:41Z

phlop/app/perf.py

+    parser.add_argument("-e", "--extra", type=str, default="", help=_help.extra)
+


💡 Codebase verification

⚠️ Potential issue

Extra argument needs input validation but shell injection is properly handled

The review comment's security concerns are partially valid, but the shell command execution is actually safe:

The extra parameter is used in perf command construction through string formatting (as seen in stat_cmd and record_cmd functions)

Command execution is handled safely through phlop/proc.py which uses subprocess.run with proper parameter handling

The suggestion to move the argument up for better organization is valid

However, input validation is still recommended to ensure only valid perf options are passed. Here's a revised suggestion:

parser.add_argument("-o", "--outfile", default=None, help=_help.outfile) parser.add_argument("-t", "--tool", default="stat", help=_help.tool) + parser.add_argument("-e", "--extra", type=str, default="", + help=_help.extra, + # Validate perf options + choices=["", "-A", "-g", "--per-thread", "--per-core"]) parser.add_argument("--logging", type=int, default=1, help=_help.logging) - - parser.add_argument("-e", "--extra", type=str, default="", help=_help.extra)

🔗 Analysis chain

Security and validation concerns with the --extra argument.

Forwarding raw strings to shell commands without validation could pose security risks. Additionally, consider:

Moving this argument up with the other related arguments for better code organization

Adding input validation to prevent command injection

Let's check if the extra options are used in shell commands:

Consider applying this diff for better organization and basic validation:

parser.add_argument("-o", "--outfile", default=None, help=_help.outfile) parser.add_argument("-t", "--tool", default="stat", help=_help.tool) + parser.add_argument("-e", "--extra", type=str, default="", + help=_help.extra, + # Basic validation to prevent command injection + choices=["-A", "-g", "--per-thread"]) parser.add_argument("--logging", type=int, default=1, help=_help.logging) - parser.add_argument("-e", "--extra", type=str, default="", help=_help.extra) -

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for shell=True usage with the extra parameter rg -A 5 "shell=True.*extra"

Length of output: 29

Script:

#!/bin/bash # Let's search more broadly for usage of the 'extra' parameter rg -A 5 '\bextra\b' phlop/app/perf.py # Also check for subprocess calls that might use the extra parameter ast-grep --pattern 'subprocess.run($$$)'

Length of output: 490

Script:

#!/bin/bash # Let's check how the extra parameter is used in command construction rg -B 10 -A 10 'cmd.*=.*extra' phlop/app/perf.py # Also check the verify_cli_args function to see if there's any validation ast-grep --pattern 'def verify_cli_args($_) { $$$ }'

Length of output: 107

Script:

#!/bin/bash # Let's check the entire perf.py file to understand the command construction rg -B 5 -A 5 'subprocess|cmd.*=' phlop/app/perf.py # Also check for any imports of subprocess or command execution related modules rg '^from.*import|^import.*' phlop/app/perf.py

Length of output: 998

Script:

#!/bin/bash # Let's check how the extra parameter flows through the code rg -B 5 -A 10 'def main' phlop/app/perf.py # Also check the proc.py module since it handles command execution rg -l 'run.*=' phlop/proc.py && cat phlop/proc.py

Length of output: 2030

coderabbitai

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (1)

phlop/app/nvidia/csan.py (1)

1-56: Consider architectural improvements

Both NVIDIA tool wrapper files (ncu.py and csan.py) share similar patterns and could benefit from the following improvements:

Create a common base class or utilities for shared functionality

Add proper error handling for command execution

Add type hints for better code maintainability

Add unit tests for command construction and argument validation

Would you like me to help create:

A base class design for NVIDIA tools

Unit test templates

Type hint additions

🧰 Tools

🪛 Ruff (0.8.0)

11-11: os imported but unused

Remove unused import: os

(F401)

14-14: phlop.proc.run_mp imported but unused

Remove unused import: phlop.proc.run_mp

(F401)

23-23: Redefinition of unused run from line 14

(F811)

24-24: Local variable env is assigned to but never used

Remove assignment to unused variable env

(F841)

25-25: Local variable cmd is assigned to but never used

Remove assignment to unused variable cmd

(F841)

25-25: f-string without any placeholders

Remove extraneous f prefix

(F541)

26-26: Undefined name record_cmd

(F821)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 9f7f103 and e5a03fe.

📒 Files selected for processing (6)

phlop/app/nvidia/__init__.py (1 hunks)
phlop/app/nvidia/csan.py (1 hunks)
phlop/app/nvidia/ncu.py (1 hunks)
phlop/app/perf.py (2 hunks)
phlop/app/pfm/showevtinfo.py (1 hunks)
sh/setup_pfm.sh (1 hunks)

✅ Files skipped from review due to trivial changes (1)

phlop/app/nvidia/init.py

🚧 Files skipped from review as they are similar to previous changes (3)

sh/setup_pfm.sh
phlop/app/perf.py
phlop/app/pfm/showevtinfo.py

🧰 Additional context used

🪛 Ruff (0.8.0)

phlop/app/nvidia/ncu.py

11-11: os imported but unused

Remove unused import: os

(F401)

14-14: phlop.proc.run_mp imported but unused

Remove unused import: phlop.proc.run_mp

(F401)

23-23: Redefinition of unused run from line 14

(F811)

24-24: Local variable env is assigned to but never used

Remove assignment to unused variable env

(F841)

25-25: f-string without any placeholders

Remove extraneous f prefix

(F541)

phlop/app/nvidia/csan.py

11-11: os imported but unused

Remove unused import: os

(F401)

14-14: phlop.proc.run_mp imported but unused

Remove unused import: phlop.proc.run_mp

(F401)

23-23: Redefinition of unused run from line 14

(F811)

24-24: Local variable env is assigned to but never used

Remove assignment to unused variable env

(F841)

25-25: Local variable cmd is assigned to but never used

Remove assignment to unused variable cmd

(F841)

25-25: f-string without any placeholders

Remove extraneous f prefix

(F541)

26-26: Undefined name record_cmd

(F821)

🔇 Additional comments (1)

phlop/app/nvidia/csan.py (1)

47-47: ⚠️ Potential issue

Fix undefined help text

The CLI parser references an undefined help text 'infiles'.

-    parser.add_argument("-i", "--infiles", default=None, help=_help.infiles)
+    parser.add_argument("-i", "--infiles", default=None, help="input files to process")

Likely invalid or redundant comment.

phlop/app/nvidia/ncu.py

coderabbitai · 2024-12-03T14:06:49Z

phlop/app/nvidia/csan.py

+metrics = [
+    "all",
+    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum",  # read
+    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum",  # wrte
+]


⚠️ Potential issue

Remove inappropriate metrics list

The metrics list appears to be copied from ncu.py but is not relevant for the Compute Sanitizer, which focuses on memory checking and race detection rather than performance metrics.

-metrics = [ - "all", - "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum", # read - "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum", # wrte -]

coderabbitai · 2024-12-03T14:06:49Z

phlop/app/nvidia/csan.py

+def verify_cli_args(cli_args):
+    return cli_args


🛠️ Refactor suggestion

Implement proper argument verification

The verify_cli_args function should validate the tool type against supported sanitizer tools.

def verify_cli_args(cli_args): + """Verify command line arguments. + + Args: + cli_args: Parsed command line arguments + + Returns: + Verified arguments + + Raises: + ValueError: If tool type is not supported + """ + supported_tools = ['memcheck', 'racecheck', 'initcheck', 'synccheck'] + if cli_args.tool and cli_args.tool not in supported_tools: + raise ValueError(f"Unsupported tool type. Must be one of: {supported_tools}") return cli_args

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def verify_cli_args(cli_args):

return cli_args

def verify_cli_args(cli_args):

"""Verify command line arguments.

Args:

cli_args: Parsed command line arguments

Returns:

Verified arguments

Raises:

ValueError: If tool type is not supported

"""

supported_tools = ['memcheck', 'racecheck', 'initcheck', 'synccheck']

if cli_args.tool and cli_args.tool not in supported_tools:

raise ValueError(f"Unsupported tool type. Must be one of: {supported_tools}")

return cli_args

phlop/app/nvidia/csan.py

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e5a03fe and 2176713.

📒 Files selected for processing (1)

inc/phlop/timing/threaded_scope_timer.hpp (6 hunks)

🔇 Additional comments (5)

inc/phlop/timing/threaded_scope_timer.hpp (5)

59-59: Cleanup of thread_reports during shutdown

Good to see that thread_reports is being cleared during shutdown to prevent resource leaks.

145-145: Adding thread_reports vector to manage report lifetimes

The addition of the thread_reports vector ensures that RunTimerReport instances are kept alive by maintaining shared ownership. This aids in proper lifetime management across threads.

172-173: Updating k and f to std::string for safe ownership

Changing the member variables k and f from std::string_view to std::string ensures that RunTimerReport owns its data, preventing dangling references if the original strings go out of scope. This improves safety at the cost of additional memory usage due to string copying.

195-200: Introducing ThreadLifeWatcher for automatic report management

The ThreadLifeWatcher struct ensures that the RunTimerReport is moved to ScopeTimerMan upon destruction. This practice automatically handles resource management when a thread terminates, preventing potential leaks.

380-386: Improved memory management in PHLOP_SCOPE_TIMER macro

The updated PHLOP_SCOPE_TIMER macro now uses std::make_shared to create RunTimerReport instances and introduces ThreadLifeWatcher to manage their lifetimes. This ensures proper memory management and cleanup of timer reports.

coderabbitai · 2024-12-04T12:41:07Z

inc/phlop/timing/threaded_scope_timer.hpp

+    void move(std::shared_ptr<RunTimerReport>& report)
+    {
+        std::unique_lock<std::mutex> lk(work_);
+        thread_reports.emplace_back(std::move(report));
+    }


🛠️ Refactor suggestion

Consider passing report by value to clarify ownership transfer

In the void move(std::shared_ptr<RunTimerReport>& report) method, report is moved into thread_reports, which can leave the original shared pointer in a moved-from state. To make ownership transfer explicit and safer, consider passing report by value.

Apply this diff to update the method signature and usage:

- void move(std::shared_ptr<RunTimerReport>& report) + void move(std::shared_ptr<RunTimerReport> report) { std::unique_lock<std::mutex> lk(work_); thread_reports.emplace_back(std::move(report)); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

void move(std::shared_ptr<RunTimerReport>& report)

{

std::unique_lock<std::mutex> lk(work_);

thread_reports.emplace_back(std::move(report));

}

void move(std::shared_ptr<RunTimerReport> report)

{

std::unique_lock<std::mutex> lk(work_);

thread_reports.emplace_back(std::move(report));

}

coderabbitai · 2024-12-04T12:41:07Z

inc/phlop/timing/threaded_scope_timer.hpp

+        assert(c);
+        assert(c->self);


⚠️ Potential issue

Replace assert with explicit null checks for safety in release builds

Using assert to check for null pointers means these checks are disabled in release builds where NDEBUG is defined. This may result in null pointer dereferences. Consider replacing assert with explicit null checks to ensure safety in all build configurations.

Apply this diff to handle null pointers explicitly:

- assert(c); - assert(c->self); + if (!c || !c->self) { + // Handle the error appropriately, e.g., log the error or throw an exception + throw std::runtime_error("Null pointer encountered in recurse_traces_for_keys"); + }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (10)

phlop/app/nvidia/__main__.py (1)
1-6: Remove or utilize empty comment lines

The empty comment lines at the top of the file serve no purpose. Consider either removing them or adding meaningful documentation such as a module docstring explaining the purpose of this nvidia module.
-#
-#
-#
-#
-#
-
-
+"""
+NVIDIA module entry point.
+
+This module provides access to NVIDIA-specific functionality including:
+- csan: CUDA Sanitizer functionality
+- ncucsan: NVIDIA CUDA C Sanitizer functionality
+"""
sh/clean.sh (2)
6-6: Consider protecting against accidental deletion of important directories

While the cleanup targets are appropriate (cache and build artifacts), the script could benefit from additional safety measures.

Consider adding these improvements:
-  __pycache__ .ruff_cache phlop.egg-info dist
+  "__pycache__" ".ruff_cache" "phlop.egg-info" "dist"
Also, consider adding these safety checks before the cleanup loop:
# Ensure we're in the correct directory
if [[ ! -f "setup.py" ]] || [[ ! -d "sh" ]]; then
    echo "Error: Must be run from repository root" >&2
    exit 1
fi

# Backup check for critical paths
for RM in ${RM_RF[@]}; do
    if [[ "$RM" == "/" ]] || [[ "$RM" == "." ]] || [[ "$RM" == ".." ]]; then
        echo "Error: Invalid cleanup target: $RM" >&2
        exit 1
    fi
done
Line range hint 8-10: Consider using safer cleanup alternatives

The current cleanup approach using find with xargs could be improved for better error handling and safety.

Consider this safer alternative:
-for RM in ${RM_RF[@]}; do
-  find . -name "${RM}" | xargs rm -rf
-done
+for RM in "${RM_RF[@]}"; do
+  while IFS= read -r -d '' file; do
+    rm -rf "$file"
+  done < <(find . -name "${RM}" -print0)
+done
This improvement:

Handles filenames with spaces correctly

Processes null-terminated filenames safely

Provides better control over the deletion process

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 9-9: Double quote array expansions to avoid re-splitting elements.

(SC2068)
phlop/app/nvidia/csan.py (1)
1-11: Enhance documentation with complete examples and usage information.

The documentation should include:

Complete list of supported sanitizer tools

Full example commands with sanitizer options

Expected output format
 # compute sanitizer frontend

 # https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html

 ## samples
 #  compute-sanitizer --tool memcheck [sanitizer_options] app_name [app_options]
 #  compute-sanitizer --tool racecheck [sanitizer_options] app_name [app_options]
-#
-#
-#
+# Example sanitizer options:
+#  --leak-check full    # For memcheck
+#  --print-level <N>    # Set output verbosity (1-4)
+#  --save <file>        # Save output to file
phlop/app/pfm/showevtinfo.py (4)
1-6: Add proper file documentation and license information.

The file header is empty. Please add:

Copyright notice

License information

Module docstring describing the purpose and usage of this module

19-19: Follow Python naming conventions for constants.

Rename EVTINFO_delimiter to EVTINFO_DELIMITER to follow Python's naming convention for constants.
-EVTINFO_delimiter = "#-----------------------------"
+EVTINFO_DELIMITER = "#-----------------------------"
22-55: Add type hints and documentation to data classes.

The data classes are well-structured but lack documentation. Please add:

Class docstrings explaining the purpose of each class

Method docstrings with parameters and return types

Type hints for the data field in EVTInfos
 @dataclass
 class EVTUMask:
+    """Represents an event umask with its properties."""
     id: str
     desc: str
     code: str

 @dataclass
 class EVTInfo:
+    """Represents performance event information."""
     idx: str
     pmu: str
     name: str
     umask: dict[str, EVTUMask] = field(default_factory=lambda: {})
     etc: dict[str, str] = field(default_factory=lambda: {})

 @dataclass
 class EVTInfos:
+    """Collection of performance event information with filtering capabilities."""
-    data: list = field(default_factory=lambda: [])
+    data: list[EVTInfo] = field(default_factory=lambda: [])

     def umasks(self) -> 'EVTInfos':
+        """Return events that have umasks defined."""
         return EVTInfos(data=[d for d in self.data if d.umask])

     def umasks_in(self, needle: str) -> 'EVTInfos':
+        """Return events that have the given needle string in their umask keys."""
         return EVTInfos(data=[d for d in self.data if any(needle in k for k in d.umask)])

     def append(self, ev: EVTInfo) -> None:
+        """Append an event to the collection."""
         self.data.append(ev)
99-102: Enhance the main block with proper CLI handling.

The main block could be improved with argument parsing and error handling.
 if __name__ == "__main__":
     import json
+    import argparse
+    import sys
+
+    parser = argparse.ArgumentParser(description="Show event information in JSON format")
+    parser.add_argument("--pretty", action="store_true", help="Pretty print JSON output")
+    args = parser.parse_args()
+
+    try:
+        info = get_evt_info()
+        json_kwargs = {"indent": 2} if args.pretty else {}
+        print(json.dumps(asdict(info), **json_kwargs))
+    except Exception as e:
+        print(f"Error: {e}", file=sys.stderr)
+        sys.exit(1)
-    print(json.dumps(asdict(get_evt_info()), tabs=2))
phlop/app/nvidia/ncu.py (2)
1-11: Enhance documentation structure

Consider improving the documentation by:

Adding a brief module description

Formatting the sample commands as a code block

Adding expected output or common usage patterns

Apply this diff to improve the documentation:
 # Nsight Compute CLI
+"""
+Module for interfacing with NVIDIA Nsight Compute CLI (ncu).
+Provides functionality to profile CUDA applications and collect performance metrics.
+"""
 
 # https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html
 
-## samples
-#  ncu --help
-#  ncu --metrics all
-#  ncu --metrics l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum
-#  ncu --target-processes all -o <report-name> mpirun [mpi arguments] <app> [app arguments]
-#
+# Usage Examples:
+# ```bash
+#   ncu --help
+#   ncu --metrics all
+#   ncu --metrics l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum
+#   ncu --target-processes all -o <report-name> mpirun [mpi arguments] <app> [app arguments]
+# ```
16-20: Document metrics and consider performance implications

The metrics list should include documentation for each metric and a warning about using "all".

Apply this diff to improve the metrics documentation:
 metrics = [
-    "all",
-    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum",  # read
-    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum",  # wrte
+    "all",  # Collects all available metrics (WARNING: resource intensive)
+    # L1 Texture Cache Metrics
+    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum",  # Shared memory load bank conflicts
+    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum",  # Shared memory store bank conflicts
 ]

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2176713 and aa3b325.

📒 Files selected for processing (7)

phlop/app/__main__.py (1 hunks)
phlop/app/nvidia/__main__.py (1 hunks)
phlop/app/nvidia/csan.py (1 hunks)
phlop/app/nvidia/ncu.py (1 hunks)
phlop/app/pfm/__main__.py (1 hunks)
phlop/app/pfm/showevtinfo.py (1 hunks)
sh/clean.sh (1 hunks)

✅ Files skipped from review due to trivial changes (2)

phlop/app/pfm/main.py
phlop/app/main.py

🔇 Additional comments (6)

sh/clean.sh (2)

2-2: LGTM: Enhanced error handling and debugging

The change from set -e to set -ex improves debugging by printing commands before execution while maintaining strict error handling.

Line range hint 1-10: Verify the impact of cleanup on the build process

Let's ensure the cleanup script doesn't remove any files that might be needed by the build process.

✅ Verification successful

Let me gather more information about the build configuration and dependencies to ensure we have a complete picture.

Let me check the content of pyproject.toml and requirements.txt to complete our verification of build-related configurations.

The cleanup script safely removes only build artifacts and caches

The cleanup targets (__pycache__, .ruff_cache, phlop.egg-info, dist) are standard Python build artifacts and caches that are safely regenerated during the build process. This is verified by:

The build configuration in pyproject.toml and setup.py shows a standard Python package setup
The cleaned directories are not referenced as source locations in the build configs
No custom build process or special handling of these directories is defined
All cleaned paths are standard Python temporary/generated directories

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any build-related files that might be affected by cleanup

# Check if any of the cleanup targets are referenced in build files
echo "Checking build file references..."
for target in __pycache__ .ruff_cache phlop.egg-info dist; do
    echo "Checking references to: $target"
    rg -l "$target" "*.yaml" "*.toml" "setup.py" || true
done

# Check for any additional build directories that should be cleaned
echo "Checking for other potential cleanup targets..."
fd -t d -d 2 "build|target|\.cache|\.egg-info"

Length of output: 1374

Script:

#!/bin/bash
# Check for build configuration files and their content
echo "Checking for build configuration files..."
fd -t f "setup.py|pyproject.toml|setup.cfg|requirements.txt"

# If setup.py exists, let's see its content
echo -e "\nChecking setup.py content if it exists..."
if [ -f "setup.py" ]; then
    cat setup.py
fi

# Check for any Python package configuration
echo -e "\nChecking for package configuration..."
fd -t f "requirements.*|Pipfile|poetry.lock"

# Look for any CI/build scripts
echo -e "\nChecking for CI/build scripts..."
fd -t f "^build.*\.sh$|^install.*\.sh$"

Length of output: 828

Script:

#!/bin/bash
# Check contents of remaining build files
echo "Checking pyproject.toml content..."
cat pyproject.toml

echo -e "\nChecking requirements.txt content..."
cat requirements.txt

Length of output: 789

🧰 Tools

🪛 Shellcheck (0.10.0)

[error] 9-9: Double quote array expansions to avoid re-splitting elements.

(SC2068)

phlop/app/nvidia/csan.py (2)

16-20: ⚠️ Potential issue

Remove inappropriate metrics list

The metrics list appears to be copied from ncu.py but is not relevant for the Compute Sanitizer, which focuses on memory checking and race detection rather than performance metrics.

-metrics = [
-    "all",
-    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum",  # read
-    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum",  # wrte
-]

57-58: ⚠️ Potential issue

Implement proper argument verification

The verify_cli_args function should validate the tool type against supported sanitizer tools.

 def verify_cli_args(cli_args):
+    """Verify command line arguments.
+    
+    Args:
+        cli_args: Parsed command line arguments
+    
+    Returns:
+        Verified arguments
+    
+    Raises:
+        ValueError: If tool type is not supported
+    """
+    supported_tools = ['memcheck', 'racecheck', 'initcheck', 'synccheck']
+    if cli_args.tool and cli_args.tool not in supported_tools:
+        raise ValueError(f"Unsupported tool type. Must be one of: {supported_tools}")
     return cli_args

phlop/app/pfm/showevtinfo.py (2)

57-88: 🛠️ Refactor suggestion

Improve error handling and documentation in parsing functions.

The parsing functions need improvements in error handling and documentation.

+def _parse_evtinfo(bits_list: list[list[str]]) -> EVTInfo:
+    """Parse event information from a list of bit fields.
+    
+    Args:
+        bits_list: List of field pairs containing event information
+    
+    Returns:
+        EVTInfo: Parsed event information
+        
+    Raises:
+        ValueError: If input format is invalid
+    """
-    assert len(bits_list) >= 7
+    if len(bits_list) < 7:
+        raise ValueError(f"Expected at least 7 fields, got {len(bits_list)}")
+
+    try:
         info = EVTInfo(
             idx=bits_list[0][1].strip(),
             pmu=bits_list[1][1].strip(),
             name=bits_list[2][1].strip(),
         )
+    except (IndexError, AttributeError) as e:
+        raise ValueError(f"Invalid field format in input: {e}") from e

     for bits in bits_list[7:]:
+        try:
             if bits[0].strip().startswith("Umask"):
                 info.umask[bits[3].strip()] = EVTUMask(
                     id=bits[3].strip(),
                     desc=bits[5].strip(),
                     code=bits[1].strip(),
                 )
+        except (IndexError, AttributeError) as e:
+            logger.warning(f"Skipping malformed umask entry: {bits}")
     return info

+def parse_evtinfo_output(lines: list[str]) -> EVTInfos:
+    """Parse multiple event information entries from output lines.
+    
+    Args:
+        lines: List of strings containing event information
+        
+    Returns:
+        EVTInfos: Collection of parsed event information
+    """
+    if not lines:
+        logger.warning("Empty input received")
+        return EVTInfos()

     start_idx = 0
     for line in lines:
         start_idx += 1
-        if line.strip() == EVTINFO_delimiter:
+        if line.strip() == EVTINFO_DELIMITER:
             break
+    else:
+        logger.error("No delimiter found in input")
+        return EVTInfos()

     bits_list, results = [], EVTInfos()
     for line in lines[start_idx:]:
-        if line == EVTINFO_delimiter:
+        if line == EVTINFO_DELIMITER:
+            try:
                 results.append(_parse_evtinfo(bits_list))
+            except ValueError as e:
+                logger.error(f"Failed to parse event info: {e}")
             bits_list = []
             continue
         bits_list.append(line.strip().split(":"))

     return results

90-97: ⚠️ Potential issue

Address security concerns with external binary execution.

The execution of the external binary needs proper validation and error handling.

+SHOWEVTINFO_PATH = Path("tpp/pfm/examples/showevtinfo")

-def run_evtinfo():
+def run_evtinfo() -> list[str]:
+    """Execute the showevtinfo command and return its output.
+    
+    Returns:
+        list[str]: Command output lines
+        
+    Raises:
+        RuntimeError: If command execution fails
+    """
     with pushd(FILE_DIR.parent.parent.parent):
-        return decode_bytes(run("./tpp/pfm/examples/showevtinfo").stdout).splitlines()
+        binary_path = SHOWEVTINFO_PATH.resolve()
+        if not binary_path.exists():
+            raise RuntimeError(f"Binary not found: {binary_path}")
+        if not os.access(binary_path, os.X_OK):
+            raise RuntimeError(f"Binary not executable: {binary_path}")
+        try:
+            result = run(f"./{SHOWEVTINFO_PATH}", check=True)
+            return decode_bytes(result.stdout).splitlines()
+        except Exception as e:
+            raise RuntimeError(f"Failed to execute {SHOWEVTINFO_PATH}: {e}") from e

-def get_evt_info():
+def get_evt_info() -> EVTInfos:
+    """Get parsed event information.
+    
+    Returns:
+        EVTInfos: Collection of parsed event information
+        
+    Raises:
+        RuntimeError: If event information cannot be retrieved
+    """
     return parse_evtinfo_output(run_evtinfo())

phlop/app/nvidia/__main__.py

phlop/app/nvidia/csan.py

coderabbitai · 2024-12-08T14:06:26Z

phlop/app/nvidia/ncu.py

+    _help = ValDict(
+        dir="working directory",
+        quiet="Redirect output to /dev/null",
+        logging="0=off, 1=on non zero exit code, 2=always",
+        outfile="path for saved file if active",
+        tool="",
+        extra="forward string to csan command",
+    )


⚠️ Potential issue

Improve help text definitions

The help texts need improvement and some are undefined.

Apply this diff to fix the help texts:

_help = ValDict( dir="working directory", quiet="Redirect output to /dev/null", logging="0=off, 1=on non zero exit code, 2=always", outfile="path for saved file if active", - tool="", - extra="forward string to csan command", + tool="ncu tool type (e.g., profile, stats)", + extra="additional arguments to pass to ncu command", + infiles="input files to process", )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

_help = ValDict(

dir="working directory",

quiet="Redirect output to /dev/null",

logging="0=off, 1=on non zero exit code, 2=always",

outfile="path for saved file if active",

tool="",

extra="forward string to csan command",

)

_help = ValDict(

dir="working directory",

quiet="Redirect output to /dev/null",

logging="0=off, 1=on non zero exit code, 2=always",

outfile="path for saved file if active",

tool="ncu tool type (e.g., profile, stats)",

extra="additional arguments to pass to ncu command",

infiles="input files to process",

)

coderabbitai · 2024-12-08T14:06:26Z

phlop/app/nvidia/ncu.py

+def verify_cli_args(cli_args):
+    return cli_args


⚠️ Potential issue

Implement proper CLI arguments verification

The current verification function doesn't perform any checks.

Consider this improved implementation:

def verify_cli_args(cli_args): - return cli_args + if not cli_args.remaining: + raise ValueError("No command specified to profile") + + if cli_args.outfile and not cli_args.outfile.endswith(('.ncu-rep', '.nsight-cuprof-report')): + raise ValueError("Output file must have .ncu-rep or .nsight-cuprof-report extension") + + return cli_args

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def verify_cli_args(cli_args):

return cli_args

def verify_cli_args(cli_args):

if not cli_args.remaining:

raise ValueError("No command specified to profile")

if cli_args.outfile and not cli_args.outfile.endswith(('.ncu-rep', '.nsight-cuprof-report')):

raise ValueError("Output file must have .ncu-rep or .nsight-cuprof-report extension")

return cli_args

coderabbitai · 2024-12-08T14:06:26Z

phlop/app/nvidia/ncu.py

+def build_command(cli_args):
+    return f"ncu {cli_args.remaining}"
+
+
+def exec(cli_args):
+    return run(build_command(cli_args), check=True)
+


⚠️ Potential issue

Enhance command building and execution robustness

The current implementation lacks:

Validation of cli_args.remaining

Error handling for command execution

Support for metrics from the defined list

Consider this improved implementation:

def build_command(cli_args): - return f"ncu {cli_args.remaining}" + if not cli_args.remaining: + raise ValueError("No command specified to profile") + + cmd_parts = ["ncu"] + if cli_args.outfile: + cmd_parts.extend(["-o", cli_args.outfile]) + if cli_args.extra: + cmd_parts.append(cli_args.extra) + cmd_parts.extend(cli_args.remaining) + return " ".join(cmd_parts) def exec(cli_args): - return run(build_command(cli_args), check=True) + try: + cmd = build_command(cli_args) + return run(cmd, check=True) + except Exception as e: + raise RuntimeError(f"Failed to execute ncu command: {e}") from e

Committable suggestion skipped: line range outside the PR's diff.

phlop/app/nvidia/ncu.py

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (2)

mkn.pfm.yaml (1)
18-19: Review the include path format

The include path format tpp/pfm/lib, 0 appears unusual. Consider using a standard YAML sequence format if multiple paths are needed.
-  inc: tpp/pfm/lib, 0
+  inc: 
+    - tpp/pfm/lib
+    - "0"
phlop/app/nvidia/ncu.py (1)
1-11: Enhance documentation structure

Consider restructuring the documentation to include:

Purpose and overview of the module

Description of key functions and their usage

Example usage patterns with expected outputs

Apply this diff to improve documentation:
 # Nsight Compute CLI
+"""
+NVIDIA Nsight Compute CLI wrapper for performance profiling.
+
+This module provides functionality to:
+- Execute NVIDIA Nsight Compute (ncu) commands
+- Profile CUDA applications
+- Collect performance metrics
+
+Example usage:
+    cli_args = cli_args_parser().parse_args()
+    exec(cli_args)  # Runs: ncu --metrics all my_cuda_app
+"""

 # https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html

-## samples
-#  ncu --help
-#  ncu --metrics all
-#  ncu --metrics l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum
-#  ncu --target-processes all -o <report-name> mpirun [mpi arguments] <app> [app arguments]
-#
+# Usage examples:
+#   ncu --help
+#   ncu --metrics all
+#   ncu --metrics l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum
+#   ncu --target-processes all -o <report-name> mpirun [mpi arguments] <app> [app arguments]

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between aa3b325 and 940c8ce.

📒 Files selected for processing (5)

mkn.pfm.yaml (1 hunks)
phlop/app/nvidia/__main__.py (1 hunks)
phlop/app/nvidia/csan.py (1 hunks)
phlop/app/nvidia/ncu.py (1 hunks)
sh/setup_pfm.sh (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

phlop/app/nvidia/main.py
sh/setup_pfm.sh
phlop/app/nvidia/csan.py

🔇 Additional comments (4)

mkn.pfm.yaml (2)

5-7: LGTM: Project configuration is well-structured

The project name and inheritance hierarchy are clearly defined.

1-4: Verify compiler compatibility and setup script execution

The build configuration requires C++20 support and a setup script. Please ensure:

The target environment's compiler supports C++20
The setup script ./sh/setup_pfm.sh exists and is executable

✅ Verification successful

Let me verify the C++20 usage more thoroughly by checking the actual source files.

Let me check the content of the C++ files to verify if they contain any C++20-specific features that would justify the C++20 requirement.

C++20 requirement is justified for this codebase

The codebase uses several modern C++ features that benefit from C++20, including:

Thread-local storage with guaranteed destruction order
Extensive use of std::string_view
Parameter pack expansion with fold expressions
Complex template metaprogramming
Modern concurrency primitives with mutexes and thread safety

Additionally, the setup script ./sh/setup_pfm.sh exists and is executable, satisfying the prerequisite mentioned in the configuration.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify setup script existence and compiler C++20 support

# Check if setup script exists and is executable
if [ -f "./sh/setup_pfm.sh" ]; then
  echo "Setup script exists"
  if [ -x "./sh/setup_pfm.sh" ]; then
    echo "Setup script is executable"
  else
    echo "Setup script is not executable"
  fi
else
  echo "Setup script not found"
fi

# Search for any C++20 specific features in the codebase
rg -l "concept|requires|constinit|consteval|co_await|co_yield|co_return"

Length of output: 263

Script:

#!/bin/bash
# Search for C++ source files
fd -e cpp -e hpp -e h

# Check content of pyproject.toml for context of why it matched
cat pyproject.toml

# Look for specific C++20 features in source files
rg -A 2 "concept|requires|constinit|consteval|co_await|co_yield|co_return" --type cpp

Length of output: 902

Script:

#!/bin/bash
# Check content of key source files
echo "=== scope_timer.hpp ==="
cat inc/phlop/timing/scope_timer.hpp

echo -e "\n=== threaded_scope_timer.hpp ==="
cat inc/phlop/timing/threaded_scope_timer.hpp

# Look for other modern C++ features that might require C++20
rg -l "template|constexpr|auto|decltype|nullptr|static_assert|using|noexcept" --type cpp

Length of output: 19139

phlop/app/nvidia/ncu.py (2)

23-29: ⚠️ Potential issue

Improve command building and execution robustness

The current implementation needs enhancement for:

Command validation and sanitization
Support for metrics and output file
Proper error handling and logging

Apply this diff to improve the implementation:

-def build_command(cli_args):
-    return f"ncu {cli_args.remaining}"
+def build_command(cli_args) -> str:
+    """Build ncu command with proper argument handling.
+    
+    Args:
+        cli_args: Parsed command line arguments
+        
+    Returns:
+        Constructed ncu command string
+        
+    Raises:
+        ValueError: If required arguments are missing
+    """
+    if not cli_args.remaining:
+        raise ValueError("No command specified to profile")
+
+    cmd_parts = ["ncu"]
+    
+    # Add output file if specified
+    if cli_args.outfile:
+        if not cli_args.outfile.endswith('.ncu-rep'):
+            raise ValueError("Output file must have .ncu-rep extension")
+        cmd_parts.extend(["-o", cli_args.outfile])
+    
+    # Add metrics if specified in extra args
+    if cli_args.extra:
+        metric_args = [arg for arg in cli_args.extra.split() if arg.startswith('--metrics')]
+        if metric_args:
+            metrics_list = metric_args[0].split('=')[1].split(',')
+            validate_metrics(metrics_list)
+    
+    # Add extra arguments
+    if cli_args.extra:
+        cmd_parts.append(cli_args.extra)
+    
+    # Add the command to profile
+    cmd_parts.extend(cli_args.remaining)
+    
+    return " ".join(cmd_parts)

-def exec(cli_args):
-    return run(build_command(cli_args), check=True)
+def exec(cli_args):
+    """Execute ncu command with proper error handling.
+    
+    Args:
+        cli_args: Parsed command line arguments
+        
+    Returns:
+        CompletedProcess instance
+        
+    Raises:
+        RuntimeError: If command execution fails
+    """
+    try:
+        cmd = build_command(cli_args)
+        return run(cmd, check=True)
+    except Exception as e:
+        raise RuntimeError(f"Failed to execute ncu command: {e}") from e

Likely invalid or redundant comment.

56-57: ⚠️ Potential issue

Implement proper CLI arguments verification

The current implementation doesn't perform any validation.

Apply this diff to add proper validation:

 def verify_cli_args(cli_args):
-    return cli_args
+    """Verify and validate command line arguments.
+    
+    Args:
+        cli_args: Parsed command line arguments
+        
+    Returns:
+        Validated command line arguments
+        
+    Raises:
+        ValueError: If arguments are invalid
+    """
+    if cli_args.list_metrics:
+        for metric in metrics:
+            print(f"{metric.name}: {metric.description}")
+        exit(0)
+    
+    if not cli_args.remaining:
+        raise ValueError("No command specified to profile")
+    
+    if cli_args.outfile and not cli_args.outfile.endswith('.ncu-rep'):
+        raise ValueError("Output file must have .ncu-rep extension")
+    
+    if cli_args.extra:
+        metric_args = [arg for arg in cli_args.extra.split() if arg.startswith('--metrics')]
+        if metric_args:
+            metrics_list = metric_args[0].split('=')[1].split(',')
+            validate_metrics(metrics_list)
+    
+    return cli_args

Likely invalid or redundant comment.

coderabbitai · 2024-12-08T14:57:55Z

mkn.pfm.yaml

+- name: pfm
+  inc: tpp/pfm/include
+
+- name: pfm_lib
+  parent: pfm
+  mode: static
+  inc: tpp/pfm/lib, 0
+  src: tpp/pfm/lib
+
+- name: pfm_events
+  self: pfm_lib
+  main: tpp/pfm/examples/check_events.c
+  out: check_events
+
+- name: pfm_info
+  self: pfm_lib
+  main: tpp/pfm/examples/showevtinfo.c
+  out: showevtinfo


💡 Codebase verification

Missing directories and version constraint needed

The verification reveals two issues:

None of the referenced directories (tpp/pfm/include, tpp/pfm/lib, tpp/pfm/examples) exist in the repository

The setup script clones libpfm4 without specifying a version, which could lead to build reproducibility issues

Recommendations:

Run the setup script first to create the required directories

Add a specific git tag or commit hash in the clone command to ensure consistent builds

🔗 Analysis chain

Verify path existence and add version constraints

The configuration references several paths under tpp/pfm/ but lacks version constraints for the libpfm4 dependency.

Consider adding version constraints for libpfm4 to ensure build reproducibility.

Let me check the setup script to understand how libpfm is being handled.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Verify referenced paths and check libpfm version information # Check if referenced directories exist echo "Checking referenced directories..." for dir in "tpp/pfm/include" "tpp/pfm/lib" "tpp/pfm/examples"; do if [ -d "$dir" ]; then echo "✓ $dir exists" else echo "✗ $dir not found" fi done # Check for version information in setup script echo -e "\nChecking for version constraints..." rg -l "libpfm" ./sh/setup_pfm.sh

Length of output: 681

Script:

#!/bin/bash # Check the content of setup script to understand libpfm handling cat ./sh/setup_pfm.sh

Length of output: 275

coderabbitai · 2024-12-08T14:57:55Z

phlop/app/nvidia/ncu.py

+metrics = [
+    "all",
+    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum",  # read
+    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum",  # wrte
+]


🛠️ Refactor suggestion

Enhance metrics validation and documentation

The metrics list is currently limited and lacks validation in the code. Consider:

Adding more commonly used metrics

Adding validation against this list

Documenting each metric's purpose

Apply this diff to improve metrics handling:

+from typing import List + +class NCUMetric: + def __init__(self, name: str, description: str): + self.name = name + self.description = description + metrics = [ - "all", - "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum", # read - "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum", # wrte + NCUMetric("all", "All available metrics"), + NCUMetric( + "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum", + "Number of shared memory bank conflicts on read operations" + ), + NCUMetric( + "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum", + "Number of shared memory bank conflicts on write operations" + ), ] + +def validate_metrics(metric_names: List[str]) -> None: + """Validate that all provided metrics are supported.""" + valid_metrics = {m.name for m in metrics} + invalid_metrics = set(metric_names) - valid_metrics + if invalid_metrics: + raise ValueError(f"Unsupported metrics: {invalid_metrics}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

metrics = [

"all",

"l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum", # read

"l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum", # wrte

]

from typing import List

class NCUMetric:

def __init__(self, name: str, description: str):

self.name = name

self.description = description

metrics = [

NCUMetric("all", "All available metrics"),

NCUMetric(

"l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum",

"Number of shared memory bank conflicts on read operations"

),

NCUMetric(

"l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum",

"Number of shared memory bank conflicts on write operations"

),

]

def validate_metrics(metric_names: List[str]) -> None:

"""Validate that all provided metrics are supported."""

valid_metrics = {m.name for m in metrics}

invalid_metrics = set(metric_names) - valid_metrics

if invalid_metrics:

raise ValueError(f"Unsupported metrics: {invalid_metrics}")

coderabbitai · 2024-12-08T14:57:56Z

phlop/app/nvidia/ncu.py

+def cli_args_parser(description="ncu tool"):
+    import argparse
+
+    _help = ValDict(
+        dir="working directory",
+        quiet="Redirect output to /dev/null",
+        logging="0=off, 1=on non zero exit code, 2=always",
+        outfile="path for saved file if active",
+        tool="",
+        extra="forward string to csan command",
+    )
+
+    parser = argparse.ArgumentParser(
+        description=description, formatter_class=argparse.ArgumentDefaultsHelpFormatter
+    )
+    parser.add_argument("remaining", nargs=argparse.REMAINDER)
+    parser.add_argument("-d", "--dir", default=".", help=_help.dir)
+    parser.add_argument("-o", "--outfile", default=None, help=_help.outfile)
+    parser.add_argument("-t", "--tool", default="stat", help=_help.tool)
+    parser.add_argument("--logging", type=int, default=1, help=_help.logging)
+    parser.add_argument("-e", "--extra", type=str, default="", help=_help.extra)
+
+    return parser
+


⚠️ Potential issue

Improve CLI argument parser implementation

The current implementation has several issues:

Incomplete help texts

Potentially irrelevant arguments

Missing metrics-specific arguments

Apply this diff to improve the implementation:

def cli_args_parser(description="ncu tool"): + """Create argument parser for ncu tool. + + Args: + description: Parser description + + Returns: + ArgumentParser instance + """ import argparse _help = ValDict( dir="working directory", quiet="Redirect output to /dev/null", logging="0=off, 1=on non zero exit code, 2=always", - outfile="path for saved file if active", - tool="", - extra="forward string to csan command", + outfile="path for saving ncu report (must end with .ncu-rep)", + extra="additional arguments to pass to ncu command (e.g., --metrics all)", ) parser = argparse.ArgumentParser( description=description, formatter_class=argparse.ArgumentDefaultsHelpFormatter ) parser.add_argument("remaining", nargs=argparse.REMAINDER) parser.add_argument("-d", "--dir", default=".", help=_help.dir) parser.add_argument("-o", "--outfile", default=None, help=_help.outfile) - parser.add_argument("-t", "--tool", default="stat", help=_help.tool) parser.add_argument("--logging", type=int, default=1, help=_help.logging) parser.add_argument("-e", "--extra", type=str, default="", help=_help.extra) + + # Add metrics-specific arguments group + metrics_group = parser.add_argument_group('Metrics') + metrics_group.add_argument( + "--list-metrics", + action="store_true", + help="List available metrics and exit" + ) return parser

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def cli_args_parser(description="ncu tool"):

import argparse

_help = ValDict(

dir="working directory",

quiet="Redirect output to /dev/null",

logging="0=off, 1=on non zero exit code, 2=always",

outfile="path for saved file if active",

tool="",

extra="forward string to csan command",

)

parser = argparse.ArgumentParser(

description=description, formatter_class=argparse.ArgumentDefaultsHelpFormatter

)

parser.add_argument("remaining", nargs=argparse.REMAINDER)

parser.add_argument("-d", "--dir", default=".", help=_help.dir)

parser.add_argument("-o", "--outfile", default=None, help=_help.outfile)

parser.add_argument("-t", "--tool", default="stat", help=_help.tool)

parser.add_argument("--logging", type=int, default=1, help=_help.logging)

parser.add_argument("-e", "--extra", type=str, default="", help=_help.extra)

return parser

def cli_args_parser(description="ncu tool"):

"""Create argument parser for ncu tool.

Args:

description: Parser description

Returns:

ArgumentParser instance

"""

import argparse

_help = ValDict(

dir="working directory",

quiet="Redirect output to /dev/null",

logging="0=off, 1=on non zero exit code, 2=always",

outfile="path for saving ncu report (must end with .ncu-rep)",

extra="additional arguments to pass to ncu command (e.g., --metrics all)",

)

parser = argparse.ArgumentParser(

description=description, formatter_class=argparse.ArgumentDefaultsHelpFormatter

)

parser.add_argument("remaining", nargs=argparse.REMAINDER)

parser.add_argument("-d", "--dir", default=".", help=_help.dir)

parser.add_argument("-o", "--outfile", default=None, help=_help.outfile)

parser.add_argument("--logging", type=int, default=1, help=_help.logging)

parser.add_argument("-e", "--extra", type=str, default="", help=_help.extra)

# Add metrics-specific arguments group

metrics_group = parser.add_argument_group('Metrics')

metrics_group.add_argument(

"--list-metrics",

action="store_true",

help="List available metrics and exit"

)

return parser

PhilipDeegan force-pushed the next branch 3 times, most recently from 6f0ed76 to 5cf7cd4 Compare November 9, 2024 21:17

pfm parsing for perf stat codes per instruction

9f7f103

PhilipDeegan force-pushed the next branch from 5cf7cd4 to 9f7f103 Compare November 9, 2024 21:55

coderabbitai bot reviewed Nov 9, 2024

View reviewed changes

nvidia profiling

e5a03fe

coderabbitai bot reviewed Dec 3, 2024

View reviewed changes

...

2176713

coderabbitai bot reviewed Dec 4, 2024

View reviewed changes

++

aa3b325

coderabbitai bot reviewed Dec 8, 2024

View reviewed changes

++

940c8ce

coderabbitai bot reviewed Dec 8, 2024

View reviewed changes

PhilipDeegan merged commit 8fbc0eb into master Dec 8, 2024
2 checks passed

PhilipDeegan deleted the next branch December 8, 2024 15:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pfm parsing for perf stat codes per instruction #33

pfm parsing for perf stat codes per instruction #33

PhilipDeegan commented Nov 9, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 9, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Nov 9, 2024

coderabbitai bot Nov 9, 2024

coderabbitai bot Nov 9, 2024

coderabbitai bot Nov 9, 2024

coderabbitai bot Nov 9, 2024

coderabbitai bot Nov 9, 2024

coderabbitai bot left a comment

coderabbitai bot Dec 3, 2024

coderabbitai bot Dec 3, 2024

coderabbitai bot left a comment

coderabbitai bot Dec 4, 2024

coderabbitai bot Dec 4, 2024

coderabbitai bot left a comment

coderabbitai bot Dec 8, 2024

coderabbitai bot Dec 8, 2024

coderabbitai bot Dec 8, 2024

coderabbitai bot left a comment

coderabbitai bot Dec 8, 2024

coderabbitai bot Dec 8, 2024

coderabbitai bot Dec 8, 2024

		events = " ".join([f"-e {get_evt_perf_code(ev)}" for ev in code.split(" ")])
		print(decode_bytes(run(f"perf stat {events} sleep 5").stderr))

-    events = " ".join([f"-e {get_evt_perf_code(ev)}" for ev in code.split(" ")])
-    print(decode_bytes(run(f"perf stat {events} sleep 5").stderr))
+    # Configuration
+    SLEEP_DURATION = 5  # seconds
+    events = " ".join([f"-e {get_evt_perf_code(ev)}" for ev in code.split(" ")])
+    try:
+        result = run(f"perf stat {events} sleep {SLEEP_DURATION}")
+        output = decode_bytes(result.stderr)
+        # TODO: Add output parsing and validation
+        print(output)
+    except Exception as e:
+        print(f"Error executing perf stat command: {e}")

		parser.add_argument("-e", "--extra", type=str, default="", help=_help.extra)

-metrics = [
-    "all",
-    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum",  # read
-    "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum",  # wrte
-]
+from typing import List
+class NCUMetric:
+    def __init__(self, name: str, description: str):
+        self.name = name
+        self.description = description
+metrics = [
+    NCUMetric("all", "All available metrics"),
+    NCUMetric(
+        "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_ld.sum",
+        "Number of shared memory bank conflicts on read operations"
+    ),
+    NCUMetric(
+        "l1tex__data_bank_conflicts_pipe_lsu_mem_shared_op_st.sum",
+        "Number of shared memory bank conflicts on write operations"
+    ),
+]
+def validate_metrics(metric_names: List[str]) -> None:
+    """Validate that all provided metrics are supported."""
+    valid_metrics = {m.name for m in metrics}
+    invalid_metrics = set(metric_names) - valid_metrics
+    if invalid_metrics:
+        raise ValueError(f"Unsupported metrics: {invalid_metrics}")

pfm parsing for perf stat codes per instruction #33

pfm parsing for perf stat codes per instruction #33

Conversation

PhilipDeegan commented Nov 9, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Nov 9, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 9, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 3, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 3, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 4, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 4, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 8, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 8, 2024

Choose a reason for hiding this comment

PhilipDeegan commented Nov 9, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 9, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)