Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc spec feedback #7

Open
bcstrongx opened this issue Oct 9, 2024 · 6 comments
Open

Misc spec feedback #7

bcstrongx opened this issue Oct 9, 2024 · 6 comments

Comments

@bcstrongx
Copy link
Collaborator

Reading through the spec doc (here), a few issues came to mind:

  1. To avoid unnecessary duplication, for all events that include both RET and SPEC flavors, I'd list only the RET events, and precede the table with text that says something like:

The following non-speculative (*.RET) events also have speculative (*.SPEC) variants. The speculative versions include events that occur as a result of instructions or uops that do not retire.

  1. I think we should remove example formulas for events we suspect an implementation may opt to use from the tables and event files. We can add non-normative text if we believe these formula methods should be broadcast.

  2. I find the RETIRE and SPEC groups confusing, since there are RET and SPEC events in many groups. What if we moved the events in the RETIRE and SPEC groups into GEN? Or preferably GENERAL?

  3. The PKI/MPKI metrics aren't clear on the order of operations. For instance, CTRL_FLOW.PKI has the following formula:

CTRL_FLOW.RET / INST.RET * 1000

It should read:

CTRL_FLOW.RET / (INST.RET * 1000)

  1. I'd add the FLUSH.* events to GENERAL as well. They aren't really CTRL_FLOW events.

  2. Our CACHE.* events count load/store instructions. But some instructions perform multiple accesses, and we don't have any way to count all accesses. I would recommend moving to the following:

    • INST.{LOAD|STORE}.{L1D|L2|L3|...}.*.{RET|SPEC} - counts load/store instructions
    • CACHE.{L1D|L2|L3}.{LOAD|STORE|PREF|SNOOP}.* - counts cache accesses (may be multiple per instruction), including speculative and implicit accesses (like prefetches, page walk loads, etc). Do we need a PGWALK access type too?

    This way the CACHE events are from the perspective of the associated cache, which knows nothing about instructions or retirement, while the load/store instruction events are part of the INST events. These new INST events could go in a new LOAD_STORE group.

    Actually, if we do this then, for consistency, we should probably name the CTRL_FLOW events INST.CTRL_FLOW too. So that all events that count instructions are INST.* events.

  3. Shouldn't CACHE.SNOOP.REMOTE_REQ_LOCAL_HITM be per cache level?

  4. I think we should include TLB.L*.CODE*.RET events. They are more costly to implement, but useful. Akin to Intel's FRONTEND_RETIRED.

  5. Problem with the formula for TOPDOWN.BACKEND_BOUND.MEMORY_BOUND.DATA_BOUND.L2_BOUND.RATE:

(TOPDOWN.BACKEND_BOUND.MEMORY.DATA.L1_MISS.SLOTS - TOPDOWN.BACKEND_BOUND.MEMORY.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.DATA.L2_MISS.SLOTS) / TOPDOWN.SLOTS

@dr-sc
Copy link
Contributor

dr-sc commented Oct 16, 2024

  1. The RET and SPEC events differ in description as well. As I understand with this proposal you assume that implementors will themselves adopt description for SPEC variants? Even though this adoption looks trivial I'm not sure this is a good direction. I personally like the current explicit version more. Yes it has some duplication, but it is simple, clear and has one-to-one relation with actual event files the users will be dealing with.
  2. See Remove alternative event definitions as formulas #9
  3. See Remove RETIRED and SPEC groups and move events from them to GENERAL #10
  4. The correct formula is CTRL_FLOW.RET / (INST.RET / 1000) - which is identical to CTRL_FLOW.RET / INST.RET * 1000. We can use 1000 * CTRL_FLOW.RET / INST.RET or CTRL_FLOW.RET / (INST.RET / 1000) variants to avoid such confusions.
  5. Also addressed in Remove RETIRED and SPEC groups and move events from them to GENERAL #10
  6. INST.{LOAD|STORE}.{L1D|L2|L3|...}..{RET|SPEC} - probably the SPEC variants are redundant considering that we'll have CACHE ones as well? We may have only RET ones I think. Also when counting events on cache IP block level it is probably more natural to operate with READ/WRITE terms instead of LOAD/STORE? E.g. a store instruction can cause read transaction to certain cache level.

@dr-sc
Copy link
Contributor

dr-sc commented Oct 16, 2024

  1. Addressed in Fix L2_BOUND TMA metric formula #11

@dr-sc
Copy link
Contributor

dr-sc commented Oct 16, 2024

  1. From the perf analysis point of view knowing cache level here is not important I think. On the other side different cache levels will count this event independently so may be yes, such breakdown may be needed.
    See Split CACHE.SNOOP.REMOTE_REQ_LOCAL_HITM event per cache level #12

@dr-sc
Copy link
Contributor

dr-sc commented Oct 16, 2024

@bcstrongx
Copy link
Collaborator Author

  1. I'm aware that some events are SPEC only, but for those that have SPEC and RET flavors, are there differences beyond the RET versions only count events incurred by uops that retire, and the SPEC versions count all? If so I agree we should keep them separate.
  2. You're right, not sure why I was trying to multiply INST.RET * 1000. But I agree that putting 1000 in front is more clear.
  3. The difference between INST.{LOAD|STORE} and CACHE.{LOAD|STORE} is that the former only increments once per instruction, even for a vector operation that may perform many cache accesses, while the latter counts all accesses. My instinct is that an implementation would only support the INST.{LOAD|STORE}*.RET (for purposes of precise attribution) and the CACHE.{LOAD|STORE} (speculative) events, but in the interest of simplicity I would include SPEC versions for all the RET events.

@bcstrongx
Copy link
Collaborator Author

Everything else looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants