diff --git a/sycl/doc/SYCLInstrumentationUsingXPTI.md b/sycl/doc/SYCLInstrumentationUsingXPTI.md index 0d6746a5afacb..933aee67bfa58 100644 --- a/sycl/doc/SYCLInstrumentationUsingXPTI.md +++ b/sycl/doc/SYCLInstrumentationUsingXPTI.md @@ -245,17 +245,41 @@ All trace point types in bold provide semantic information about the graph, node | Trace Point Type | Parameter Description | Metadata | | :--------------: | :-------------------- | :------- | | **`graph_create`** |
  • **trace_type**: `xpti::trace_point_type_t::graph_create` that marks the creation of an asynchronous graph.
  • **parent**: `nullptr`
  • **event**: The global asynchronous graph object ID. All other graph related events such as node and edge creation will always this ID as the parent ID.
  • **instance**: Unique ID related to the event, but not a correlation ID as there are other events to correlate to.
  • **user_data**: `nullptr`
  • SYCL runtime will always have one instance of a graph object with many disjoint subgraphs that get created during the execution of an application.
    | None | -| **`node_create`** |
  • **trace_type**: `xpti::trace_point_type_t::node_create` that marks the creation of a node in the graph, which could be a computational kernel or memory operation.
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The unique ID that identifies the data parallel compute operation or memory operation.
  • **instance**: Unique ID related to the event, but not a correlation ID as there are other events to correlate to.
  • **user_data**: Command type that has been submitted through the command group handler, which could be one of: `command_group_node`, `memory_transfer_node`, `memory_allocation_node`, `sub_buffer_creation_node`, `memory_deallocation_node`, `host_acc_create_buffer_lock_node`, `host_acc_destroy_buffer_release_node` combined with the address of the command group object and represented as a string [`const char *`]
  • SYCL runtime will always have one instance of a graph object with many disjoint subgraphs that get created during the execution of an application.
    |
  • Computational Kernels
  • `sycl_device`, `kernel_name`, `from_source`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`
  • Memory operations
  • `memory_object`, `offset`, `access_range`, `allocation_type`, `copy_from`, `copy_to` | +| **`node_create`** |
  • **trace_type**: `xpti::trace_point_type_t::node_create` that marks the creation of a node in the graph, which could be a computational kernel or memory operation.
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The unique ID that identifies the data parallel compute operation or memory operation.
  • **instance**: Unique ID related to the event, but not a correlation ID as there are other events to correlate to.
  • **user_data**: Command type that has been submitted through the command group handler, which could be one of: `command_group_node`, `memory_transfer_node`, `memory_allocation_node`, `sub_buffer_creation_node`, `memory_deallocation_node`, `host_acc_create_buffer_lock_node`, `host_acc_destroy_buffer_release_node` combined with the address of the command group object and represented as a string [`const char *`]
  • SYCL runtime will always have one instance of a graph object with many disjoint subgraphs that get created during the execution of an application.
    |
  • Computational Kernels
  • `sycl_device`, `sycl_device_type`, `sycl_device_name`, `kernel_name`, `from_source`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`
  • Memory operations
  • `memory_object`, `offset`, `access_range`, `allocation_type`, `copy_from`, `copy_to` | | **`edge_create`** |
  • **trace_type**: `xpti::trace_point_type_t::graph_create` that marks the creation of an asynchronous graph.
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The unique ID that identifies the dependence relationship between two operations.
  • **instance**: Unique ID related to the event, but not a correlation ID as there are other events to correlate to.
  • **user_data**: `nullptr`
  • Edges capture dependence relationships between computations or computations and memory operations.
    | `access_mode`, `memory_object`, `event` | | `task_begin` |
  • **trace_type**: `xpti::trace_point_type_t::task_begin` that marks the beginning of a task belonging to one of the nodes in the graph. When the trace event is for a kernel executing on a device other than the the CPU, this `task_begin` and corresponding `task_end` mark the submit call. To track the execution of the kernel on the device, the `trace_signal` event must be monitored to get the kernel event handle from which the execution statistics can be gathered.
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The event ID will reflect the ID of the computation or memory operation kernel, which would be one of the nodes in the graph.
  • **instance**: Instance ID for the task that can be used to correlate it with the corresponding `task_end` trace event.
  • **user_data**: `nullptr`
  • | Same metadata defined for the node the trace task belongs to. | | `task_end` |
  • **trace_type**: `xpti::trace_point_type_t::task_end` that marks the end of a task belonging to one of the nodes in the graph. The specific task instance can be tacked through the instance ID parameter which helps correlate the `task_end` with the corresponding `task_begin`.
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The event ID will reflect the ID of the computation or memory operation kernel, which would be one of the nodes in the graph.
  • **instance**: Instance ID for the task that can be used to correlate it with the corresponding `task_begin` trace event.
  • **user_data**: `nullptr`
  • | Same metadata defined for the node the trace task belongs to. | | `signal` |
  • **trace_type**: `xpti::trace_point_type_t::signal` that marks the an event that contains the `event` handle of an executing kernel on a device.
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The event ID will reflect the ID of the computation or memory operation kernel, which would be one of the nodes in the graph.
  • **instance**: Instance ID for the task for which the signal has been generated.
  • **user_data**: Address of the kernel event that is returned by the device so the progress of the execution can be tracked.
  • | Same metadata defined for the node the trace task belongs to. | | `wait_begin` |
  • **trace_type**: `xpti::trace_point_type_t::wait_begin` that marks the beginning of the wait on an `event`
  • **parent**: `nullptr`
  • **event**: The event ID will reflect the ID of the command group object submission that created this event or the graph event, if the event is an external event.
  • **instance**: Unique ID to allow the correlation of the `wait_begin` event with the `wait_end` event.
  • **user_data**: String indicating `event.wait` and the address of the event sent in as `const char *`
  • Tracing the `event.wait()` or `event.wait_and_throw()` will capture the waiting on the action represented by the event object, which could be the execution of a kernel, completion of a memory operation, etc.
    | None | | `wait_end` |
  • **trace_type**: `xpti::trace_point_type_t::wait_end` that marks the beginning of the wait on an `event`
  • **parent**: `nullptr`
  • **event**: The event ID will reflect the ID of the command group object submission that created this event or the graph event, if the event is an external event.
  • **instance**: Unique ID to allow the correlation of the `wait_begin` event with the `wait_end` event.
  • **user_data**: String indicating `event.wait` and the address of the event sent in as `const char *`
  • | None | -| `wait_begin` |
  • **trace_type**: `xpti::trace_point_type_t::wait_begin` that marks the beginning of the wait on an `event`
  • **parent**: `nullptr`
  • **event**: The event ID will reflect the ID of the command group object submission that created this event or a new event based on the combination of the string "queue.wait" and the address of the event.
  • **instance**: Unique ID to allow the correlation of the `wait_begin` event with the `wait_end` event.
  • **user_data**: String indicating `queue.wait` and the address of the event sent in as `const char *`
  • Tracing the `queue.wait()` or `queue.wait_and_throw()` will capture the waiting on the action represented by the event object, which could be the execution of a kernel, completion of a memory operation, etc that is embedded in the command group handler. All wait events contain metadata that indicates the SYCL device on which the corresponding operation has been submitted. If the event is from a command group handler, then the source location information is available as well.
    | **`sycl_device`**, `sym_function_name`, `sym_source_file_name`, `sym_line_no` | -| `wait_end` |
  • **trace_type**: `xpti::trace_point_type_t::wait_end` that marks the beginning of the wait on an `event`
  • **parent**: `nullptr`
  • **event**: The event ID will reflect the ID of the command group object submission that created this event or a new event based on the combination of the string "queue.wait" and the address of the event.
  • **instance**: Unique ID to allow the correlation of the `wait_begin` event with the `wait_end` event.
  • **user_data**: String indicating `queue.wait` and the address of the event as `const char *`
  • | **`sycl_device`**, `sym_function_name`, `sym_source_file_name`, `sym_line_no` | -| `barrier_begin` |
  • **trace_type**: `xpti::trace_point_type_t::barrier_begin` that marks the beginning of a barrier while enqueuing a command group object
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The event ID will reflect the ID of the command group object that has encountered a barrier during the enqueue operation.
  • **instance**: Unique ID to allow the correlation of the `barrier_begin` event with the `barrier_end` event.
  • **user_data**: String indicating `enqueue.barrier` and the reason for the barrier as a `const char *`
  • The reason for the barrier could be one of `Buffer locked by host accessor`, `Blocked by host task` or `Unknown reason`.
    |
  • Computational Kernels
  • `sycl_device`, `kernel_name`, `from_source`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`
  • Memory operations
  • `memory_object`, `offset`, `access_range`, `allocation_type`, `copy_from`, `copy_to` | -| `barrier_end` |
  • **trace_type**: `xpti::trace_point_type_t::barrier_end` that marks the end of the barrier that is encountered during enqueue.
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The event ID will reflect the ID of the command group object that has encountered a barrier during the enqueue operation.
  • **instance**: Unique ID to allow the correlation of the `barrier_begin` event with the `barrier_end` event.
  • **user_data**: String indicating `enqueue.barrier` and the reason for the barrier as a `const char *`
  • The reason for the barrier could be one of `Buffer locked by host accessor`, `Blocked by host task` or `Unknown reason`.
    |
  • Computational Kernels
  • `sycl_device`, `kernel_name`, `from_source`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`
  • Memory operations
  • `memory_object`, `offset`, `access_range`, `allocation_type`, `copy_from`, `copy_to` | +| `wait_begin` |
  • **trace_type**: `xpti::trace_point_type_t::wait_begin` that marks the beginning of the wait on an `event`
  • **parent**: `nullptr`
  • **event**: The event ID will reflect the ID of the command group object submission that created this event or a new event based on the combination of the string "queue.wait" and the address of the event.
  • **instance**: Unique ID to allow the correlation of the `wait_begin` event with the `wait_end` event.
  • **user_data**: String indicating `queue.wait` and the address of the event sent in as `const char *`
  • Tracing the `queue.wait()` or `queue.wait_and_throw()` will capture the waiting on the action represented by the event object, which could be the execution of a kernel, completion of a memory operation, etc that is embedded in the command group handler. All wait events contain metadata that indicates the SYCL device on which the corresponding operation has been submitted. If the event is from a command group handler, then the source location information is available as well.
    | `sycl_device`, `sycl_device_type`, `sycl_device_name`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`, `sym_column_no` | +| `wait_end` |
  • **trace_type**: `xpti::trace_point_type_t::wait_end` that marks the beginning of the wait on an `event`
  • **parent**: `nullptr`
  • **event**: The event ID will reflect the ID of the command group object submission that created this event or a new event based on the combination of the string "queue.wait" and the address of the event.
  • **instance**: Unique ID to allow the correlation of the `wait_begin` event with the `wait_end` event.
  • **user_data**: String indicating `queue.wait` and the address of the event as `const char *`
  • | `sycl_device`, `sycl_device_type`, `sycl_device_name`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`, `sym_column_no` | +| `barrier_begin` |
  • **trace_type**: `xpti::trace_point_type_t::barrier_begin` that marks the beginning of a barrier while enqueuing a command group object
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The event ID will reflect the ID of the command group object that has encountered a barrier during the enqueue operation.
  • **instance**: Unique ID to allow the correlation of the `barrier_begin` event with the `barrier_end` event.
  • **user_data**: String indicating `enqueue.barrier` and the reason for the barrier as a `const char *`
  • The reason for the barrier could be one of `Buffer locked by host accessor`, `Blocked by host task` or `Unknown reason`.
    |
  • Computational Kernels
  • `sycl_device`, `sycl_device_type`, `sycl_device_name`, `kernel_name`, `from_source`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`, `sym_column_no`
  • Memory operations
  • `memory_object`, `offset`, `access_range_start`, `access_range_end`, `allocation_type`, `copy_from`, `copy_to` | +| `barrier_end` |
  • **trace_type**: `xpti::trace_point_type_t::barrier_end` that marks the end of the barrier that is encountered during enqueue.
  • **parent**: The global graph event that is created during the `graph_create` event.
  • **event**: The event ID will reflect the ID of the command group object that has encountered a barrier during the enqueue operation.
  • **instance**: Unique ID to allow the correlation of the `barrier_begin` event with the `barrier_end` event.
  • **user_data**: String indicating `enqueue.barrier` and the reason for the barrier as a `const char *`
  • The reason for the barrier could be one of `Buffer locked by host accessor`, `Blocked by host task` or `Unknown reason`.
    |
  • Computational Kernels
  • `sycl_device`, `sycl_device_type`, `sycl_device_name`, `kernel_name`, `from_source`, `sym_function_name`, `sym_source_file_name`, `sym_line_no`, `sym_column_no`
  • Memory operations
  • `memory_object`, `offset`, `access_range_start`, `access_range_end`, `allocation_type`, `copy_from`, `copy_to` | + +### Metadata description + +| Metadata | Type | Description | +| :------: | :--: | :---------- | +| `access_mode` | `int` | Value of `sycl::access::mode` enum | +| `access_range_start` | `size_t` | Start of accessor range | +| `access_range_end` | `size_t` | End of accessor range | +| `allocation_type` | C-style string | Allocation type | +| `copy_from` | `size_t` | ID of source device | +| `copy_to` | `size_t` | ID of target device | +| `event` | `size_t` | Unique identifier of event | +| `from_source` | `bool` | `true` if kernel comes from user source | +| `kernel_name` | C-style string | Kernel name | +| `memory_object` | `size_t` | Unique identifier of memory object | +| `offset` | `size_t` | Accessor offset size in bytes | +| `sycl_device` | `size_t` | Unique identifier of SYCL device | +| `sycl_device_type` | C-style string | `CPU`, `GPU`, `ACC`, or `HOST` | +| `sycl_device_name` | C-style string | Result of `sycl::device::get_info()` | +| `sym_function_name` | C-style string | Function name | +| `sym_source_file_name` | C-style string | Source file name | +| `sym_line_no` | `int32_t` | File line number | +| `sym_column_no` | `int32_t` | File column number | + ## Buffer management stream `"sycl.experimental.buffer"` Notification Signatures diff --git a/sycl/source/detail/device_impl.cpp b/sycl/source/detail/device_impl.cpp index 243ca75b5f808..e5337b97d8521 100644 --- a/sycl/source/detail/device_impl.cpp +++ b/sycl/source/detail/device_impl.cpp @@ -347,6 +347,13 @@ bool device_impl::isAssertFailSupported() const { return MIsAssertFailSupported; } +std::string device_impl::getDeviceName() const { + std::call_once(MDeviceNameFlag, + [this]() { MDeviceName = get_info(); }); + + return MDeviceName; +} + } // namespace detail } // namespace sycl } // __SYCL_INLINE_NAMESPACE(cl) diff --git a/sycl/source/detail/device_impl.hpp b/sycl/source/detail/device_impl.hpp index 01299271ec4ac..650569225fa4c 100644 --- a/sycl/source/detail/device_impl.hpp +++ b/sycl/source/detail/device_impl.hpp @@ -16,6 +16,7 @@ #include #include +#include __SYCL_INLINE_NAMESPACE(cl) { namespace sycl { @@ -225,6 +226,8 @@ class device_impl { bool isAssertFailSupported() const; + std::string getDeviceName() const; + private: explicit device_impl(pi_native_handle InteropDevice, RT::PiDevice Device, PlatformImplPtr Platform, const plugin &Plugin); @@ -234,6 +237,8 @@ class device_impl { bool MIsHostDevice; PlatformImplPtr MPlatform; bool MIsAssertFailSupported = false; + mutable std::string MDeviceName; + mutable std::once_flag MDeviceNameFlag; }; // class device_impl } // namespace detail diff --git a/sycl/source/detail/queue_impl.cpp b/sycl/source/detail/queue_impl.cpp index d484e10b5e5f5..f786a87ef98c0 100644 --- a/sycl/source/detail/queue_impl.cpp +++ b/sycl/source/detail/queue_impl.cpp @@ -231,12 +231,12 @@ void *queue_impl::instrumentationProlog(const detail::code_location &CodeLoc, DevStr = "ACCELERATOR"; else DevStr = "UNKNOWN"; - xptiAddMetadata(WaitEvent, "sycl_device", DevStr.c_str()); + xpti::addMetadata(WaitEvent, "sycl_device", DevStr); if (HasSourceInfo) { - xptiAddMetadata(WaitEvent, "sym_function_name", CodeLoc.functionName()); - xptiAddMetadata(WaitEvent, "sym_source_file_name", CodeLoc.fileName()); - xptiAddMetadata(WaitEvent, "sym_line_no", - std::to_string(CodeLoc.lineNumber()).c_str()); + xpti::addMetadata(WaitEvent, "sym_function_name", CodeLoc.functionName()); + xpti::addMetadata(WaitEvent, "sym_source_file_name", CodeLoc.fileName()); + xpti::addMetadata(WaitEvent, "sym_line_no", + std::to_string(CodeLoc.lineNumber())); } xptiNotifySubscribers(StreamID, xpti::trace_wait_begin, nullptr, WaitEvent, QWaitInstanceNo, diff --git a/sycl/source/detail/scheduler/commands.cpp b/sycl/source/detail/scheduler/commands.cpp index 3688bb518dcd7..7a7d4f594a13d 100644 --- a/sycl/source/detail/scheduler/commands.cpp +++ b/sycl/source/detail/scheduler/commands.cpp @@ -31,6 +31,7 @@ #include #include +#include #include #include @@ -85,6 +86,13 @@ static std::string deviceToString(device Device) { return "UNKNOWN"; } +static size_t deviceToID(const device &Device) { + if (Device.is_host()) + return 0; + else + return reinterpret_cast(getSyclObjImpl(Device)->getHandleRef()); +} + static std::string accessModeToString(access::mode Mode) { switch (Mode) { case access::mode::read: @@ -381,18 +389,20 @@ void Command::emitInstrumentationDataProxy() { /// access mode to the buffer if it is due to an accessor /// @param IsCommand True if the dependency has a command object as the source, /// false otherwise -void Command::emitEdgeEventForCommandDependence(Command *Cmd, void *ObjAddr, - const std::string &Prefix, - bool IsCommand) { +void Command::emitEdgeEventForCommandDependence( + Command *Cmd, void *ObjAddr, bool IsCommand, + std::optional AccMode) { #ifdef XPTI_ENABLE_INSTRUMENTATION // Bail early if either the source or the target node for the given dependency // is undefined or NULL if (!(xptiTraceEnabled() && MTraceEvent && Cmd && Cmd->MTraceEvent)) return; + // If all the information we need for creating an edge event is available, // then go ahead with creating it; if not, bail early! xpti::utils::StringHelper SH; std::string AddressStr = SH.addressAsString(ObjAddr); + std::string Prefix = AccMode ? accessModeToString(AccMode.value()) : "Event"; std::string TypeString = SH.nameWithAddressString(Prefix, AddressStr); // Create an edge with the dependent buffer address for which a command // object has been created as one of the properties of the edge @@ -407,10 +417,12 @@ void Command::emitEdgeEventForCommandDependence(Command *Cmd, void *ObjAddr, EdgeEvent->source_id = SrcEvent->unique_id; EdgeEvent->target_id = TgtEvent->unique_id; if (IsCommand) { - xptiAddMetadata(EdgeEvent, "access_mode", TypeString.c_str()); - xptiAddMetadata(EdgeEvent, "memory_object", AddressStr.c_str()); + xpti::addMetadata(EdgeEvent, "access_mode", + static_cast(AccMode.value())); + xpti::addMetadata(EdgeEvent, "memory_object", + reinterpret_cast(ObjAddr)); } else { - xptiAddMetadata(EdgeEvent, "event", TypeString.c_str()); + xpti::addMetadata(EdgeEvent, "event", reinterpret_cast(ObjAddr)); } xptiNotifySubscribers(MStreamID, xpti::trace_edge_create, detail::GSYCLGraphEvent, EdgeEvent, EdgeInstanceNo, @@ -437,7 +449,7 @@ void Command::emitEdgeEventForEventDependence(Command *Cmd, if (Cmd && Cmd->MTraceEvent) { // If the event is associated with a command, we use this command's trace // event as the source of edge, hence modeling the control flow - emitEdgeEventForCommandDependence(Cmd, (void *)PiEventAddr, "Event", false); + emitEdgeEventForCommandDependence(Cmd, (void *)PiEventAddr, false); return; } if (PiEventAddr) { @@ -455,7 +467,7 @@ void Command::emitEdgeEventForEventDependence(Command *Cmd, xptiMakeEvent(NodeName.c_str(), &VNPayload, xpti::trace_graph_event, xpti_at::active, &VNodeInstanceNo); // Emit the virtual node first - xptiAddMetadata(NodeEvent, "kernel_name", NodeName.c_str()); + xpti::addMetadata(NodeEvent, "kernel_name", NodeName); xptiNotifySubscribers(MStreamID, xpti::trace_node_create, detail::GSYCLGraphEvent, NodeEvent, VNodeInstanceNo, nullptr); @@ -472,7 +484,8 @@ void Command::emitEdgeEventForEventDependence(Command *Cmd, xpti_td *TgtEvent = static_cast(MTraceEvent); EdgeEvent->source_id = NodeEvent->unique_id; EdgeEvent->target_id = TgtEvent->unique_id; - xptiAddMetadata(EdgeEvent, "event", EdgeName.c_str()); + xpti::addMetadata(EdgeEvent, "event", + reinterpret_cast(PiEventAddr)); xptiNotifySubscribers(MStreamID, xpti::trace_edge_create, detail::GSYCLGraphEvent, EdgeEvent, EdgeInstanceNo, nullptr); @@ -596,9 +609,9 @@ Command *Command::addDep(DepDesc NewDep, std::vector &ToCleanUp) { } #ifdef XPTI_ENABLE_INSTRUMENTATION - emitEdgeEventForCommandDependence( - NewDep.MDepCommand, (void *)NewDep.MDepRequirement->MSYCLMemObj, - accessModeToString(NewDep.MDepRequirement->MAccessMode), true); + emitEdgeEventForCommandDependence(NewDep.MDepCommand, + (void *)NewDep.MDepRequirement->MSYCLMemObj, + true, NewDep.MDepRequirement->MAccessMode); #endif return ConnectionCmd; @@ -759,7 +772,8 @@ void Command::resolveReleaseDependencies(std::set &DepList) { xpti_td *SrcTraceEvent = static_cast(Item->MTraceEvent); EdgeEvent->target_id = TgtTraceEvent->unique_id; EdgeEvent->source_id = SrcTraceEvent->unique_id; - xptiAddMetadata(EdgeEvent, "memory_object", AddressStr.c_str()); + xpti::addMetadata(EdgeEvent, "memory_object", + reinterpret_cast(MAddress)); xptiNotifySubscribers(MStreamID, xpti::trace_edge_create, detail::GSYCLGraphEvent, EdgeEvent, EdgeInstanceNo, nullptr); @@ -801,9 +815,12 @@ void AllocaCommandBase::emitInstrumentationData() { // Set the relevant meta data properties for this command if (MTraceEvent && MFirstInstance) { xpti_td *TE = static_cast(MTraceEvent); - xptiAddMetadata(TE, "sycl_device", - deviceToString(MQueue->get_device()).c_str()); - xptiAddMetadata(TE, "memory_object", MAddressString.c_str()); + xpti::addMetadata(TE, "sycl_device", deviceToID(MQueue->get_device())); + xpti::addMetadata(TE, "sycl_device_type", + deviceToString(MQueue->get_device())); + xpti::addMetadata(TE, "sycl_device_name", + getSyclObjImpl(MQueue->get_device())->getDeviceName()); + xpti::addMetadata(TE, "memory_object", reinterpret_cast(MAddress)); } #endif } @@ -916,12 +933,11 @@ void AllocaSubBufCommand::emitInstrumentationData() { // data that is available for the command if (MFirstInstance) { xpti_td *TE = static_cast(MTraceEvent); - xptiAddMetadata(TE, "offset", - std::to_string(this->MRequirement.MOffsetInBytes).c_str()); - std::string range = std::to_string(this->MRequirement.MAccessRange[0]) + - "-" + - std::to_string(this->MRequirement.MAccessRange[1]); - xptiAddMetadata(TE, "access_range", range.c_str()); + xpti::addMetadata(TE, "offset", this->MRequirement.MOffsetInBytes); + xpti::addMetadata(TE, "access_range_start", + this->MRequirement.MAccessRange[0]); + xpti::addMetadata(TE, "access_range_end", + this->MRequirement.MAccessRange[1]); makeTraceEventEpilog(); } #endif @@ -992,10 +1008,13 @@ void ReleaseCommand::emitInstrumentationData() { if (MFirstInstance) { xpti_td *TE = static_cast(MTraceEvent); - xptiAddMetadata(TE, "sycl_device", - deviceToString(MQueue->get_device()).c_str()); - xptiAddMetadata(TE, "allocation_type", - commandToName(MAllocaCmd->getType()).c_str()); + xpti::addMetadata(TE, "sycl_device", deviceToID(MQueue->get_device())); + xpti::addMetadata(TE, "sycl_device_type", + deviceToString(MQueue->get_device())); + xpti::addMetadata(TE, "sycl_device_name", + getSyclObjImpl(MQueue->get_device())->getDeviceName()); + xpti::addMetadata(TE, "allocation_type", + commandToName(MAllocaCmd->getType())); makeTraceEventEpilog(); } #endif @@ -1106,9 +1125,12 @@ void MapMemObject::emitInstrumentationData() { if (MFirstInstance) { xpti_td *TE = static_cast(MTraceEvent); - xptiAddMetadata(TE, "sycl_device", - deviceToString(MQueue->get_device()).c_str()); - xptiAddMetadata(TE, "memory_object", MAddressString.c_str()); + xpti::addMetadata(TE, "sycl_device", deviceToID(MQueue->get_device())); + xpti::addMetadata(TE, "sycl_device_type", + deviceToString(MQueue->get_device())); + xpti::addMetadata(TE, "sycl_device_name", + getSyclObjImpl(MQueue->get_device())->getDeviceName()); + xpti::addMetadata(TE, "memory_object", reinterpret_cast(MAddress)); makeTraceEventEpilog(); } #endif @@ -1164,9 +1186,12 @@ void UnMapMemObject::emitInstrumentationData() { if (MFirstInstance) { xpti_td *TE = static_cast(MTraceEvent); - xptiAddMetadata(TE, "sycl_device", - deviceToString(MQueue->get_device()).c_str()); - xptiAddMetadata(TE, "memory_object", MAddressString.c_str()); + xpti::addMetadata(TE, "sycl_device", deviceToID(MQueue->get_device())); + xpti::addMetadata(TE, "sycl_device_type", + deviceToString(MQueue->get_device())); + xpti::addMetadata(TE, "sycl_device_name", + getSyclObjImpl(MQueue->get_device())->getDeviceName()); + xpti::addMetadata(TE, "memory_object", reinterpret_cast(MAddress)); makeTraceEventEpilog(); } #endif @@ -1249,13 +1274,20 @@ void MemCpyCommand::emitInstrumentationData() { if (MFirstInstance) { xpti_td *CmdTraceEvent = static_cast(MTraceEvent); - xptiAddMetadata(CmdTraceEvent, "sycl_device", - deviceToString(MQueue->get_device()).c_str()); - xptiAddMetadata(CmdTraceEvent, "memory_object", MAddressString.c_str()); - std::string From = deviceToString(MSrcQueue->get_device()); - std::string To = deviceToString(MQueue->get_device()); - xptiAddMetadata(CmdTraceEvent, "copy_from", From.c_str()); - xptiAddMetadata(CmdTraceEvent, "copy_to", To.c_str()); + xpti::addMetadata(CmdTraceEvent, "sycl_device", + deviceToID(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_type", + deviceToString(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_name", + getSyclObjImpl(MQueue->get_device())->getDeviceName()); + xpti::addMetadata(CmdTraceEvent, "memory_object", + reinterpret_cast(MAddress)); + xpti::addMetadata(CmdTraceEvent, "copy_from", + reinterpret_cast( + getSyclObjImpl(MSrcQueue->get_device()).get())); + xpti::addMetadata( + CmdTraceEvent, "copy_to", + reinterpret_cast(getSyclObjImpl(MQueue->get_device()).get())); makeTraceEventEpilog(); } #endif @@ -1411,13 +1443,20 @@ void MemCpyCommandHost::emitInstrumentationData() { if (MFirstInstance) { xpti_td *CmdTraceEvent = static_cast(MTraceEvent); - xptiAddMetadata(CmdTraceEvent, "sycl_device", - deviceToString(MQueue->get_device()).c_str()); - xptiAddMetadata(CmdTraceEvent, "memory_object", MAddressString.c_str()); - std::string From = deviceToString(MSrcQueue->get_device()); - std::string To = deviceToString(MQueue->get_device()); - xptiAddMetadata(CmdTraceEvent, "copy_from", From.c_str()); - xptiAddMetadata(CmdTraceEvent, "copy_to", To.c_str()); + xpti::addMetadata(CmdTraceEvent, "sycl_device", + deviceToID(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_type", + deviceToString(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_name", + getSyclObjImpl(MQueue->get_device())->getDeviceName()); + xpti::addMetadata(CmdTraceEvent, "memory_object", + reinterpret_cast(MAddress)); + xpti::addMetadata(CmdTraceEvent, "copy_from", + reinterpret_cast( + getSyclObjImpl(MSrcQueue->get_device()).get())); + xpti::addMetadata( + CmdTraceEvent, "copy_to", + reinterpret_cast(getSyclObjImpl(MQueue->get_device()).get())); makeTraceEventEpilog(); } #endif @@ -1502,9 +1541,14 @@ void EmptyCommand::emitInstrumentationData() { if (MFirstInstance) { xpti_td *CmdTraceEvent = static_cast(MTraceEvent); - xptiAddMetadata(CmdTraceEvent, "sycl_device", - deviceToString(MQueue->get_device()).c_str()); - xptiAddMetadata(CmdTraceEvent, "memory_object", MAddressString.c_str()); + xpti::addMetadata(CmdTraceEvent, "sycl_device", + deviceToID(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_type", + deviceToString(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_name", + getSyclObjImpl(MQueue->get_device())->getDeviceName()); + xpti::addMetadata(CmdTraceEvent, "memory_object", + reinterpret_cast(MAddress)); makeTraceEventEpilog(); } #endif @@ -1567,9 +1611,14 @@ void UpdateHostRequirementCommand::emitInstrumentationData() { if (MFirstInstance) { xpti_td *CmdTraceEvent = static_cast(MTraceEvent); - xptiAddMetadata(CmdTraceEvent, "sycl_device", - deviceToString(MQueue->get_device()).c_str()); - xptiAddMetadata(CmdTraceEvent, "memory_object", MAddressString.c_str()); + xpti::addMetadata(CmdTraceEvent, "sycl_device", + deviceToID(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_type", + deviceToString(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_name", + getSyclObjImpl(MQueue->get_device())->getDeviceName()); + xpti::addMetadata(CmdTraceEvent, "memory_object", + reinterpret_cast(MAddress)); makeTraceEventEpilog(); } #endif @@ -1635,19 +1684,20 @@ void ExecCGCommand::emitInstrumentationData() { // Create a payload with the command name and an event using this payload to // emit a node_create bool HasSourceInfo = false; - std::string KernelName, FromSource; + std::string KernelName; + std::optional FromSource; switch (MCommandGroup->getType()) { case detail::CG::Kernel: { auto KernelCG = reinterpret_cast(MCommandGroup.get()); if (KernelCG->MSyclKernel && KernelCG->MSyclKernel->isCreatedFromSource()) { - FromSource = "true"; + FromSource = true; pi_kernel KernelHandle = KernelCG->MSyclKernel->getHandleRef(); MAddress = KernelHandle; KernelName = MCommandGroup->MFunctionName; } else { - FromSource = "false"; + FromSource = false; KernelName = demangleKernelName(KernelCG->getKernelName()); } } break; @@ -1693,20 +1743,24 @@ void ExecCGCommand::emitInstrumentationData() { if (CGKernelInstanceNo > 1) return; - xptiAddMetadata(CmdTraceEvent, "sycl_device", - deviceToString(MQueue->get_device()).c_str()); + xpti::addMetadata(CmdTraceEvent, "sycl_device", + deviceToID(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_type", + deviceToString(MQueue->get_device())); + xpti::addMetadata(CmdTraceEvent, "sycl_device_name", + getSyclObjImpl(MQueue->get_device())->getDeviceName()); if (!KernelName.empty()) { - xptiAddMetadata(CmdTraceEvent, "kernel_name", KernelName.c_str()); + xpti::addMetadata(CmdTraceEvent, "kernel_name", KernelName); } - if (!FromSource.empty()) { - xptiAddMetadata(CmdTraceEvent, "from_source", FromSource.c_str()); + if (FromSource.has_value()) { + xpti::addMetadata(CmdTraceEvent, "from_source", FromSource.value()); } if (HasSourceInfo) { - xptiAddMetadata(CmdTraceEvent, "sym_function_name", KernelName.c_str()); - xptiAddMetadata(CmdTraceEvent, "sym_source_file_name", - MCommandGroup->MFileName.c_str()); - xptiAddMetadata(CmdTraceEvent, "sym_line_no", - std::to_string(MCommandGroup->MLine).c_str()); + xpti::addMetadata(CmdTraceEvent, "sym_function_name", KernelName); + xpti::addMetadata(CmdTraceEvent, "sym_source_file_name", + MCommandGroup->MFileName); + xpti::addMetadata(CmdTraceEvent, "sym_line_no", MCommandGroup->MLine); + xpti::addMetadata(CmdTraceEvent, "sym_column_no", MCommandGroup->MColumn); } xptiNotifySubscribers(MStreamID, xpti::trace_node_create, diff --git a/sycl/source/detail/scheduler/commands.hpp b/sycl/source/detail/scheduler/commands.hpp index 756ceadc7dc03..82c42711b2da1 100644 --- a/sycl/source/detail/scheduler/commands.hpp +++ b/sycl/source/detail/scheduler/commands.hpp @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -161,9 +162,9 @@ class Command { /// instrumentation to report these dependencies as edges. void resolveReleaseDependencies(std::set &list); /// Creates an edge event when the dependency is a command. - void emitEdgeEventForCommandDependence(Command *Cmd, void *ObjAddr, - const std::string &Prefix, - bool IsCommand); + void emitEdgeEventForCommandDependence( + Command *Cmd, void *ObjAddr, bool IsCommand, + std::optional AccMode = std::nullopt); /// Creates an edge event when the dependency is an event. void emitEdgeEventForEventDependence(Command *Cmd, RT::PiEvent &EventAddr); /// Creates a signal event with the enqueued kernel event handle. diff --git a/xpti/CMakeLists.txt b/xpti/CMakeLists.txt index b5e8860be2deb..e1dee3bc93fe5 100644 --- a/xpti/CMakeLists.txt +++ b/xpti/CMakeLists.txt @@ -6,7 +6,7 @@ set(CMAKE_CXX_STANDARD 14) set(XPTI_DIR ${CMAKE_CURRENT_LIST_DIR}) # Setting the same version as SYCL -set(CMAKE_CXX_STANDARD 11) +set(CMAKE_CXX_STANDARD 17) option(XPTI_ENABLE_WERROR OFF) diff --git a/xpti/include/xpti/xpti_data_types.h b/xpti/include/xpti/xpti_data_types.h index a4f4eb6b4ec4b..86f92ad54167f 100644 --- a/xpti/include/xpti/xpti_data_types.h +++ b/xpti/include/xpti/xpti_data_types.h @@ -105,6 +105,7 @@ enum class payload_flag_t { using trace_point_t = uint16_t; using event_type_t = uint16_t; using string_id_t = int32_t; +using object_id_t = int32_t; using safe_flag_t = std::atomic; using safe_uint64_t = std::atomic; @@ -113,7 +114,7 @@ using safe_uint16_t = std::atomic; using safe_int64_t = std::atomic; using safe_int32_t = std::atomic; using safe_int16_t = std::atomic; -using metadata_t = std::unordered_map; +using metadata_t = std::unordered_map; #define XPTI_EVENT(val) xpti::event_type_t(val) #define XPTI_TRACE_POINT_BEGIN(val) xpti::trace_point_t(val << 1 | 0) @@ -123,6 +124,12 @@ using metadata_t = std::unordered_map; #define XPTI_PACK16_RET32(value1, value2) ((value1 << 16) | value2) #define XPTI_PACK32_RET64(value1, value2) (((uint64_t)value1 << 32) | value2) +struct object_data_t { + size_t size; + const char *data; + uint8_t type; +}; + /// @brief Payload data structure that is optional for trace point callback /// API /// @details The payload structure, if determined at compile time, can deliver @@ -479,6 +486,16 @@ enum class trace_activity_type_t { sleep_activity = 1 << 3 }; +/// Provides hints to the tools on how to interpret unknown metadata values. +enum class metadata_type_t { + binary = 0, + string = 1, + signed_integer = 2, + unsigned_integer = 3, + floating = 4, + boolean = 5 +}; + struct reserved_data_t { /// Has a reference to the associated payload field for an event payload_t *payload = nullptr; diff --git a/xpti/include/xpti/xpti_trace_framework.h b/xpti/include/xpti/xpti_trace_framework.h index 12454cd4f2c6e..6bc86b58ab041 100644 --- a/xpti/include/xpti/xpti_trace_framework.h +++ b/xpti/include/xpti/xpti_trace_framework.h @@ -126,6 +126,32 @@ XPTI_EXPORT_API xpti::string_id_t xptiRegisterString(const char *string, /// @return A reference to the string identified by the string ID. XPTI_EXPORT_API const char *xptiLookupString(xpti::string_id_t id); +/// @brief Register an object to the object table +/// +/// @details All object in the XPTI framework are referred to by their object +/// IDs and this method allow you to register an object and get the object ID +/// for it. This lifetime of this object reference is equal to the lifetime of +/// the XPTI framework. +/// @param data Raw bytes of data to be registered with the object table. If the +/// object already exists in the table, the previous ID is returned. +/// @param size Size in bytes of the object. +/// @param type One of xpti::metadata_type_t values. These only serve as a hint +/// to the tools for processing unknown values. +/// @return The ID of the object being registered. If an error occurs +/// during registration, xpti::invalid_id is returned. +XPTI_EXPORT_API xpti::object_id_t xptiRegisterObject(const char *data, + size_t size, uint8_t type); + +/// @brief Lookup an object in the object table with its ID +/// +/// @details All object in the XPTI framework are referred to by their object +/// IDs and this method allows you to lookup an object by its object ID. The +/// lifetime of the returned object reference is equal to the lifetime of the +/// XPTI framework. +/// @param id The ID of the object to lookup. +/// @return A reference to the object identified by the object ID. +XPTI_EXPORT_API xpti::object_data_t xptiLookupObject(xpti::object_id_t id); + /// @brief Register a payload with the framework /// @details Since a payload may contain multiple strings that may have been /// defined on the stack, it is recommended the payload object is registered @@ -389,14 +415,14 @@ xptiNotifySubscribers(uint8_t stream_id, uint16_t trace_type, /// /// @param e The event for which the metadata is being added /// @param key The key that identifies the metadata as a string -/// @param value The value for the key as a string +/// @param value_id The value for the key as an ID of a registered object. /// @return The result code which can be one of: /// 1. XPTI_RESULT_SUCCESS when the add is successful /// 2. XPTI_RESULT_INVALIDARG when the inputs are invalid /// 3. XPTI_RESULT_DUPLICATE when the key-value pair already exists XPTI_EXPORT_API xpti::result_t xptiAddMetadata(xpti::trace_event_data_t *e, const char *key, - const char *value); + xpti::object_id_t value_id); /// @brief Query the metadata table for a given event /// @details In order to retrieve metadata information for a given event, you @@ -453,6 +479,9 @@ typedef void (*xpti_set_universal_id_t)(uint64_t uid); typedef uint64_t (*xpti_get_unique_id_t)(); typedef xpti::string_id_t (*xpti_register_string_t)(const char *, char **); typedef const char *(*xpti_lookup_string_t)(xpti::string_id_t); +typedef xpti::string_id_t (*xpti_register_object_t)(const char *, size_t, + uint8_t); +typedef xpti::object_data_t (*xpti_lookup_object_t)(xpti::object_id_t); typedef uint64_t (*xpti_register_payload_t)(xpti::payload_t *); typedef uint8_t (*xpti_register_stream_t)(const char *); typedef xpti::result_t (*xpti_unregister_stream_t)(const char *); @@ -473,7 +502,7 @@ typedef xpti::result_t (*xpti_notify_subscribers_t)( uint8_t, uint16_t, xpti::trace_event_data_t *, xpti::trace_event_data_t *, uint64_t instance, const void *temporal_user_data); typedef xpti::result_t (*xpti_add_metadata_t)(xpti::trace_event_data_t *, - const char *, const char *); + const char *, xpti::object_id_t); typedef xpti::metadata_t *(*xpti_query_metadata_t)(xpti::trace_event_data_t *); typedef bool (*xpti_trace_enabled_t)(); } diff --git a/xpti/include/xpti/xpti_trace_framework.hpp b/xpti/include/xpti/xpti_trace_framework.hpp index 75f4b09af8a1a..855a72138d8d0 100644 --- a/xpti/include/xpti/xpti_trace_framework.hpp +++ b/xpti/include/xpti/xpti_trace_framework.hpp @@ -8,7 +8,9 @@ #pragma once #include +#include #include +#include #include #include #include @@ -306,6 +308,165 @@ struct finally { } // namespace utils +template +inline result_t addMetadata(trace_event_data_t *Event, const std::string &Key, + const T &Data) { + static_assert(std::is_trivially_copyable_v, + "T must be trivially copyable"); + static_assert(!std::is_same_v); + + const uint8_t Type = [] { + if (std::is_same_v) { + return static_cast(metadata_type_t::boolean); + } + if (std::numeric_limits::is_integer && + std::numeric_limits::is_signed) { + return static_cast(metadata_type_t::signed_integer); + } + if (std::numeric_limits::is_integer && + !std::numeric_limits::is_signed) { + return static_cast(metadata_type_t::unsigned_integer); + } + if (std::numeric_limits::is_specialized && + !std::numeric_limits::is_integer) { + return static_cast(metadata_type_t::floating); + } + + return static_cast(metadata_type_t::binary); + }(); + + object_id_t Value = xptiRegisterObject(reinterpret_cast(&Data), + sizeof(Data), Type); + return xptiAddMetadata(Event, Key.c_str(), Value); +} + +template <> +inline result_t addMetadata(trace_event_data_t *Event, + const std::string &Key, + const std::string &Data) { + const uint8_t Type = static_cast(metadata_type_t::string); + object_id_t Value = xptiRegisterObject(Data.c_str(), Data.size(), Type); + return xptiAddMetadata(Event, Key.c_str(), Value); +} + +template <> +inline result_t addMetadata(trace_event_data_t *Event, + const std::string &Key, + const char *const &Data) { + const uint8_t Type = static_cast(metadata_type_t::string); + object_id_t Value = xptiRegisterObject(Data, strlen(Data), Type); + return xptiAddMetadata(Event, Key.c_str(), Value); +} + +template +inline std::pair +getMetadata(const metadata_t::value_type &MD) { + static_assert(std::is_trivially_copyable::value, + "T must be trivially copyable"); + + object_data_t RawData = xptiLookupObject(MD.second); + assert(RawData.size == sizeof(T)); + + T Value = *reinterpret_cast(RawData.data); + + const char *Key = xptiLookupString(MD.first); + + return std::make_pair(std::string_view(Key), Value); +} + +template <> +inline std::pair +getMetadata(const metadata_t::value_type &MD) { + object_data_t RawData = xptiLookupObject(MD.second); + + std::string Value(RawData.data, RawData.size); + + const char *Key = xptiLookupString(MD.first); + + return std::make_pair(std::string_view(Key), Value); +} + +template <> +inline std::pair +getMetadata(const metadata_t::value_type &MD) { + object_data_t RawData = xptiLookupObject(MD.second); + + std::string_view Value(RawData.data, RawData.size); + + const char *Key = xptiLookupString(MD.first); + + return std::make_pair(std::string_view(Key), Value); +} + +inline std::string readMetadata(const metadata_t::value_type &MD) { + object_data_t RawData = xptiLookupObject(MD.second); + + if (RawData.type == static_cast(metadata_type_t::binary)) { + return std::string("Binary data, size: ") + std::to_string(RawData.size); + } + + if (RawData.type == static_cast(metadata_type_t::boolean)) { + bool Value = *reinterpret_cast(RawData.data); + return Value ? "true" : "false"; + } + + if (RawData.type == static_cast(metadata_type_t::signed_integer)) { + if (RawData.size == 1) { + auto I = *reinterpret_cast(RawData.data); + return std::to_string(I); + } + if (RawData.size == 2) { + auto I = *reinterpret_cast(RawData.data); + return std::to_string(I); + } + if (RawData.size == 4) { + auto I = *reinterpret_cast(RawData.data); + return std::to_string(I); + } + if (RawData.size == 8) { + auto I = *reinterpret_cast(RawData.data); + return std::to_string(I); + } + } + + if (RawData.type == static_cast(metadata_type_t::unsigned_integer)) { + if (RawData.size == 1) { + auto I = *reinterpret_cast(RawData.data); + return std::to_string(I); + } + if (RawData.size == 2) { + auto I = *reinterpret_cast(RawData.data); + return std::to_string(I); + } + if (RawData.size == 4) { + auto I = *reinterpret_cast(RawData.data); + return std::to_string(I); + } + if (RawData.size == 8) { + auto I = *reinterpret_cast(RawData.data); + return std::to_string(I); + } + } + + if (RawData.type == static_cast(metadata_type_t::floating)) { + if (RawData.size == 4) { + auto F = *reinterpret_cast(RawData.data); + return std::to_string(F); + } + if (RawData.size == 8) { + auto F = *reinterpret_cast(RawData.data); + return std::to_string(F); + } + } + + if (RawData.type == static_cast(metadata_type_t::string)) { + return std::string(RawData.data, RawData.size); + } + + return std::string("Unknown metadata type, size ") + + std::to_string(RawData.size); +} + namespace framework { constexpr uint16_t signal = (uint16_t)xpti::trace_point_type_t::signal; constexpr uint16_t graph_create = diff --git a/xpti/src/xpti_proxy.cpp b/xpti/src/xpti_proxy.cpp index ad07862368539..6f63d00930690 100644 --- a/xpti/src/xpti_proxy.cpp +++ b/xpti/src/xpti_proxy.cpp @@ -23,6 +23,8 @@ enum functions_t { XPTI_GET_UNIQUE_ID, XPTI_REGISTER_STRING, XPTI_LOOKUP_STRING, + XPTI_REGISTER_OBJECT, + XPTI_LOOKUP_OBJECT, XPTI_REGISTER_STREAM, XPTI_UNREGISTER_STREAM, XPTI_REGISTER_USER_DEFINED_TP, @@ -56,6 +58,8 @@ class ProxyLoader { {XPTI_GET_UNIQUE_ID, "xptiGetUniqueId"}, {XPTI_REGISTER_STRING, "xptiRegisterString"}, {XPTI_LOOKUP_STRING, "xptiLookupString"}, + {XPTI_REGISTER_OBJECT, "xptiRegisterObject"}, + {XPTI_LOOKUP_OBJECT, "xptiLookupObject"}, {XPTI_REGISTER_PAYLOAD, "xptiRegisterPayload"}, {XPTI_REGISTER_STREAM, "xptiRegisterStream"}, {XPTI_UNREGISTER_STREAM, "xptiUnregisterStream"}, @@ -265,6 +269,28 @@ XPTI_EXPORT_API const char *xptiLookupString(xpti::string_id_t id) { return nullptr; } +XPTI_EXPORT_API xpti::object_id_t +xptiRegisterObject(const char *data, size_t size, uint8_t type) { + if (xpti::ProxyLoader::instance().noErrors()) { + auto f = + xpti::ProxyLoader::instance().functionByIndex(XPTI_REGISTER_OBJECT); + if (f) { + return (*(xpti_register_object_t)f)(data, size, type); + } + } + return xpti::invalid_id; +} + +XPTI_EXPORT_API xpti::object_data_t xptiLookupObject(xpti::object_id_t id) { + if (xpti::ProxyLoader::instance().noErrors()) { + auto f = xpti::ProxyLoader::instance().functionByIndex(XPTI_LOOKUP_OBJECT); + if (f) { + return (*(xpti_lookup_object_t)f)(id); + } + } + return xpti::object_data_t{0, nullptr, 0}; +} + XPTI_EXPORT_API uint64_t xptiRegisterPayload(xpti::payload_t *payload) { if (xpti::ProxyLoader::instance().noErrors()) { auto f = @@ -396,11 +422,11 @@ XPTI_EXPORT_API bool xptiTraceEnabled() { XPTI_EXPORT_API xpti::result_t xptiAddMetadata(xpti::trace_event_data_t *e, const char *key, - const char *value) { + xpti::object_id_t value_id) { if (xpti::ProxyLoader::instance().noErrors()) { auto f = xpti::ProxyLoader::instance().functionByIndex(XPTI_ADD_METADATA); if (f) { - return (*(xpti_add_metadata_t)f)(e, key, value); + return (*(xpti_add_metadata_t)f)(e, key, value_id); } } return xpti::result_t::XPTI_RESULT_FAIL; diff --git a/xptifw/CMakeLists.txt b/xptifw/CMakeLists.txt index b3dac5f39af40..a568f09cd2d31 100644 --- a/xptifw/CMakeLists.txt +++ b/xptifw/CMakeLists.txt @@ -3,7 +3,7 @@ cmake_minimum_required(VERSION 3.8) set(XPTI_VERSION 0.4.1) project (xptifw VERSION "${XPTI_VERSION}" LANGUAGES CXX) -set(CMAKE_CXX_STANDARD 14) +set(CMAKE_CXX_STANDARD 17) set(XPTIFW_DIR ${CMAKE_CURRENT_LIST_DIR}) # The XPTI framework requires the includes from @@ -14,8 +14,9 @@ set(XPTI_DIR ${CMAKE_CURRENT_LIST_DIR}/../xpti) option(XPTI_ENABLE_TBB "Enable TBB in the framework" OFF) option(XPTI_ENABLE_WERROR OFF) - option(XPTI_BUILD_SAMPLES OFF) +option(XPTI_BUILD_BENCHMARK OFF) +option(XPTI_ENABLE_STATISTICS OFF) if (XPTI_ENABLE_WERROR) if(MSVC) @@ -60,6 +61,10 @@ if (XPTI_BUILD_SAMPLES) add_subdirectory(samples/syclpi_collector) endif() +if (XPTI_BUILD_BENCHMARK) + add_subdirectory(benchmark) +endif() + # The tests in basic_test are written using TBB, so these tests are enabled # only if TBB has been enabled. if (XPTI_ENABLE_TBB) diff --git a/xptifw/benchmark/CMakeLists.txt b/xptifw/benchmark/CMakeLists.txt new file mode 100644 index 0000000000000..59917b4773c6d --- /dev/null +++ b/xptifw/benchmark/CMakeLists.txt @@ -0,0 +1,16 @@ +add_executable(XPTIFWBenchmark + object_table.cpp + string_table.cpp + main.cpp +) + +target_include_directories(XPTIFWBenchmark PRIVATE + $ + $ +) + +target_link_libraries(XPTIFWBenchmark PRIVATE benchmark) + +if (XPTI_ENABLE_STATISTICS) + target_compile_definitions(XPTIFWBenchmark PRIVATE XPTI_STATISTICS) +endif() diff --git a/xptifw/benchmark/helpers.hpp b/xptifw/benchmark/helpers.hpp new file mode 100644 index 0000000000000..ad10af8e9433b --- /dev/null +++ b/xptifw/benchmark/helpers.hpp @@ -0,0 +1,24 @@ +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +#pragma once + +#include + +inline static std::string getRandomString() { + std::random_device Dev; + std::mt19937 Range(Dev()); + std::uniform_int_distribution Dist(1, 255); + + size_t Size = Dist(Range); + std::string Result = ""; + Result.resize(Size); + + for (char &C : Result) { + C = static_cast(Dist(Range)); + } + + return Result; +} diff --git a/xptifw/benchmark/main.cpp b/xptifw/benchmark/main.cpp new file mode 100644 index 0000000000000..004e5d03afa28 --- /dev/null +++ b/xptifw/benchmark/main.cpp @@ -0,0 +1,8 @@ +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +#include "benchmark/benchmark.h" + +BENCHMARK_MAIN(); diff --git a/xptifw/benchmark/object_table.cpp b/xptifw/benchmark/object_table.cpp new file mode 100644 index 0000000000000..7ec4a7b40bc16 --- /dev/null +++ b/xptifw/benchmark/object_table.cpp @@ -0,0 +1,88 @@ +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// + +#include "helpers.hpp" +#include "xpti/xpti_data_types.h" +#include "xpti_object_table.hpp" + +#include "benchmark/benchmark.h" + +#include + +constexpr uint64_t NUM_ITERATIONS = 100'000; + +static std::vector *GIDs; +static xpti::ObjectTable *GObjTable = nullptr; + +static void ObjectTable_Insert(benchmark::State &State) { + if (State.thread_index == 0) { + GObjTable = new xpti::ObjectTable(); + } + for (auto _ : State) { + State.PauseTiming(); + const std::string Str = getRandomString(); + State.ResumeTiming(); + + benchmark::DoNotOptimize(GObjTable->insert( + Str, static_cast(xpti::metadata_type_t::string))); + } + if (State.thread_index == 0) { +#ifdef XPTI_STATISTICS + State.counters["Cache hits"] = GObjTable->getCacheHits(); + State.counters["Small objects"] = GObjTable->getSmallObjectsCount(); + State.counters["Large objects"] = GObjTable->getLargeObjectsCount(); +#endif + delete GObjTable; + GObjTable = nullptr; + } +} + +BENCHMARK(ObjectTable_Insert)->Threads(1)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Insert)->Threads(2)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Insert)->Threads(4)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Insert)->Threads(8)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Insert)->Threads(16)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Insert)->Threads(24)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Insert)->Threads(32)->Iterations(NUM_ITERATIONS); + +static void ObjectTable_Lookup(benchmark::State &State) { + if (State.thread_index == 0) { + GObjTable = new xpti::ObjectTable(100'000); + GIDs = new std::vector(); + GIDs->resize(100'000); + for (int I = 0; I < 100'000; I++) { + const std::string Rand = getRandomString(); + (*GIDs)[I] = GObjTable->insert( + Rand, static_cast(xpti::metadata_type_t::string)); + } + } + + for (auto _ : State) { + State.PauseTiming(); + std::random_device Dev; + std::mt19937 Range(Dev()); + std::uniform_int_distribution Dist(0, 99'999); + size_t ID = Dist(Range); + State.ResumeTiming(); + + benchmark::DoNotOptimize(GObjTable->lookup((*GIDs)[ID])); + } + + if (State.thread_index == 0) { + delete GObjTable; + delete GIDs; + GObjTable = nullptr; + GIDs = nullptr; + } +} + +BENCHMARK(ObjectTable_Lookup)->Threads(1)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Lookup)->Threads(2)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Lookup)->Threads(4)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Lookup)->Threads(8)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Lookup)->Threads(16)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Lookup)->Threads(24)->Iterations(NUM_ITERATIONS); +BENCHMARK(ObjectTable_Lookup)->Threads(32)->Iterations(NUM_ITERATIONS); diff --git a/xptifw/benchmark/string_table.cpp b/xptifw/benchmark/string_table.cpp new file mode 100644 index 0000000000000..4b02bdf1c01c8 --- /dev/null +++ b/xptifw/benchmark/string_table.cpp @@ -0,0 +1,90 @@ +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +#include "helpers.hpp" +#include "xpti_string_table.hpp" + +#include "benchmark/benchmark.h" + +#include +#include + +constexpr uint64_t NUM_ITERATIONS = 100'000; + +static std::vector *GIDs; +static xpti::StringTable *GStringTable = nullptr; + +static void StringTable_Insert(benchmark::State &State) { + if (State.thread_index == 0) { + GStringTable = new xpti::StringTable(); + } + for (auto _ : State) { + State.PauseTiming(); + const std::string Str = getRandomString(); + const char *Ref = nullptr; + State.ResumeTiming(); + + benchmark::DoNotOptimize(GStringTable->add(Str.c_str(), &Ref)); + } + if (State.thread_index == 0) { +#ifdef XPTI_STATISTICS + State.counters["Retrievals"] = GStringTable->getRetrievals(); + State.counters["Insertions"] = GStringTable->getInsertions(); +#endif + delete GStringTable; + GStringTable = nullptr; + } +} + +BENCHMARK(StringTable_Insert)->Threads(1)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Insert)->Threads(2)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Insert)->Threads(4)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Insert)->Threads(8)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Insert)->Threads(16)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Insert)->Threads(24)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Insert)->Threads(32)->Iterations(NUM_ITERATIONS); + +static void StringTable_Lookup(benchmark::State &State) { + if (State.thread_index == 0) { + GStringTable = new xpti::StringTable(100'000); + GIDs = new std::vector(); + GIDs->resize(100'000); + for (int I = 0; I < 100'000; I++) { + const char *Ref; + const std::string Rand = getRandomString(); + (*GIDs)[I] = GStringTable->add(Rand.c_str(), &Ref); + } + } + + for (auto _ : State) { + State.PauseTiming(); + std::random_device Dev; + std::mt19937 Range(Dev()); + std::uniform_int_distribution Dist(0, 99'999); + size_t ID = Dist(Range); + State.ResumeTiming(); + + benchmark::DoNotOptimize(GStringTable->query((*GIDs)[ID])); + } + + if (State.thread_index == 0) { +#ifdef XPTI_STATISTICS + State.counters["Retrievals"] = GStringTable->getRetrievals(); + State.counters["Insertions"] = GStringTable->getInsertions(); +#endif + delete GStringTable; + delete GIDs; + GStringTable = nullptr; + GIDs = nullptr; + } +} + +BENCHMARK(StringTable_Lookup)->Threads(1)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Lookup)->Threads(2)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Lookup)->Threads(4)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Lookup)->Threads(8)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Lookup)->Threads(16)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Lookup)->Threads(24)->Iterations(NUM_ITERATIONS); +BENCHMARK(StringTable_Lookup)->Threads(32)->Iterations(NUM_ITERATIONS); diff --git a/xptifw/include/spin_lock.hpp b/xptifw/include/spin_lock.hpp new file mode 100644 index 0000000000000..0405b48c644ef --- /dev/null +++ b/xptifw/include/spin_lock.hpp @@ -0,0 +1,174 @@ +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +#pragma once + +#include +#include + +#ifndef __has_include +#define __has_include(x) +#endif + +#if __has_include() +#define HAS_PAUSE +#include +#endif + +namespace xpti { + +namespace detail { +class Backoff { +public: + void pause() { + if (MCount <= LOOPS_BEFORE_YIELD) { +#ifdef HAS_PAUSE + _mm_pause(); +#else + std::this_thread::yield(); +#endif + MCount *= 2; + } else { + std::this_thread::yield(); + } + } + + void reset() noexcept { MCount = 1; } + +private: + static constexpr uint32_t LOOPS_BEFORE_YIELD = 16; + uint32_t MCount = 1; +}; +} // namespace detail + +/// RAII-style read lock for \c SharedSpinLock. +/// +/// Unlike std::shared_lock this class provides aims to upgrade reader lock to +/// writer lock, which in some cases can improve performance. +template class SharedLock { +public: + SharedLock() = default; + + SharedLock(Mutex &M) { acquire(M); } + + void acquire(Mutex &M) { + MMutex = &M; + M.lock_shared(); + } + + void release() { + if (MIsWriter) + MMutex->unlock(); + else + MMutex->unlock_shared(); + MMutex = nullptr; + MIsWriter = false; + } + + void upgrade_to_writer() { + if (!MIsWriter) { + MIsWriter = true; + MMutex->upgrade(); + } + } + + ~SharedLock() { + if (MMutex) + release(); + } + +private: + Mutex *MMutex; + bool MIsWriter = false; +}; + +/// SpinLock is a synchronization primitive, that uses atomic variable and +/// causes thread trying acquire lock wait in loop while repeatedly check if +/// the lock is available. +/// +/// One important feature of this implementation is that std::atomic_flag can +/// be zero-initialized. This allows SpinLock to have trivial constructor and +/// destructor, which makes it possible to use it in global context (unlike +/// std::mutex, that doesn't provide such guarantees). +class SpinLock { +public: + void lock() { + detail::Backoff B; + while (MLock.test_and_set(std::memory_order_acquire)) + B.pause(); + } + void unlock() { MLock.clear(std::memory_order_release); } + +private: + std::atomic_flag MLock = ATOMIC_FLAG_INIT; +}; + +/// SharedSpinLock is a synchronization primitive, that allows RW-locks. +/// +/// Unlike std::shared_mutex, SharedSpinLock is guaranteed to be trivially +/// destructible. It also provides aims to upgrade reader lock to writer lock. +class SharedSpinLock { +public: + void lock() noexcept { + for (detail::Backoff B;; B.pause()) { + uint32_t CurState = MState.load(std::memory_order_relaxed); + if (!(CurState & BUSY)) { + if (MState.compare_exchange_strong(CurState, WRITER)) { + break; + } + B.reset(); + } else if (!(CurState & WRITER_PENDING)) { + MState |= WRITER_PENDING; + } + } + } + + void unlock() noexcept { MState &= READERS; } + + void lock_shared() noexcept { + for (detail::Backoff B;; B.pause()) { + uint32_t CurState = MState.load(std::memory_order_relaxed); + if (!(CurState & (WRITER | WRITER_PENDING))) { + uint32_t OldState = MState.fetch_add(ONE_READER); + if (!(OldState & WRITER)) + break; + + MState -= ONE_READER; + } + } + } + + void unlock_shared() noexcept { + MState.fetch_sub(ONE_READER, std::memory_order_release); + } + + void upgrade() noexcept { + uint32_t CurState = MState.load(std::memory_order_relaxed); + if ((CurState & READERS) == ONE_READER || !(CurState & WRITER_PENDING)) { + if (MState.compare_exchange_strong(CurState, + CurState | WRITER | WRITER_PENDING)) { + detail::Backoff B; + while ((MState.load(std::memory_order_relaxed) & READERS) != + ONE_READER) { + B.pause(); + } + + MState -= (ONE_READER + WRITER_PENDING); + return; + } + } + unlock_shared(); + lock(); + } + +private: + static constexpr uint32_t WRITER = 1 << 31; + static constexpr uint32_t WRITER_PENDING = 1 << 30; + static constexpr uint32_t READERS = ~(WRITER | WRITER_PENDING); + static constexpr uint32_t ONE_READER = 1; + static constexpr uint32_t BUSY = WRITER | READERS; + std::atomic MState{0}; +}; +} // namespace xpti diff --git a/xptifw/include/xpti_object_table.hpp b/xptifw/include/xpti_object_table.hpp new file mode 100644 index 0000000000000..52bb599834d1e --- /dev/null +++ b/xptifw/include/xpti_object_table.hpp @@ -0,0 +1,161 @@ +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +#pragma once + +#include "spin_lock.hpp" + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +namespace xpti { +/// A thread-safe caching table for arbitrary objects in XPTI framework. +/// +/// This class enables registration of arbitrary objects within XPTI framework +/// to allow passing them as metadata. If an object being added already exists, +/// an existing ID will be returned. +/// +/// @tparam KeyType is the data type of the returned key. +/// @tparam SmallSize is the size of an object, that will fit within table +/// without allocation of additional memory (i.e. small size optimization). +/// The default value of 224 is carefully chosen for Value struct to take +/// 4 cache lines on x86. +template +class ObjectTable { +public: + using HashFunction = std::function; + + constexpr static auto DefaultHash = [](std::string_view Data) -> uint64_t { + // This is an implementation of FNV hash function. + constexpr uint64_t Prime = 1099511628211; + uint64_t Hash = 14695981039346656037UL; + + for (char C : Data) { + Hash *= Prime; + Hash ^= C; + } + + return Hash; + }; + + /// Constructs empty object table. + /// + /// @param InitialSize is the number of pre-allocated values in the table. + /// @param HashFunc is a callable object, that given raw bytes returns some + /// hash value. This is only used upon insertion to quickly scan through the + /// table and return an existing ID, if any. + ObjectTable(size_t InitialSize = 4096, + const HashFunction &HashFunc = DefaultHash) + : MHashFunction(HashFunc) { + MValues.reserve(InitialSize); + } + + /// Inserts an object into a table or retrieves an existing object ID. + KeyType insert(std::string_view Data, uint8_t Type) { + uint64_t Hash = MHashFunction(Data); + + SharedLock Lock(MMutex); + // Check if this data object already exists + if (MCache.count(Hash) > 0) { + for (KeyType Key : MCache[Hash]) { + // Avoid collisions + if (getValue(MValues[Key]).first == Data) { +#ifdef XPTI_STATISTICS + MCacheHits++; +#endif + return Key; + } + } + } + + const Value &V = makeValue(Data, Hash, Type); + + Lock.upgrade_to_writer(); + + MValues.push_back(std::move(V)); + KeyType Key = MValues.size() - 1; + MCache[MValues.back().MHash].push_back(Key); + + return Key; + } + + /// @returns a pair of raw data bytes and registered data type. + std::pair lookup(KeyType Key) { + SharedLock Lock(MMutex); + + return getValue(MValues[Key]); + } + +#ifdef XPTI_STATISTICS + size_t getCacheHits() const noexcept { return MCacheHits; } + size_t getSmallObjectsCount() const noexcept { return MSmallObjects; } + size_t getLargeObjectsCount() const noexcept { return MLargeObjects; } +#endif + +private: + using Item = std::variant, std::vector>; + + struct Value { + uint64_t MSize = 0; + uint64_t MHash = 0; + Item MItem; + uint8_t MType = 0; + }; + + Value makeValue(std::string_view Data, uint64_t Hash, uint8_t Type) { + Value V; + V.MSize = Data.size(); + V.MHash = Hash; + V.MType = Type; + + char *Dest = nullptr; + + if (V.MSize > SmallSize) { + V.MItem = std::vector(V.MSize, 0); + Dest = std::get<1>(V.MItem).data(); +#ifdef XPTI_STATISTICS + MLargeObjects++; +#endif + } else { + V.MItem = std::array(); + Dest = std::get<0>(V.MItem).data(); +#ifdef XPTI_STATISTICS + MSmallObjects++; +#endif + } + + std::uninitialized_copy(Data.begin(), Data.end(), Dest); + + return V; + } + + std::pair getValue(const Value &V) { + return std::visit( + [&V](auto &&Data) { + return std::make_pair(std::string_view(Data.data(), V.MSize), + V.MType); + }, + V.MItem); + } + + HashFunction MHashFunction; + std::vector MValues; + std::unordered_map> MCache; + mutable xpti::SharedSpinLock MMutex; + +#ifdef XPTI_STATISTICS + size_t MCacheHits = 0; + size_t MSmallObjects = 0; + size_t MLargeObjects = 0; +#endif +}; +} // namespace xpti diff --git a/xptifw/include/xpti_string_table.hpp b/xptifw/include/xpti_string_table.hpp index 6f74ccdd74bdd..060ea0a61ca88 100644 --- a/xptifw/include/xpti_string_table.hpp +++ b/xptifw/include/xpti_string_table.hpp @@ -326,6 +326,22 @@ class StringTable { #endif } + int getInsertions() const noexcept { +#ifdef XPTI_STATISTICS + return MInsertions; +#else + return 0; +#endif + } + + int getRetrievals() const noexcept { +#ifdef XPTI_STATISTICS + return MRetrievals; +#else + return 0; +#endif + } + private: safe_int32_t MIds; ///< Thread-safe ID generator st_forward_t MStringToID; ///< Forward lookup hash map diff --git a/xptifw/samples/basic_collector/basic_collector.cpp b/xptifw/samples/basic_collector/basic_collector.cpp index 0f8176afea5f2..bc864cd65f98c 100644 --- a/xptifw/samples/basic_collector/basic_collector.cpp +++ b/xptifw/samples/basic_collector/basic_collector.cpp @@ -5,7 +5,7 @@ // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // // -#include "xpti/xpti_trace_framework.h" +#include "xpti/xpti_trace_framework.hpp" #include "xpti_timers.hpp" #include @@ -21,12 +21,6 @@ static uint8_t GStreamID = 0; std::mutex GIOMutex; xpti::ThreadID GThreadIDEnum; -static const char *TPTypes[] = { - "unknown", "graph_create", "node_create", "edge_create", - "region_", "task_", "barrier_", "lock_", - "signal ", "transfer_", "thread_", "wait_", - 0}; - // The lone callback function we are going to use to demonstrate how to attach // the collector to the running executable XPTI_CALLBACK_API void tpCallback(uint16_t trace_type, @@ -44,13 +38,10 @@ XPTI_CALLBACK_API void xptiTraceInit(unsigned int major_version, // The basic collector will take in streams from anyone as we are just // printing out the stream data if (stream_name) { - char *tstr; // Register this stream to get the stream ID; This stream may already have // been registered by the framework and will return the previously // registered stream ID GStreamID = xptiRegisterStream(stream_name); - xpti::string_id_t dev_id = xptiRegisterString("sycl_device", &tstr); - (void)dev_id; // Register our lone callback to all pre-defined trace point types xptiRegisterCallback(GStreamID, (uint16_t)xpti::trace_point_type_t::graph_create, @@ -139,9 +130,10 @@ XPTI_CALLBACK_API void tpCallback(uint16_t TraceType, ID); // Go through all available meta-data for an event and print it out xpti::metadata_t *Metadata = xptiQueryMetadata(Event); - for (auto &Item : *Metadata) { - printf(" %-25s:%s\n", xptiLookupString(Item.first), - xptiLookupString(Item.second)); + for (const auto &Item : *Metadata) { + std::cout << " "; + std::cout << xptiLookupString(Item.first) << " : "; + std::cout << xpti::readMetadata(Item) << "\n"; } if (Payload->source_file_sid() != xpti::invalid_id && Payload->line_no > 0) { diff --git a/xptifw/src/xpti_trace_framework.cpp b/xptifw/src/xpti_trace_framework.cpp index 081dc073f55f1..4ab65285872d3 100644 --- a/xptifw/src/xpti_trace_framework.cpp +++ b/xptifw/src/xpti_trace_framework.cpp @@ -5,6 +5,7 @@ // #include "xpti/xpti_trace_framework.hpp" #include "xpti_int64_hash_table.hpp" +#include "xpti_object_table.hpp" #include "xpti_string_table.hpp" #include @@ -17,6 +18,7 @@ #include #include #include +#include #include #include @@ -360,18 +362,15 @@ class Tracepoints { // data types, we will allow them to add these pairs as strings. Internally, // we will store key-value pairs as a map of string ids. xpti::result_t addMetadata(xpti::trace_event_data_t *Event, const char *Key, - const char *Value) { - if (!Event || !Key || !Value) + object_id_t ValueID) { + if (!Event || !Key) return xpti::result_t::XPTI_RESULT_INVALIDARG; string_id_t KeyID = MStringTableRef.add(Key); if (KeyID == xpti::invalid_id) { return xpti::result_t::XPTI_RESULT_INVALIDARG; } - string_id_t ValueID = MStringTableRef.add(Value); - if (ValueID == xpti::invalid_id) { - return xpti::result_t::XPTI_RESULT_INVALIDARG; - } + // Protect simultaneous insert operations on the metadata tables { std::lock_guard HashLock(MMetadataMutex); @@ -836,8 +835,8 @@ class Framework { void setUniversalID(uint64_t uid) noexcept { g_tls_uid = uid; } xpti::result_t addMetadata(xpti::trace_event_data_t *Event, const char *Key, - const char *Value) { - return MTracepoints.addMetadata(Event, Key, Value); + object_id_t ValueID) { + return MTracepoints.addMetadata(Event, Key, ValueID); } xpti::trace_event_data_t * @@ -920,6 +919,18 @@ class Framework { return MStringTableRef.query(ID); } + object_id_t registerObject(const char *Object, size_t Size, uint8_t Type) { + if (!Object) + return xpti::invalid_id; + + return MObjectTable.insert(std::string_view(Object, Size), Type); + } + + object_data_t lookupObject(object_id_t ID) { + auto [Result, Type] = MObjectTable.lookup(ID); + return {Result.size(), Result.data(), Type}; + } + uint64_t registerPayload(xpti::payload_t *payload) { if (!payload) return xpti::invalid_id; @@ -1037,6 +1048,8 @@ class Framework { xpti::Notifications MNotifier; /// Thread-safe string table xpti::StringTable MStringTableRef; + /// Thread-safe object table + xpti::ObjectTable MObjectTable; /// Thread-safe string table, used for stream IDs xpti::StringTable MStreamStringTable; /// Thread-safe string table, used for vendor IDs @@ -1118,6 +1131,15 @@ XPTI_EXPORT_API const char *xptiLookupString(xpti::string_id_t ID) { return xpti::Framework::instance().lookupString(ID); } +XPTI_EXPORT_API xpti::object_id_t +xptiRegisterObject(const char *Data, size_t Size, uint8_t Type) { + return xpti::Framework::instance().registerObject(Data, Size, Type); +} + +XPTI_EXPORT_API xpti::object_data_t xptiLookupObject(xpti::object_id_t ID) { + return xpti::Framework::instance().lookupObject(ID); +} + XPTI_EXPORT_API uint64_t xptiRegisterPayload(xpti::payload_t *payload) { return xpti::Framework::instance().registerPayload(payload); } @@ -1180,8 +1202,8 @@ XPTI_EXPORT_API bool xptiTraceEnabled() { XPTI_EXPORT_API xpti::result_t xptiAddMetadata(xpti::trace_event_data_t *Event, const char *Key, - const char *Value) { - return xpti::Framework::instance().addMetadata(Event, Key, Value); + xpti::object_id_t ID) { + return xpti::Framework::instance().addMetadata(Event, Key, ID); } XPTI_EXPORT_API xpti::metadata_t * diff --git a/xptifw/unit_test/xpti_api_tests.cpp b/xptifw/unit_test/xpti_api_tests.cpp index da5eb8b206753..1af17abf27c97 100644 --- a/xptifw/unit_test/xpti_api_tests.cpp +++ b/xptifw/unit_test/xpti_api_tests.cpp @@ -3,6 +3,7 @@ // See https://llvm.org/LICENSE.txt for license information. // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // +#include "xpti/xpti_trace_framework.h" #include "xpti/xpti_trace_framework.hpp" #include @@ -521,13 +522,9 @@ TEST_F(xptiApiTest, xptiAddMetadataBadInput) { &instance); EXPECT_NE(Event, nullptr); - auto Result = xptiAddMetadata(nullptr, nullptr, nullptr); + auto Result = xptiAddMetadata(nullptr, nullptr, 0); EXPECT_EQ(Result, xpti::result_t::XPTI_RESULT_INVALIDARG); - Result = xptiAddMetadata(Event, nullptr, nullptr); - EXPECT_EQ(Result, xpti::result_t::XPTI_RESULT_INVALIDARG); - Result = xptiAddMetadata(Event, "foo", nullptr); - EXPECT_EQ(Result, xpti::result_t::XPTI_RESULT_INVALIDARG); - Result = xptiAddMetadata(Event, nullptr, "bar"); + Result = xptiAddMetadata(Event, nullptr, 0); EXPECT_EQ(Result, xpti::result_t::XPTI_RESULT_INVALIDARG); } @@ -539,9 +536,10 @@ TEST_F(xptiApiTest, xptiAddMetadataGoodInput) { &instance); EXPECT_NE(Event, nullptr); - auto Result = xptiAddMetadata(Event, "foo", "bar"); + xpti::object_id_t ID = xptiRegisterObject("bar", 3, 0); + auto Result = xptiAddMetadata(Event, "foo", ID); EXPECT_EQ(Result, xpti::result_t::XPTI_RESULT_SUCCESS); - Result = xptiAddMetadata(Event, "foo", "bar"); + Result = xptiAddMetadata(Event, "foo", ID); EXPECT_EQ(Result, xpti::result_t::XPTI_RESULT_DUPLICATE); } @@ -556,12 +554,14 @@ TEST_F(xptiApiTest, xptiQueryMetadata) { auto md = xptiQueryMetadata(Event); EXPECT_NE(md, nullptr); - auto Result = xptiAddMetadata(Event, "foo1", "bar1"); + xpti::object_id_t ID = xptiRegisterObject("bar1", 4, 0); + auto Result = xptiAddMetadata(Event, "foo1", ID); EXPECT_EQ(Result, xpti::result_t::XPTI_RESULT_SUCCESS); char *ts; EXPECT_EQ(md->size(), 1); - auto ID = (*md)[xptiRegisterString("foo1", &ts)]; - auto str = xptiLookupString(ID); - EXPECT_STREQ(str, "bar1"); + auto MDID = (*md)[xptiRegisterString("foo1", &ts)]; + auto obj = xptiLookupObject(MDID); + std::string str{obj.data, obj.size}; + EXPECT_EQ(str, "bar1"); }