From 9237f48def4da458a96f99759c2e2f7806e7d882 Mon Sep 17 00:00:00 2001 From: David Solt Date: Mon, 27 Feb 2023 12:10:42 -0600 Subject: [PATCH 1/6] Minor edits to Events chapter --- Chap_API_Event.tex | 21 +++++++++++++-------- Chap_API_Server.tex | 2 ++ 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/Chap_API_Event.tex b/Chap_API_Event.tex index a3a7d4a7..5a3ad55e 100644 --- a/Chap_API_Event.tex +++ b/Chap_API_Event.tex @@ -23,7 +23,7 @@ \section{Notification and Management} Both \ac{SMS} elements and applications can register for events of either type. \adviceimplstart -Race conditions can cause the registration to come after events of possible interest (e.g., a memory \ac{ECC} event that occurs after start of execution but prior to registration, or an application process generating an event prior to another process registering to receive it). \ac{SMS} vendors are \textit{requested} to cache environment events for some time to mitigate this situation, but are not \textit{required} to do so. However, \ac{PMIx} implementers are \textit{required} to cache all events received by the \ac{PMIx} server library and to deliver them to registering clients in the same order in which they were received +Race conditions can cause the registration to come after events of possible interest (e.g., a memory \ac{ECC} event that occurs after start of execution but prior to registration, or an application process generating an event prior to another process registering to receive it). \ac{SMS} vendors are \textit{requested} to cache environment events for some time to mitigate this situation, but are not \textit{required} to do so. However, \ac{PMIx} implementers are \textit{required} to cache all events received by the \ac{PMIx} server library and to deliver them to registering clients in the same order in which they were received. \adviceimplend \adviceuserstart @@ -43,7 +43,10 @@ \section{Notification and Management} % \end{itemize} -Users can specify the callback order of a handler within its category at the time of registration. Ordering can be specified by providing the relevant event handler names, if the user specified an event handler name when registering the corresponding event. Thus, users can specify that a given handler be executed before or after another handler should both handlers appear in an event chain (the ordering is ignored if the other handler isn't included). Note that ordering does not imply immediate relationships. For example, multiple handlers registered to be serviced after event handler \textit{A} will all be executed after \textit{A}, but are not guaranteed to be executed in any particular order amongst themselves. +Users can specify the callback order of a handler within its category at the time of registration. +Users can specify that a given handler be executed before or after another target handler should both handlers appear in an event chain (the ordering is ignored if the other handler isn't included). +The ordering is dictated by providing the event handler name of the target. The name must have been assigned when registereding the target handler. +Note that ordering does not imply immediate relationships. For example, multiple handlers registered to be serviced after event handler \textit{A} will all be executed after \textit{A}, but are not guaranteed to be executed in any particular order amongst themselves. In addition, one event handler can be declared as the \textit{first} handler to be executed in the chain. This handler will \textit{always} be called prior to any other handler, regardless of category, provided the incoming event matches both the specified range and event code. Only one handler can be so designated --- attempts to designate additional handlers as \textit{first} will return an error. Deregistration of the declared \textit{first} handler will re-open the position for subsequent assignment. @@ -143,10 +146,10 @@ \subsection{\code{PMIx_Register_event_handler}} %%%% \descr -Register an event handler to report events. Note that the codes being registered do \textit{not} need to be \ac{PMIx} error constants --- any integer value can be registered. This allows for registration of non-PMIx events such as those defined by a particular \ac{SMS} vendor or by an application itself. +Register an event handler to report events. Note that the codes being registered do \textit{not} need to be \ac{PMIx} event constants --- any integer value can be registered. This allows for registration of non-PMIx events such as those defined by a particular \ac{SMS} vendor or by an application itself. \adviceuserstart -In order to avoid potential conflicts, users are advised to only define codes that lie outside the range of the \ac{PMIx} standard's error codes. Thus, \ac{SMS} vendors and application developers should constrain their definitions to positive values or negative values beyond the \refconst{PMIX_EXTERNAL_ERR_BASE} boundary. +In order to avoid potential conflicts, users are advised to only use values as event codes if they lie outside the range of the \ac{PMIx} standard's error codes. Thus, \ac{SMS} vendors and application developers should constrain their definitions to positive values or negative values beyond the \refconst{PMIX_EXTERNAL_ERR_BASE} boundary. \adviceuserend @@ -375,7 +378,7 @@ \subsection{Notification Function} \adviceuserend \advicermstart -On the server side, the notification function is used to inform the \ac{PMIx} server library's host of a detected event in the \ac{PMIx} server library. Events generated by \ac{PMIx} clients are communicated to the \ac{PMIx} server library, but will be relayed to the host via the \refapi{pmix_server_notify_event_fn_t} function pointer, if provided. +On the server side, the notification function, \refapi{pmix_server_notify_event_fn_t} (See \ref{chap:server:func_pointers}), is used to inform the \ac{PMIx} server library's host of a detected event in the \ac{PMIx} server library. Events generated by \ac{PMIx} clients are communicated to the \ac{PMIx} server library, but will be relayed to the host via this \refapi{pmix_server_notify_event_fn_t} function pointer, if provided. \advicermend @@ -504,7 +507,9 @@ \subsection{\code{PMIx_Notify_event}} %%%% \descr -Report an event for notification via any registered event handler. This function can be called by any \ac{PMIx} process, including application processes, \ac{PMIx} servers, and \ac{SMS} elements. The \ac{PMIx} server calls this \ac{API} to report events it detected itself so that the host \ac{SMS} daemon distribute and handle them, and to pass events given to it by its host down to any attached client processes for processing. Examples might include notification of the failure of another process, detection of an impending node failure due to rising temperatures, or an intent to preempt the application. Events may be locally generated or come from anywhere in the system. +Report an event for notification via any registered event handler. This function can be called by any \ac{PMIx} process, including application processes, \ac{PMIx} servers, and \ac{SMS} elements. + +The \ac{PMIx} server calls this \ac{API} to report events it detected itself so that the host \ac{SMS} daemon can distribute and handle them, and to pass events given to it by its host down to any attached client processes for processing. Examples might include notification of the failure of another process, detection of an impending node failure due to rising temperatures, or an intent to preempt the application. Events may be locally generated or come from anywhere in the system. Host \ac{SMS} daemons call the \ac{API} to pass events down to its embedded \ac{PMIx} server both for transmittal to local client processes and for the host's own internal processing where the host has registered its own event handlers. The \ac{PMIx} server library is not allowed to echo any event given to it by its host via this \ac{API} back to the host through the \refapi{pmix_server_notify_event_fn_t} server module function. The host is required to deliver the event to all \ac{PMIx} servers where the targeted processes either are currently running, or (if they haven't started yet) might be running at some point in the future as the events are required to be cached by the \ac{PMIx} server library. @@ -512,7 +517,7 @@ \subsection{\code{PMIx_Notify_event}} \adviceuserstart The callback function will be called upon completion of the -\code{notify_event} function's actions. At that time, any messages required for executing the operation (e.g., to send the notification to the local \ac{PMIx} server) will +\refapi{PMIx_Notify_event} function's actions. At that time, any messages required for executing the operation (e.g., to send the notification to the local \ac{PMIx} server) will have been queued, but may not yet have been transmitted. The caller is required to maintain the input data until the callback function has been executed --- the sole purpose of the callback function is to indicate when the input data is no longer required. \adviceuserend @@ -567,7 +572,7 @@ \subsubsection{Completion Callback Function Status Codes} Event handler: Action deferred. % \declareconstitemvalue{PMIX_EVENT_ACTION_COMPLETE}{-334} -Event handler: Action complete. +Event handler: Action complete. Further handlers in the same chain are not called. % \end{constantdesc} diff --git a/Chap_API_Server.tex b/Chap_API_Server.tex index 0b26eba3..0ac91b1f 100644 --- a/Chap_API_Server.tex +++ b/Chap_API_Server.tex @@ -1940,6 +1940,8 @@ \subsection{\code{PMIx_server_delete_process_set}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Server Function Pointers} +\label{chap:server:func_pointers} + \ac{PMIx} utilizes a "function-shipping" approach to support for implementing the server-side of the protocol. This method allows \acp{RM} to implement the server without being burdened with \ac{PMIx} internal details. When a request is received from the client, the corresponding server function will be called with the information. Any functions not supported by the \ac{RM} can be indicated by a \code{NULL} for the function pointer. \ac{PMIx} implementations are required to return a \refconst{PMIX_ERR_NOT_SUPPORTED} status to all calls to functions that require host environment support and are not backed by a corresponding server module entry. Host environments may, if they choose, include a function pointer for operations they have not yet implemented and simply return \refconst{PMIX_ERR_NOT_SUPPORTED}. From a7894d4908fb7d0c8f11b71191a7098ff1252dc7 Mon Sep 17 00:00:00 2001 From: David Solt Date: Mon, 6 Mar 2023 13:59:00 -0600 Subject: [PATCH 2/6] Some clarifications after getting feeback from Ralph --- Chap_API_Event.tex | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/Chap_API_Event.tex b/Chap_API_Event.tex index 5a3ad55e..73edb6ea 100644 --- a/Chap_API_Event.tex +++ b/Chap_API_Event.tex @@ -32,7 +32,7 @@ \section{Notification and Management} The generator of an event can specify the \textit{target range} for delivery of that event. Thus, the generator can choose to limit notification to processes on the local node, processes within the same job as the generator, processes within the same allocation, other threads within the same process, only the \ac{SMS} (i.e., not to any application processes), all application processes, or to a custom range based on specific process identifiers. Only processes within the given range that register for the provided event code will be notified. In addition, the generator can use attributes to direct that the event not be delivered to any default event handlers, or to any multi-code handler (as defined below). -Event notifications provide the process identifier of the source of the event plus the event code and any additional information provided by the generator. When an event notification is received by a process, the registered handlers are scanned for their event code(s), with matching handlers assembled into an \textit{event chain} for servicing. Note that users can also specify a \textit{source range} when registering an event (using the same range designators described above) to further limit when they are to be invoked. When assembled, PMIx event chains are ordered based on both the specificity of the event handler and user directives at time of handler registration. By default, handlers are grouped into three categories based on the number of event codes that can trigger the callback: +Event notifications provide the process identifier of the source of the event plus the event code and any additional information provided by the generator. When an event notification is received by a process, the registered handlers are scanned for their event code(s), with matching handlers assembled into an \textit{event chain} for servicing. Note that users can also specify a \textit{source range} when registering an event (using the same range designators described above) to further limit when they are to be invoked. When assembled, the PMIx event chain is ordered based on both the specificity of the event handler and user directives at time of handler registration. By default, handlers are grouped into three categories based on the number of event codes that can trigger the callback: \begin{itemize} % \item \textit{single-code} handlers are serviced first as they are the most specific. These are handlers that are registered against one specific event code. @@ -45,7 +45,7 @@ \section{Notification and Management} Users can specify the callback order of a handler within its category at the time of registration. Users can specify that a given handler be executed before or after another target handler should both handlers appear in an event chain (the ordering is ignored if the other handler isn't included). -The ordering is dictated by providing the event handler name of the target. The name must have been assigned when registereding the target handler. +The ordering is dictated by providing the event handler name of the target. The name must have been assigned the target handler was registered. Note that ordering does not imply immediate relationships. For example, multiple handlers registered to be serviced after event handler \textit{A} will all be executed after \textit{A}, but are not guaranteed to be executed in any particular order amongst themselves. In addition, one event handler can be declared as the \textit{first} handler to be executed in the chain. This handler will \textit{always} be called prior to any other handler, regardless of category, provided the incoming event matches both the specified range and event code. Only one handler can be so designated --- attempts to designate additional handlers as \textit{first} will return an error. Deregistration of the declared \textit{first} handler will re-open the position for subsequent assignment. @@ -149,7 +149,7 @@ \subsection{\code{PMIx_Register_event_handler}} Register an event handler to report events. Note that the codes being registered do \textit{not} need to be \ac{PMIx} event constants --- any integer value can be registered. This allows for registration of non-PMIx events such as those defined by a particular \ac{SMS} vendor or by an application itself. \adviceuserstart -In order to avoid potential conflicts, users are advised to only use values as event codes if they lie outside the range of the \ac{PMIx} standard's error codes. Thus, \ac{SMS} vendors and application developers should constrain their definitions to positive values or negative values beyond the \refconst{PMIX_EXTERNAL_ERR_BASE} boundary. +In order to avoid potential conflicts, users are advised to only use values as event codes if they lie outside the range of the \ac{PMIx} standard's error and event codes (See \ref{api:struct:usererrors}). Thus, \ac{SMS} vendors and application developers should constrain their definitions to positive values or negative values beyond the \refconst{PMIX_EXTERNAL_ERR_BASE} boundary. \adviceuserend @@ -291,19 +291,19 @@ \subsubsection{Fault tolerance event attributes} % \declareAttribute{PMIX_EVENT_TERMINATE_SESSION}{"pmix.evterm.sess"}{bool}{ -The \ac{RM} intends to terminate this session. +The \ac{RM} intends to terminate the affected session. } % \declareAttribute{PMIX_EVENT_TERMINATE_JOB}{"pmix.evterm.job"}{bool}{ -The \ac{RM} intends to terminate this job. +The \ac{RM} intends to terminate the affected job. } % \declareAttribute{PMIX_EVENT_TERMINATE_NODE}{"pmix.evterm.node"}{bool}{ -The \ac{RM} intends to terminate all processes on this node. +The \ac{RM} intends to terminate all processes on the affected node. } % \declareAttribute{PMIX_EVENT_TERMINATE_PROC}{"pmix.evterm.proc"}{bool}{ -The \ac{RM} intends to terminate just this process. +The \ac{RM} intends to terminate just the affected process. } % \declareAttribute{PMIX_EVENT_ACTION_TIMEOUT}{"pmix.evtimeout"}{int}{ @@ -476,7 +476,7 @@ \subsection{\code{PMIx_Notify_event}} \returnend \reqattrstart -The following attributes are required to be supported by all \ac{PMIx} libraries: +The following attributes are required to be supported by all \ac{PMIx} libraries and will be passed to all invoked handlers: \pasteAttributeItem{PMIX_EVENT_NON_DEFAULT} \pasteAttributeItem{PMIX_EVENT_CUSTOM_RANGE} @@ -570,10 +570,16 @@ \subsubsection{Completion Callback Function Status Codes} % \declareconstitemvalue{PMIX_EVENT_ACTION_DEFERRED}{-333} Event handler: Action deferred. +\end{constantdesc} % -\declareconstitemvalue{PMIX_EVENT_ACTION_COMPLETE}{-334} -Event handler: Action complete. Further handlers in the same chain are not called. + +The following status code may be returned by a handler to stop the handler chain. It will not appear in the \refarg{results} for any handler invocation since it is the last handler executed in the chain. + +\begin{constantdesc} % +\declareconstitemvalue{PMIX_EVENT_ACTION_COMPLETE}{-334} +Event handler: Action complete. \end{constantdesc} +% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% From 99d628f55b315906ae22af2d45966c3741b64bc8 Mon Sep 17 00:00:00 2001 From: David Solt Date: Thu, 4 May 2023 15:50:00 -0500 Subject: [PATCH 3/6] Some feedback from Aurelien. Will squash before merging. Signed-off-by: David Solt --- Chap_API_Event.tex | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Chap_API_Event.tex b/Chap_API_Event.tex index 73edb6ea..884363f3 100644 --- a/Chap_API_Event.tex +++ b/Chap_API_Event.tex @@ -45,7 +45,7 @@ \section{Notification and Management} Users can specify the callback order of a handler within its category at the time of registration. Users can specify that a given handler be executed before or after another target handler should both handlers appear in an event chain (the ordering is ignored if the other handler isn't included). -The ordering is dictated by providing the event handler name of the target. The name must have been assigned the target handler was registered. +The ordering is dictated by providing the event handler name of the target. The name must have been assigned when the target handler was registered. Note that ordering does not imply immediate relationships. For example, multiple handlers registered to be serviced after event handler \textit{A} will all be executed after \textit{A}, but are not guaranteed to be executed in any particular order amongst themselves. In addition, one event handler can be declared as the \textit{first} handler to be executed in the chain. This handler will \textit{always} be called prior to any other handler, regardless of category, provided the incoming event matches both the specified range and event code. Only one handler can be so designated --- attempts to designate additional handlers as \textit{first} will return an error. Deregistration of the declared \textit{first} handler will re-open the position for subsequent assignment. @@ -378,7 +378,7 @@ \subsection{Notification Function} \adviceuserend \advicermstart -On the server side, the notification function, \refapi{pmix_server_notify_event_fn_t} (See \ref{chap:server:func_pointers}), is used to inform the \ac{PMIx} server library's host of a detected event in the \ac{PMIx} server library. Events generated by \ac{PMIx} clients are communicated to the \ac{PMIx} server library, but will be relayed to the host via this \refapi{pmix_server_notify_event_fn_t} function pointer, if provided. +On the server side, the notification function, \refapi{pmix_server_notify_event_fn_t} (See \ref{chap:server:func_pointers}), is used to inform the \ac{PMIx} server library's host of a detected event in the \ac{PMIx} server library. Events generated by \ac{PMIx} clients are communicated to the \ac{PMIx} server library, but will be relayed to the host via its \refapi{pmix_server_notify_event_fn_t} function pointer, if provided. \advicermend @@ -573,7 +573,7 @@ \subsubsection{Completion Callback Function Status Codes} \end{constantdesc} % -The following status code may be returned by a handler to stop the handler chain. It will not appear in the \refarg{results} for any handler invocation since it is the last handler executed in the chain. +The following status code may be returned by a handler to stop the handler chain. It will not appear in the \refarg{results} for any handler invocation since the handler that produces it is the last handler executed in the chain. \begin{constantdesc} % From aa9793c81b051b84e2130be993da40c4ebb688f5 Mon Sep 17 00:00:00 2001 From: David Solt Date: Mon, 18 Sep 2023 11:26:31 -0500 Subject: [PATCH 4/6] Make clearer that there is a single event chain Co-authored-by: Josh Hursey <4259120+jjhursey@users.noreply.github.com> --- Chap_API_Event.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Chap_API_Event.tex b/Chap_API_Event.tex index 884363f3..e3b73b83 100644 --- a/Chap_API_Event.tex +++ b/Chap_API_Event.tex @@ -44,7 +44,7 @@ \section{Notification and Management} \end{itemize} Users can specify the callback order of a handler within its category at the time of registration. -Users can specify that a given handler be executed before or after another target handler should both handlers appear in an event chain (the ordering is ignored if the other handler isn't included). +Users can specify that a given handler be executed before or after another target handler should both handlers appear in the event chain (the ordering is ignored if the other handler isn't included). The ordering is dictated by providing the event handler name of the target. The name must have been assigned when the target handler was registered. Note that ordering does not imply immediate relationships. For example, multiple handlers registered to be serviced after event handler \textit{A} will all be executed after \textit{A}, but are not guaranteed to be executed in any particular order amongst themselves. From 8262de4c7073e33d1bb29f3e9b54b69c8686af6e Mon Sep 17 00:00:00 2001 From: David Solt Date: Mon, 18 Sep 2023 11:27:24 -0500 Subject: [PATCH 5/6] Better wording of advise on user defined event codes Co-authored-by: Aurelien Bouteiller --- Chap_API_Event.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Chap_API_Event.tex b/Chap_API_Event.tex index e3b73b83..db1ccd97 100644 --- a/Chap_API_Event.tex +++ b/Chap_API_Event.tex @@ -149,7 +149,7 @@ \subsection{\code{PMIx_Register_event_handler}} Register an event handler to report events. Note that the codes being registered do \textit{not} need to be \ac{PMIx} event constants --- any integer value can be registered. This allows for registration of non-PMIx events such as those defined by a particular \ac{SMS} vendor or by an application itself. \adviceuserstart -In order to avoid potential conflicts, users are advised to only use values as event codes if they lie outside the range of the \ac{PMIx} standard's error and event codes (See \ref{api:struct:usererrors}). Thus, \ac{SMS} vendors and application developers should constrain their definitions to positive values or negative values beyond the \refconst{PMIX_EXTERNAL_ERR_BASE} boundary. +In order to avoid potential conflicts, users are advised to only use event code values that lie outside the range of the \ac{PMIx} standard's error and event codes (See \ref{api:struct:usererrors}). Thus, \ac{SMS} vendors and application developers should constrain their definitions to positive values or negative values beyond the \refconst{PMIX_EXTERNAL_ERR_BASE} boundary. \adviceuserend From 52febfb6b2a2039c283c63bad647d9f01b87a6d7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Aur=C3=A9lien=20Bouteiller?= Date: Thu, 25 Jan 2024 01:35:24 -0500 Subject: [PATCH 6/6] Issue175: minor edits from 23Q4 --- Chap_API_Event.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Chap_API_Event.tex b/Chap_API_Event.tex index db1ccd97..faeded98 100644 --- a/Chap_API_Event.tex +++ b/Chap_API_Event.tex @@ -476,7 +476,7 @@ \subsection{\code{PMIx_Notify_event}} \returnend \reqattrstart -The following attributes are required to be supported by all \ac{PMIx} libraries and will be passed to all invoked handlers: +The following attributes are required to be supported by all \ac{PMIx} libraries and will be passed, when relevant, to all invoked handlers: \pasteAttributeItem{PMIX_EVENT_NON_DEFAULT} \pasteAttributeItem{PMIX_EVENT_CUSTOM_RANGE}