-
Notifications
You must be signed in to change notification settings - Fork 22
/
Chap_API_Server.tex
4225 lines (3181 loc) · 196 KB
/
Chap_API_Server.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Chapter: API Server
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{Server-Specific Interfaces}
\label{chap:api_server}
The process that hosts the \ac{PMIx} server library interacts with that library in two distinct manners. First, \ac{PMIx} provides a set of \acp{API} by which the host can request specific services from its library. This includes:
\begin{compactitemize}
\item collecting inventory to support scheduling algorithms,
\item providing subsystems with an opportunity to precondition their resources for optimized application support,
\item generating regular expressions,
\item registering information to be passed to client processes, and
\item requesting information on behalf of a remote process.
\end{compactitemize}
Note that the host always has access to all \ac{PMIx} client \acp{API} - the functions listed below are in addition to those available to a \ac{PMIx} client.
Second, the host can provide a set of callback functions by which the \ac{PMIx} server library can pass requests upward for servicing by the host. These include notifications of client connection and finalize, as well as requests by clients for information and/or services that the \ac{PMIx} server library does not itself provide.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Server Initialization and Finalization}
\label{chap:api_init:server}
Initialization and finalization routines for \ac{PMIx} servers.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\code{PMIx_server_init}}
\declareapi{PMIx_server_init}
%%%%
\summary
Initialize the \ac{PMIx} server.
%%%%
\format
\copySignature{PMIx_server_init}{1.0}{
pmix_status_t \\
PMIx_server_init(pmix_server_module_t *module, \\
\hspace*{17\sigspace}pmix_info_t info[], size_t ninfo);
}
\begin{arglist}
\arginout{module}{\refapi{pmix_server_module_t} structure (handle)}
\argin{info}{Array of \refstruct{pmix_info_t} structures (array of handles)}
\argin{ninfo}{Number of elements in the \refarg{info} array (\code{size_t})}
\end{arglist}
\returnsimple
\reqattrstart
The following attributes are required to be supported by all \ac{PMIx} libraries:
\pasteAttributeItem{PMIX_SERVER_NSPACE}
\pasteAttributeItem{PMIX_SERVER_RANK}
\pasteAttributeItem{PMIX_SERVER_TMPDIR}
\pasteAttributeItem{PMIX_SYSTEM_TMPDIR}
\pasteAttributeItem{PMIX_SERVER_TOOL_SUPPORT}
\pasteAttributeItem{PMIX_SERVER_SYSTEM_SUPPORT}
\pasteAttributeItem{PMIX_SERVER_SESSION_SUPPORT}
\pasteAttributeItem{PMIX_SERVER_GATEWAY}
\pasteAttributeItem{PMIX_SERVER_SCHEDULER}
\reqattrend
\optattrstart
The following attributes are optional for implementers of \ac{PMIx} libraries:
\pasteAttributeItemBegin{PMIX_USOCK_DISABLE} If the library supports Unix socket connections, this attribute may be supported for disabling it.
\pasteAttributeItemEnd{}
\pasteAttributeItemBegin{PMIX_SOCKET_MODE} If the library supports socket connections, this attribute may be supported for setting the socket mode.
\pasteAttributeItemEnd{}
\pasteAttributeItem{PMIX_SINGLE_LISTENER}
\pasteAttributeItemBegin{PMIX_TCP_REPORT_URI} If the library supports TCP socket connections, this attribute may be supported for reporting the URI.
\pasteAttributeItemEnd{}
\pasteAttributeItemBegin{PMIX_TCP_IF_INCLUDE} If the library supports TCP socket connections, this attribute may be supported for specifying the interfaces to be used.
\pasteAttributeItemEnd{}
\pasteAttributeItemBegin{PMIX_TCP_IF_EXCLUDE} If the library supports TCP socket connections, this attribute may be supported for specifying the interfaces that are \textit{not} to be used.
\pasteAttributeItemEnd{}
\pasteAttributeItemBegin{PMIX_TCP_IPV4_PORT} If the library supports IPV4 connections, this attribute may be supported for specifying the port to be used.
\pasteAttributeItemEnd{}
\pasteAttributeItemBegin{PMIX_TCP_IPV6_PORT} If the library supports IPV6 connections, this attribute may be supported for specifying the port to be used.
\pasteAttributeItemEnd{}
\pasteAttributeItemBegin{PMIX_TCP_DISABLE_IPV4} If the library supports IPV4 connections, this attribute may be supported for disabling it.
\pasteAttributeItemEnd{}
\pasteAttributeItemBegin{PMIX_TCP_DISABLE_IPV6} If the library supports IPV6 connections, this attribute may be supported for disabling it.
\pasteAttributeItemEnd{}
\pasteAttributeItemBegin{PMIX_SERVER_REMOTE_CONNECTIONS} If the library supports connections from remote tools, this attribute may be supported for enabling or disabling it.
\pasteAttributeItemEnd{}
\pasteAttributeItem{PMIX_EXTERNAL_PROGRESS}
\pasteAttributeItem{PMIX_EVENT_BASE}
\pasteAttributeItem{PMIX_TOPOLOGY2}
\pasteAttributeItemBegin{PMIX_SERVER_SHARE_TOPOLOGY}The \ac{PMIx} server will
perform the necessary actions to scalably expose the description to the local
clients. This includes creating any required shared memory backing stores and/
or \ac{XML} representations, plus ensuring that all necessary key-value pairs
for clients to access the description are included in the job-level
information provided to each client. All required files are to be installed
under the effective \refattr{PMIX_SERVER_TMPDIR} directory. The \ac{PMIx}
server library is responsible for cleaning up any artifacts (e.g., shared
memory backing files or cached key-value pairs) at library finalize.
\pasteAttributeItemEnd{}
\pasteAttributeItem{PMIX_SERVER_ENABLE_MONITORING}
\pasteAttributeItem{PMIX_HOMOGENEOUS_SYSTEM}
\pasteAttributeItem{PMIX_SINGLETON}
\pasteAttributeItem{PMIX_IOF_LOCAL_OUTPUT}
\optattrend
%%%%
\descr
Initialize the \ac{PMIx} server support library, and provide a pointer to a \refapi{pmix_server_module_t} structure containing the caller's callback functions.
The array of \refstruct{pmix_info_t} structs is used to pass additional info that may be required by the server when initializing.
For example, it may include the \refattr{PMIX_SERVER_TOOL_SUPPORT} attribute, thereby indicating that the daemon is willing to accept connection requests from tools.
\advicermstart
Providing a value of \code{NULL} for the \refarg{module} argument is permitted, as is passing an empty \refarg{module} structure. Doing so indicates that the host environment will not provide support for multi-node operations such as \refapi{PMIx_Fence}, but does intend to support local clients access to information.
\advicermend
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\code{PMIx_server_finalize}}
\declareapi{PMIx_server_finalize}
%%%%
\summary
Finalize the PMIx server library.
%%%%
\format
\copySignature{PMIx_server_finalize}{1.0}{
pmix_status_t \\
PMIx_server_finalize(void);
}
\returnsimple
%%%%
\descr
Finalize the \ac{PMIx} server support library, terminating all connections to attached tools and any local clients.
All memory usage is released.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Server Initialization Attributes}
\label{chap:api_init:serverattrs}
These attributes are used to direct the configuration and operation of the \ac{PMIx} server library by passing them into \refapi{PMIx_server_init}.
%
\declareAttribute{PMIX_TOPOLOGY2}{"pmix.topo2"}{pmix_topology_t}{
Provide a pointer to an implementation-specific description of the local node
topology.
}
%
\declareAttribute{PMIX_SERVER_SHARE_TOPOLOGY}{"pmix.srvr.share"}{bool}{
The \ac{PMIx} server is to share its copy of the local node topology (whether given to it or self-discovered) with any clients.
}
%
\declareAttribute{PMIX_USOCK_DISABLE}{"pmix.usock.disable"}{bool}{
Disable legacy UNIX socket (usock) support.
}
%
\declareAttribute{PMIX_SOCKET_MODE}{"pmix.sockmode"}{uint32_t}{
POSIX \var{mode_t} (9 bits valid).
}
%
\declareAttribute{PMIX_SINGLE_LISTENER}{"pmix.sing.listnr"}{bool}{
Use only one rendezvous socket, letting priorities and/or environment parameters select the active transport.
}
%
\declareAttribute{PMIX_SERVER_TOOL_SUPPORT}{"pmix.srvr.tool"}{bool}{
The host \ac{RM} wants to declare itself as willing to accept tool connection requests.
}
%
\declareAttribute{PMIX_SERVER_REMOTE_CONNECTIONS}{"pmix.srvr.remote"}{bool}{
Allow connections from remote tools. Forces the PMIx server to not exclusively use loopback device.
}
%
\declareAttribute{PMIX_SERVER_SYSTEM_SUPPORT}{"pmix.srvr.sys"}{bool}{
The host \ac{RM} wants to declare itself as being the local system server for PMIx connection requests.
}
%
\declareAttribute{PMIX_SERVER_SESSION_SUPPORT}{"pmix.srvr.sess"}{bool}{
The host \ac{RM} wants to declare itself as being the local session server for PMIx connection requests.
}
%
\declareAttribute{PMIX_SERVER_START_TIME}{"pmix.srvr.strtime"}{char*}{
Time when the server started - i.e., when the server created it's rendezvous file (given in ctime string format).
}
%
\declareAttribute{PMIX_SERVER_TMPDIR}{"pmix.srvr.tmpdir"}{char*}{
Top-level temporary directory for all client processes connected to this server, and where the PMIx server will place its tool rendezvous point and contact information.
}
%
\declareAttribute{PMIX_SYSTEM_TMPDIR}{"pmix.sys.tmpdir"}{char*}{
Temporary directory for this system, and where a PMIx server that declares itself to be a system-level server will place a tool rendezvous point and contact information.
}
%
\declareAttribute{PMIX_SERVER_ENABLE_MONITORING}{"pmix.srv.monitor"}{bool}{
Enable PMIx internal monitoring by the PMIx server.
}
%
\declareAttribute{PMIX_SERVER_NSPACE}{"pmix.srv.nspace"}{char*}{
Name of the namespace to use for this PMIx server.
}
%
\declareAttribute{PMIX_SERVER_RANK}{"pmix.srv.rank"}{pmix_rank_t}{
Rank of this PMIx server.
}
%
\declareAttribute{PMIX_SERVER_GATEWAY}{"pmix.srv.gway"}{bool}{
Server is acting as a gateway for PMIx requests that cannot be serviced on backend nodes (e.g., logging to email).
}
%
\declareAttribute{PMIX_SERVER_SCHEDULER}{"pmix.srv.sched"}{bool}{
Server is supporting system scheduler and desires access to appropriate \ac{WLM}-supporting features. Indicates that the library is to be initialized for scheduler support.
}
%
\declareAttribute{PMIX_EXTERNAL_PROGRESS}{"pmix.evext"}{bool}{
The host shall progress the \ac{PMIx} library via calls to \refapi{PMIx_Progress}
}
%
\declareAttribute{PMIX_HOMOGENEOUS_SYSTEM}{"pmix.homo"}{bool}{
The nodes comprising the session are homogeneous - i.e., they each contain the same number of identical packages, fabric interfaces, \acp{GPU}, and other devices.
}
%
\declareAttributeProvisional{PMIX_SINGLETON}{"pmix.singleton"}{char*}{
String representation (nspace.rank) of proc ID for the singleton the server was started to support
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Server Support Functions}
The following \acp{API} allow the \ac{RM} daemon that hosts the \ac{PMIx} server library to request specific services from the \ac{PMIx} library.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\code{PMIx_generate_regex}}
\declareapi{PMIx_generate_regex}
%%%%
\summary
Generate a compressed representation of the input string.
%%%%
\format
\copySignature{PMIx_generate_regex}{1.0}{
pmix_status_t \\
PMIx_generate_regex(const char *input, char **output);
}
\begin{arglist}
\argin{input}{String to process (string)}
\argout{output}{Compressed representation of \refarg{input} (array of bytes)}
\end{arglist}
\returnsimple
%%%%
\descr
Given a comma-separated list of \refarg{input} values, generate a reduced size representation of the input that can be passed down to the \ac{PMIx} server library's \refapi{PMIx_server_register_nspace} \ac{API} for parsing. The order of the individual values in the \refarg{input} string is preserved across the operation. The caller is responsible for releasing the returned data.
\label{regex:fmt}The precise compressed representations will be implementation specific. The regular expression itself is not required to be a printable string nor to obey typical string constraints (e.g., include a \code{NULL} terminator byte). However, all \ac{PMIx} implementations are required to include a colon-delimited \code{NULL}-terminated string at the beginning of the output representation that can be printed for diagnostic purposes and identifies the method used to generate the representation. The following identifiers are reserved by the \ac{PMIx} Standard:
\begin{itemize}
\item \code{"raw:\textbackslash0"} - indicates that the expression following the identifier is simply the comma-delimited input string (no processing was performed).
\item \code{"pmix:\textbackslash0"} - a \ac{PMIx}-unique regular expression represented as a \code{NULL}-terminated string following the identifier.
\item \code{"blob:\textbackslash0"} - a \ac{PMIx}-unique regular expression that is not represented as a \code{NULL}-terminated string following the identifier. Additional implementation-specific metadata may follow the identifier along with the data itself. For example, a compressed binary array format based on the \emph{zlib} compression package, with the size encoded in the space immediately following the identifier.
\end{itemize}
Communicating the resulting output should be done by first packing the returned expression using the \refapi{PMIx_Data_pack}, declaring the input to be of type \refconst{PMIX_REGEX}, and then obtaining the resulting blob to be communicated using the \refmacro{PMIX_DATA_BUFFER_UNLOAD} macro. The reciprocal method can be used on the remote end prior to passing the regex into \refapi{PMIx_server_register_nspace}. The pack/unpack routines will ensure proper handling of the data based on the regex prefix.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\code{PMIx_generate_ppn}}
\declareapi{PMIx_generate_ppn}
%%%%
\summary
Generate a compressed representation of the input identifying the processes on each node.
%%%%
\format
\copySignature{PMIx_generate_ppn}{1.0}{
pmix_status_t \\
PMIx_generate_ppn(const char *input, char **ppn);
}
\begin{arglist}
\argin{input}{String to process (string)}
\argout{ppn}{Compressed representation of \refarg{input} (array of bytes)}
\end{arglist}
\returnsimple
%%%%
\descr
The input shall consist of a semicolon-separated list of ranges representing the ranks of processes on each node of the job - e.g., \code{"1-4;2-5;8,10,11,12;6,7,9"}. Each field of the input must correspond to the node name provided at that position in the input to \refapi{PMIx_generate_regex}. Thus, in the example, ranks 1-4 would be located on the first node of the comma-separated list of names provided to \refapi{PMIx_generate_regex}, and ranks 2-5 would be on the second name in the list.
Rules governing the format of the returned regular expression are the same as those specified for \refapi{PMIx_generate_regex}, as detailed \hyperref[regex:fmt]{here}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\code{PMIx_server_register_nspace}}
\declareapi{PMIx_server_register_nspace}
%%%%
\summary
Setup the data about a particular namespace.
%%%%
\format
\copySignature{PMIx_server_register_nspace}{1.0}{
pmix_status_t \\
PMIx_server_register_nspace(const pmix_nspace_t nspace, \\
\hspace*{28\sigspace}int nlocalprocs, \\
\hspace*{28\sigspace}pmix_info_t info[], size_t ninfo, \\
\hspace*{28\sigspace}pmix_op_cbfunc_t cbfunc, \\
\hspace*{28\sigspace}void *cbdata);
}
\begin{arglist}
\argin{nspace}{Character array of maximum size \refconst{PMIX_MAX_NSLEN} containing the namespace identifier (string)}
\argin{nlocalprocs}{number of local processes (integer)}
\argin{info}{Array of info structures (array of handles)}
\argin{ninfo}{Number of elements in the \refarg{info} array (integer)}
\argin{cbfunc}{Callback function \refapi{pmix_op_cbfunc_t} to be executed upon completion of the operation. A \code{NULL} function reference indicates that the function is to be executed as a blocking operation (function reference)}
\argin{cbdata}{Data to be passed to the callback function (memory reference)}
\end{arglist}
\returnsimplenb
\returnstart
\begin{itemize}
\item \refconst{PMIX_OPERATION_SUCCEEDED}, indicating that the request was immediately processed and returned \textit{success} - the \refarg{cbfunc} will not be called
\end{itemize}
\returnend
\reqattrstart
The following attributes are required to be supported by all \ac{PMIx} libraries:
\pasteAttributeItem{PMIX_REGISTER_NODATA}
\pasteAttributeItem{PMIX_SESSION_INFO_ARRAY}
\pasteAttributeItem{PMIX_JOB_INFO_ARRAY}
\pasteAttributeItem{PMIX_APP_INFO_ARRAY}
\pasteAttributeItem{PMIX_PROC_INFO_ARRAY}
\pasteAttributeItem{PMIX_NODE_INFO_ARRAY}
\divider
Host environments are required to provide a wide range of session-, job-, application-, node-, and process-realm information, and may choose to provide a similarly wide range of optional information. The information is broadly separated into categories based on the \emph{data realm} definitions explained in Section \ref{api:struct:attributes:retrieval}, and retrieved according to the rules detailed in Section \ref{chap:api_rsvd_keys:retrules}.
Session-realm information may be passed as individual \refstruct{pmix_info_t} entries, or as part of a \refstruct{pmix_data_array_t} using the \refattr{PMIX_SESSION_INFO_ARRAY} attribute. The list of data referenced in this way shall include:
\begin{itemize}
\item \pasteAttributeItem{PMIX_UNIV_SIZE}
\item \pasteAttributeItemBegin{PMIX_MAX_PROCS}Must be provided if \refattr{PMIX_UNIV_SIZE} is not given. Requires use of the \refattr{PMIX_SESSION_INFO} attribute to avoid ambiguity when retrieving it.
\pasteAttributeItemEnd
\item \pasteAttributeItem{PMIX_SESSION_ID}
\end{itemize}
plus the following optional information:
\begin{itemize}
\item \pasteAttributeItemBegin{PMIX_CLUSTER_ID}As this information is not related to the namespace, it is best passed using the \refapi{PMIx_server_register_resources} \ac{API}.
\pasteAttributeItemEnd
\item \pasteAttributeItem{PMIX_ALLOCATED_NODELIST}
\item \pasteAttributeItemBegin{PMIX_RM_NAME}As this information is not related to the namespace, it is best passed using the \refapi{PMIx_server_register_resources} \ac{API}.
\pasteAttributeItemEnd
\item \pasteAttributeItemBegin{PMIX_RM_VERSION}As this information is not related to the namespace, it is best passed using the \refapi{PMIx_server_register_resources} \ac{API}.
\pasteAttributeItemEnd
\item \pasteAttributeItemBegin{PMIX_SERVER_HOSTNAME}As this information is not related to the namespace, it is best passed using the \refapi{PMIx_server_register_resources} \ac{API}.
\pasteAttributeItemEnd
\end{itemize}
Job-realm information may be passed as individual \refstruct{pmix_info_t} entries, or as part of a \refstruct{pmix_data_array_t} using the \refattr{PMIX_JOB_INFO_ARRAY} attribute. The list of data referenced in this way shall include:
\begin{itemize}
\item \pasteAttributeItemBegin{PMIX_SERVER_NSPACE}Identifies the namespace of the \ac{PMIx} server itself
\pasteAttributeItemEnd
\item \pasteAttributeItemBegin{PMIX_SERVER_RANK}Identifies the rank of the \ac{PMIx} server itself.
\pasteAttributeItemEnd
\item \pasteAttributeItemBegin{PMIX_NSPACE}Identifies the namespace of the job being registered.
\pasteAttributeItemEnd
\item \pasteAttributeItem{PMIX_JOBID}
\item \pasteAttributeItem{PMIX_JOB_SIZE}
\item \pasteAttributeItemBegin{PMIX_MAX_PROCS}Retrieval of this attribute defaults to the job level unless an appropriate specification is given (e.g., \refattr{PMIX_SESSION_INFO}).
\pasteAttributeItemEnd
\item \pasteAttributeItem{PMIX_NODE_MAP}
\item \pasteAttributeItem{PMIX_PROC_MAP}
\end{itemize}
plus the following optional information:
\begin{itemize}
\item \pasteAttributeItem{PMIX_NPROC_OFFSET}
\item \pasteAttributeItemBegin{PMIX_JOB_NUM_APPS}This is a required attribute if more than one application is included in the job.
\pasteAttributeItemEnd
\item \pasteAttributeItem{PMIX_MAPBY}
\item \pasteAttributeItem{PMIX_RANKBY}
\item \pasteAttributeItem{PMIX_BINDTO}
\item \pasteAttributeItem{PMIX_HOSTNAME_KEEP_FQDN}
\item \pasteAttributeItem{PMIX_ANL_MAP}
\item \pasteAttributeItem{PMIX_TDIR_RMCLEAN}
\item \pasteAttributeItem{PMIX_CRYPTO_KEY}
\end{itemize}
If more than one application is included in the namespace, then the host environment is also required to supply data consisting of the following items for each application in the job, passed as a \refstruct{pmix_data_array_t} using the \refattr{PMIX_APP_INFO_ARRAY} attribute:
\begin{itemize}
\item \pasteAttributeItemBegin{PMIX_APPNUM}This attribute must appear at the beginning of the array.
\pasteAttributeItemEnd
\item \pasteAttributeItem{PMIX_APP_SIZE}
\item \pasteAttributeItemBegin{PMIX_MAX_PROCS}Requires use of the \refattr{PMIX_APP_INFO} attribute to avoid ambiguity when retrieving it.
\pasteAttributeItemEnd
\item \pasteAttributeItem{PMIX_APPLDR}
\item \pasteAttributeItemBegin{PMIX_WDIR}This attribute is required for all registrations, but may be provided as an individual \refstruct{pmix_info_t} entry if only one application is included in the namespace.
\pasteAttributeItemEnd
\item \pasteAttributeItemBegin{PMIX_APP_ARGV}This attribute is required for all registrations, but may be provided as an individual \refstruct{pmix_info_t} entry if only one application is included in the namespace.
\pasteAttributeItemEnd
\end{itemize}
plus the following optional information:
\begin{itemize}
\item \pasteAttributeItem{PMIX_PSET_NAMES}
\item \pasteAttributeItemBegin{PMIX_APP_MAP_TYPE}This attribute may be provided as an individual \refstruct{pmix_info_t} entry if only one application is included in the namespace.
\pasteAttributeItemEnd
\item \pasteAttributeItemBegin{PMIX_APP_MAP_REGEX}This attribute may be provided as an individual \refstruct{pmix_info_t} entry if only one application is included in the namespace.
\pasteAttributeItemEnd
\end{itemize}
The data may also include attributes provided by the host environment that identify the programming model (as specified by the user) being executed within the application. The \ac{PMIx} server library may utilize this information to customize the environment to fit that model (e.g., adding environmental variables specified by the corresponding standard for that model):
\begin{itemize}
\item \pasteAttributeItem{PMIX_PROGRAMMING_MODEL}
\item \pasteAttributeItem{PMIX_MODEL_LIBRARY_NAME}
\item \pasteAttributeItem{PMIX_MODEL_LIBRARY_VERSION}
\end{itemize}
Node-realm information may be passed as individual \refstruct{pmix_info_t} entries if only one node will host processes from the job being registered, or as part of a \refstruct{pmix_data_array_t} using the \refattr{PMIX_NODE_INFO_ARRAY} attribute when multiple nodes are involved in the job. The list of data referenced in this way shall include:
\begin{itemize}
\item \pasteAttributeItem{PMIX_NODEID}
\item \pasteAttributeItemBegin{PMIX_HOSTNAME}As this information is not related to the namespace, it can be passed using the \refapi{PMIx_server_register_resources} \ac{API}. However, either it or the \refattr{PMIX_NODEID} must be included in the array to properly identify the node.
\pasteAttributeItemEnd
\item \pasteAttributeItemBegin{PMIX_HOSTNAME_ALIASES}As this information is not related to the namespace, it is best passed using the \refapi{PMIx_server_register_resources} \ac{API}.
\pasteAttributeItemEnd
\item \pasteAttributeItem{PMIX_LOCAL_SIZE}
\item \pasteAttributeItem{PMIX_NODE_SIZE}
\item \pasteAttributeItem{PMIX_LOCALLDR}
\item \pasteAttributeItem{PMIX_LOCAL_PEERS}
\item \pasteAttributeItem{PMIX_NODE_OVERSUBSCRIBED}
\end{itemize}
plus the following information for the server's own node:
\begin{itemize}
\item \pasteAttributeItem{PMIX_TMPDIR}
\item \pasteAttributeItem{PMIX_NSDIR}
\item \pasteAttributeItem{PMIX_LOCAL_PROCS}
\end{itemize}
The data may also include the following optional information for the server's own node:
\begin{itemize}
\item \pasteAttributeItem{PMIX_LOCAL_CPUSETS}
\item \pasteAttributeItemBegin{PMIX_AVAIL_PHYS_MEMORY}As this information is not related to the namespace, it can be passed using the \refapi{PMIx_server_register_resources} \ac{API}.
\pasteAttributeItemEnd
\end{itemize}
and the following optional information for other nodes:
\begin{itemize}
\item \pasteAttributeItemBegin{PMIX_MAX_PROCS}Requires use of the \refattr{PMIX_NODE_INFO} attribute to avoid ambiguity when retrieving it.
\pasteAttributeItemEnd
\end{itemize}
Process-realm information shall include the following data for each process in the job, passed as a \refstruct{pmix_data_array_t} using the \refattr{PMIX_PROC_INFO_ARRAY} attribute:
\begin{itemize}
\item \pasteAttributeItem{PMIX_RANK}
\item \pasteAttributeItemBegin{PMIX_APPNUM}This attribute may be omitted if only one application is present in the namespace.
\pasteAttributeItemEnd
\item \pasteAttributeItemBegin{PMIX_APP_RANK}This attribute may be omitted if only one application is present in the namespace.
\pasteAttributeItemEnd
\item \pasteAttributeItem{PMIX_GLOBAL_RANK}
\item \pasteAttributeItem{PMIX_LOCAL_RANK}
\item \pasteAttributeItem{PMIX_NODE_RANK}
\item \pasteAttributeItem{PMIX_NODEID}
\item \pasteAttributeItem{PMIX_REINCARNATION}
\item \pasteAttributeItem{PMIX_SPAWNED}
\end{itemize}
plus the following information for processes that are local to the server:
\begin{itemize}
\item \pasteAttributeItem{PMIX_LOCALITY_STRING}
\item \pasteAttributeItem{PMIX_PROCDIR}
\item \pasteAttributeItem{PMIX_PACKAGE_RANK}
\end{itemize}
and the following optional information - note that some of this information can be derived from information already provided by other attributes, but it may be included here for ease of retrieval by users:
\begin{itemize}
\item \pasteAttributeItem{PMIX_HOSTNAME}
\item \pasteAttributeItem{PMIX_CPUSET}
\item \pasteAttributeItem{PMIX_CPUSET_BITMAP}
\item \pasteAttributeItem{PMIX_DEVICE_DISTANCES}
\end{itemize}
\divider
Attributes not directly provided by the host environment may be derived by the \ac{PMIx} server library from other required information and included in the data made available to the server library's clients.
\reqattrend
%%%%
\descr
Pass job-related information to the \ac{PMIx} server library for distribution to local client processes.
\advicermstart
Host environments are required to execute this operation prior to starting any local application process within the given namespace.
The \ac{PMIx} server must register all namespaces that will participate in collective operations with local processes.
This means that the server must register a namespace even if it will not host any local processes from within that namespace if any local process of another namespace might at some point perform an operation involving one or more processes from the new namespace.
This is necessary so that the collective operation can identify the participants and know when it is locally complete.
The caller must also provide the number of local processes that will be launched within this namespace.
This is required for the \ac{PMIx} server library to correctly handle collectives as a collective operation call can occur before all the local processes have been started.
A \code{NULL} \refarg{cbfunc} reference indicates that the function is to be executed as a blocking operation.
\advicermend
\adviceuserstart
The number of local processes for any given namespace is generally fixed at the time of application launch. Calls to \refapi{PMIx_Spawn} result in processes launched in their own namespace, not that of their parent. However, it is possible for processes to \textit{migrate} to another node via a call to \refapi{PMIx_Job_control_nb}, thus resulting in a change to the number of local processes on both the initial node and the node to which the process moved. It is therefore critical that applications not migrate processes without first ensuring that \ac{PMIx}-based collective operations are not in progress, and that no such operations be initiated until process migration has completed.
\adviceuserend
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Namespace registration attributes}
\label{api:struct:attributes:storage}
The following attributes are defined specifically for use with the \refapi{PMIx_server_register_nspace} \ac{API}:
%
\declareAttribute{PMIX_REGISTER_NODATA}{"pmix.reg.nodata"}{bool}{
Registration is for this namespace only, do not copy job data.
}
\vspace{\baselineskip}
The following attributes are used to assemble information according to its data realm (\refterm{session}, \refterm{job}, \refterm{application}, \refterm{node}, or \refterm{process} as defined in Section \ref{api:struct:attributes:retrieval}) for registration where ambiguity may exist - see \ref{chap:api_server:assemble} for examples of their use.
%
\declareAttribute{PMIX_SESSION_INFO_ARRAY}{"pmix.ssn.arr"}{pmix_data_array_t}{
Provide an array of \refstruct{pmix_info_t} containing session-realm information. The \refattr{PMIX_SESSION_ID} attribute is required to be included in the array.
}
%
\declareAttribute{PMIX_JOB_INFO_ARRAY}{"pmix.job.arr"}{pmix_data_array_t}{
Provide an array of \refstruct{pmix_info_t} containing job-realm information. The \refattr{PMIX_SESSION_ID} attribute of the \refterm{session} containing the \refterm{job} is required to be included in the array whenever the \ac{PMIx} server library may host multiple sessions (e.g., when executing with a host \ac{RM} daemon). As information is registered one job (aka namespace) at a time via the \refapi{PMIx_server_register_nspace} \ac{API}, there is no requirement that the array contain either the \refattr{PMIX_NSPACE} or \refattr{PMIX_JOBID} attributes when used in that context (though either or both of them may be included). At least one of the job identifiers must be provided in all other contexts where the job being referenced is ambiguous.
}
%
\declareAttribute{PMIX_APP_INFO_ARRAY}{"pmix.app.arr"}{pmix_data_array_t}{
Provide an array of \refstruct{pmix_info_t} containing application-realm information. The \refattr{PMIX_NSPACE} or \refattr{PMIX_JOBID} attributes of the \refterm{job} containing the application, plus its \refattr{PMIX_APPNUM} attribute, must to be included in the array when the array is \textit{not} included as part of a call to \refapi{PMIx_server_register_nspace} - i.e., when the job containing the application is ambiguous. The job identification is otherwise optional.
}
%
\declareAttribute{PMIX_PROC_INFO_ARRAY}{"pmix.pdata"}{pmix_data_array_t}{
Provide an array of \refstruct{pmix_info_t} containing process-realm information. The \refattr{PMIX_RANK} and \refattr{PMIX_NSPACE} attributes, or the \refattr{PMIX_PROCID} attribute, are required to be included in the array when the array is not included as part of a call to \refapi{PMIx_server_register_nspace} - i.e., when the job containing the process is ambiguous. All three may be included if desired. When the array is included in some broader structure that identifies the job, then only the \refattr{PMIX_RANK} or the \refattr{PMIX_PROCID} attribute must be included (the others are optional).
}
%
\declareAttribute{PMIX_NODE_INFO_ARRAY}{"pmix.node.arr"}{pmix_data_array_t}{
Provide an array of \refstruct{pmix_info_t} containing node-realm information. At a minimum, either the \refattr{PMIX_NODEID} or \refattr{PMIX_HOSTNAME} attribute is required to be included in the array, though both may be included.
}
Note that these assemblages can be used hierarchically:
\begin{itemize}
\item a \refattr{PMIX_JOB_INFO_ARRAY} might contain multiple \refattr{PMIX_APP_INFO_ARRAY} elements, each describing values for a specific application within the job.
\item a \refattr{PMIX_JOB_INFO_ARRAY} could contain a \refattr{PMIX_NODE_INFO_ARRAY} for each node hosting processes from that job, each array describing job-level values for that node.
\item a \refattr{PMIX_SESSION_INFO_ARRAY} might contain multiple \refattr{PMIX_JOB_INFO_ARRAY} elements, each describing a job executing within the session. Each job array could, in turn, contain both application and node arrays, thus providing a complete picture of the active operations within the allocation.
\end{itemize}
\adviceimplstart
\ac{PMIx} implementations must be capable of properly parsing and storing any hierarchical depth of information arrays. The resulting stored values are must to be accessible via both \refapi{PMIx_Get} and \refapi{PMIx_Query_info_nb} \acp{API}, assuming appropriate directives are provided by the caller.
\adviceimplend
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Assembling the registration information}
\label{chap:api_server:assemble}
The following description is not intended to represent the actual layout of information in a given \ac{PMIx} library. Instead, it is describes how information provided in the \refarg{info} parameter of the \refapi{PMIx_server_register_nspace} shall be organized for proper processing by a \ac{PMIx} server library. The ordering of the various information elements is arbitrary - they are presented in a top-down hierarchical form solely for clarity in reading.
\advicermstart
Creating the \refarg{info} array of data requires knowing in advance the number of elements required for the array. This can be difficult to compute and somewhat fragile in practice. One method for resolving the problem is to create a linked list of objects, each containing a single \refstruct{pmix_info_t} structure. Allocation and manipulation of the list can then be accomplished using existing standard methods. Upon completion, the final \refarg{info} array can be allocated based on the number of elements on the list, and then the values in the list object \refstruct{pmix_info_t} structures transferred to the corresponding array element utilizing the \refapi{PMIx_Info_xfer} \ac{API}.
\advicermend
\label{cptr:api_server:noderegex}A common building block used in several areas is the construction of a regular expression identifying the nodes involved in that area - e.g., the nodes in a \refterm{session} or \refterm{job}. \ac{PMIx} provides several tools to facilitate this operation, beginning by constructing an argv-like array of node names. This array is then passed to the \refapi{PMIx_generate_regex} function to create a regular expression parseable by the \ac{PMIx} server library, as shown below:
\cspecificstart
\begin{codepar}
char **nodes = NULL;
char *nodelist;
char *regex;
size_t n;
pmix_status_t rc;
pmix_info_t info;
/* loop over an array of nodes, adding each
* name to the array */
for (n=0; n < num_nodes; n++) \{
/* filter the nodes to ignore those not included
* in the target range (session, job, etc.). In
* this example, all nodes are accepted */
PMIX_ARGV_APPEND(&nodes, node[n]->name);
\}
/* join into a comma-delimited string */
nodelist = PMIX_ARGV_JOIN(nodes, ',');
/* release the array */
PMIX_ARGV_FREE(nodes);
/* generate regex */
rc = PMIx_generate_regex(nodelist, ®ex);
/* release list */
free(nodelist);
/* pass the regex as the value to the PMIX_NODE_MAP key */
PMIx_Info_load(&info, PMIX_NODE_MAP, regex, PMIX_REGEX);
/* release the regex */
free(regex);
\end{codepar}
\cspecificend
Changing the filter criteria allows the construction of node maps for any level of information. A description of the returned regular expression is provided \hyperref[regex:fmt]{here}.
\label{cptr:api_server:ppnregex}A similar method is used to construct the map of processes on each node from the namespace being registered. This may be done for each information level of interest (e.g., to identify the process map for the entire \refterm{job} or for each \refterm{application} in the job) by changing the search criteria. An example is shown below for the case of creating the process map for a \refterm{job}:
\cspecificstart
\begin{codepar}
char **ndppn;
char rank[30];
char **ppnarray = NULL;
char *ppn;
char *localranks;
char *regex;
size_t n, m;
pmix_status_t rc;
pmix_info_t info;
/* loop over an array of nodes */
for (n=0; n < num_nodes; n++) \{
/* for each node, construct an array of ranks on that node */
ndppn = NULL;
for (m=0; m < node[n]->num_procs; m++) \{
/* ignore processes that are not part of the target job */
if (!PMIX_CHECK_NSPACE(targetjob,node[n]->proc[m].nspace)) \{
continue;
\}
snprintf(rank, 30, "%d", node[n]->proc[m].rank);
PMIX_ARGV_APPEND(&ndppn, rank);
\}
/* convert the array into a comma-delimited string of ranks */
localranks = PMIX_ARGV_JOIN(ndppn, ',');
/* release the local array */
PMIX_ARGV_FREE(ndppn);
/* add this node's contribution to the overall array */
PMIX_ARGV_APPEND(&ppnarray, localranks);
/* release the local list */
free(localranks);
\}
/* join into a semicolon-delimited string */
ppn = PMIX_ARGV_JOIN(ppnarray, ';');
/* release the array */
PMIX_ARGV_FREE(ppnarray);
/* generate ppn regex */
rc = PMIx_generate_ppn(ppn, ®ex);
/* release list */
free(ppn);
/* pass the regex as the value to the PMIX_PROC_MAP key */
PMIx_Info_load(&info, PMIX_PROC_MAP, regex, PMIX_REGEX);
/* release the regex */
free(regex);
\end{codepar}
\cspecificend
Note that the \refattr{PMIX_NODE_MAP} and \refattr{PMIX_PROC_MAP} attributes are linked in that the order of entries in the process map must match the ordering of nodes in the node map - i.e., there is no provision in the \ac{PMIx} process map regular expression generator/parser pair supporting an out-of-order node or a node that has no corresponding process map entry (e.g., a node with no processes on it). Armed with these tools, the registration \refarg{info} array can be constructed as follows:
\begin{itemize}
\item Session-level information includes all session-specific values. In many cases, only two values (~\refattr{PMIX_SESSION_ID} and \refattr{PMIX_UNIV_SIZE}) are included in the registration array. Since both of these values are session-specific, they can be specified independently - i.e., in their own \refstruct{pmix_info_t} elements of the \refarg{info} array. Alternatively, they can be provided as a \refstruct{pmix_data_array_t} array of \refstruct{pmix_info_t} using the \refattr{PMIX_SESSION_INFO_ARRAY} attribute and identifed by including the \refattr{PMIX_SESSION_ID} attribute in the array - this is required in cases where non-specific attributes (e.g., \refattr{PMIX_NUM_NODES} or \refattr{PMIX_NODE_MAP}~) are passed to describe aspects of the session. Note that the node map can include nodes not used by the job being registered as no corresponding process map is specified.
The \refarg{info} array at this point might look like (where the labels identify the corresponding attribute - e.g., ``Session ID'' corresponds to the \refattr{PMIX_SESSION_ID} attribute):
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.3\textwidth]{figs/sessioninfo.pdf}
\end{center}
\caption{Session-level information elements}
\label{fig:sessioninfo}
\end{figure*}
\endgroup
\item Job-level information includes all job-specific values such as \refattr{PMIX_JOB_SIZE}, \refattr{PMIX_JOB_NUM_APPS}, and \refattr{PMIX_JOBID}. Since each invocation of \refapi{PMIx_server_register_nspace} describes a single \refterm{job}, job-specific values can be specified independently - i.e., in their own \refstruct{pmix_info_t} elements of the \refarg{info} array. Alternatively, they can be provided as a \refstruct{pmix_data_array_t} array of \refstruct{pmix_info_t} identified by the \refattr{PMIX_JOB_INFO_ARRAY} attribute - this is required in cases where non-specific attributes (e.g., \refattr{PMIX_NODE_MAP}) are passed to describe aspects of the job. Note that since the invocation only involves a single namespace, there is no need to include the \refattr{PMIX_NSPACE} attribute in the array.
Upon conclusion of this step, the \refarg{info} array might look like:
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.4\textwidth]{figs/jobinfo.pdf}
\end{center}
\caption{Job-level information elements}
\label{fig:jobinfo}
\end{figure*}
\endgroup
Note that in this example, \refattr{PMIX_NUM_NODES} is not required as that information is contained in the \refattr{PMIX_NODE_MAP} attribute. Similarly, \refattr{PMIX_JOB_SIZE} is not technically required as that information is contained in the \refattr{PMIX_PROC_MAP} when combined with the corresponding node map - however, there is no issue with including the job size as a separate entry.
The example also illustrates the hierarchical use of the \refattr{PMIX_NODE_INFO_ARRAY} attribute. In this case, we have chosen to pass several job-related values for each node - since those values are non-unique across the job, they must be passed in a node-info container. Note that the choice of what information to pass into the \ac{PMIx} server library versus what information to derive from other values at time of request is left to the host environment. \ac{PMIx} implementors in turn may, if they choose, pre-parse registration data to create expanded views (thus enabling faster response to requests at the expense of memory footprint) or to compress views into tighter representations (thus trading minimized footprint for longer response times).
\item Application-level information includes all application-specific values such as \refattr{PMIX_APP_SIZE} and \refattr{PMIX_APPLDR}. If the \refterm{job} contains only a single \refterm{application}, then the application-specific values can be specified independently - i.e., in their own \refstruct{pmix_info_t} elements of the \refarg{info} array - or as a \refstruct{pmix_data_array_t} array of \refstruct{pmix_info_t} using the \refattr{PMIX_APP_INFO_ARRAY} attribute and identifed by including the \refattr{PMIX_APPNUM} attribute in the array. Use of the array format is must in cases where non-specific attributes (e.g., \refattr{PMIX_NODE_MAP}) are passed to describe aspects of the application.
However, in the case of a job consisting of multiple applications, all application-specific values for each application must be provided using the \refattr{PMIX_APP_INFO_ARRAY} format, each identified by its \refattr{PMIX_APPNUM} value.
Upon conclusion of this step, the \refarg{info} array might look like that shown in \ref{fig:appinfo}, assuming there are two applications in the job being registered:
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.5\textwidth]{figs/appinfo.pdf}
\end{center}
\caption{Application-level information elements}
\label{fig:appinfo}
\end{figure*}
\endgroup
\item Process-level information includes an entry for each process in the job being registered, each entry marked with the \refattr{PMIX_PROC_INFO_ARRAY} attribute. The \refterm{rank} of the process must be the first entry in the array - this provides efficiency when storing the data. Upon conclusion of this step, the \refarg{info} array might look like the diagram in \ref{fig:procinfo}:
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.8\textwidth]{figs/procinfo.pdf}
\end{center}
\caption{Process-level information elements}
\label{fig:procinfo}
\end{figure*}
\endgroup
\item For purposes of this example, node-level information only includes values describing the local node - i.e., it does not include information about other nodes in the job or session. In many cases, the values included in this level are unique to it and can be specified independently - i.e., in their own \refstruct{pmix_info_t} elements of the \refarg{info} array. Alternatively, they can be provided as a \refstruct{pmix_data_array_t} array of \refstruct{pmix_info_t} using the \refattr{PMIX_NODE_INFO_ARRAY} attribute - this is required in cases where non-specific attributes are passed to describe aspects of the node, or where values for multiple nodes are being provided.
The node-level information requires two elements that must be constructed in a manner similar to that used for the node map. The \refattr{PMIX_LOCAL_PEERS} value is computed based on the processes on the local node, filtered to select those from the job being registered, as shown below using the tools provided by \ac{PMIx}:
\cspecificstart
\begin{codepar}
char **ndppn = NULL;
char rank[30];
char *localranks;
size_t m;
pmix_info_t info;
for (m=0; m < mynode->num_procs; m++) \{
/* ignore processes that are not part of the target job */
if (!PMIX_CHECK_NSPACE(targetjob,mynode->proc[m].nspace)) \{
continue;
\}
snprintf(rank, 30, "%d", mynode->proc[m].rank);
PMIX_ARGV_APPEND(&ndppn, rank);
\}
/* convert the array into a comma-delimited string of ranks */
localranks = PMIX_ARGV_JOIN(ndppn, ',');
/* release the local array */
PMIX_ARGV_FREE(ndppn);
/* pass the string as the value to the PMIX_LOCAL_PEERS key */
PMIx_Info_load(&info, PMIX_LOCAL_PEERS, localranks, PMIX_STRING);
/* release the list */
free(localranks);
\end{codepar}
\cspecificend
The \refattr{PMIX_LOCAL_CPUSETS} value is constructed in a similar manner. In the provided example, it is assumed that an \ac{HWLOC} cpuset representation (a comma-delimited string of processor IDs) of the processors assigned to each process has previously been generated and stored on the process description. Thus, the value can be constructed as shown below:
\cspecificstart
\begin{codepar}
char **ndcpus = NULL;
char *localcpus;
size_t m;
pmix_info_t info;
for (m=0; m < mynode->num_procs; m++) \{
/* ignore processes that are not part of the target job */
if (!PMIX_CHECK_NSPACE(targetjob,mynode->proc[m].nspace)) \{
continue;
\}
PMIX_ARGV_APPEND(&ndcpus, mynode->proc[m].cpuset);
\}
/* convert the array into a colon-delimited string */
localcpus = PMIX_ARGV_JOIN(ndcpus, ':');
/* release the local array */
PMIX_ARGV_FREE(ndcpus);
/* pass the string as the value to the PMIX_LOCAL_CPUSETS key */
PMIx_Info_load(&info, PMIX_LOCAL_CPUSETS, localcpus, PMIX_STRING);
/* release the list */
free(localcpus);
\end{codepar}
\cspecificend
Note that for efficiency, these two values can be computed at the same time.
\end{itemize}
The final \refarg{info} array might therefore look like the diagram in \ref{fig:nodeinfo}:
\begingroup
\begin{figure*}[ht!]
\begin{center}
\includegraphics[clip,width=0.8\textwidth]{figs/nodeinfo.pdf}
\end{center}
\caption{Final information array}
\label{fig:nodeinfo}
\end{figure*}
\endgroup
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\code{PMIx_server_deregister_nspace}}
\declareapi{PMIx_server_deregister_nspace}
%%%%
\summary
Deregister a namespace.
%%%%
\format
\copySignature{PMIx_server_deregister_nspace}{1.0}{
void PMIx_server_deregister_nspace(const pmix_nspace_t nspace, \\
\hspace*{24\sigspace}pmix_op_cbfunc_t cbfunc, void *cbdata);
}
\begin{arglist}
\argin{nspace}{Namespace (string)}
\argin{cbfunc}{Callback function \refapi{pmix_op_cbfunc_t}. A \code{NULL} function reference indicates that the function is to be executed as a blocking operation. (function reference)}
\argin{cbdata}{Data to be passed to the callback function (memory reference)}
\end{arglist}
%%%%
\descr
Deregister the specified \refarg{nspace} and purge all objects relating to it, including any client information from that namespace.
This is intended to support persistent \ac{PMIx} servers by providing an opportunity for the host \ac{RM} to tell the \ac{PMIx} server library to release all memory for a completed job. Note that the library must not invoke the callback function prior to returning from the \ac{API}, and that a \code{NULL} \refarg{cbfunc} reference indicates that the function is to be executed as a blocking operation.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\code{PMIx_server_register_resources}}
\declareapi{PMIx_server_register_resources}
%%%%
\summary
Register non-namespace related information with the local \ac{PMIx} server library.
%%%%
\format
\copySignature{PMIx_server_register_resources}{4.0}{
pmix_status_t \\
PMIx_server_register_resources(pmix_info_t info[], size_t ninfo, \\
\hspace*{31\sigspace}pmix_op_cbfunc_t cbfunc, \\
\hspace*{31\sigspace}void *cbdata);
}
\begin{arglist}
\argin{info}{Array of info structures (array of handles)}
\argin{ninfo}{Number of elements in the \refarg{info} array (integer)}
\argin{cbfunc}{Callback function \refapi{pmix_op_cbfunc_t}. A \code{NULL} function reference indicates that the function is to be executed as a blocking operation (function reference)}
\argin{cbdata}{Data to be passed to the callback function (memory reference)}
\end{arglist}
%%%%
\descr
Pass information about resources not associated with a given namespace to the \ac{PMIx} server library for distribution to local client processes. This includes information on fabric devices, \acp{GPU}, and other resources. All information provided through this \ac{API} shall be made available to each job as part of its job-level information. Duplicate information provided with the \refapi{PMIx_server_register_nspace} \ac{API} shall override any information provided by this function for that namespace, but only for that specific namespace.
\returnsimple
\advicermstart
Note that information passed in this manner could also have been included in a call to \refapi{PMIx_server_register_nspace} - e.g., as part of a \refattr{PMIX_NODE_INFO_ARRAY} array. This \ac{API} is provided as a logical alternative for code clarity, especially where multiple jobs may be supported by a single \ac{PMIx} server library instance, to avoid multiple registration of static resource information.
A \code{NULL} \refarg{cbfunc} reference indicates that the function is to be executed as a blocking operation.
\advicermend
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\code{PMIx_server_deregister_resources}}
\declareapi{PMIx_server_deregister_resources}
%%%%
\summary
Remove specified non-namespace related information from the local \ac{PMIx} server library.
%%%%
\format
\copySignature{PMIx_server_deregister_resources}{4.0}{
pmix_status_t \\
PMIx_server_deregister_resources(pmix_info_t info[], size_t ninfo, \\
\hspace*{33\sigspace}pmix_op_cbfunc_t cbfunc, \\
\hspace*{33\sigspace}void *cbdata);
}
\begin{arglist}
\argin{info}{Array of info structures (array of handles)}
\argin{ninfo}{Number of elements in the \refarg{info} array (integer)}
\argin{cbfunc}{Callback function \refapi{pmix_op_cbfunc_t}. A \code{NULL} function reference indicates that the function is to be executed as a blocking operation (function reference)}
\argin{cbdata}{Data to be passed to the callback function (memory reference)}
\end{arglist}
%%%%
\descr
Remove information about resources not associated with a given namespace from the \ac{PMIx} server library. Only the \refarg{key} fields of the provided \refarg{info} array shall be used for the operation - the associated values shall be ignored except where they serve as qualifiers to the request. For example, to remove a specific fabric device from a given node, the \refarg{info} array might include a \refattr{PMIX_NODE_INFO_ARRAY} containing the \refattr{PMIX_NODEID} or \refattr{PMIX_HOSTNAME} identifying the node hosting the device, and the \refattr{PMIX_FABRIC_DEVICE_NAME} specifying the device to be removed. Alternatively, the device could be removed using only the \refattr{PMIX_DEVICE_ID} as this is unique across the overall system.
\returnsimple
\advicermstart
As information not related to namespaces is considered \emph{static}, there is no requirement that the host environment deregister resources prior to finalizing the \ac{PMIx} server library. The server library shall properly cleanup as part of its normal finalize operations. Deregistration of resources is only required, therefore, when the host environment determines that client processes should no longer have access to that information.
A \code{NULL} \refarg{cbfunc} reference indicates that the function is to be executed as a blocking operation.
% TODO: If this is one of those functions that can be blocking or non-blocking, then
% it needs a full return code explanation, not the returnsimple
\advicermend
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{\code{PMIx_server_register_client}}
\declareapi{PMIx_server_register_client}
%%%%
\summary
Register a client process with the PMIx server library.
%%%%
\format
\copySignature{PMIx_server_register_client}{1.0}{
pmix_status_t \\
PMIx_server_register_client(const pmix_proc_t *proc, \\
\hspace*{24\sigspace}uid_t uid, gid_t gid, \\
\hspace*{24\sigspace}void *server_object, \\
\hspace*{24\sigspace}pmix_op_cbfunc_t cbfunc, void *cbdata);
}
\begin{arglist}
\argin{proc}{\refstruct{pmix_proc_t} structure (handle)}
\argin{uid}{user id (integer)}
\argin{gid}{group id (integer)}
\argin{server_object}{(memory reference)}
\argin{cbfunc}{Callback function \refapi{pmix_op_cbfunc_t}. A \code{NULL} function reference indicates that the function is to be executed as a blocking operation (function reference)}
\argin{cbdata}{Data to be passed to the callback function (memory reference)}
\end{arglist}
\returnsimplenb
\returnstart
\begin{itemize}
\item \refconst{PMIX_OPERATION_SUCCEEDED}, indicating that the request was immediately processed and returned \textit{success} - the \refarg{cbfunc} will not be called
\end{itemize}
\returnend
%%%%