forked from bsc-performance-tools/extrae
-
Notifications
You must be signed in to change notification settings - Fork 0
/
ChangeLog
1333 lines (1215 loc) · 75.3 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
ChangeLog
[ + added, - removed, * changed ]
* 13May2022 Removes autogenerated INSTALL from git and updates README with instructions to install from git (closes #65)
* 12May2022 Changes config macros variable assignments to make them POSIX compliant (closes #56)
* 12May2022 Fixes allocated memory tracker to prevent infinite recursive calls (closes #66)
* 11May2022 Recovers merger flags [-no]-translate-data-addresses removed by commit cf9e823b32e1441d9b426dc37e7ed7320dd8a59c
* 11May2022 Fixes incorrect sequence of initialization of CUDA instrumentation components
* 02May2022 Fixes uninitialized HWC structures that were causing abnormaly high counter values
* 23May2022 Removes autogenerated INSTALL from git and updates README with instructions to install from git (closes #65)
+ 08Apr2022 Adds --enable-riscv configure option so ARCH_RISCV64 is enabled and PC is correctly retrieved
* 07Apr2022 GASPI labels are not written in the PCF when there are no events present in the trace
* 31Mar2022 Adds PAPI linker flags to parallel merger and sets merger default synchronization to by_node
* 28Mar2022 Fixes check for optional headers in AX_FIND_INSTALLATION
* 23Mar2022 Applies corrections in MPI_Comm_spawn support
* 08Mar2022 Reworks synchronization methods to work with and among multiple apps
+ 08Mar2022 Adds option to disable rpath per dependency
+ 04Mar2022 Adds option to disable rpath in the objects built
* 02Mar2022 Fixes OpenACC states
* 25Feb2022 Adds rpath to MPI libraries and improves the AX_FIND_INSTALLATION macro
* 16Feb2022 Changes user function instrumentation to work with function names only, using function adresses as fallback
+ 16Feb2022 Adds support for GOMP_taskloop_ull and changes interposition mechanism for a more robust one
+ 24Jan2022 Adds OPENACC instrumentation in host and CUDA devices
* 14Jan2022 Reworks allocated memory tracker and removes need for critical zones
+ 04Jan2022 Adds XML option to exclude some MPI_Comm_* calls from tracing
* 16Dec2022 Assigns same ID's to Infiniband counters ignoring Mellanox driver version
* 09Nov2021 Updates and upgrades CUDA support
* 09Nov2021 Ensures native and uncore counters are uniquely identified
* 19Oct2021 Fixes trace control file checks, skipping them if needed structures are not allocated
* 09Oct2020 GASPI parameters now start at 1 and PCF has labels to their actual values
+ 01Oct2020 Adds instrumentation support for gaspi_queue_create/delete
+ 30Sep2020 Adds instrumentation support for gaspi_read_notify & gaspi_read_list_notify
* 27Sep2021 Fixes states in OpenMP taskgroup and taskloop regions
+ 22Sep2021 Adds dependency lines between instantiation and execution of OpenMP tasks for GNU libgomp runtime
+ 09Sep2020 Adds callers at GASPI instrumentation points
* 07Sep2020 Fixes notification_id parameter for gaspi_notify_waitsome & gaspi_notify_reset
+ 06Sep2021 Adds missing wrapper for __kmpc_critical_with_hint
+ 31Aug2021 Extends communications matching algorithm to support applications calling MPI_Irecv and MPI_Wait from different threads
* 24Aug2021 Fixes unreleased mutex in xtr_hash_query and xtr_hash_add
* 05Aug2021 Creates new event handler for ADD_RESERVED_MEM_EV and SUB_RESERVED_MEM_EV
* 05Aug2021 Fix wrong arguments for mpi_win_lock and mpi_compare_and_swap wrappers
* 03Aug2021 Makes CPUID event start at 1
* 03Aug2021 Sampling temporal files not created if disabled in XML
+ 03Aug2021 Reads PC on RISC-V architecture to add support for time sampling
* 06Jul2021 Adds L3 store misses offcore counter PEBS sampling (closes #62)
* 06Jul2021 Adds new option ([-no]-translate-data-addresses) to mpi2prv (closes #61)
* 06Jul2021 Fixes calloc() behaviour (closes #60)
* 06Jul2021 Fixes realloc comment in mpi2prv (closes #59)
* 06Jul2021 Fix for missing symbols of libraries loaded dynamically (closes #58)
+ 27May2021 Adds support for MPI_Dist_graph_create_adjacent and fixes bug in xtr_MPI_Comm_neighbors_count parameters
* 26Jan2021 Checks if EXTRAE_PY_CEVENTS is not empty before checking its value
+ 29Sep2020 Adds configure option to disable instrumentation of pthread_cond_* calls
* 21Aug2020 Fixes bug instrumenting user code inside OpenMP
* 19Aug2020 Fixes wrong exit event in cudaFree and cudaMemset
* 12Aug2020 Modifications to support CUDA10
+ 14Jul2020 Adds missing wrappers/CUPTI and a catch-all for CUPTI uninstrumented routines
+ 03Jul2020 Enforces a second file system synchronization for the merger on the TRACE.mpits file
+ 03Jul2020 Adds Makefile.extrae_module to distribution to compile extrae_module.f90
+ 29Jun2020 Adds instrumentation for close and fclose IO calls
* 26Jun2020 Synchronization points are now stored in the local SYM to allow MPI_Init to be called from threads other than 0
* 25Jun2020 Changes call to MPI_Barrier into PMPI_Barrier
* 23Jun2020 Fixes wrong JAVAH definition and conditional check
* 23Jun2020 Fixes for dyninst make check tests and configure checks for libsynapse
* 23Jun2020 Forces exit on second termination signal
* 22Jun2020 Changes to use new binutils >= 2.34 API
* 22Jun2020 Changes merger block distribution to avoid processes with 0 traces
* 22Jun2020 Fixes checks for `javah` to compile Extrae with JDK > 9
* 19Jun2020 Updates common.h to meet C++11 syntax
* 12Jun2020 Solves duplicate capture of C and Fortran MPI calls when using Intel implementations
* 08Jun2020 Fixes finalization issues when Extrae is initialized by API and another runtime
* 26May2020 Adds option to instrument I/O internals
+ 25May2020 Fixes race condition in pthread instrumentation and adds wrappers for pthread_cond_signal, pthread_cond_broadcast, pthread_cond_wait, pthread_cond_timedwait
* 29Apr2020 Fixes local to world rank translations for comms going thru an intercommunicator created with MPI_Intercomm_create
+ 29Apr2020 Adds environment variable EXTRAE_ENFORCE_FS_SYNC to ensure file system synchronization after creating trace files and folders
* 17Mar2020 Changes for Workflows and Distributed Computing group
+ 13Mar2020 Added instrumentation for MPI_Comm_dup_with_info
+ 28Jan2020 Adds events ADD_RESERVED_MEM_EV and SUB_RESERVED_MEM_EV to track dynamic memory usage
* 22Jan2020 Fixes race condition in flushing/closing buffers between Backend_Finalize and Backend_Flush_pThread
* 20Jan2019 Fixes circular buffering to dump remaining data at the end of the run outside online mode
+ 13Jan2020 Adds xml option and flag to the merger to stop the process at a given percentage
+ 18Nov2019 Adds instrumentation wrappers for MPI_Comm_split_type
+ 07Nov2019 Adds environment variable EXTRAE_UNSET_PRELOAD to clear the LD_PRELOAD once the tracing is loaded
* 25Oct2019 Fixes double emission of python events call and c_call
* 10Sep2019 Fixes configure check for MPI_Comm_spawn
* 09Sep2019 Fixes online install-data-hook directory creation, others substitution, & mispellings (closes #33 #34 #36)
* 04Sep2019 Adds support for GASPI next branch and adds notification_id and queue_id parameters
+ 17Jul2019 Adds Python bindings for Extrae_shutdown and _restart
* 16Jul2019 Fixes correlation of PEBS samples with reallocated objects
+ 09Jul2019 Adds wrappers for MPI split collective data access routines
* 05Jul2019 Fixes bug in MPI_Test & MPI_Imrecv where the request is not properly copied for later processing
* 21Jun2019 Fixes potentially incomplete paths in Makefiles (closes #22)
* 13Jun2019 Adds support to read MPI task identifier from `PBS` environment variables
* 12Jun2019 Guards the definition of SaveMessage and ProcessMessage only under MPI3
* 11Jun2019 Replaces request's hash table for a new implementation
* 11Jun2019 Fixes bug in MPI_Wait where the input request is not copied for later processing
* 11Jun2019 Adds instrumentation support for GASPI
* 03Jun2019 Increases static buffer for `calloc` to 8MB
* 03Jun2019 Fixes libbfd detection in native arm64 machines
* 03Jun2019 Fixes Dyninst URL
* 25Apr2019 Fixes race conditions between thread creations and PEBS samples
* 17Apr2019 Fixes bug in pebs sampling store misses identification
+ 02Apr2019 Adds support for changing num_threads in OpenMP parallel constructs
* 29Mar2019 Changes order in which MPI env vars are checked to discover process' task id during auto init
* 28Mar2019 Extends support for PEBS sampling on Skylake processors, xml 'period' attribute changed to 'frequency'
+ 27Mar2019 Adds missing user_lock routines in kmp runtime
* 27Mar2019 Removes warnings for ambiguous `if` and non-std99 compliant `for`
* 27Mar2019 Fixes incorrect states for MPI_Win_flush_* calls
* 27Mar2019 Breaks infinite loop in signal handler (closes #19)
* 26Mar2019 Installs Python modules only if their support is enabled by `configure`
* 26Mar2019 Fixes bug in configure mis-detecting libiberty even if it is not available (closes #7)
- 26Mar2019 Reverts partial support for nested OpenMP in Intel KMPC runtime
+ 26Mar2019 Adds instrumentation support for Mprobe, Improbe, Mrecv, Imrecv
* 26Mar2019 Reverts commit 1a215e849f to stop using dladdr to translate UF from dynamic libraries
* 25Mar2019 Fixes missing worksharings in GNU and Intel OpenMP
+ 25Mar2019 Adds pyextrae.pthreads module
* 25Mar2019 Fixes timestamp of rusage and memusage events
+ 25Mar2019 Adds functions in pyextrae to toggle the profiler on/off
* 25Mar2019 Fixes compatibility issue with cudaStreamLegacy available since CUDA 7
* 25Mar2019 Fixes bug in initialization of OmpSs tracing (only master thread was detected)
+ 06Mar2019 Adds GASPI and GASPI+OMP examples (closes #41)
* 24Jan2019 Always use POSIX clock except if it is disabled setting the EXTRAE_USE_POSIX_CLOCK environment variable to 0
* 19Sep2018 Fixes bug in variable declaration in 'free' wrapper
* 18Sep2018 Fixes bug in OpenMP common wrapper hook points
* 18Sep2018 Removes debug for OpenMP Fortran wrappers
* 17Sep2018 Fixes race condition when issuing install in a parallel make (closes #18)
* 17Sep2018 Fixes memory corruption when reading CPU frequency (closes #17)
* 17Sep2018 Translates UF symbol addresses to find symbols in dynamic libraries
* 17Sep2018 Increases maximum number of threads for IBM XL OMP runtime helpers to 256 and removes checks when the IBM runtime is not hooked
* 17Sep2018 Adds wrapper for taskgroup construct in Intel runtime
* 17Sep2018 Fixes dlsym loop when trying to obtain the pointer to calloc
* 17Sep2018 Solves infinite loop when getting the real pointer for 'free'
* 09Jul2018 Fixes compatibility of pyextrae with Python 3
* 13Jun2018 Activates library auto-init by default in all tracing libraries (closes #5 #12)
* 06Jun2018 Implements OpenMP Fortran wrappers
* 05Jun2018 Refactors functions and structures used in the requests hash table
* 01Jun2018 Clarifies mpimpi2prv error message when running with 1 task
* 01Jun2018 Fixes wrong include in UF_xl_instrument.c
* 01Jun2018 Compile MPI_Fetch_and_op, MPI_Compare_and_swap and MPI_Win_flush* only if MPI3 is supported
* 31May2018 Fixes cudaStreamDestroy wrappers
+ 29May2018 Adds missing CUDA wrappers and callbacks for the creation and destruction of streams
+ 29May2018 Add $DESTDIR to src/others/Makefile.am to be able to create RPM's
+ 22May2018 Adds instrumentation support for MPI_Fetch_and_op, MPI_Compare_and_swap and MPI_Win_flush* routines (closes #2)
* 18May2018 Fixed wrong format of tag in the prv trace that resulted in negative comm tags
* 17May2018 Adds extrae_module.f90 to distribution
+ 15May2018 Added configure check for MPI_Get_accumulate, BullMPI does not implement this routine
+ 11May2018 Added support for ARM64 cross-compilation (GitHub pull #11)
+ 10May2018 Added instrumentation wrapper for ioctl()
* 06Apr2018 Fixed missing states for MPI_*neighbor* routines
* 20Mar2018 Added instrumentation support for MPI_*neighbor* routines
* 19Mar2018 Allows defining a trace-control gops interval starting at 0
* 15Mar2018 Fixes segfault when EXTRAE_CONFIG_FILE is wrong (GitHub issue #1)
* 28Feb2018 Fix invalid read in WriteFileBuffer_delete (GitHub pull #9)
* 28Feb2018 Adds R/W lock in OMPT-helper functions (GitHub pull #6)
* 13Feb2018 Adds message to warn about merger not able to open the binary to translate addresses
* 09Feb2018 Changes index calculation when there is no nesting
* 23Jan2018 Fixes Testsome wrapper calling PMPI_Waitsome instead of PMPI_Testsome (GitHub issue #3)
* 09Jan2018 Added mutex to protect buffers' double frees from dying pthreads
* 08Jan2018 Adds Skylake support in PEBS
* 19Dec2017 Defer the initialization of each IO, DYNAMIC MEMORY and SYSTEM calls to the first time they are used
* 13Dec2017 Increase maximum events for buffer transactions
* 29Nov2017 Refactor utils function names
+ 09Nov2017 Extended pyextrae API with calls to nevent and neventandcounters
* 27Oct2017 No longer set a value for OMP_NUM_THREADS if it was not defined by the user to avoid conflicts with schedulers that assign threads through omp_set_num_threads
+ 26Oct2017 Added header file extrae_version.h
* 25Oct2017 Fixes compatibility issue accessing struct ucontext
* 16Oct2017 Reduces maximum number of arguments in intel-kmpc wrappers
+ 13Oct2017 Added fallback mechanism for taskloop instrumentation that recovers from runtime internal copies of the instrumented parameters
* 10Oct2017 Changes RECHECK_INIT macro so it also allocates nested helpers
* 29Sep2017 Fixes missing OMP user functions in parallel loops
* 29Sep2017 Fixes invalid reference to task_helper in GOMP_task and callme_task
* 21Sep2017 Sampling macros now check if task events have to be stored
* 10Aug2017 Fixes confusing error message when merger reaches quota limit
* 20Jul2017 Fixed wrong dependencies with OMPT in OpenMP libraries
* 13Jul2017 Upgrade to v3.5.0
+ 12Jul2017 New documentation using sphinx-doc
* 12Jul2017 Fixes configure check for without-cuda and without-cupti
* 10Jul2017 Make the I/O labels appear in the PCF only if each specific call has been used during the run
* 10Jul2017 Include elapsed time outside MPI in Test* calls
+ 14Jun2017 Added instrumentation support for OpenMP taskloop and ordered directives
* 6Jun2017 Make DLB available by default
+ 01Jun2017 Emits cpu_event for each OpenMP thread at least once
* 30May2017 Clears compilation warnings and MPI_HAS_MPI_F_STATUS_IGNORE bug
* 30May2017 Compiles and distributes a Fortran module to use the Extrae API
with Fortran programs
+ 17May2017 Added instrumentation support for mpi_reduce_scatter_block, mpi_ireduce_scatter_block, mpi_alltoallw, mpi_ialltoallw, mpi_win_lock, mpi_win_unlock, mpi_get_accumulate
+ 12May2017 Added compatibility with new events in CUPTI 5 for stream creation
* 10May2017 Upgraded pyextrae support for Python 3 and added submodule for CUDA tracing
+ 02May2017 Adds intrumentation for kmpc dynamic memory routines
* 27Apr2017 Extrae_get_version API call now relies on the configure.ac value instead of using a separate include file
* 26Apr2017 Initializes a different PEBS fd for every thread
* 26Apr2017 Don't exit when the binary is linked against multiple OpenMP runtimes
* 24Apr2017 Applied patch to emit a different event for the allocation id events
* 20Apr2017 Fixes typo in the convenience library for Intel KMPC wrappers
* 12Apr2017 Fixes CUDA tracing when using CUPTI
* 12Apr2017 Fixed missing dependency with librt in clocks module
* 11Apr2017 Changed PEBS initialization to activate in all the threads
* 04Apr2017 Unifies GNU OpenMP APIs in a single library
* 23Mar2017 Added missing initialization of control variables to keep track when we're in instrumentation or sampling that prevented PEBS sampling to trigger
* 22Mar2017 Fixed configure checks for Java instrumentation support
* 15Mar2017 pyextrae no longer requires a list of functions to instrument to activate the tracing
* 03Mar2017 Fixed missing python scripts after make install
* 14Feb2017 Fixed conflict between sampling and I/O tracing where we would capture I/O calls inside the sampling handler
* 13Feb2017 Fixed critical bug in __kmpc_fork_call due to the use of the non-reentrant par_func variable to store the user task pointer
* 13Feb2017 Upgraded Dyninst compabitility to series 9.x
* 31Jan2017 Fixes bug in libdwarf detection macro
* 12Jan2017 Fixes shared-libraries tests being compiled during 'make' instead
of during 'make check'
+ 28Dec2016 Add instrumentation support for sched_yield syscall and others
+ 27Dec2016 Added support for Python MPI and multiprocessing tracing
* 13Dec2016 Adds new build information into configured.sh script
* 13Dec2016 Adds tracing of MPI IO size in Fortran
* 01Dec2016 Increases default buffer size in the examples by x10
* 17Nov2016 Patched IO calls again to save/restore the value of errno twice, before and after calling the real I/O symbol
* 14Nov2016 Builds additional C/Fortran tracing library by default
* 14Nov2016 Fixes MPI_Ialltoallv label
- 08Nov2016 Removed flag -Wall from tests/overhead
* 08Nov2016 Patched IO calls to save/restore the value of errno at entry/exit of the wrappers
* 07Nov2016 Fixes PEBS memory translations
* 27Oct2016 Fixes Fortran interfaces for MPI3 non-blocking collective calls
* 21Oct2016 Changed checks in IO wrappers to ensure the compiler doesn't optimize the conditions and tests for an invalid THREADID
* 14Oct2016 Fixed interface and wrapper for MPI_Intercomm_create that had a recursion
* 14Oct2016 Added event for memkind partition
* 11Oct2016 Fixes online and dyninst tests
* 10Oct2016 Fixes IO segfault when opening non-existing file and enables it by default
* 10Oct2016 Updated configure check for newer Cray XT systems
* 23Sep2016 Upgraded to v3.4.1
* 23Sep2016 Fixed Java and MPI checks
* 23Sep2016 Run JAVA checks with make check instead of with make installcheck
* 22Sep2016 Fixes for make check
+ 14Sep2016 Adds XML counter distribution (thread-cyclic) to allow each thread to read a different counter set
+ 12Sep2016 Adds states for memory allocation calls
* 08Sep2016 Changed management of MPI_Cancel calls
+ 07Sep2016 Adds event marking the periodicity of CPU events emission
* 06Sep2016 Changed environment scripts extrae.sh and configured.sh to locate the EXTRAE_HOME path automatically
* 06Sep2016 Fixes BFD configure check
* 02Sep2016 Removes strict XML version check
* 08Ago2016 Fixes make uninstall
* 04Aug2016 Fixed bug in configure macro to check for pebs sampling
* 29Jul2016 Upgraded to v3.4.0
* 28Jul2016 Adds XML option (cpu-events) to specify the emission frequency of CPU events
* 26Jul2016 Repeat file open if the assigned fd for the trace file is 0
* 26Jul2016 Repeat file open if the assigned fd for the trace file is 0
* 14Jul2016 Fixed bug in malloc wrappers of null THREADID in combination with OpenMP
* 08Jul2016 Removes bootstrap warning due to shared-libraries test Makefile.am
debug rules.
* 06Jul2016 Fixes shared-libraries test
+ 06Jul2016 Added posix_memalign to the list of memory wrappers
+ 05Jul2016 Always enable POSIX clock and control it via EXTRAE_USE_POSIX_CLOCK
environment variable.
* 04Jul2016 Enable peps sampling by default
+ 28Jun2016 Adds .gitignore with build generated files
+ 28Jun2016 Added instrumentation wrappers for memkind_malloc* routines
* 28Jun2016 Fixes tests that fail when Extrae is built from within a folder
* 15Jun2016 Consider callers when computing buffer size for IO calls
* 13Jun2016 Fix compilation error when enabling heterogeneous
* 09Jun2016 JAVA tracer tests now run after installation.
Fix ompss tests reference files.
Fix MPI Iprobe_wait test.
+ 03May2016 Add recording of MPI one-sided operations' size
* 23May2016 Rank 0 was exiting if XML file couldn't be opened, leaving the others stalled in a collective
+ 17May2016 Add wrappers and merger support to instrument open, fopen and the string name of the opened files
* 13May2016 Add MPI_Finalize to dlsym PMPI hook
+ 12May2016 Add configure option to use dlsym as PMPI hook
* 12May2016 Fixed table of states colors with missing states
+ 12May2016 Added wrappers to instrument both 32 and 64-bit I/O calls preadv, pwritev, preadv64, pwritev64
* 11May2016 Remove SVN revision and branch check and creation of SVN-branch and
SVN-revision files
* 04May2016 Added the missing case MPI_WIN_CREATE_EV in the Get_State function
* 29Apr2016 Added check for POE environments in the launcher of the online_root
+ 28Apr2016 Added tracing finalization mechanism for the OMPT target devices, started by Extrae at Backend_Finalization using calls to ompt_target_stop_trace.
+ 27Apr2016 Add support for IBM Platform MPI
* 27Apr2016 Bug fixes in the instrumentation of OMPT target devices
+ 25Apr2016 Added instrumentation for the following IO calls: fread, fwrite, pread, pwrite, readv, writev, preadv, pwritev
* 21Apr2016 Update m4 macro to look in lib/x86_64-linux-gnu (support Ubuntu)
* 08Apr2016 Changed the default PAPI counters in LINUX examples to avoid errors with the latest versions of PAPI and PAPI_BR_MSP
* 07Apr2016 Removed flag -shared from the lib_dyn_mpitracec_la_LDFLAGS and lib_dyn_mpitracef_la_LDFLAGS, which was set to tell libtool just to generate shared libs (no .a's), but this generated .so's with two dependencies to MRNet, one resolved and one unresolved.
* 07Apr2016 Extrae version 3.3.0 final release
* 07Apr2016 Added initial support for tracing of OMPT target devices (accelerators, FPGAs, etc)
* 06Apr2016 Priorize the search for mpicc family of compilers under bin64 before bin
* 29Mar2016 Fix trace.sh within DynInst examples (CUDA+MPI and MPI)
* 24Mar2016 Emit only once types associated to Extrae_register_codelocation_type & Extrae_register_function_address
* 22Mar2016 Fixes in extrae-cmd / Added documentation for Extrae-cmd
* 11Mar2016 read()/write() now emit callstack & descriptor type
* 10Mar2016 Add support to capture different signal types
Add <flush-sampling-buffer-at-instrumentation-point> option
* 09Mar2016 Increase classification of memory hierarchy in PEBS samples
* 04Mar2016 Fix trace-file naming (sometimes failed to guess the binary name)
+ 02Mar2016 Add code reference to memory allocated object
+ 29Feb2016 Included the size of MPI-IO operations in the trace
* 26Feb2016 States per thread are no longer allocated statically but
reallocated on demand.
Remove SVN keywords for most of the files.
* 19Feb2016 Fixed bug preparing the output trace name
* 10Feb2016 OpenCL fixes: ClFinish communication to reach end,
emit OpenCL synchro at the ClFinish begin,
communication of executing kernel goes to beginning
* 04Feb2016 Fix MPI/F77 examples. Neither IBM/Intel compilers liked the
extra space.
* 03Feb2015 Sanitized memory reference samples point to the variable
Fix compilation with -lpthread/-pthread when building
pthread instrumentation libraries
* 01Feb2016 Improve memory reference samples point to the variable
+ 01Feb2016 Emit event OpenCL thread id for command queue in clFinish
+ 22Jan2016 Add instrumentation for cudaDeviceSynchronize
- 18Jan2016 Remove extrae-post-installation.sh. It was broken.
* 13Jan2016 Memory reference samples point to the variable (either static or
dynamically allocated)
* 08Jan2016 Improved boost search in configure
* 07Jan2016 Emit raw system time at APPL_EV events for synchronization purposes
with other tools
* 05Jan2016 Fixed bug when generating communications between Memcpy commands
in CUDA applications
* 18Dec2015 Fixed bug in Time Synchronization due to uninit Spawn time if MPI_Comm_spawn was not used
Emit Java thread name in Java thread start
* 15Dec2015 Fixed bug in Map_Paraver_Files at file_set.c. Field SkipAsMasterOfSubtree of PRVFileSet_t was not initialized, leading to undeterministic problems where this would randomly take garbage values ending in a later race condition in the parallel merge.
+ 11Dec2015 Added documentation for use on top of PnMPI
* 03Dec2015 In BG/Q MPI_get_processor_name != gethostame -> fails when matching
the contents of .mpits and each of the .mpit files
* 23Nov2015 Guess location for EXTRAE_HOME based on the configured.sh script
Further OMPT upgrades to latest spec
* 19Nov2015 Added overhead section in the user guide using the overhead
tests in a variety of systems.
* 18Nov2015 Changed calls to MPI_Comm_compare in Trace_MPI_Communicator (mpi_wrapper.c) into PMPI_Comm_compare so as not to intercept those calls
* 16Nov2015 Fix extra comma in papi_best_set
Upgrade OMPT instrumentation
Fix #pragma omp taskgroup instrumentation
* 10Nov2015 Fix unfinished states due to the last event/state and improve
messaging
* 09Nov2015 Fix label for MPI_Ibarrier (shown as MPI_Ibcast)
* 03Nov2015 Fix state generation for several new MPI calls (backported from 3.2.1)
* 02Nov2015 Extended Java instrumentation through AspectJ and JVMTI
Added $EXTRAE_HOME/bin/extreaj launcher for Java apps
Added new examples in JAVA directory
manual/ --> instrumentation without AspectJ
automatic/-> instrumentation with AspectJ
* 30Oct2015 Extrae 3.2.0 released!
Fix compilation of extrae-cmd in BG/* if PAPI is not set
Fix compilation in an other directory rather than $top_src_dir
* 27Oct2015 Free buffer when pthread routine finishes, not at pthread_join time
Move MPI_FINALIZE_EV so that it covers instrumentation code
* 23Oct2015 Waste some CPU time in mpi_ping example for use with extrae_bursts_1ms.xml
Fix timing Extrae check
* 22Oct2015 Improved "make check". Added -without-addresses to mpi2prv to ignore addresses
* 21Oct2015 Fixed visibility of the on-line API routines that are called from the wrappers
Fix example installation. Use cp instead of ln -s.
Fix Java compilation in other place rather than extract dir
Fix configured.sh when building in a directory other than extraction dir
* 15Oct2015 Code adapted to use Synapse libraries v2.0 instead of the older version libMRNetApp
* 13Oct2015 Fix parametrization of clCreateKernelsInProgram
* 05Oct2015 Adding support for MPI3 immediate collectives
Revamping source code structure, each wrapper generates its
intermediate lib_wrap_*
Removed support for TRT, UPC(unfinished), PACX, CELL
* 16Sep2015 Fixed configure check to look for MPI Fortran libraries named like "libmpifort"
* 05Aug2015 papi_best_set now checks whether counters and created eventsets
are 0 or not
+ 20Jul2015 Support for minimum number of cycles of reference in PEBS
* 16Jul2015 Fix missing pthread_mutex_unlock in persistent request
+ 08Jul2015 Imported first bits for sionlib
* 25Jun2015 Fix crash when temporal dir != final dir and their shared
characteristics differ.
Fix compilation into a separate directory.
Fix installation from a separate building directory.
* 23Jun2015 Add initial PEBS sampling support
Split sampling files into src/tracer/sampling
* 17Jun2015 Fix extrae command line, lack of thread init
* 12Jun2015 Do not invoke MPI_* routines when initializing Extrae but MPI
is not present (as in MPI+OmpSs).
* 04Jun2015 Sanitized environment variables, in particular LD_LIBRARY_PATH
Added extrae-test-dyninst tool to check DynInst functionality
Annotate the node name in the .mpit file to avoid file collisions
Extend Java documentation
* 25May2015 Extrae 3.1.0 released
* 25May2015 Fixes for bootstrapping, changed into autoreconf.
Added libxml2's m4 file into config/
* 20May2015 Fujitsu compiler does not have -Wall or similar flag
* 05May2015 Fixes for K computer
* 28Apr2015 papi_best_set cannot support 64 or more counters
* 27Apr2015 Fixed Makefile rules for the pyextrae script. It is now
installed in libexec instead of lib.
Support for MPI_UNDEFINED in mpi_stats & filtering
Promoted usage of stat instead of open+read to check for shared
dirs among MPI processes
* 23Apr2015 Upgrade to 3.1.0rc
Missing MPI+OpenMP example installation
Improve directory construction on shared disks
Shorten labels related to user source code references
Change format for performance counters labels
* 22Apr2015 Fix timings for first mode & hardware counter events
Fix compilation in IBM ppc when --disable-openmp-gnu is given
Make Extrae_user_function use the same parameters
+ 21Apr2015 Add check for MPI_Comm_spawn
Add omp_get_thread_num default to 0 and message in Extrae
Automatically detect the system type to enable/disable openmp runtimes
Rebirth IBM xlsmp instrumentation (requires further testing)
+ 20Apr2015 --with-libgomp supports auto to determine the appropriate version from CC
Add OpenMP task-based statistics (instantiated, executed)
Add OMPT dependencies (MB project)
Freeing memory, reported from Valgrind
* 10Apr2015 Fixed bug in Python instrumentation. The signature of the
Extrae_define_event_type receives parameters by reference.
* 31Mar2015 Added configure options in the documentation regarding the on-line analysis and the instrumentation of OpenSHMEM.
+ 31Mar2014 Add SEQ example using -finstrument-functions
* 30Mar2015 Create directory if target trace-file is in a directory that does not exist
* 27Mar2015 Changed memory allocation for the InputTraces and FileSet structures in the merger: instead of allocating space for MAX_FILES, we allocate only for the maximum of mpits. MAX_FILES constant has been removed.
Extended examples to support dynamic memory instrumentation.
* 16Mar2015 Support up to 512 params in Intel KMP OpenMP runtime
* 13Mar2015 Support to compile Extrae outside its src directory
(thanks to Jorge Bellon)
Fixed some Extrae tests
* 10Mar2015 Improved/fixed support for task constructs in libgomp
(thanks to Eduard Ayguade)
* 05Mar2015 Fixed pthread support (missing pthread_id for master pthred)
Use multiarch triplet where available
(thanks to JM Perez)
* 03Mar2015 Revamped papi_best_set
* 26Feb2015 Completed MPI_Intercomm_create support (serial & parallel mergers)
+ 23Feb2015 Instrumentation of MPI_intercomm_[create|merge]
+ 13Feb2015 Extend malloc-related instrumentation
* 28Jan2015 Fixed bug in calltrace.c that made wrong caller levels to be traced
when using backtrace instead of libunwind.
* 26Jan2015 Fixed bug in calltrace.h that made that the label of the caller
events when is a single level of the stack requested to be always 70000000,
despite the level requested.
* 23Jan2015 Changed primary lib_LTLIBRARIES to noinst_LTLIBRARIES at
src/tracer/stats/Makefile.am to fix a bug in K computer.
* 19Dec2014 Improve Java examples for Darwin & Linux
Improve support for Darwin
* 18Dec2014 Fix installation script (substitute) to work on AIX & Darwin
* 15Dec2014 Fix CUDA+MPI+OpenMP instrumentation library
Improve location for libbfd & libiberty from the binutils package
Capture callstack at dynamic memory instrumentation points
Improve support for AARCH64 architectures
* 12Dec2014 Upgraded DLB support
* 04Dec2014 Fixed PCF labels regarding OpenSHMEM, they don't appear now if
there's no SHMEM events in the trace
+ 03Dec2014 Adding list-functions DynInst based tool (not installed)
Improve detection for libiberty & libbfd within binutils package
Add minimal support for aarch64 (ARM64)
+ 19Nov2014 Recorded events for bytes sent/received via SHMEM calls
Added new states for SHMEM operations
Now emmitting callers at the entry of SHMEM calls
* 18Nov2014 Fixed bug in *Distribute_XML routines that caused sporadic crashes
* 17Nov2014 Fixed time synchronization and running states in shmem traces
Added configure parameters to specify the OpenSHMEM dependencies
--with-openshmem-deps-libsdir and --with-openshmem-deps-libs
+ 14Nov2014 Added 'wrapgen' script to generate lists of wrappers/probes automatically
* 03Nov2014 Fix capturing >1k function names through binutils
Added instrumentation for:
MPI_Win_post, MPI_Win_complete, MPI_Win_wait
Fixed bug in the initialization of the on-line gremlins
+ 31Oct2014 Added instrumentation for:
MPI_Win_create, MPI_Win_free, MPI_Win_start, MPI_Win_fence
Fix ordering in ROW file / objects in .prv
* 30Oct2014 Fixed the DistributeWork routine to support dynamic threads per task.
Fixed the Search_Synchronization_Point routine to support the different distribution methods (block, cyclic, size...)
Upgraded to v3.0.3.
* 16Oct2014 Added configuration options for the gremlins analysis.
Also changed the initialization of the gremlins.
* 09Oct2014 Added interface for libgomp 4.9.
Added configure parameter '--with-libgomp-version' to choose the interface.
For both versions 4.2 and 4.9, the helper struct used to pass the parameters to the tasks has been changed to be allocated dynamically.
Instrumentation for tasks has been disabled in version 4.9 because it crashes inside the OpenMP runtime.
Upgraded to v3.0.2.
+ 04Oct2014 Fix request for bfd_demangle (reported by David Clarke)
+ 25Sep2014 Move .sym file from $temporal_dir to $final_dir
* 19Sep2014 Fixed minor version from 3.01 to 3.0.1
+ 08Sep2014 Added support for OpenMP threads in the on-line analysis.
Upgraded version to 3.01.
+ 05Sep2014 Added instrumentation for mpi_request_get_status/fortran
* 02Sep2014 Fixed bug in the buffers management. Need to forward to the
SEEK_END of the file before writing to disk, due to the interactions between
the normal tracing buffer and the cache.
* 01Sep2014 Fixed bug in the gremlins initial configuration that crashed
Extrae when not doing gremlins analysis
* 28Aug2014 Fixed support for MPI-3 for MPI_Comm_spawn* calls
* 27Aug2014 Upgraded version to 3.0 release. TA-DA!
+ 27Aug2014 Added support for MPI-3
* 27Aug2014 Fixed configure/compilation bugs in the OpenSHMEM instrumentation.
* 11Aug2014 Extensions to OpenCL instrumentation
* 04Aug2014 Disable instrumetation of mpi_file_open in fortran
Don't emit counters for a thread when flushing at the end if the
calling thread is not the exec thread.
+ 31Jul2014 Added instrumentation support for OpenSHMEM
* 14Jul2014 Fix permissions for files/directories in set-X
* 11Jul2014 GOMP_parallel support (appeared in libgomp for GCC 4.9)
* 24Jun2014 More fixes to OMPT support
* 04Jun2014 Fixed bug in MPI statistics in burst mode: the resets were missing, and the timestamp of the events was wrong.
+ 04Jun2014 Preliminary OMPT support (tested with IBM OpenMP rte).
+ 14May2014 Added support for activating/pausing gremlins at run-time and options in the xml file to configure a repetitive pattern
* 05May2014 Fixes for additional CUDA/Dimemas simulations
* 25Apr2014 Changed the synchronization point for spawned processes from the start of the spawn call to the end.
* 17Apr2014 MPI_Comm_spawn*: added time synchronization, fixed communications from child to parent.
Online clustering: pass all counters to the clustering library, not only the common ones.
Added online gremlins.
Added mock-up for standalone libraries and a binary loader.
Removed debug messages.
* 16Apr2014 Emit additional CUDA information for future Dimemas simulations
* 14Apr2014 Bring libcudaompitrace to life
* 03Apr2014 Fix improper access to MPI_STATUS_IGNORE in mpi_sendrecv, mpi_sendrecv_replace and MPI_Sendrecv_replace
MFB 2.5
small fixes in documentation
missing OpenCL bits
* 27Mar2014 Reverted prototype of API call Extrae_define_event_type to receive parameters by reference to be compatible with Fortran
* 19Mar2014 Sanitize examples' Makefiles (MPI basically)
Fix unmatched persistent requests with parallel merge.
* 13Mar2014 Ensure DYNINSTAPI_RT_LIB is correctly set in extrae.sh/csh
* 27Feb2014 Added option to run the on-line analysis at the end of the execution instead of periodically.
Fixed bug in the Makefiles that caused that the online_env.sh script was not reinstalled after a reconfigure.
Added routine to dump the states stack in the merger when the MAX_STATES limit is reached.
The extrae.sh sources the necessary environment for the online analysis (no support for csh yet).
Online examples updated.
Updated version to 3.0rc6.
+ 17Feb2014 Splitted --enable-openmp for the several supported runtimes
Extrae_suspend/resume_virtual thread no longer captures HWCs
* 07Feb2014 Fixed missing include 'stats/MPI/mpi_utils.h' in the distribution
+ 05Feb2014 Added API call Extrae_flush to force flushing of the calling thread
* 31Jan2014 Fixed bug that stalled the on-line analysis (on-line root running in double-background was killed by garbage collector)
Removed unnecessary recursive mutexes.
Refactored the way the spectral worker emits online events
Fixed bug in the call to Extrae_define_event_type
Updated version to 3.0rc5
* 23Jan2014 Nanos examples for BG/Q
* 20Jan2014 Added initialization of mutex in the on-line root that went missing in the previous update
Updated version to 3.0rc4
+ 13Jan2014 Added advanced functionalities in the on-line mode (automatic threshold, conversion from detail to burst, phase profiling)
Added new MPI statistics
Changed the management of the HWC_CHANGE_EV
Updated version to 3.0rc3
* 09Jan2014 Extended sampling addresses support
* 07Jan2014 Support up to 256 parameters for Intel OMP/runtime.
* 18Dec2013 Improved support to instrument OpenCL in Apple machines
+ 04Dec2013 Added example of OpenMP using LD_PRELOAD in Linux
+ 03Dec2013 Added a new configure flag (--with-mpi-lib-name) to force the specific name of the MPI library to link with.
Changed the mallinfo() wrapper (Extrae_memusage_Wrapper) to emit absolute values instead of deltas.
Removed the initialization calls to Extrae_memusage_set_to_0_Wrapper().
* 29Nov2013 Added initial Java instrumentation (--with-java at configure)
Added initial Extrae command line instrumentation
Added gettimeofday clock
Added flag for David Carrera's Hadoop instrumentation (--enable-dcarrera-hadoop)
Bring MPI I/O instrumentation back to life
* 22Nov2013 Honor <cuda enabled> and <opencl enabled>
MFB missing OpenCL probes
Experimental support for isntrumenting I/O & dynamic memory calls
* 21Nov2013 Added improved support to instrument nanos+MPI apps including
examples for OmpSs and MPI+OmpSs
In Linux, merging process automatically uses binary name as
default trace file name and no longer requires -e neither in xml.
Add checkers for PIC/noPIC code in instrumentation libraries that use dlsym
* 07Nov2013 Disallow --with-X= in the configure line
* 05Nov2013 Added -rpath to the used MPI library to all lib*mpi*so libraries
Fix segfault when no performance counters are given to Extrae
Create TMPDIR directory if it does not exist
* 09Oct2013 Fixed bug in merger when processing HWC_CHANGE_EV: counters shared by the old and the new set appeared twice.
+ 26Sep2013 Fix missing header in merger files
BFD compilation in cross-compiled environments
Autodetection of librt using gcc --print-file-name=librt.so
New --with-librt to pass the location of librt
* 17Sep2013 Improved instrumentation support for MPI_Comm_spawn. Now supports concurrent spawns from different process of the parent application.
+ 09Sep2013 Added a quicksort algorithm for files
* 09Sep2013 Revamped pthread event types so as all calls are only one type
Prepare scripts for LSF & slurm (MPI)
* 06Sep2013 Fixed some warnings
Read /proc/self/maps to know which binaries & libraries are used
Create tests to test the usage of binaries & libraries
Additional location for MPI libraries / includes / binaries in BG/Q
Many minor bugfixes
Fixed minor configure/compilation issues
Protected includes of system headers by their appropriate #ifdef's
Added missing licensing headers
* 05Sep2013 Upgraded to version 3.0rc2
Major changes to the on-line analysis. The front-end is now launched as a separate process.
+ 05Sep2013 Added a new control (matching_zone) to the communications matching algorithm, so as not to make matches that cross areas where the tracing was disabled.
* 02Sep2013 Fixing OpenCL instrumentation (added clRetain & clRelease calls)
Fixed some event mismatches in the OpenCL accelerator side
Added OpenCL C++ example
* 30Jul2013 Added overwrite option in <merge> section
Removed unused 'remove-files' in favor of 'keep-mpits'
Fix fortran declarations on API
Fix MPI interfaces declarations
* 29Jul2013 Fixed generation of codelocation & user function in nanos stacked
+ 19Jul2013 Add CPU information at begin & end of
- routine executing pthread_create
- OpenMP worksharing, function, ... *
Add multiple overhead tests
+ 17Jul2013 Added overhead tests
* 15Jul2013 Fixed OpenCL timing issues, added MPI+OpenCL example
* 09Jul2013 Fix sampling handler callstack level
* 08Jul2013 Allow OpenMP user code to generate samples again
Moved Extrae_set_initial_taskid at the very beginning
* 04Jul2013 Fix compilation of papi_best_set for PAPI when using CUDA
Move Extrae_CUDA_fini to cuda_common.c
* 02Jul2013 Fixed compilation flags for the merger to enable the online support.
Fixed which events are cached to keep the communicators definitions in the online analysis.
The first spectral analysis now applies windowing to the first 10% of the data.
Fixed missing dependencies in the online libraries.
Fixed report message for the circular buffer when parsing the XML configuration.
* 25Jun2013 Fixed bug in the cpu IDs of unmatched communications through different tasks of the parallel merge.
Removed extra debug messages.
Added instrumentation support for MPI_comm_spawn and MPI_comm_spawn_multiple
+ 17Jun2013 Add instrumentation for routines that call fork/wait/...
Fix bug that malformed communication records in mpi2prv serial
+ 12Jun2013 Add check for CUDA
+ 07Jun2013 Add CUDA fini to flush all streams just in case some did
not get flushed
+ 06Jun2013 Added check for Extrae fortran API / define_event_type
Additional bits for instrumenting OpenCL
+ 30May2013 First part of the OpenCL instrumentation
Added tests to check clock resolution
+ 23May2013 Emitted pid(), ppid() and fork depth in the tracefile
getrusage and mallinfo are emitted to 0 at the beginning
Fixed make check
Dyninst/fork instrumentation, free parent's buffer
Trace initialization state / delimited even for Extrae_init
* 21May2013 Get performance counters at flushes
+ 17May2013 Added extrae_define_event_type to be callable by Fortran
Added sequential example of pi, including the above call.
* 16May2013 Restart sampling once leaving the fork call as child
Renamed devices and streams in CUDA traces so that their
numbering goes from 1 to N instead of 0 to N-1.
Support for time suffixes ns, ms, and us.
Instrumented MPI_Comm_free.
* 13May2013 Added instrumentation for #pragma omp for ordered in GNU
Removed useless warnings when instrumenting OpenMP using preload
* 10May2013 Improved support for fork+wait+waitpid+system+exec calls
using dyninst launcher
* 09May2013 Require binutils if using unwind or compiling in linuxos
(provides backtrace)
* 07May2013 Multiple small fixes in the bursty tracing
Avoided problems when Extrae_shutdown was called
Emission of MPI others statistics
MPI communicators minor refactoring
Removed erroneous emission of hardware counters when a set is
about to start
Added installation of extrae_module.f
OpenMP snippets were misused in DynInst, use the appropriate
Add cross-compiled librt for ARM architectures if available
+ 02May2013 Added support for CUDA5+CUPTI
Changed substitute script to use sed -i
+ 30Apr2013 Added support for cudaDeviceReset.
Honor --with-cuda instead of adding /opt/cuda/4.0 in Makefile
Look for specific intel mpi libs
* 29Apr2013 Do not require -e to import symbols
Added 'random' starting-set-distribution for performance counter
sets
+ 11Apr2013 Instrumented pthread_barrier_wait
+ 20Mar2013 Added example for Python instrumentation
* 20Mar2013 Fixed paths in pyextrae.py
+ 20Mar2013 Added pyextrae.py module in src/others to instrument python programs
* 15Mar2013 Added Fortran module with Extrae constant and function
declarations
* 12Mar2013 Modified SVN propset using
svn propset svn:keywords "Date Revision Author HeadURL Id"
* 12Mar2013 Code restructuring.
* 11Mar2013 Changed the automatic generation of topologies for small online executions
* 11Mar2013 Fixed Makefiles to distribute some missing files
* 08Mar2013 Fixed minor compilation issues.
* 08Mar2013 Extrae version 3.0rc1
* 08Mar2013 Fixed compilation issues (made on-line support only available to the MPI tracing libraries).
Fixed XML parse to skip comments.
Hardware counters are no longer a requirement for the on-line spectral analysis.
Enable sampling by default
* 07Mar2013 Fixed minor compilation issues and changed the default configuration of the online example.
Extrae version 3.0
Added on-line spectral analysis.
+ 01Mar2013 Emit CPU through sched_getcpu at Init and Flushes.
Removed MN specific code that no longer has sense.
+ 27Feb2013 Emit lock address when instrumenting named locks (MFB 2.3)
Allow environment variables when parsing the XML file through the
DynInst mutator (MFB 2.3).
Add exclude-automatic-functions in the <user-functions> tag (MFB
2.3).
Allow cross-compiling for ARM through --enable-arm (MFB 2.3)
Cleaned obj_table and changed into ApplicationTable (MFB 2.3)
+ 26Feb2013 Initial fork+wait+waitpid instrumentation under dyninst (MFB 2.3)
Improved detection of binutils package (uses find)
* 22Feb2013 Fixed bad instrumentation of GOMP_*_next using dyninst (MFB 2.3)
* 20Feb2013 Added instrumentation for Intel omp_set_num_threads (which is
named ompc_set_num_threads) (MFB 2.3)
Added support for Intel fortran compiler with F90 code to
instrument using dyninst (MFB 2.3)
* 15Feb2013 Improve detection of libbfd*.so (MFB 2.3)
* 14Feb2013 Removes all temporal files at the end of the merge process (MFB
2.3)
Don't require MPI to generate the MPI communicators within the
PRV file (MFB 2.3)
MPI statistics (burst mode) were calculated incorrectly, 1 missing
Add variability to the sampling period
* 13Feb2013 Removes duplicate creation of temporal files (MFB 2.3)
Actually Fixed SendRecv communication (refixes 06Feb2013) (MFB
2.3)
- 11Feb2013 Support for freq_table in pair with freqtable to determine
whether to use the posix clock routines.
Removed auto instrumentation of libpttrace anb libtrtrace because
pthread_mutex was issued before starting and was not hooked
(MF-tag-2.3.2)
* 06Feb2013 Fixed SendRecv communication matching error (MF-tag-2.3.2)
* 01Feb2013 Removed useless lib dependencies for MN3 (MF-tag-2.3.2)
* 25Jan2013 Fixed configure checks to link with the spectral analysis toolkit
+ 25Jan2013 Added base code for on-line support
* 24Jan2013 Extrae version 2.3.2
* 24Jan2013 Fixed Paraver header generation when number of threads differs
across tasks.
Release resources (files and memory) as soon as pthreads terminate
(only through pthread_exit() or pthread_join())
+ 14Jan2012 Extrae version 2.3.1
* 14Jan2013 Fixed typo in MPI_Testall wrapper for Fortran
Fixed error when generating .sym files. Now creating also local
.sym files
Fixed locating binutils package when --enable-shared is set
* 03Jan2013 Added -lintl in some OSes
* 18Dec2012 Added instrumentation for MPI_Testany/all/some
Fixed issues when renaming *.ttmp into *.mpit tracefiles for MPI apps
* 14Dec2012 Fixed bug in the generation of the ROW file, where the threads appeared disordered.
* 05Dec2012 Fixed problems with trace names when using omp_set_num_threads
with OmpSs.
Added -lb libraries to nanos+mpi.
+`04Dec2012 Added libz to dependencies when generating the shared libraries
* 13Nov2012 Added checks for the C++ compiler only when it is required by
other functionalities like Dyninst or MRNet. Updated documentation.
+ 03Nov2012 Extrae version 2.3
* 24Oct2012 Sampling support for pthreads/openmp -> change type from default
to virtual or prof.
+ 23Oct2012 Added support for binary rewriting using DynInst
* 22Oct2012 DynInst compilation changed and broke into different components.
Now sports pcontrol, stackwalk, dynelf/dwarf and symlite.
* 18Oct2012 Fixes for DynInst launch
CUDA applications instrumented with DynInst now instruments
routines that call kernels.
* 17Oct2012 Added Intel OpenMP runtime instrumentation when using Dyninst.
Also, some minor enhacenements done when using DynInst.
* 16Oct2012 Do not use -lrt if the system automatically adds clock_gettime
* 11Oct2012 Improved support for Intel MIC KNC/KNF, now with support for
Intel MPI
* 10Oct2012 Changed --enable-cuda for --with-cuda=DIR in the configure script
+ 09Oct2012 Improved support for BG/Q systems. Improved example for this
architecture using libxml2 and binutils.
+ 04Oct2012 Added void Extrae_define_event_type (extrae_type_t type, char
*type_description, unsigned nvalues, extrae_value_t *values, char
**values_description); into the API
+ 04Oct2012 Added void Extrae_register_function_address (void *ptr, char
*funcname, char *modname, unsigned line); into the API
+ 04Oct2012 Added additional checks for Extrae_register_function_address call
+ 01Oct2012 Added two additional checks tests/functional/dump-events and
tests/functional/auto-init-fini. The former checks the basic API to emit
events whereas the second checks for the instrumentation auto
initialization and finalization.
* 01Oct2012 Extrae API calls now honor extre_type_t / extrae_value_t instead
of basic types (unsigned/unsigned long).
* 28Sep2012 Turned
void Extrae_register_codelocation_type (extrae_type_t t, char* s1, char
*s2)
into
void Extrae_register_codelocation_type (extrae_type_t t1, extrae_type_t t2,
char* s1, char *s2)
+ 28Sep2012 Added make check capability into Extrae. Currently there is only
one check
+ 21Sep2012 Added instrumentation call
void Extrae_register_codelocation_type (extrae_type_t t, char* s1, char *s2)
+ 14Aug2012 Added -remove-files to the mpi2prv (and to the xml) so as to
remove the related .mpit .mpits .sym file related to an execution. These
files are removed if the .prv generations is satisfactory.
* 21Jun2012 Fixed instrumentation of cudaStreamCreate
* 18Jun2012 Combined the configure options --with-bfd and --with-liberty into --with-binutils
+ 15Jun2012 Automatically load instrumentation with:
- libseqtrace
- libomptrace
- libpttrace
When these libraries are LD_PRELOADed or linked dynamically with the application.
* 08May2012 Refactored and simplified examples. Usage of Makefile.inc
extrae-post-installation-upgrade.sh gets alive!
Check whether all timings are in us instead of ns.
* 04May2012 Support for NANOS distributed
+ 19Apr2012 Add functionality -task-view -no-task-view in mpi2prv to generate
traces specially for OMPSs.
Add Extrae_get_version API call.
Updated documentation.
+ 18Apr2012 Build of shared libraries on BG/{P,Q}
+ 13Apr2012 Instrumentation of CUDA runtime through DynInst
Also, support for CUDA+MPI using DynInst. Added an example.
Removed lib_dyn_omptrace. Using libomptrace instead.
Minor changes in documentation
* 02Apr2012 Generation of the PCF file. Static entries now only appear if
they are present in the tracefile.
+ 30Mar2012 Documented new instrumentation of OpenMP tasks
Documented alternative libraries to use during instrumentation
at the quick guide.
+ 30Mar2012 Support to do Extrae_init + MPI_Init (and also MPI_Finalize +
Extrae_fini)
+ 27Mar2012 Instrumentation of Intel/GNU #pragma omp task
* 15Mar2012 Fixed versioning number for lib MPI+OMP when using dyninst
Fixed nesting creation of pthreads.
+ 13Mar2012 Instrumentation for pthread_exit
* 09Mar2012 Fixes in the #pragma omp parallel sections constructs in icc/gcc.
* 06Mar2012 First parallel event when thread > 1 was missing its HWC.
XML parser was looking for sort-address instead of sort-addresses
within <merge> block.
Several exit cleanups.
Documentation is no longer generated at make, but distributed
in its PS/PDF/HTML forms. This allows having the user-guide
without needing latex/dvi2ps/dvipdf/latex2html
* 27Feb2012 Fixes in the instrumentation of worksharings for Intel/OpenMP
* 24Feb2012 Extrae 2.2.1 is released!
* 24Feb2012 Fixed a bug in the name of the nodes in the row file
* 24Feb2012 Fixes in the instrumentation of GNU/OpenMP
* 01Feb2012 Added CUDA instrumentation through CUPTI
+ 27Jan2012 Support for BG/Q machines
* 23Jan2012 Fix when handling MPI_ANY_TAG in MPI calls (also for mpimpi2prv)
* 19Jan2012 Improved support for Intel OpenMP rte. Some files autogenerated.
+ 03Jan2012 Support to instrument Intel OpenMP rte v11/v12 through LD_PRELOAD.
* 01Dec2011 Improved support for DynInst (C&Fortran bugs in MT)
* 01Dec2011 Improved support for instrumenting NANOS+MPI
* 30Nov2011 Increased verbosity of papi_best_set
* 28Nov2011 Fixed a bug that appeared when creating communicators
* 14Nov2011 Added support to MAC OS X and improved support to FreeBSD
* 07Nov2011 Extrae 2.2.0 is released!
* 07Nov2011 Made --with-unwind, --with-papi, --with-mpi, --with-dyninst mandatory.
Can be avoided through the respective --without.
* 07Nov2011 Autodetect whether --enable-posix-clock is needed.
* 25Oct2011 Adding thread names, starting by CUDA
+ 19Oct2011 Support for MPI+OpenMP applications using Dyninst launcher
* 17Oct2011 Translation of @ of CUDA kernels into kernel names
* 14Oct2011 Updating PACX instrumentation
* 13Oct2011 Several OpenMP bugfixes. Added instrumentation for set/get num
threads. Fixed hwc counts at openmp when counter set changes.
Fixed timing routines when using multiple threads.
* 08Sep2011 Several OpenMP bugfixes
* 26Aug2011 Modified code to support virtual_thread (from nanos) to view
nanos tasks as threads in paraver.
* 23Aug2011 Updated examples and added DynInst examples to the user-guide.
* 23Aug2011 Fixed LINUX/{SEQ,OMP} examples to use either DynInst or static
instrumentation.
* 23Aug2011 Fixed emitting HWC when HWC_CHANGE_EV also occurs. HWC values
are 0 for the new set and also the HWC counters are not emitted
at the same timestamp.
* 23Aug2011 Added example for CUDA instrumentation
* 01Aug2011 Added instrumentation of MPI_Get and MPI_Put
* 29Jun2011 Modified SVN propset using
svn propset svn:keywords "Date Revision Author HeadURL Id"
* (29/Jun/2011) Added extrae_version.h to track in extrae_user_events.h the
current version of the instrumentation package
* (23/Jun/2011) mpi_ping examples now accept > 2 processes
* (20/Jun/2011) Optimized initialization of mpimpi2prv when loading a large
number of files using a large number of processes.
* (10/Jun/2011) Improved CUDA support
* (07/Jun/2011) BG/P fails to execute PAPI_read in PAPI_read when using time
sampling (fixed by additional logic & avoiding read inside).
* (07/Jun/2011) --sort-addresses is enabled by default
* (07/Jun/2011) Fixed emitting flush events in between multiple events
through routines Backend_Enter_Instrumentation and
Backend_Leave_Instrumentation. This should avoid most
Skipping state with negative duration messages at mpi2prv.
* (07/Jun/2011) Fixed hwc set change when mixing changeat-time and
changeat-globalops within the XML config file
* (07/Jun/2011) Linux/amd64 can rely on libc to call backtrace instead of
requiring libunwind
+ (02/Jun/2011) Added initial (& very limited) CUDA instrumentation (through --enable-cuda)
* (30/May/2011) Added time sampling capabilities (doc, xml, and source changes)
* (02/May/2011) Honor $DESTDIR in make install
+ (15/Apr/2011) Import initial sampling support based on alarm (2).
+ (05/Apr/2011) Additional support for pthread library
* (31/Mar/2011) Fix for matching user communications in merger.
* (28/Mar/2011) Support for XL -qdebug=function_trace
* (23/Mar/2011) Several fixes to revive CBEA support
* (22/Mar/2011) Increase buffer size to 500k elements
* (03/Mar/2011) Fix communication between threads (i.e. task != 1 is sending
or receiving)
Differentiate MPI_Init_thread per C/Fortran (Fortran
implementation may be missing but C available)
* (24/Feb/2011) Fix typo in Extrae_fini that prevented tracing applications in fortran
+ (02/Feb/2011) Added example of dynamic load instrumentation in AIX
* (01/Feb/2011) Bug solved: looking for hw signals in PAPI 4.x
Change at global operations now uses MPI_Comm_compare
Honor -f/-f-relative with the new MPIT file distribution in set-* directories
Fix sampled address translation. The address that was
obtained by the overflow was pointing to the incorrect line no
after a fix in revision 425.
* (21/Jan/2011) AIX support to generate libmpitrace.so without libtool
(libtool in AIX does not allow generating shared libraries)
+ (19/Nov/2010) FreeBSD support (examples & --with-libexecinfo)
Simplification of example install rules in Makefile.am
Change bash to sh in substitute/substitute-all
* (18/Nov/2010) Fix: when sorting addresses wipe the address2info cache.
* (12/Nov/2010) Fixes compilation when struct mallinfo is not available.
Fixes compilation if BFD&liberty are not available in the system.
Put the EXTRAE_LABELS content at the end of the PCF file.
* (09/Nov/2010) Tagged Extrae 2.1.1
* (09/Nov/2010) Bugfix, mpi2prv triead *always* to generate dimemas traces.
New -sort-addresses functionality in merger, to sort addresses of
MPI callers, user functions and so.
Improved XML parsing (env vars & case insensitive)
Added aliases to Extrae_* API/fortran
Added -fno-optimize-sibling call to avoid unexpected
optimizations (we found that sometimes some routines were
called directly skipping routines: for example,
misc_interface calls did not appear)
Simplified behavior with/without -e and with/without BFD
support:
* no bfd support / no -e / -e and invalid binary,
addresses are left unchanged in the trace
* -e binary translates addresses to source code through
dictionary
* (20/Oct/2010) added -with-dwarf for dyninst
support for upcoming dyninst (post 6.1)
+ (18/Oct/2010) .sym file is automatically loaded based on the .mpits files given through -f at merge step
+ (15/Oct/2010) Added automatic merge in the tracing libraries (see --enable-merge-in-trace in configure)
Improved documentation: examples, XML, FAQ
+ (16/Jul/2010) Now parallel merge works in a tree-based topology (must be run with NP >= 2)
+ (13/Apr/2010) Added support for IBM POE on Linux.
+ (08/Apr/2010) Intermediate files are stored in separated directories instead of a single one.
+ (10/Jan/2010) Instrumentation library in bursts mode can gather MPI calls.
+ (07/Jan/2010) Initial tracing of PACX
Removed license code.
* (07/Dec/2009) Force mkdir of the storage directory. So make-dir are no longer needed in the XML files.
Improve -f on merger. First try on absolute path, then in relative.
* (03/Dec/2009) Fixed a bug in 64bit systems where MPI_Request is a pointer. May produce mismatching communications.
Added MPItrace_user_function (int) into the headers.
* (26/Nov/2009) Fixed conversion of caller lines (sampling or MPI)
* (25/Nov/2009) Fixed access to ptr_statuses which caused to segfault when calling mpi_waitall/mpi_waitsome
+ (04/Nov/2009) Added DLB support for MPI & SMPss applications.
* (03/Nov/2009) Support of using MPI_STATUS_IGNORE in MPI_Status parameters (MPI_Wait, MPI_Recv, ...)
* (03/Sep/2009) Fixed matching communications in mpi2prv that made mpi_sendrecv fail.
* (17/Aug/2009) Fixed MPItrace_nevents to put all events to the same timestamp
Changed number of events in buffer in CELL (from 64 to 256).
Reduced clock-skew between CPU/SPU in CELL machine.
* (28/May/2009) Fixed configure checking of mpi fortran decoration type.
* (28/May/2009) Reverted status of libmpitrace.*. Now contains C & Fortran symbols because some MPI
implementations rely their MPI fortran symbols to C symbols (MN/MPICH) and others do
not (AIX/POE).
* (28/May/2009) Changes in calltrace (and similar information). Uniformization of routines and lines:
MPI callers, OpenMP outlined routines, pthread called routines, user functions, sampled points
* (26/May/2009) Fixed Dimemas translation problems related with communicators.
* (25/May/2009) Fixed a problem with a call to fstat that delays the next event after a flush.
* (25/May/2009) Fixed problems with the timestamp where the HWC_CHANGE events appeared
* (25/May/2009) Fixed problems with make-dir in XML parsing. Also added full path to *.mpits file.
* (25/May/2009) Parallel merge improvements
* (20/May/2009) IBM MPI implementation does not support calling MPI_Get_count with MPI_STATUS_IGNORE, fix.
* (23/Feb/2009) Improved support for SMPss+CELLss.
+ (18/Feb/2009) Bluegene/P examples imported. Support for PAPI 3.9.0 (BG/P).
* (09/Feb/2009) Bluegene/P support.
* (05/Jan/2009) Basic OpenMP instrumentation.
* (12/Dec/2008) Basic API instrumentation performed by the DynInst launcher.
* (11/Dec/2008) Splitted MPI instrumentation libraries (C/Fortran) and unifinied in a single one.
* (24/Oct/2008) Improved time reading for Linux/PPC 64bits
+ (24/Oct/2008) MPItrace dyninst-based instrumentation is working
+ (22/Oct/2008) XML files automatically uses the xml-parser.c rcsid variable.
+ (20/Oct/2008) XML Example files use the $PREFIX$
+ (20/Oct/2008) Added support for SMPss
*** CVS Branch STABLE-1.2
* (25/Jun/2008) Basic OpenMP instrumentation for GNU OpenMP library (aka GOMP)
* (30/May/2008) Improved the IBM XL openmp support (now with reductions)
* (21/May/2008) Patch to workaround an #ifdef inside atomic.h /* linux - ppc32bits - openmp */
* (19/May/2008) Changes ifndef USE_HARDWARE_COUNTERS-> if !USE_HARDWARE_COUNTERS
* (19/May/2008) Updated the timestamp routine for the CBEA.
* (03/Mar/2008) Modified the timestamp routine on x86/64
+ (28/Feb/2008) Added -syn-node option to the mpi2prv.
+ (11/Feb/2008) Added caller support for FreeBSD.
+ (28/Jan/2008) Added "xml-parser-id" to the XML in order to add a versioning system for the config file.
Added MPItrace_next_hwc_set / MPItrace_previous_hwc_set
Fixed MPItrace_counters
Fixed handling of large intermediate files when merging.
Imported Load Balancing into the tracing package
+ (21/Jan/2008) Added initial parallel merge for Dimemas traces (lacks 'caller' support)
Updated mpi2prv.1 manual file
Added mpimpi2prv.1, mpi2dim and mpimpi2dim manual file
XML can be used to support Dimemas/Paraver MPIT files.
* (17/Jan/2008) Added initial support to generate Dimemas trace files.
Reduced the sizeof(paraver_rec_t) on 64bit systems by 10%
* (03/Jan/2008) OpenMP instrumentation ignored "MPITRACE_ON", fixed.
Pthread instrumentation.
+ (21/Dec/2007) Added initial support for multiple output semantics in the merger. It currently generates PRV semantics only.
* Fixed an undefined reference in the calltrace module on AIX.
* (19/Dec/2007) Improved verbosity of HWC output things.
+ (18/Dec/2007) New utility under bin/ called papi_best_set that searchs for groups of PAPI counters.
* (17/Dec/2007) Solved a bug in makedir_recursive that failed when depth > 1
* (10/Dec/2007) Modified the Fortran header file -- it contained some typos.
* (29/Nov/2007) Trace package can be compiled without MPI.
Added more verbose information on HWC that cannot be added.
* (28/Nov/2007) Package compiles for CELL & SDK 3.0 -- minor changes in send/receive from mailboxes --.
* (13/Nov/2007) Fixed a link problem of the libmpitrace (included the MPI library inside)
Version 1.X
Thu 25/Oct/2007
List of changes
[ + added, - removed, * changed ]
+ Minimal support for DynInst instrumentation package (now just builds on IA64).
* Fixed a bug when calling MPI_Test. Tracing was unable to remove the request from the hash_table.
* Fixed a bug that ignored make-dir in the final-directory.
Version 1.X
Wed 10/Oct/2007
List of changes
[ + added, - removed, * changed ]
+ Added MPItrace wizard script