Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMP offloading #280

Open
ooreilly opened this issue May 8, 2023 · 3 comments
Open

OpenMP offloading #280

ooreilly opened this issue May 8, 2023 · 3 comments
Labels
bug Something isn't working libomnitrace Involves omnitrace library OMPT OpenMP tools Under Investigation

Comments

@ooreilly
Copy link

ooreilly commented May 8, 2023

I'm trying omnitrace with OpenMP offloading for a small fortran test code. Depending on which system I tested on I encountered different issues. The test code is compiled using the HPE Cray compiler, CCE 15.0.1.

I either saw:

WARNING: Unrecognized OMPT entry_point request ompt_get_record_type
WARNING: Unrecognized OMPT entry_point request ompt_get_record_ompt
WARNING: Unrecognized OMPT entry_point request ompt_get_device_num_procs
WARNING: Unrecognized OMPT entry_point request ompt_callback_mutex
WARNING: Unrecognized OMPT entry_point request ompt_callback_nest_lock
WARNING: Unrecognized OMPT entry_point request ompt_callback_flush
WARNING: Unrecognized OMPT entry_point request ompt_callback_cancel
WARNING: Unrecognized OMPT entry_point request ompt_callback_dispatch
WARNING: Unrecognized OMPT entry_point request ompt_callback_buffer_request
WARNING: Unrecognized OMPT entry_point request ompt_callback_buffer_complete
WARNING: Unrecognized OMPT entry_point request ompt_callback_dependences
WARNING: Unrecognized OMPT entry_point request ompt_callback_task_dependence
[omnitrace][21794][2045] No signals to block...
[omnitrace][21794][2044] No signals to block...
[omnitrace][21794][OnLoad] Loading ROCm tooling...
[omnitrace][21794][0][OnLoad] Setting rocm_smi state to active...
[omnitrace][21794][0][OnLoad] Requesting roctracer to setup...
[omnitrace][21794][PID=21794][rank=0] Thread 1 [0x000000000000552b] (#5) (parent: 0 [0x0000000000005522] (#0)) created
[omnitrace][21794][PID=21794][rank=0] Thread 1 [0x000000000000552b] (#5) (parent: 0 [0x0000000000005522] (#0)) exited
 n =  1100000000
 Data size (read and write): 17.600000000000001 GB
terminate called after throwing an instance of 'std::runtime_error'
  what():  Error! nullptr to ompt_data_t! key = ompt_target_enter_data_dev_0

or:

OMNITRACE: HSA_TOOLS_LIB=/pfs/lustrep2/projappl/project_462000125/omnitrace/lib/libomnitrace-dl.so.1.10.0
OMNITRACE: HSA_TOOLS_REPORT_LOAD_FAILURE=1
OMNITRACE: LD_PRELOAD=/pfs/lustrep2/projappl/project_462000125/omnitrace/lib/libomnitrace-dl.so.1.10.0
OMNITRACE: OMP_TOOL_LIBRARIES=/pfs/lustrep2/projappl/project_462000125/omnitrace/lib/libomnitrace-dl.so.1.10.0
OMNITRACE: ROCP_HSA_INTERCEPT=1
OMNITRACE: ROCP_TOOL_LIB=/pfs/lustrep2/projappl/project_462000125/omnitrace/lib/libomnitrace.so.1.10.0
srun: error: nid007263: task 0: Exited with exit code 255
srun: launch/slurm: _step_signal: Terminating StepId=3480167.3

Any idea what is happening here? Thanks!

@jrmadsen jrmadsen added bug Something isn't working libomnitrace Involves omnitrace library OMPT OpenMP tools labels Jun 22, 2023
@ppanchad-amd
Copy link

Hi @ooreilly. Internal ticket has been created to investigate your issue. Thanks!

@darren-amd
Copy link

darren-amd commented Oct 9, 2024

Hi @ooreilly,

I tried running a simple Fortran example with OpenMP offloading and was unable to reproduce the error on omnitrace-instrument v1.11.2, ROCm 6.2.2, and the GNU Fortran compiler. Could you please provide more information so that I may further investigate:

  1. The Fortran example you are running
  2. The OS, GPU and ROCm version of the 2 systems
  3. Omnitrace version omnitrace-instrument --version
  4. Commands you are using to compile the test code and run omnitrace

Also, I wanted to confirm if the compiled executable runs as expected without omnitrace? Having this information should allow me to help further, thanks!

@ooreilly
Copy link
Author

Hi @darren-amd,

Thanks for investigating. Please point me to the internal ticket (ping Ossian O'Reilly on teams).
1.

program bandwidth

    use iso_c_binding
    use omp_lib
    implicit none
    !$omp requires unified_shared_memory
    ! Set input array size to be a multiple of the CU count on a single MI250x
    integer, parameter :: n = 110 * 10000000, nthreads = 1024
    integer :: i, j, num_devices, nteams
    double precision :: GB
    double precision, allocatable, dimension(:) :: a, b
    double precision :: t0, t1, elapsed

    allocate(a(n))
    allocate(b(n))

    GB = 1000**3

    call omp_set_default_device(0)
    num_devices = omp_get_num_devices()

    ! Pick a number of teams that is multiple of the CU count
    nteams = 110 * 1000

    a = 1.0

    print *, "n = ", n
    print *, "Data size (read and write):", (c_sizeof(a) + c_sizeof(b)) / GB, "GB"

    t0 = omp_get_wtime()
    !$omp target enter data map(to:a, b)
    t1 = omp_get_wtime()

    elapsed = t1 - t0
    print *, "Initial Map elapsed:", elapsed, " s", " Bandwidth:", ( (c_sizeof(a) + c_sizeof(b)) / GB ) / elapsed, " GB/s"

    do i=1,100

        t0 = omp_get_wtime()
        !$omp target teams distribute parallel do simd num_teams(nteams) thread_limit(nthreads)
        do j=1,n
            b(j) = a(j)
        end do
        t1 = omp_get_wtime()

        elapsed = t1 - t0

        print *, "Elapsed:", elapsed, " s", " Bandwidth:", ( (c_sizeof(a) + c_sizeof(b)) / GB ) / elapsed, " GB/s"

    end do

    !$omp target update from(a,b)

    if (a(n) /= b(n)) then
        print *, "Error: a != b!", a(n), b(n)
    endif


end program
  1. OpenSuse 15.4, MI250X, ROCm 5.3. I am not sure what you mean by "two systems".

omnitrace-instrument v1.10.0 (rev: 9de3a6b0b4243bf8ec10164babdd99f64dbc65f2, tag: v1.10.0, compiler: GNU v7.5.0, rocm: v5.3.x)
4. I don't recall

Yes, the compiled executable runs as expected without omnitrace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libomnitrace Involves omnitrace library OMPT OpenMP tools Under Investigation
Projects
None yet
Development

No branches or pull requests

4 participants