-
Notifications
You must be signed in to change notification settings - Fork 11.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flang][LICM] deferred-shape arrays are not vectorized in some cases #110613
Comments
@llvm/issue-subscribers-flang-ir Author: Yusuke MINATO (yus3710-fj)
Flang can't vectorize some loops in [TSVC](https://www.netlib.org/benchmark/vectors) if arrays are `ALLOCATABLE`. For example, Flang can't vectorize the loop in `s271` of TSVC if I rewrite explicit-shape arrays to deferred-shape arrays.
! s271_allocatable.f90
subroutine s271 (ld,n,a,b,c)
implicit none
integer ld, n, i
real, allocatable :: a(:), b(:), c(:) ! added ALLOCATABLE attribute
call init(ld,n,a,b,c,'s271 ')
do i=1,n
if (b(i) .gt. 0.) a(i) = a(i) + b(i) * c(i)
end do
call dummy(ld,n,a,b,c,1.)
end subroutine s271 $ flang-new -v -O3 -flang-experimental-integer-overflow s271_allocatable.f90 -S -Rpass=vector -mcpu=a64fx
flang-new version 20.0.0git (https://github.com/llvm/llvm-project.git 2c770675ce36402b51a320ae26f369690c138dc1)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/build/bin
Build config: +assertions
Found candidate GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11
Selected GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11
Candidate multilib: .;@<!-- -->m64
Selected multilib: .;@<!-- -->m64
"/path/to/build/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -target-cpu a64fx -target-feature +outline-atomics -target-feature +v8.2a -target-feature +aes -target-feature +complxnum -target-feature +crc -target-feature +fp-armv8 -target-feature +fullfp16 -target-feature +lse -target-feature +neon -target-feature +perfmon -target-feature +ras -target-feature +rdm -target-feature +sha2 -target-feature +sve -fversion-loops-for-stride -flang-experimental-integer-overflow -Rpass=vector -resource-dir /path/to/build/lib/clang/20 -mframe-pointer=non-leaf -O3 -o /dev/null -x f95-cpp-input s271_allocatable.f90 The base addresses and the lower bounds of arrays aren't recognized as loop-invariant. 11: ; preds = %.lr.ph, %25
%indvars.iv = phi i64 [ 1, %.lr.ph ], [ %indvars.iv.next, %25 ] ;; i
%12 = sub nsw i64 %indvars.iv, %.unpack322.unpack.unpack ;; i - lbound(b,1)
%13 = getelementptr float, ptr %.unpack266.pre, i64 %12
%14 = load float, ptr %13, align 4, !tbaa !12
%15 = fcmp fast ogt float %14, 0.000000e+00 ;; b(i) > 0
br i1 %15, label %16, label %25
16: ; preds = %11
%.unpack329 = load ptr, ptr %2, align 8, !tbaa !4 ;; a
%.unpack343.unpack.unpack = load i64, ptr %.elt342, align 8, !tbaa !4 ;; lbound(a,1)
%17 = sub nsw i64 %indvars.iv, %.unpack343.unpack.unpack ;; i - lbound(a,1)
%18 = getelementptr float, ptr %.unpack329, i64 %17
%19 = load float, ptr %18, align 4, !tbaa !14 ;; a(i)
%.unpack350 = load ptr, ptr %4, align 8, !tbaa !4 ;; c
%.unpack364.unpack.unpack = load i64, ptr %.elt363, align 8, !tbaa !4 ;; lbound(c,1)
%20 = sub nsw i64 %indvars.iv, %.unpack364.unpack.unpack ;; i - lbound(c,1)
%21 = getelementptr float, ptr %.unpack350, i64 %20
%22 = load float, ptr %21, align 4, !tbaa !16 ;; c(i)
%23 = fmul fast float %22, %14
%24 = fadd fast float %23, %19
store float %24, ptr %18, align 4, !tbaa !14
br label %25
25: ; preds = %16, %11
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond.not = icmp eq i64 %indvars.iv.next, %10
br i1 %exitcond.not, label %._crit_edge.loopexit, label %11 If I move |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Flang can't vectorize some loops in TSVC if arrays are
ALLOCATABLE
. For example, Flang can't vectorize the loop ins271
of TSVC if I rewrite explicit-shape arrays to deferred-shape arrays.The base addresses and the lower bounds of arrays aren't recognized as loop-invariant.
If I move
%.unpack329
,%.unpack343.unpack.unpack
,%.unpack350
and%.unpack364.unpack.unpack
outside the loop manually, the loop is vectorized.The text was updated successfully, but these errors were encountered: