Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge develop in preparation of the 0.3.28 release #4854

Merged
merged 399 commits into from
Aug 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
399 commits
Select commit Hold shift + click to select a range
0073aff
Merge pull request #4693 from goplanid/locks-improvement
martin-frbg May 26, 2024
1036378
add cblas_?gemm_batch
martin-frbg May 29, 2024
89c7bbc
add cblas_?gemm_batch
martin-frbg May 29, 2024
833a888
add cblas_?gemm_batch
martin-frbg May 29, 2024
d0794f8
add gemm_batch driver
martin-frbg May 29, 2024
362a063
remove return value
martin-frbg May 29, 2024
dd4505c
Fix CMake warning
Neumann-A May 30, 2024
ff6670c
don't generate non-cblas files for gemm_batch
martin-frbg May 30, 2024
b9a1c9a
Merge pull request #4725 from Neumann-A/patch-1
martin-frbg May 30, 2024
0d007ad
fix clang_cl-flang job to use flang-new after the llvm update
martin-frbg May 30, 2024
ad2b5c6
fix another corner case involving infinity
martin-frbg May 30, 2024
a16f824
add tests with the imaginary part of the array infinite
martin-frbg May 30, 2024
ab13cfe
more fixes for infinite x
martin-frbg May 31, 2024
ce130f1
Update zscal.c
martin-frbg May 31, 2024
9ff4e97
additional fixes for handling INF arguments
martin-frbg May 31, 2024
516743f
fix other instances of mishandling INF
martin-frbg May 31, 2024
8c05765
fix other corner cases where x=INF
martin-frbg May 31, 2024
076766d
Update CMakeLists.txt
martin-frbg May 31, 2024
db070a9
add gemm_batch drivers
martin-frbg May 31, 2024
6b564d5
Merge pull request #4727 from martin-frbg/issue4726
martin-frbg May 31, 2024
56bd57c
Merge pull request #4720 from martin-frbg/issue3039
martin-frbg May 31, 2024
020b3e1
fix handling of INF arguments
martin-frbg May 31, 2024
83bc8d5
Merge pull request #4712 from RajalakshmiSR/zscalp10
martin-frbg Jun 1, 2024
4400417
Updated CONTRIBUTORS.md
jake-arkinstall-quantinuum Jun 1, 2024
a9fae32
Merge pull request #4730 from jake-arkinstall/develop
martin-frbg Jun 1, 2024
db9f7bc
fix float array types to include bfloat16
martin-frbg Jun 2, 2024
3a3ff1b
Merge pull request #4732 from martin-frbg/issue4731
martin-frbg Jun 3, 2024
df87aeb
Drop the -static Fortran flag from generic builds as it breaks OpenMP
martin-frbg Jun 4, 2024
8ab2e9e
LoongArch: DGEMM small matrix opt
XiWeiGu Sep 16, 2023
913be34
Merge pull request #4733 from martin-frbg/issue4719
martin-frbg Jun 4, 2024
0c2ac76
Merge pull request #4734 from XiWeiGu/loongarch64_small_matrix
martin-frbg Jun 5, 2024
4e9144b
Update .cirrus.yml (#4735)
martin-frbg Jun 5, 2024
af73ae6
LoongArch: Fixed issue 4728
XiWeiGu Jun 6, 2024
2787c9f
Disable GEMM3M for generic targets (not implemented)
martin-frbg Jun 6, 2024
0cf8b98
Merge pull request #4736 from XiWeiGu/loongarch_issue4728
martin-frbg Jun 6, 2024
442dec2
Merge pull request #4738 from martin-frbg/issue4737
martin-frbg Jun 6, 2024
f96ee86
remove .mod files during make clean
martin-frbg Jun 6, 2024
f955616
Merge pull request #4740 from martin-frbg/fixlapackmod
martin-frbg Jun 6, 2024
ffc1ab3
Test corner cases of all SCAL variants
martin-frbg Jun 6, 2024
1abafcd
handle corner cases involving NAN and/or INF
martin-frbg Jun 6, 2024
2bd43ad
Merge branch 'OpenMathLib:develop' into issue4728
martin-frbg Jun 6, 2024
9e22d70
Dynamic locking in Pthread Backend to allow multiple BLAS calls to be…
shivammonaka Jun 7, 2024
5ed4f24
Handle corner cases with INF and NAN arguments
martin-frbg Jun 7, 2024
c7cacd9
disable the shortcut for da=0 to ensure proper handling of INF and NAN
martin-frbg Jun 7, 2024
6ffaf99
disable da=0 shortcut to handle NAN and INF correctly
martin-frbg Jun 7, 2024
2f12a47
fix build options for CAXPYC/ZAXPYC
martin-frbg Jun 9, 2024
62c33db
Merge pull request #4746 from martin-frbg/issue4743
martin-frbg Jun 9, 2024
1ca1bb8
LoongArch64: Update QEMU
XiWeiGu Jun 13, 2024
ed5db5b
LoongArch64: Update the address for obtaining the Clang cross-toolchain
XiWeiGu Jun 13, 2024
dd7efcf
Avoid exceeding the configured thread count in x86_64 TOBF16 (#4748)
martin-frbg Jun 14, 2024
fdb88e0
Merge pull request #4749 from XiWeiGu/loongarch64-qemu-update
martin-frbg Jun 14, 2024
3d8054f
add clobber list
martin-frbg Jun 14, 2024
21c0f76
ensure that cpu-specific -march options are always applied to icx
martin-frbg Jun 14, 2024
d25ee4d
Fix detection of Intel ifx and apply -fp-model option to it
martin-frbg Jun 14, 2024
8bc37f9
Merge pull request #4754 from martin-frbg/issue4750-2
martin-frbg Jun 15, 2024
f13403b
Merge pull request #4755 from martin-frbg/issue4739
martin-frbg Jun 15, 2024
33bb4b9
Improve error message output from the fork() utest (#4753)
martin-frbg Jun 15, 2024
cf2962b
fix possible infinite loop on error (Reference-LAPACK PR 1024)
martin-frbg Jun 18, 2024
bf521a2
fix possible infinite loop on error (Reference-LAPACK PR 1024)
martin-frbg Jun 18, 2024
a9817b4
fix reference in format (Reference-LAPACK PR 1024)
martin-frbg Jun 18, 2024
2152796
fix possible infinite loop on error (Reference-LAPACK PR 1024)
martin-frbg Jun 18, 2024
18063b1
Merge pull request #4757 from martin-frbg/lapack1024
martin-frbg Jun 19, 2024
7582796
Add support forZhaoxin KX7000
martin-frbg Jun 20, 2024
9b2a0c7
Add Zhaoxin KX7000
martin-frbg Jun 20, 2024
0773695
Merge pull request #4760 from martin-frbg/zhaoxin7k
martin-frbg Jun 20, 2024
7e9a4ba
Merge pull request #4741 from shivammonaka/Pthread_Scalability_Improv…
martin-frbg Jun 20, 2024
3ec5992
Add a clobber list to fix utest errors seen with gcc13 on Apple M
martin-frbg Jun 20, 2024
1ba1b9c
Merge pull request #4761 from martin-frbg/m1zdot
martin-frbg Jun 20, 2024
a2ee4b1
Merge branch 'OpenMathLib:develop' into issue4728
martin-frbg Jun 21, 2024
f1248b8
handle INF and NAN in input
martin-frbg Jun 22, 2024
7f8f037
handle INF and NAN in input
martin-frbg Jun 22, 2024
0a744a9
temporarily(?) disable the alpha=0 branch to handle NaN/Inf in x
martin-frbg Jun 22, 2024
68f2501
temporarily(?) disable the alpha=0 branch to handle Inf/NaN in x
martin-frbg Jun 22, 2024
bd47630
exclude the alpha=0 branch as it does not handle NaN or Inf in x
martin-frbg Jun 22, 2024
c08113c
fix special cases of x= NAN or INF
martin-frbg Jun 22, 2024
541e1b6
disable the fast path for inc=1, alpha=0 as it does not handle x=NaN …
martin-frbg Jun 23, 2024
a11f086
Update sscal_msa.c
martin-frbg Jun 23, 2024
9e24121
temporarily(?) disable da=0 shortcut to handle x=Inf or NAN
martin-frbg Jun 23, 2024
8c472ef
Further tweak small GEMM for AArch64
Mousius Jun 24, 2024
37a8547
BENCH: sync codspeed-benchmarks with BLAS-benchmarks
ev-br Jun 24, 2024
400cf9f
restore the problem sizes for codspeed benchmarks
ev-br Jun 24, 2024
11a0c56
BENCH: add BLAS level 2 gemv and gbmv
ev-br Jun 27, 2024
c1019d5
Handle INF and NAN in inputs
martin-frbg Jun 27, 2024
28fb95d
BENCH: actually add gemv/gbmv f2py wrappers
ev-br Jun 27, 2024
2a5fe97
temporarily(?) disable the alpha=0 branch as it does not handle INF,NAN
martin-frbg Jun 27, 2024
f3c364c
temporarily(?) disable the alpha=0 branch as it fails to handle INF,NAN
martin-frbg Jun 27, 2024
1b52e3d
openblas: fix `BUFFERSIZE` value
drupol Jun 28, 2024
e8da541
fix regex for renaming callbacks
isuruf Jun 28, 2024
017a268
Update GitHub Actions used in docs.yml
rgommers Jun 28, 2024
97df476
Improve html theme: dark mode, nicer color scheme, icons for github/l…
rgommers Jun 29, 2024
30770c5
Merge pull request #4772 from isuruf/rename
martin-frbg Jun 29, 2024
8cbb797
Merge pull request #4773 from rgommers/update-docs-yml
martin-frbg Jun 29, 2024
c33bc84
Merge pull request #4729 from martin-frbg/issue4728
martin-frbg Jun 29, 2024
3a8e72c
docs: improve the "About" documentation page
rgommers Jun 29, 2024
237c2c4
docs: fix footnote rendering on "Redistributing OpenBLAS" page
rgommers Jun 29, 2024
c1b9bb8
docs: improvements to the User Manual
rgommers Jun 29, 2024
a8e1ff8
docs: improve the "Build system" page
rgommers Jun 30, 2024
3eba16c
docs: improve the Developer manual
rgommers Jun 30, 2024
ca9a0c2
docs: improve extensions page
rgommers Jun 30, 2024
d0b9948
Guard against invalid thread_status.queue
martin-frbg Jun 30, 2024
3677b38
Merge pull request #4702 from bashimao/detect-nv-grace
martin-frbg Jun 30, 2024
4052b31
Merge pull request #4763 from ev-br/sync-codspeed
martin-frbg Jun 30, 2024
bdb6069
Merge pull request #4775 from martin-frbg/issue4770
martin-frbg Jun 30, 2024
c1c0dbf
docs: address review comments on PR 4774
rgommers Jul 2, 2024
5b385fd
WIP: fish out the gesdd failure?
ev-br Jul 2, 2024
9d0abe2
Add support for RISCV64_GENERIC in cmake
JAicewizard Jul 2, 2024
cd3c167
ignore sgesdd failure on codspeed
ev-br Jul 3, 2024
74f059a
Update OSX jobs to use the macos-12 image
martin-frbg Jul 3, 2024
acf0c3c
Merge pull request #4777 from ev-br/sgesdd_ci_err
martin-frbg Jul 3, 2024
2df4007
Update compiler and sdk versions for osx
martin-frbg Jul 3, 2024
df81b15
Merge pull request #4774 from rgommers/improve-docs
martin-frbg Jul 3, 2024
9836883
Merge pull request #4780 from martin-frbg/azureosx12
martin-frbg Jul 3, 2024
6ede8b1
ci: fix CI job to deploy docs, and make it run on pull requests too
rgommers Jul 3, 2024
f729013
Merge pull request #4781 from rgommers/fix-docs-deployment
martin-frbg Jul 3, 2024
cea4abc
Fix compiling on mingw
JAicewizard Jul 4, 2024
b422742
collect error output from ctest, if any
martin-frbg Jul 4, 2024
3063d03
Add another CPUID for Meteor Lake
martin-frbg Jul 4, 2024
536200b
fix handling of INF or NAN
martin-frbg Jul 4, 2024
e1eef56
Merge pull request #4783 from martin-frbg/cpuid_meteor
martin-frbg Jul 4, 2024
4547908
docs: rewrite "Install OpenBLAS" page (part 1: binaries, basic from s…
rgommers Jul 4, 2024
4520143
docs: rework building from source on Windows section
rgommers Jul 4, 2024
268dcd8
docs: convert remaining install sections (Android, iOS, FreeBSD, Cort…
rgommers Jul 4, 2024
a5c04e3
Update scal.c
martin-frbg Jul 4, 2024
cb15483
Vectorize SBGEMM incopy - 4x faster.
ChipKerchner Jul 9, 2024
e706bc1
Fix core assignment for Intel family 15
martin-frbg Jul 9, 2024
f708944
Add all 4 variations of the SBGEMM to compare_sgemm_sbgemm
ChipKerchner Jul 10, 2024
4c12090
Fix build on FreeBSD/powerpc64*
pkubaj Jul 10, 2024
1d77647
Merge pull request #4769 from drupol/fix-buffersize-value
martin-frbg Jul 11, 2024
362856f
Merge pull request #4778 from JAicewizard/develop
martin-frbg Jul 11, 2024
f0fc724
Merge pull request #4792 from martin-frbg/issue4790
martin-frbg Jul 11, 2024
8277828
Merge pull request #4785 from rgommers/docs-install
martin-frbg Jul 11, 2024
b70227a
Merge pull request #4795 from pkubaj/patch-1
martin-frbg Jul 11, 2024
475bd24
Suffix BUFFERSIZEs as UL to prevent int overflow in computations
martin-frbg Jul 11, 2024
2fefdfa
Merge branch 'OpenMathLib:develop' into azurewincl
martin-frbg Jul 11, 2024
dfc11ef
Merge pull request #4791 from ChipKerchner/vectorizeSBGEMMincopy
martin-frbg Jul 11, 2024
5d08ec7
Merge pull request #4782 from martin-frbg/azurewincl
martin-frbg Jul 11, 2024
f3cebb3
x86: Fixed numpy CI failure when the target is ZEN.
XiWeiGu Jul 10, 2024
9789034
Merge branch 'OpenMathLib:develop' into ppcbuf
martin-frbg Jul 12, 2024
6013b36
Merge pull request #4796 from martin-frbg/ppcbuf
martin-frbg Jul 12, 2024
3f39c8f
LoongArch: Fixed numpy CI failure
XiWeiGu Jul 12, 2024
9b3e80e
utest: Add test_gemv
XiWeiGu Jul 15, 2024
3b715e6
Add autodetection for riscv64
markdryan Jul 5, 2024
67bf4b6
Fix axpby_rvv kernels for cases where inc_y = 0
markdryan Jul 12, 2024
a373d0f
Improve the error message for thread creation failure
martin-frbg Jul 15, 2024
a3c10c6
Merge pull request #4799 from martin-frbg/issue4762
martin-frbg Jul 15, 2024
127ea5d
Add missing parenthesis
vlad0x00 Jul 15, 2024
56e1782
Add another missing parenthesis
vlad0x00 Jul 15, 2024
0985fdc
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
Jul 16, 2024
b1aa2e1
Merge pull request #4802 from markdryan/markdryan/rvv_axpby_incy0
martin-frbg Jul 16, 2024
e9f6aa4
Merge pull request #4800 from vlad0x00/patch-2
martin-frbg Jul 16, 2024
f6d6c14
mips: Fixed numpy CI failure
XiWeiGu Jul 17, 2024
34b80ce
mips64: Fixed numpy CI failure
XiWeiGu Jul 17, 2024
ee87cb9
Merge pull request #4803 from iha-taisei/SVESupportSDGEMV
martin-frbg Jul 17, 2024
eb4879e
make NAN handling depend on the dummy2 parameter
martin-frbg Jul 17, 2024
b9bfc8c
make NAN handling depend on dummy2 parameter
martin-frbg Jul 17, 2024
7375121
make NAN handling depend on dummy2 parameter
martin-frbg Jul 17, 2024
7284c53
make NAN handling depend on dummy2 parameter
martin-frbg Jul 17, 2024
3870995
make NAN handling depend on dummy2 parameter
martin-frbg Jul 17, 2024
2020569
fix NAN handling and make it depend on dummy2 parameter
martin-frbg Jul 17, 2024
b1c9faf
Remove k2 loop from DGEMM TN and use a more conservative heuristic fo…
Mousius Jul 18, 2024
9984c5c
Clean up k2 removal more and unroll SGEMM more
Mousius Jul 18, 2024
a9edddb
Unroll TN further
Mousius Jul 18, 2024
db98f87
Try to fix LAPACK testing failures on P7.
penghongbo Jul 19, 2024
5a845ef
Merge pull request #4809 from penghongbo/reorder_gemm_gemvt
martin-frbg Jul 19, 2024
dd6c33d
make NAN handling depend on dummy2 parameter
martin-frbg Jul 19, 2024
a815594
Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch
martin-frbg Jul 19, 2024
7311d93
Unroll TT further
Mousius Jul 19, 2024
ea4ab3b
Better header guard around bridge
Mousius Jul 20, 2024
c2ffd90
make NAN handling depend on dummy2 parameter
martin-frbg Jul 20, 2024
c064319
fix alpha=NAN case
martin-frbg Jul 20, 2024
dfbc234
fix NAN handling
martin-frbg Jul 20, 2024
73f8866
make NAN handling depend on DUMMY2 parameter
martin-frbg Jul 21, 2024
f5d0431
Merge branch 'OpenMathLib:develop' into scalfixes
martin-frbg Jul 21, 2024
29f3e75
work around a gcc14.1 bug observed on Loongarch
martin-frbg Jul 23, 2024
821ef34
Add A64FX to the list of CPUs supported by DYNAMIC_ARCH
yamazakimitsufumi Jul 23, 2024
ed82fd2
Merge pull request #4810 from martin-frbg/issue4805
martin-frbg Jul 23, 2024
0096482
fix incompatible definitions of MAXLOC
martin-frbg Jul 23, 2024
4140ac4
Merge pull request #4813 from martin-frbg/issue4812
martin-frbg Jul 23, 2024
b613754
Update scal..c
martin-frbg Jul 24, 2024
88caf02
Fix ambiguous error on Mac OS
yamazakimitsufumi Jul 25, 2024
949a7f9
Merge pull request #4811 from yamazakimitsufumi/add_a64fx_to_dynamic_…
martin-frbg Jul 25, 2024
a4e56e0
Merge pull request #4806 from Mousius/small-gemm
martin-frbg Jul 25, 2024
15c53dd
Merge pull request #4794 from XiWeiGu/Fixed_Numpy_CI_Test
martin-frbg Jul 25, 2024
fb7c53c
Merge pull request #4807 from martin-frbg/scalfixes
martin-frbg Jul 25, 2024
24acdd6
correct offset
martin-frbg Jul 26, 2024
a875304
fix inverted conditional for NAN handling
martin-frbg Jul 26, 2024
d9ae460
remove C99 requirement
martin-frbg Jul 26, 2024
db5328e
make array dimensions constant
martin-frbg Jul 26, 2024
7006492
replace "Preview" in the MSVC vcvarsall path with "Community"
martin-frbg Jul 26, 2024
a090011
just use numeric constants in dimensions
martin-frbg Jul 26, 2024
25e148e
Merge pull request #4817 from martin-frbg/fix4807
martin-frbg Jul 26, 2024
0929865
Merge pull request #4818 from martin-frbg/docs_winbuild
martin-frbg Jul 26, 2024
4460d3e
re-enable the sgesdd benchmark
martin-frbg Jul 26, 2024
886acfc
Merge pull request #4819 from martin-frbg/issue4776
martin-frbg Jul 26, 2024
175008c
harden against a dashed suffix to the gcc version number
martin-frbg Jul 27, 2024
05bf35f
Merge pull request #4822 from martin-frbg/issue4821
martin-frbg Jul 27, 2024
85ca003
Add fallback compile options for A64FX target
Mousius Jul 29, 2024
3ed226d
Re-add ISCLANG filter
Mousius Jul 29, 2024
6d071f1
Merge pull request #4826 from Mousius/a64fx-fallback
martin-frbg Jul 29, 2024
54ce33e
Fix GCC11 check for A64FX target
Mousius Jul 29, 2024
d11e734
Merge pull request #4827 from Mousius/a64fx-gcc11
martin-frbg Jul 29, 2024
a13015b
try requesting ubuntu22 instead of latest
martin-frbg Jul 30, 2024
86c15f0
Update Jenkinsfile.pwr
martin-frbg Jul 30, 2024
136a4ed
Merge pull request #4830 from martin-frbg/jenk
martin-frbg Jul 30, 2024
3db5dbc
forward to GEMV when one argument is actually a vector
martin-frbg May 20, 2024
28b5334
Complete implementation of GEMV forwarding
Mousius Jul 23, 2024
90eb863
Re-add accidental removal
Mousius Jul 23, 2024
b26424c
Allow opt into GEMM -> GEMV forwarding
Mousius Jul 24, 2024
ba2e989
Add accumulators to AArch64 GEMV Kernels
Mousius Jul 31, 2024
edbf093
Update zarch SCAL kernels to handle INF and NAN arguments (#4829)
martin-frbg Jul 31, 2024
9afd0c8
Merge pull request #4814 from Mousius/gemv-proxy
martin-frbg Jul 31, 2024
fcb88b9
enable GEMM/GEMV forwarding for riscv and ppc
martin-frbg Jul 31, 2024
9eecd0d
enable GEMM/GEMV forwarding for riscv and ppc
martin-frbg Jul 31, 2024
42d8865
fix typo
martin-frbg Aug 1, 2024
abff4ba
re-enable queue struct members related to locking
martin-frbg Aug 2, 2024
6468dc1
restore the coarse locking of the pre-4359 version
martin-frbg Aug 2, 2024
2c2b6bc
Merge pull request #4831 from martin-frbg/gemmforward
martin-frbg Aug 3, 2024
7af3c55
use TARGET rather than CORE from Makefile.conf_last to fill in pkgconfig
martin-frbg Aug 3, 2024
f408194
mention RISCV64 as a permitted architecture for DYNAMIC_ARCH
martin-frbg Aug 3, 2024
e8bd97a
add RISCV64 entries for DYNAMIC_ARCH
martin-frbg Aug 3, 2024
2aed901
Add riscv sources for DYNAMIC_ARCH
martin-frbg Aug 3, 2024
5257f80
fix invalid ifdef syntax in HUGETLB handling
martin-frbg Aug 3, 2024
60abcc3
add proper return statement
martin-frbg Aug 3, 2024
f1c9803
add proper return statement
martin-frbg Aug 3, 2024
ae27b02
Merge pull request #4837 from martin-frbg/dyn_riscv_cmake
martin-frbg Aug 4, 2024
50397e0
Merge pull request #4838 from martin-frbg/fix4662-3
martin-frbg Aug 4, 2024
cf483d9
Merge pull request #4836 from martin-frbg/issue4275-3
martin-frbg Aug 4, 2024
19f8a8d
Merge pull request #4839 from martin-frbg/fix4794
martin-frbg Aug 4, 2024
a4845fa
set MACOSX_RPATH to true on Apple
martin-frbg Aug 4, 2024
14a8a9a
Merge pull request #4840 from martin-frbg/issue4823
martin-frbg Aug 5, 2024
c8b4cec
prevent compilers from using FMA (Reference-LAPACK PR 1033)
martin-frbg Aug 5, 2024
bce48d4
Fix typos and sytrd boundary workspace (Reference-LAPACK PR 1030)
martin-frbg Aug 5, 2024
ae9e0e3
Merge pull request #4842 from martin-frbg/lapack1030
martin-frbg Aug 5, 2024
5bdd3a0
Merge pull request #4841 from martin-frbg/lapack1033
martin-frbg Aug 5, 2024
7e8118d
Support new build option LAPACK_STRLEN
martin-frbg Aug 6, 2024
cc36db6
Support new LAPACK build option LAPACK_STRLEN
martin-frbg Aug 6, 2024
923b79d
make the type of the hidden arguments configurable via LAPACK_STRLEN …
martin-frbg Aug 6, 2024
797ae08
Add explanation of LAPACK_STRLEN
martin-frbg Aug 6, 2024
3b8d7df
Merge pull request #4846 from martin-frbg/lapack1025
martin-frbg Aug 6, 2024
753c7eb
Merge pull request #4835 from martin-frbg/revertwin4359
martin-frbg Aug 7, 2024
ccc2333
have the dummy GEMM3M kernel at least forward to regular GEMM
martin-frbg Aug 7, 2024
46e331a
remove the unworkable GEMM3M restriction from GENERIC again
martin-frbg Aug 7, 2024
deae7cf
Merge pull request #4850 from martin-frbg/generic_3m
martin-frbg Aug 7, 2024
76db713
fix invocation of GEMM3M tests
martin-frbg Aug 7, 2024
d92cc96
Merge pull request #4851 from martin-frbg/test3m
martin-frbg Aug 7, 2024
7878976
disable forwarding from SBGEMM to SBGEMV for now
martin-frbg Aug 8, 2024
1df95bb
Update Changelog.txt for 0.3.28
martin-frbg Aug 8, 2024
1c2bfea
Merge pull request #4852 from martin-frbg/fix4814
martin-frbg Aug 8, 2024
2c8e001
Merge pull request #4853 from martin-frbg/changelog0328
martin-frbg Aug 8, 2024
884a949
Merge branch 'release-0.3.0' into develop
martin-frbg Aug 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
26 changes: 14 additions & 12 deletions .cirrus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ macos_instance:
# - make CC=gcc-11 FC=gfortran-11 USE_OPENMP=1

macos_instance:
image: ghcr.io/cirruslabs/macos-monterey-xcode:latest
image: ghcr.io/cirruslabs/macos-sonoma-xcode:latest
task:
name: AppleM1/LLVM x86_64 xbuild
compile_script:
Expand All @@ -58,8 +58,8 @@ task:
- export VALID_ARCHS="i386 x86_64"
- xcrun --sdk macosx --show-sdk-path
- xcodebuild -version
- export CC=/Applications/Xcode-15.3.0.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
- export CFLAGS="-O2 -unwindlib=none -Wno-macro-redefined -isysroot /Applications/Xcode-15.3.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.4.sdk -arch x86_64"
- export CC=/Applications/Xcode_15.4.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
- export CFLAGS="-O2 -unwindlib=none -Wno-macro-redefined -isysroot /Applications/Xcode_15.4.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.5.sdk -arch x86_64"
- make TARGET=CORE2 DYNAMIC_ARCH=1 NUM_THREADS=32 HOSTCC=clang NOFORTRAN=1 RANLIB="ls -l"
always:
config_artifacts:
Expand All @@ -70,16 +70,18 @@ task:
# type: application/octet-streamm

macos_instance:
image: ghcr.io/cirruslabs/macos-monterey-xcode:latest
image: ghcr.io/cirruslabs/macos-sonoma-xcode:latest
task:
name: AppleM1/LLVM armv8-ios xbuild
compile_script:
- #brew install llvm
- export #PATH=/opt/homebrew/opt/llvm/bin:$PATH
- export #LDFLAGS="-L/opt/homebrew/opt/llvm/lib"
- export #CPPFLAGS="-I/opt/homebrew/opt/llvm/include"
- export CC=/Applications/Xcode-15.3.0.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
- export CFLAGS="-O2 -unwindlib=none -Wno-macro-redefined -isysroot /Applications/Xcode-15.3.0.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS17.4.sdk -arch arm64 -miphoneos-version-min=10.0"
- export CC=/Applications/Xcode_15.4.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
- export CFLAGS="-O2 -unwindlib=none -Wno-macro-redefined -isysroot /Applications/Xcode_15.4.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS17.5.sdk -arch arm64 -miphoneos-version-min=10.0"
- xcrun --sdk iphoneos --show-sdk-path
- ls -l /Applications
- make TARGET=ARMV8 NUM_THREADS=32 HOSTCC=clang NOFORTRAN=1 CROSS=1
always:
config_artifacts:
Expand All @@ -96,11 +98,11 @@ task:
- export #LDFLAGS="-L/opt/homebrew/opt/llvm/lib"
- export #CPPFLAGS="-I/opt/homebrew/opt/llvm/include"
- ls /System/Volumes/Data/opt/homebrew
- ls -l /System/Volumes/Data/opt/homebrew/Caskroom/
- ls -l /System/Volumes/Data/opt/homebrew/Caskroom/android-ndk
- find /System/Volumes/Data/opt/homebrew -name "armv7a-linux-androideabi*-ranlib"
- #export CC=/Applications/Xcode-13.4.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
- #export CFLAGS="-O2 -unwindlib=none -Wno-macro-redefined -isysroot /Applications/Xcode-13.4.1.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS16.0.sdk -arch arm64 -miphoneos-version-min=10.0"
- export CC=/System/Volumes/Data/opt/homebrew/Caskroom/android-ndk/26c/AndroidNDK*.app/Contents/NDK/toolchains/llvm/prebuilt/darwin-x86_64/bin/armv7a-linux-androideabi23-clang
- export CC=/System/Volumes/Data/opt/homebrew/Caskroom/android-ndk/26d/AndroidNDK*.app/Contents/NDK/toolchains/llvm/prebuilt/darwin-x86_64/bin/armv7a-linux-androideabi23-clang
- make TARGET=ARMV7 ARM_SOFTFP_ABI=1 NUM_THREADS=32 HOSTCC=clang NOFORTRAN=1 RANLIB="ls -l"
always:
config_artifacts:
Expand Down Expand Up @@ -132,7 +134,7 @@ task:
FreeBSD_task:
name: FreeBSD-gcc12
freebsd_instance:
image_family: freebsd-13-2
image_family: freebsd-13-3
install_script:
- pkg update -f && pkg upgrade -y && pkg install -y gmake gcc
compile_script:
Expand All @@ -143,7 +145,7 @@ FreeBSD_task:
FreeBSD_task:
name: freebsd-gcc12-ilp64
freebsd_instance:
image_family: freebsd-13-2
image_family: freebsd-13-3
install_script:
- pkg update -f && pkg upgrade -y && pkg install -y gmake gcc
compile_script:
Expand All @@ -153,10 +155,10 @@ FreeBSD_task:
FreeBSD_task:
name: FreeBSD-clang-openmp
freebsd_instance:
image_family: freebsd-13-2
image_family: freebsd-13-3
install_script:
- pkg update -f && pkg upgrade -y && pkg install -y gmake gcc
- ln -s /usr/local/lib/gcc12/libgfortran.so.5.0.0 /usr/lib/libgfortran.so
- ln -s /usr/local/lib/gcc13/libgfortran.so.5.0.0 /usr/lib/libgfortran.so
compile_script:
- gmake CC=clang FC=gfortran USE_OPENMP=1 CPP_THREAD_SAFETY_TEST=1

Expand Down
1 change: 1 addition & 0 deletions .github/workflows/c910v.yml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ jobs:
run: |
export PATH=$GITHUB_WORKSPACE/qemu-install/bin/:$PATH
qemu-riscv64 ./utest/openblas_utest
qemu-riscv64 ./utest/openblas_utest_ext
OPENBLAS_NUM_THREADS=2 qemu-riscv64 ./ctest/xscblat1
OPENBLAS_NUM_THREADS=2 qemu-riscv64 ./ctest/xdcblat1
OPENBLAS_NUM_THREADS=2 qemu-riscv64 ./ctest/xccblat1
Expand Down
157 changes: 157 additions & 0 deletions .github/workflows/codspeed-bench.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
name: Run codspeed benchmarks

on: [push, pull_request]

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

permissions:
contents: read # to fetch code (actions/checkout)

jobs:
benchmarks:
if: "github.repository == 'OpenMathLib/OpenBLAS'"
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest]
fortran: [gfortran]
build: [make]
pyver: ["3.12"]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v3
with:
python-version: ${{ matrix.pyver }}

- name: Print system information
run: |
if [ "$RUNNER_OS" == "Linux" ]; then
cat /proc/cpuinfo
fi

- name: Install Dependencies
run: |
if [ "$RUNNER_OS" == "Linux" ]; then
sudo apt-get update
sudo apt-get install -y gfortran cmake ccache libtinfo5
else
echo "::error::$RUNNER_OS not supported"
exit 1
fi

- name: Compilation cache
uses: actions/cache@v3
with:
path: ~/.ccache
# We include the commit sha in the cache key, as new cache entries are
# only created if there is no existing entry for the key yet.
# GNU make and cmake call the compilers differently. It looks like
# that causes the cache to mismatch. Keep the ccache for both build
# tools separate to avoid polluting each other.
key: ccache-${{ runner.os }}-${{ matrix.build }}-${{ matrix.fortran }}-${{ github.ref }}-${{ github.sha }}
# Restore a matching ccache cache entry. Prefer same branch and same Fortran compiler.
restore-keys: |
ccache-${{ runner.os }}-${{ matrix.build }}-${{ matrix.fortran }}-${{ github.ref }}
ccache-${{ runner.os }}-${{ matrix.build }}-${{ matrix.fortran }}
ccache-${{ runner.os }}-${{ matrix.build }}

- name: Write out the .pc
run: |
cd benchmark/pybench
cat > openblas.pc << EOF
libdir=${{ github.workspace }}
includedir= ${{ github.workspace }}
openblas_config= OpenBLAS 0.3.27 DYNAMIC_ARCH NO_AFFINITY Haswell MAX_THREADS=64
version=0.0.99
extralib=-lm -lpthread -lgfortran -lquadmath -L${{ github.workspace }} -lopenblas
Name: openblas
Description: OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version
Version: ${version}
URL: https://github.com/xianyi/OpenBLAS
Libs: ${{ github.workspace }}/libopenblas.so -Wl,-rpath,${{ github.workspace }}
Libs.private: -lm -lpthread -lgfortran -lquadmath -L${{ github.workspace }} -lopenblas
Cflags: -I${{ github.workspace}}
EOF
cat openblas.pc

- name: Configure ccache
run: |
if [ "${{ matrix.build }}" = "make" ]; then
# Add ccache to path
if [ "$RUNNER_OS" = "Linux" ]; then
echo "/usr/lib/ccache" >> $GITHUB_PATH
elif [ "$RUNNER_OS" = "macOS" ]; then
echo "$(brew --prefix)/opt/ccache/libexec" >> $GITHUB_PATH
else
echo "::error::$RUNNER_OS not supported"
exit 1
fi
fi
# Limit the maximum size and switch on compression to avoid exceeding the total disk or cache quota (5 GB).
test -d ~/.ccache || mkdir -p ~/.ccache
echo "max_size = 300M" > ~/.ccache/ccache.conf
echo "compression = true" >> ~/.ccache/ccache.conf
ccache -s

- name: Build OpenBLAS
run: |
case "${{ matrix.build }}" in
"make")
make -j$(nproc) DYNAMIC_ARCH=1 USE_OPENMP=0 FC="ccache ${{ matrix.fortran }}"
;;
"cmake")
mkdir build && cd build
cmake -DDYNAMIC_ARCH=1 \
-DNOFORTRAN=0 \
-DBUILD_WITHOUT_LAPACK=0 \
-DCMAKE_VERBOSE_MAKEFILE=ON \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_Fortran_COMPILER=${{ matrix.fortran }} \
-DCMAKE_C_COMPILER_LAUNCHER=ccache \
-DCMAKE_Fortran_COMPILER_LAUNCHER=ccache \
..
cmake --build .
;;
*)
echo "::error::Configuration not supported"
exit 1
;;
esac

- name: Show ccache status
continue-on-error: true
run: ccache -s

- name: Install benchmark dependencies
run: pip install meson ninja numpy pytest pytest-codspeed --user

- name: Build the wrapper
run: |
cd benchmark/pybench
export PKG_CONFIG_PATH=$PWD
meson setup build --prefix=$PWD/build-install
meson install -C build
#
# sanity check
cd build/openblas_wrap
python -c'import _flapack; print(dir(_flapack))'

- name: Run benchmarks under pytest-benchmark
run: |
cd benchmark/pybench
pip install pytest-benchmark
export PYTHONPATH=$PWD/build-install/lib/python${{matrix.pyver}}/site-packages/
OPENBLAS_NUM_THREADS=1 pytest benchmarks/bench_blas.py -k 'gesdd'

- name: Run benchmarks
uses: CodSpeedHQ/action@v2
with:
token: ${{ secrets.CODSPEED_TOKEN }}
run: |
cd benchmark/pybench
export PYTHONPATH=$PWD/build-install/lib/python${{matrix.pyver}}/site-packages/
OPENBLAS_NUM_THREADS=1 pytest benchmarks/bench_blas.py --codspeed

40 changes: 40 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
name: Publish docs via GitHub Pages

on:
push:
branches:
- develop
pull_request:
branches:
- develop

jobs:
build:
name: Deploy docs
if: "github.repository == 'OpenMathLib/OpenBLAS'"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- uses: actions/setup-python@v5
with:
python-version: "3.10"

- name: Install MkDocs and doc theme packages
run: pip install mkdocs mkdocs-material mkdocs-git-revision-date-localized-plugin

- name: Build docs site
run: mkdocs build

# mkdocs gh-deploy command only builds to the top-level, hence deploying
# with this action instead.
# Deploys to http://www.openmathlib.org/OpenBLAS/docs/
- name: Deploy docs
uses: peaceiris/actions-gh-pages@4f9cc6602d3f66b9c108549d475ec49e8ef4d45e # v4.0.0
if: ${{ github.ref == 'refs/heads/develop' }}
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./site
destination_dir: docs/
Loading
Loading