-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-41816: [C++] Meson Build System Support #45441
base: main
Are you sure you want to change the base?
Conversation
@kou this is a simplified attempt at adding Meson to address your comment #41816 (comment) This is currently slower than the CMake configuration by a good deal (pending investigation) and the configuration file is not 100% complete, but this should give us an idea of what a Meson configuration may look like. To use, from the cpp directory developers can: meson setup builddir -Dtests=true -Dcompute=true
meson compile -C builddir
meson test -C builddir For ASAN/UBSAN, users could simply: meson setup builddir -Dtests=true -Dcompute=true -Db_sanitize=address,undefined Or if the project is already setup run: meson configure -C builddir -Db_sanitize=address,undefined Coverage can be enabled with: meson configure -C builddir -Db_coverage=true and tests can be run under valgrind with: meson test -C builddir --wrap='valgrind --track-origins=yes --leak-check=full' --print-errorlog |
Thanks. Can we start from more simplified version? For example, we don't need We want to add a nightly CI for this to detect regression. We want to update version information automatically in release process. For example: arrow/dev/release/utils-prepare.sh Lines 39 to 44 in 0556905
(I can do it in this branch later.) Anyway, I didn't know that |
Sure I can strip down further. So do think just something that builds libarrow is the right starting point? |
Yes. Only minimal |
Remove extraneous files
80c606c
to
3760159
Compare
c492403
to
61f4bbb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kou . Hope this is more in line with your expectation.
As far as nightlies go, can you point me to an existing nightly CI setup in the repo? I was able to grep for some R nightlies, but not sure if there is existing infrastructure for C++ nightly jobs where this would be better placed
cpp/src/arrow/meson.build
Outdated
objects: objlibs, | ||
include_directories: [include_dir], | ||
install: true, | ||
# compute/expression.cc may have undefined IPC symbols in non-IPC builds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't sure if this was intentional or not in the existing code base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we should disable features that depend on IPC in cpp/src/arrow/compute/expression.cc
like GH-45171.
Could you open an issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem - see #45512
override_options: {'b_lundef': 'false'}, | ||
) | ||
|
||
# Meson does not allow you to glob for headers to install. See also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meson is pretty strict about wildcards for the reasons outlined in their FAQ. Idiomatically, Meson would want you to put the files you want to install in a separate directory and call install_subdir
, but that would go beyond the scope of this initial PR I think
cpp/src/arrow/meson.build
Outdated
arrow_so_version = (ver_major.to_int() * 100 + ver_minor.to_int()).to_string() | ||
arrow_full_so_version = '@0@.@1@.@2@'.format(arrow_so_version, ver_patch, 0) | ||
|
||
# TODO: The Meson generated .pc file does not include the Apache license |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not research this in too much detail yet; figured I'd check if it was a big deal before investing time into a resolution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need it.
arrow.pc.in
has the license header because it's in this repository. Files that don't exist in this repository don't need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I push some commits to this branch for nightly CI and auto version update?
cpp/meson.build
Outdated
'cpp', | ||
'c', | ||
version: '19.0.0-SNAPSHOT', | ||
license: 'Apache 2.0', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license: 'Apache 2.0', | |
license: 'Apache-2.0', |
cpp/meson.build
Outdated
|
||
git_id = get_option('git_id') | ||
if git_id == '' | ||
git_id = run_command('git', 'log', '-n1', '--format=%H', check: true).stdout().strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we ignore git log
error? This will fail when we use source archive.
git_id = run_command('git', 'log', '-n1', '--format=%H', check: true).stdout().strip() | |
git_id = run_command('git', 'log', '-n1', '--format=%H').stdout().strip() |
cpp/meson.build
Outdated
|
||
git_description = get_option('git_description') | ||
if git_description == '' | ||
git_description = run_command('git', 'describe', '--tags', check: true).stdout().strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git_description = run_command('git', 'describe', '--tags', check: true).stdout().strip() | |
git_description = run_command('git', 'describe', '--tags').stdout().strip() |
cpp/src/arrow/meson.build
Outdated
|
||
# Meson does not natively support object libraries | ||
# https://github.com/mesonbuild/meson/issues/13843 | ||
objlib_sources = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to emulate objlib in Meson. It's just for faster build.
We can just use library()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a pretty poor understanding of how object libraries actually work, but I don't think we can just convert these to using the library
call unless we want the linker to allow undefined symbols in these libraries as well (?)
Trying to do what I think you suggest generates a huge amount of errors like:
[7/96] Linking target src/arrow/libarrow_io.so
FAILED: src/arrow/libarrow_io.so
c++ -o src/arrow/libarrow_io.so src/arrow/libarrow_io.so.p/io_buffered.cc.o src/arrow/libarrow_io.so.p/io_caching.cc.o src/arrow/libarrow_io.so.p/io_compressed.cc.o src/arrow/libarrow_io.so.p/io_file.cc.o src/arrow/libarrow_io.so.p/io_hdfs.cc.o src/arrow/libarrow_io.so.p/io_hdfs_internal.cc.o src/arrow/libarrow_io.so.p/io_interfaces.cc.o src/arrow/libarrow_io.so.p/io_memory.cc.o src/arrow/libarrow_io.so.p/io_slow.cc.o src/arrow/libarrow_io.so.p/io_stdio.cc.o src/arrow/libarrow_io.so.p/io_transform.cc.o -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -shared -fPIC -Wl,--start-group -Wl,-soname,libarrow_io.so -Wl,--end-group
/usr/bin/ld: src/arrow/libarrow_io.so.p/io_buffered.cc.o: in function `arrow::io::BufferedInputStream::SetBufferSize(long)':
buffered.cc:(.text+0x13a9): undefined reference to `arrow::Buffer::CheckCPU() const'
/usr/bin/ld: buffered.cc:(.text+0x13b1): undefined reference to `arrow::Buffer::CheckMutable() const'
/usr/bin/ld: buffered.cc:(.text+0x1407): undefined reference to `arrow::util::detail::StringStreamWrapper::StringStreamWrapper()'
/usr/bin/ld: buffered.cc:(.text+0x149e): undefined reference to `arrow::util::detail::StringStreamWrapper::str[abi:cxx11]()'
/usr/bin/ld: buffered.cc:(.text+0x14a6): undefined reference to `arrow::util::detail::StringStreamWrapper::~StringStreamWrapper()'
/usr/bin/ld: buffered.cc:(.text+0x14b6): undefined reference to `arrow::Status::Status(arrow::StatusCode, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we can remove objlib_sources
and static_library()
for them entirely.
We can just add sources in objlib_sources
to arrow_srcs
and use it in one library()
(that already exists in this PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(We don't need objlib feature entirely with Meson.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I've still kept the top level dict for parity with CMake, and since some of the sources vary depending on compilation options (like arrow compute). Let me know if this is more in line with what you are thinking
cpp/src/arrow/meson.build
Outdated
objects: objlibs, | ||
include_directories: [include_dir], | ||
install: true, | ||
# compute/expression.cc may have undefined IPC symbols in non-IPC builds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we should disable features that depend on IPC in cpp/src/arrow/compute/expression.cc
like GH-45171.
Could you open an issue for this?
cpp/src/arrow/meson.build
Outdated
arrow_so_version = (ver_major.to_int() * 100 + ver_minor.to_int()).to_string() | ||
arrow_full_so_version = '@0@.@1@.@2@'.format(arrow_so_version, ver_patch, 0) | ||
|
||
# TODO: The Meson generated .pc file does not include the Apache license |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need it.
arrow.pc.in
has the license header because it's in this repository. Files that don't exist in this repository don't need it.
cpp/src/arrow/util/meson.build
Outdated
configuration: conf_data, | ||
format: 'cmake@', | ||
install: true, | ||
install_dir: 'arrow', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
install_dir: 'arrow', | |
install_dir: 'arrow/util', |
cpp/src/arrow/util/meson.build
Outdated
foreach cmakedefine : [ | ||
'ARROW_COMPUTE', | ||
'ARROW_CSV', | ||
'ARROW_CUDA', | ||
'ARROW_DATASET', | ||
'ARROW_FILESYSTEM', | ||
'ARROW_FLIGHT', | ||
'ARROW_FLIGHT_SQL', | ||
'ARROW_IPC', | ||
'ARROW_JEMALLOC', | ||
'ARROW_JEMALLOC_VENDORED', | ||
'ARROW_JSON', | ||
'ARROW_MIMALLOC', | ||
'ARROW_ORC', | ||
'ARROW_PARQUET', | ||
'ARROW_SUBSTRAIT', | ||
'ARROW_AZURE', | ||
'ARROW_ENABLE_THREADING', | ||
'ARROW_GCS', | ||
'ARROW_HDFS', | ||
'ARROW_S3', | ||
'ARROW_USE_GLOG', | ||
'ARROW_USE_NATIVE_INT128', | ||
'ARROW_WITH_BROTLI', | ||
'ARROW_WITH_BZ2', | ||
'ARROW_WITH_LZ4', | ||
'ARROW_WITH_MUSL', | ||
'ARROW_WITH_OPENTELEMETRY', | ||
'ARROW_WITH_RE2', | ||
'ARROW_WITH_SNAPPY', | ||
'ARROW_WITH_UCX', | ||
'ARROW_WITH_UTF8PROC', | ||
'ARROW_WITH_ZLIB', | ||
'ARROW_WITH_ZSTD', | ||
'PARQUET_REQUIRE_ENCRYPTION', | ||
] | ||
conf_data.set(cmakedefine, false) | ||
endforeach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we don't need to use foreach
here:
conf_data.set('ARROW_COMPUTE', false)
conf_data.set('ARROW_CSV', false)
...
cpp/src/arrow/util/meson.build
Outdated
install: true, | ||
install_dir: 'arrow', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to install _internal.h
:
install: true, | |
install_dir: 'arrow', |
Sure no problem |
Rationale for this change
The Meson build system may be more user friendly to some developers, and may make it easier to perform tasks like valgrind, coverage, or ASAN/UBSAN coverage. There is also a prior art for using meson in the nanoarrow and arrow-adbc projects.
What changes are included in this PR?
This PR implements a Meson configuration that can support the
ARROW_BUILD_TESTS
andARROW_COMPUTE
options, as well as the dependencies for those optionsAre these changes tested?
Not in CI. You can download and run
meson test
when built to run the test suite.Are there any user-facing changes?
No. This is strictly for developers