Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Golink wait for too much time with bazel 7 and remote execution #4116

Open
hawkingrei opened this issue Sep 23, 2024 · 3 comments
Open

Golink wait for too much time with bazel 7 and remote execution #4116

hawkingrei opened this issue Sep 23, 2024 · 3 comments

Comments

@hawkingrei
Copy link
Contributor

What version of rules_go are you using?

v0.50.1

What version of gazelle are you using?

v0.38.0

What version of Bazel are you using?

[root@10-2-12-124 tidb]# bazelisk version
Bazelisk version: v1.17.0
Starting local Bazel server and connecting to it...
Build label: 7.3.1
Build target: @@//src/main/java/com/google/devtools/build/lib/bazel:BazelServer
Build time: Mon Aug 19 16:12:50 2024 (1724083970)
Build timestamp: 1724083970
Build timestamp as int: 1724083970

Does this issue reproduce with the latest releases of all the above?

bazel build //...

with the remote execution. bazel-buildfarm v2.11.1

build --announce_rc
build --experimental_guard_against_concurrent_changes
build --experimental_remote_merkle_tree_cache
build --disk_cache=/data1/bazel/cache
build --experimental_remote_cache_compression
build --repository_cache=/data1/bazel/cache
run --color=yes
build  --remote_executor=grpc://xxxx:8980
build  --jobs 100 

What operating system and processor architecture are you using?

Any other potentially useful information about your toolchain?

What did you do?

What did you expect to see?

can finish the work with bazel v6.5

https://tiprow.hawkingrei.com/view/gs/pingcapprow/pr-logs/pull/pingcap_tidb/51126/fast_test_tiprow/1838293257872740352#1:build-log.txt%3A11

What did you see instead?

I wait for a long time. it cannot stop and not raise the error.

Starting local Bazel server and connecting to it...
INFO: Invocation ID: aa345b60-b676-4ecb-bca1-12c92e87a140
INFO: Reading 'startup' options from /data3/wangweizhen/tidb/.bazelrc: --host_jvm_args=-Xmx4g, --unlimit_coredumps
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=148
INFO: Reading rc options for 'build' from /data3/wangweizhen/tidb/.bazelrc:
  'build' options: --announce_rc --experimental_guard_against_concurrent_changes --experimental_remote_merkle_tree_cache --java_language_version=17 --java_runtime_version=17 --tool_java_language_version=17 --tool_java_runtime_version=17 --incompatible_strict_action_env --incompatible_enable_cc_toolchain_resolution
INFO: Reading rc options for 'build' from /root/.bazelrc:
  'build' options: --announce_rc --experimental_guard_against_concurrent_changes --experimental_remote_merkle_tree_cache --disk_cache=/data1/bazel/cache --experimental_remote_cache_compression --repository_cache=/data1/bazel/cache --remote_executor=grpc://10.2.12.124:8980 --jobs 100
WARNING: Found stale downloads from previous build, deleting...
INFO: Analyzed 1202 targets (2310 packages loaded, 29891 targets configured).
[23,322 / 24,720] 100 actions, 0 running
    [Prepa] GoLink br/pkg/checkpoint/checkpoint_test_/checkpoint_test; 213s
    [Prepa] GoLink pkg/lightning/backend/kv/kv_test_/kv_test; 213s
    [Prepa] GoLink pkg/sessionctx/stmtctx/stmtctx_test_/stmtctx_test; 212s
    [Prepa] GoLink pkg/sessionctx/sessionstates/sessionstates_test_/sessionstates_test; 212s
    [Prepa] GoLink pkg/meta/meta_test_/meta_test; 212s
    [Prepa] GoLink pkg/statistics/statistics_test_/statistics_test; 212s
    [Prepa] GoLink pkg/extension/extension_test_/extension_test; 212s
    [Prepa] GoLink pkg/table/tables/tables_test_/tables_test; 212s ...
@hawkingrei
Copy link
Contributor Author

hawkingrei commented Sep 23, 2024

You can reproduce this problem.

git clone https://github.com/pingcap/tidb.git
cd tidb
bazel build //... --//build:with_nogo_flag=true --//build:with_rbe_flag=true

@fmeum
Copy link
Member

fmeum commented Sep 25, 2024

If this fails with Bazel 7 but succeeds with Bazel 6, I would expect this to be due to a Bazel bug or incompatible change. Could you share a Starlark profile so that we can get some idea of what's taking so long?

Edit: --experimental_remote_merkle_tree_cache is known to be quite problematic in certain cases. Could you try disabling it?

@arjantop-cai
Copy link

bazelbuild/bazel#21626 (comment)

We hit the same issue, either increase the cache size with experimental_remote_merkle_tree_cache_size or disable merkle tree cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants