Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HyperQueue has crashed in image v24.10.0a3 on Ubuntu 24.04.1 LTS #923

Open
superstar54 opened this issue Nov 11, 2024 · 3 comments
Open
Labels
bug Something isn't working

Comments

@superstar54
Copy link
Member

I ran the image v24.10.0a3, and got an error related to the hq.

System: Ubuntu 24.04.1 LTS

You can also re-run HyperQueue server (and its workers) with the `RUST_LOG=hq=debug,tako=debug`
environment variable, and attach the logs to the issue, to provide us more information.

thread 'main' panicked at crates/tako/src/internal/common/resources/descriptor.rs:112:9:
assertion failed: size > 0
stack backtrace:
   0:     0x627f15cb3bf9 - std::backtrace_rs::backtrace::libunwind::trace::hbee8a7973eeb6c93
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
   1:     0x627f15cb3bf9 - std::backtrace_rs::backtrace::trace_unsynchronized::hc8ac75eea3aa6899
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x627f15cb3bf9 - std::sys_common::backtrace::_print_fmt::hc7f3e3b5298b1083
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:68:5
   3:     0x627f15cb3bf9 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hbb235daedd7c6190
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x627f159feb60 - core::fmt::rt::Argument::fmt::h76c38a80d925a410
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/rt.rs:142:9
   5:     0x627f159feb60 - core::fmt::write::h3ed6aeaa977c8e45
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/mod.rs:1120:17
   6:     0x627f15c7c87e - std::io::Write::write_fmt::h78b18af5775fedb5
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/io/mod.rs:1810:15
   7:     0x627f15cb5c2e - std::sys_common::backtrace::_print::h5d645a07e0fcfdbb
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:47:5
   8:     0x627f15cb5c2e - std::sys_common::backtrace::print::h85035a511aafe7a8
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:34:9
   9:     0x627f15cb54d7 - std::panicking::default_hook::{{closure}}::hcce8cea212785a25
  10:     0x627f15cb50bf - std::panicking::default_hook::hf5fcb0f213fe709a
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:292:9
  11:     0x627f15962eeb - call<(&core::panic::panic_info::PanicInfo), (dyn core::ops::function::Fn<(&core::panic::panic_info::PanicInfo), Output=()> + core::marker::Send + core::marker::Sync), alloc::alloc::Global>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/alloc/src/boxed.rs:2029:9
  12:     0x627f15962eeb - {closure#0}
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:360:9
  13:     0x627f15cb621a - <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::hbc5ccf4eb663e1e5
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/alloc/src/boxed.rs:2029:9
  14:     0x627f15cb621a - std::panicking::rust_panic_with_hook::h095fccf1dc9379ee
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:783:13
  15:     0x627f15cb5f68 - std::panicking::begin_panic_handler::{{closure}}::h032ba12139b353db
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:649:13
  16:     0x627f15cb5ef6 - std::sys_common::backtrace::__rust_end_short_backtrace::h9259bc2ff8fd0f76
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:171:18
  17:     0x627f15cb5eef - rust_begin_unwind
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
  18:     0x627f1583d074 - core::panicking::panic_fmt::h784f20a50eaab275
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
  19:     0x627f1583d242 - core::panicking::panic::hb837a5ebbbe5b188
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:144:5
  20:     0x627f15b3c7d6 - simple_indices
                               at /__w/hyperqueue/hyperqueue/crates/tako/src/internal/common/resources/descriptor.rs:112:9
  21:     0x627f15b3c7d6 - parse_cpu_definition
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/worker/parser.rs:15:19
  22:     0x627f15b7b3a9 - call<fn(&str) -> core::result::Result<tako::internal::common::resources::descriptor::ResourceDescriptorKind, anyhow::Error>, (&str)>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:79:5
  23:     0x627f15b7b3a9 - parse_ref<fn(&str) -> core::result::Result<tako::internal::common::resources::descriptor::ResourceDescriptorKind, anyhow::Error>, tako::internal::common::resources::descriptor::ResourceDescriptorKind, anyhow::Error>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/value_parser.rs:928:25
  24:     0x627f15b7b3a9 - parse_ref<tako::internal::common::resources::descriptor::ResourceDescriptorKind>
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/client/utils.rs:56:9
  25:     0x627f15b7ad75 - parse_ref_<hyperqueue::client::utils::PassthroughParser<tako::internal::common::resources::descriptor::ResourceDescriptorKind>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/value_parser.rs:773:9
  26:     0x627f15b7ad75 - parse_ref_<hyperqueue::client::utils::PassThroughArgument<tako::internal::common::resources::descriptor::ResourceDescriptorKind>, hyperqueue::client::utils::PassthroughParser<tako::internal::common::resources::descriptor::ResourceDescriptorKind>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/value_parser.rs:658:25
  27:     0x627f159de91e - parse_ref
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/value_parser.rs:242:9
  28:     0x627f159de91e - push_arg_values
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:1083:27
  29:     0x627f159c4da7 - react
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:1192:21
  30:     0x627f159c3dad - parse_opt_value
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:1037:36
  31:     0x627f159bb962 - parse_long_arg
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:801:17
  32:     0x627f159bb962 - get_matches_with
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:115:44
  33:     0x627f159c161e - parse_subcommand
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:720:37
  34:     0x627f159c161e - get_matches_with
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:474:17
  35:     0x627f159c161e - parse_subcommand
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:720:37
  36:     0x627f159c161e - get_matches_with
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/parser/parser.rs:474:17
  37:     0x627f159b6f51 - _do_parse
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/command.rs:4000:29
  38:     0x627f1595a9c2 - try_get_matches_from_mut<std::env::ArgsOs, std::ffi::os_str::OsString>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/command.rs:830:9
  39:     0x627f1595a9c2 - get_matches_from<std::env::ArgsOs, std::ffi::os_str::OsString>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/command.rs:701:9
  40:     0x627f1595a9c2 - get_matches
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/clap_builder-4.5.1/src/builder/command.rs:610:9
  41:     0x627f1595a9c2 - {async_block#0}
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:375:19
  42:     0x627f1594883d - poll<&mut hq::main::{async_block_env#0}>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/future/future.rs:124:9
  43:     0x627f1594883d - {closure#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:659:57
  44:     0x627f1594883d - with_budget<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/coop.rs:107:5
  45:     0x627f1594883d - budget<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure#0}::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/coop.rs:73:5
  46:     0x627f1594883d - {closure#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:659:25
  47:     0x627f1594883d - enter<core::task::poll::Poll<core::result::Result<(), hyperqueue::common::error::HqError>>, tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure#0}::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:404:19
  48:     0x627f1594883d - {closure#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:658:36
  49:     0x627f1594883d - {closure#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:737:68
  50:     0x627f1594883d - set<tokio::runtime::scheduler::Context, tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context/scoped.rs:40:9
  51:     0x627f1594883d - {closure#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context.rs:176:26
  52:     0x627f1594883d - try_with<tokio::runtime::context::Context, tokio::runtime::context::set_scheduler::{closure_env#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:270:16
  53:     0x627f1594883d - with<tokio::runtime::context::Context, tokio::runtime::context::set_scheduler::{closure_env#0}<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>, (alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>)>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:246:9
  54:     0x627f1594883d - set_scheduler<(alloc::boxed::Box<tokio::runtime::scheduler::current_thread::Core, alloc::alloc::Global>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>), tokio::runtime::scheduler::current_thread::{impl#8}::enter::{closure_env#0}<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context.rs:176:17
  55:     0x627f1594883d - enter<tokio::runtime::scheduler::current_thread::{impl#8}::block_on::{closure_env#0}<core::pin::Pin<&mut hq::main::{async_block_env#0}>>, core::option::Option<core::result::Result<(), hyperqueue::common::error::HqError>>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:737:27
  56:     0x627f1594883d - block_on<core::pin::Pin<&mut hq::main::{async_block_env#0}>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:646:19
  57:     0x627f1594883d - {closure#0}<hq::main::{async_block_env#0}>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:175:28
  58:     0x627f1594883d - enter_runtime<tokio::runtime::scheduler::current_thread::{impl#0}::block_on::{closure_env#0}<hq::main::{async_block_env#0}>, core::result::Result<(), hyperqueue::common::error::HqError>>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/context/runtime.rs:65:16
  59:     0x627f1594883d - block_on<hq::main::{async_block_env#0}>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/scheduler/current_thread/mod.rs:167:9
  60:     0x627f1594883d - block_on<hq::main::{async_block_env#0}>
                               at /github/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.36.0/src/runtime/runtime.rs:348:47
  61:     0x627f1594883d - main
                               at /__w/hyperqueue/hyperqueue/crates/hyperqueue/src/bin/hq.rs:456:5
  62:     0x627f158d4203 - call_once<fn() -> core::result::Result<(), hyperqueue::common::error::HqError>, ()>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:250:5
  63:     0x627f158d4203 - __rust_begin_short_backtrace<fn() -> core::result::Result<(), hyperqueue::common::error::HqError>, core::result::Result<(), hyperqueue::common::error::HqError>>
                               at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:155:18
  64:     0x627f15963500 - main
  65:     0x786503f29d90 - <unknown>
  66:     0x786503f29e40 - __libc_start_main
  67:     0x627f1587d049 - <unknown>
  68:                0x0 - <unknown>
Oops, HyperQueue has crashed. This is a bug, sorry for that.
If you would be so kind, please report this issue at the HQ issue tracker: https://github.com/It4innovations/hyperqueue/issues/new?title=HQ%20crashes
Please include the above error (starting from "thread ... panicked ...") and the stack backtrace in the issue contents, along with the following information:

HyperQueue version: v0.19.0

@superstar54 superstar54 added the bug Something isn't working label Nov 11, 2024
@superstar54
Copy link
Member Author

Ping @unkcpz in case you have any idea about the error.

@unkcpz
Copy link
Member

unkcpz commented Nov 11, 2024

Thanks @superstar54, how you got the exception? You are inside a container or it is raise when you start the image? More information on how to produce it will be helpful.
Since it is a hyperqueue exception, can you open an issue on hyperequeue repo?

@unkcpz
Copy link
Member

unkcpz commented Nov 11, 2024

me and @superstar54 had a quite check offline.
The problem is without using --cpus <n_cpus> the CPU_LIMIT set came from


The nproc was missing, it came from the previous implementation. This is a bug and I'll fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants