Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Bug: try_await_forever letting panic through #53

Open
bcpeinhardt opened this issue Nov 11, 2023 · 15 comments
Open

Possible Bug: try_await_forever letting panic through #53

bcpeinhardt opened this issue Nov 11, 2023 · 15 comments

Comments

@bcpeinhardt
Copy link

My understanding of task.try_await_forever is that it should catch when a process panics and return an error. However it seems to cause the calling process to panic

pub fn main() {
  io.println("Try await a panicking task")
  let _ = task.try_await_forever(task.async(fn() { panic }))

  // This line is never reached
  io.println("Try await caught the error")
}

Is this a bug or am I misunderstanding the function?

@lpil
Copy link
Member

lpil commented Nov 11, 2023

For errors not to propagate you'll need to make the process that spawns the task to trap exits.

I agree though, this API is a bit confusing, I'm not sure what the intention is for how and when to use it. @TanklesXL do you remember better than I do?

@TanklesXL
Copy link
Contributor

From what I remember when I tried this my assumption was that trap exits was enabled by default. Maybe this might be something we want as a parameter to task.async or maybe a separate function, something like a task.async_trap? It just seemed weird to me that trap exits wasn't the default behaviour since it would be easy to return an error if the task process crashed

@TanklesXL
Copy link
Contributor

TanklesXL commented Nov 11, 2023

I suspect part of the confusion here is that when I used this last I assumed the await function would be able to preemptively handle any trapping but I don't think that makes sense, this is something that needs to be set up when the task process is spawned

@TanklesXL
Copy link
Contributor

For the historical context side of things try_await_forever was added after the original try_await to allow for certain cases where you don't necessarily care or want a timeout on your task

@bcpeinhardt
Copy link
Author

So for context I think this used to work? I ran into the issue trying to update testbldr for the new gleam version, which included updating the gleam_otp dependency.

@TanklesXL
Copy link
Contributor

I wouldn't be surprised if it never actually trapped exits, I remember having to do some funky rescueshenanigans in gladvent

@TanklesXL
Copy link
Contributor

I've verified that this behaviour of not trapping exits is independent of whether you use task.await or task.await_forever. Upon further inspection I suspect there's a race condition in task.async() where the process.monitor_process may only be called after the task process has already crashed:

pub fn async(work: fn() -> value) -> Task(value) {
 ...
  let pid =
    process.start(linked: True, running: fn() { process.send(subject, work()) })
  let monitor = process.monitor_process(pid)
  ...
      |> process.selecting_process_down(monitor, FromMonitor)
...
}

Notice that the process can only be monitored after the task has already been spawned, this is likely what happens.

What i find weird though is that when I add a sleep to the task process the exit still isn't trapped, which leads me to believe there might be a bug in monitor_process on top of the race condition, note the following test still crashes even though the sleep should allow the task process to be monitored correctly:

pub fn async_await_forever_panic_test() {
  task.async(fn() {
    process.sleep(2000)
    panic as "panic"
  })
  |> task.await_forever()
  |> should.be_error
}

If i add a process.trap_exits to the beginning of the test I am able to successfully trap the exit message and cause task.await_forever to fail with a match error since it's not being set up with selecting_trapped_exits but instead since it's depending on monitor_process it's using selecting_process_down.

@TanklesXL
Copy link
Contributor

Update: I'm playing around with using spawn_monitor rather than spawn_link in an attempt to address that race condition but I'm getting badarg in the demonitor function, I need to investigate this further 🤔

@lpil
Copy link
Member

lpil commented Nov 13, 2023

Sounds like there's no clear use for this function. Trapping exits is not something one should do unless you're implementing a supervisor, and will cause all sorts of problems if it's enabled temporarily as concurrent processes could crash and cause unexpected behaviour.

Let's remove this function.

@TanklesXL
Copy link
Contributor

TanklesXL commented Nov 13, 2023

Sounds like there's no clear use for this function

I'm not sure how you ended up at this conclusion after what i said earlier 😆 i use these task functions all the time, the only caveat here is that the user needs to know that if a crash occurs it'll crash the calling process

Which do you mean to remove? The issue occurs regardless of which await you use? This is not unique to try_await or try_await_forever it is also present in await and await_forever.

unless you mean remove the task module as a whole?

@lpil
Copy link
Member

lpil commented Nov 13, 2023

Any functions that require the user to enable trapping exits should be removed. I think that is only try_await_forever as try_await returns an error in the case of a timeout.

@TanklesXL
Copy link
Contributor

Any functions that require the user to enable trapping exits should be removed. I think that is only try_await_forever as try_await returns an error in the case of a timeout.

It does yes but they both still allow a panic to crash the calling process, the only difference is the timeout so they both require trapping exits

@lpil
Copy link
Member

lpil commented Nov 13, 2023

A crash is supposed to crash the calling process, Task is not an exception catching system, that's what supervisors are for.

@TanklesXL
Copy link
Contributor

TanklesXL commented Nov 14, 2023

so this has really been bugging me so i played around a bit more and managed to fix these issues usingspawn_monitor(not currently included in gleam/erlang/process), it required a change to add process_monitor_from_ref in gleam/erlang/process (this can also be achieved by making that type non-opaque)

in gleam/erlang/process:

pub fn process_monitor_from_ref(ref: Reference) -> ProcessMonitor {
  ProcessMonitor(ref)
}

the problem was indeed with how we are attaching monitors to processes after the fact (no idea if there is a bug in the monitor_process function but that's a separate conversation...

@external(erlang, "erlang", "spawn_monitor")
fn spawn_monitor(running f: fn() -> any) -> #(Pid, Reference)

/// Spawn a task process that calls a given function in order to perform some
/// work. The result of this function is send back to the parent and can be
/// received using the `await` function.
///
/// See the top level module documentation for more information on async/await.
///
pub fn async(work: fn() -> value) -> Task(value) {
  let owner = process.self()
  let subject = process.new_subject()
  let #(pid, ref) =
    spawn_monitor(running: fn() { process.send(subject, work()) })
  let monitor = process.process_monitor_from_ref(ref)
  let selector =
    process.new_selector()
    |> process.selecting_process_down(monitor, FromMonitor)
    |> process.selecting(subject, FromSubject)
  Task(owner: owner, pid: pid, monitor: monitor, selector: selector)
}

corresponding passing test cases:

pub fn try_await_forever_panic_test() {
  // Start with an empty mailbox
  flush()

  // Perform an asynchronous task
  // and monitor it until it's done
  let _result =
    task.async(fn() { panic as "hehe" })
    |> task.try_await_forever
    |> should.be_error()

  // Mailbox should be empty;
  // no "DOWN" message should have been sent
  process.self()
  |> get_message_queue_length
  |> should.equal(0)
}

pub fn try_await_panic_test() {
  // Start with an empty mailbox
  flush()

  // Perform an asynchronous task
  // and monitor it until it's done
  let _result =
    task.async(fn() { panic as "haha" })
    |> task.try_await(1000)
    |> should.be_error()

  // Mailbox should be empty;
  // no "DOWN" message should have been sent
  process.self()
  |> get_message_queue_length
  |> should.equal(0)
}

@lpil
Copy link
Member

lpil commented Nov 20, 2023

We have decided on Discord to rewrite the Task module using the Elixir one as a reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants