Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use process-based parallel computing inside Pluto #1023

Closed
pgagarinov opened this issue Mar 24, 2021 · 6 comments
Closed

Cannot use process-based parallel computing inside Pluto #1023

pgagarinov opened this issue Mar 24, 2021 · 6 comments
Labels
other packages Integration with other Julia packages

Comments

@pgagarinov
Copy link

The last cell in the following notebook

### A Pluto.jl notebook ###
# v0.12.21

using Markdown
using InteractiveUtils

# ╔═╡ 28ae80d8-8ce4-11eb-1a51-bba047294cfd
using Pkg 

# ╔═╡ 11ee6c60-8cdc-11eb-294a-adb6a3d02321
using MLJ

# ╔═╡ dbc6ed10-8cd6-11eb-1f5f-e90bbeb88d5f
Pkg.activate("MLJ_tour", shared=true)

# ╔═╡ 2fc99522-8ce4-11eb-03fe-ed9ca3d061d5
Pkg.add("MLJ")

# ╔═╡ 40262afc-8ce4-11eb-292f-d78daf85b56e
Pkg.add("EvoTrees")

# ╔═╡ 19b4340c-8cdc-11eb-00df-81f976249503
X, y = @load_reduced_ames;

# ╔═╡ 2df9ff1e-8cdc-11eb-1025-810b25be781d
begin
        Booster = @load EvoTreeRegressor
        booster = Booster(max_depth=2) # specify hyperparamter at construction
        booster.nrounds=50             # or mutate post facto
end

# ╔═╡ 3778b33c-8cdc-11eb-13bf-e3d79a67a1f2
pipe = @pipeline ContinuousEncoder booster

# ╔═╡ 5725e0ea-8cdc-11eb-33ed-bb1d273a7571
max_depth_range = range(pipe,
                        :(evo_tree_regressor.max_depth),
                        lower = 1,
                        upper = 10)

# ╔═╡ 5fd7ea32-8cdc-11eb-171b-8fea8465d636
self_tuning_pipe = TunedModel(model=pipe,
                              tuning=RandomSearch(),
                              ranges = max_depth_range,
                              resampling=CV(nfolds=3, rng=456),
                              measure=l1,
                              acceleration=CPUThreads(),
                              n=50)

# ╔═╡ 7d7929ac-8cdc-11eb-0695-8de142dfb433
mach = machine(self_tuning_pipe, X, y)

# ╔═╡ 92b907ba-8cdc-11eb-3fa9-a1dc824c56d1
evaluate!(mach,
        measures=[l1, l2],
        resampling=CV(nfolds=6, rng=123),
        acceleration=CPUProcesses(), verbosity=2)

# ╔═╡ 97c7a18a-8ce6-11eb-0862-25d6ce67a1ea


# ╔═╡ Cell order:
# ╠═28ae80d8-8ce4-11eb-1a51-bba047294cfd
# ╠═dbc6ed10-8cd6-11eb-1f5f-e90bbeb88d5f
# ╠═2fc99522-8ce4-11eb-03fe-ed9ca3d061d5
# ╠═40262afc-8ce4-11eb-292f-d78daf85b56e
# ╠═11ee6c60-8cdc-11eb-294a-adb6a3d02321
# ╠═19b4340c-8cdc-11eb-00df-81f976249503
# ╠═2df9ff1e-8cdc-11eb-1025-810b25be781d
# ╠═3778b33c-8cdc-11eb-13bf-e3d79a67a1f2
# ╠═5725e0ea-8cdc-11eb-33ed-bb1d273a7571
# ╠═5fd7ea32-8cdc-11eb-171b-8fea8465d636
# ╠═7d7929ac-8cdc-11eb-0695-8de142dfb433
# ╠═92b907ba-8cdc-11eb-3fa9-a1dc824c56d1
# ╠═97c7a18a-8ce6-11eb-0862-25d6ce67a1ea

fails with the following error message

KeyError: key MLJBase [a7f614a8-145f-11e9-1d2a-a57a1082229d] not found

Stacktrace:

[1] getindex

@ ./dict.jl:482 [inlined]

[2] root_module

@ ./loading.jl:957 [inlined]

[3] deserialize_module

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:954

[4] handle_deserialize

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:856

[5] deserialize

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:774

[6] deserialize_datatype

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:1279

[7] handle_deserialize

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:827

[8] deserialize

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:774

[9] handle_deserialize

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:834

[10] deserialize

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:774

[11] #5

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:933

[12] ntupleany

@ ./ntuple.jl:43

[13] deserialize_tuple

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:933

[14] handle_deserialize

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:817

[15] deserialize

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:774 [inlined]

[16] deserialize_msg

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/messages.jl:87

[17] #invokelatest#2

@ ./essentials.jl:708 [inlined]

[18] invokelatest

@ ./essentials.jl:706 [inlined]

[19] message_handler_loop

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:169

[20] process_tcp_streams

@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:126

[21] #99

@ ./task.jl:406

var"#remotecall_fetch#143"(::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ::typeof(remotecall_fetch), ::Function, ::Distributed.Worker, ::Function, ::Vararg{Any, N} where N)@remotecall.jl:394
remotecall_fetch(::Function, ::Distributed.Worker, ::Function, ::Vararg{Any, N} where N)@remotecall.jl:386
var"#remotecall_fetch#146"(::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ::typeof(remotecall_fetch), ::Function, ::Int64, ::Function, ::Vararg{Any, N} where N)@remotecall.jl:421
[email protected]:421[inlined]
[email protected]:118[inlined]
macro [email protected]:731[inlined]
macro [email protected]:382[inlined]
_evaluate!(::MLJBase.var"#fit_and_extract_on_fold#286"{Vector{Tuple{Vector{Int64}, Vector{Int64}}}, Nothing, Nothing, Int64, Vector{MLJBase.LPLoss{Int64}}, typeof(MLJModelInterface.predict), Bool, Bool, Vector{Float64}, NamedTuple{(:OverallQual, :GrLivArea, :Neighborhood, :x1stFlrSF, :TotalBsmtSF, :BsmtFinSF1, :LotArea, :GarageCars, :MSSubClass, :GarageArea, :YearRemodAdd, :YearBuilt), Tuple{CategoricalArrays.CategoricalVector{Int64, UInt32, Int64, CategoricalArrays.CategoricalValue{Int64, UInt32}, Union{}}, Vector{Float64}, CategoricalArrays.CategoricalVector{String, UInt32, String, CategoricalArrays.CategoricalValue{String, UInt32}, Union{}}, Vector{Float64}, Vector{Float64}, Vector{Float64}, Vector{Float64}, Vector{Int64}, CategoricalArrays.CategoricalVector{String, UInt32, String, CategoricalArrays.CategoricalValue{String, UInt32}, Union{}}, Vector{Float64}, Vector{Int64}, Vector{Int64}}}}, ::MLJBase.Machine{MLJTuning.DeterministicTunedModel{MLJTuning.RandomSearch, Main.workspace26.Pipeline275}, true}, ::ComputationalResources.CPUProcesses{Nothing}, ::Int64, ::Int64)@resampling.jl:724
evaluate!(::MLJBase.Machine{MLJTuning.DeterministicTunedModel{MLJTuning.RandomSearch, Main.workspace26.Pipeline275}, true}, ::Vector{Tuple{Vector{Int64}, Vector{Int64}}}, ::Nothing, ::Nothing, ::Nothing, ::Int64, ::Int64, ::Vector{MLJBase.LPLoss{Int64}}, ::typeof(MLJModelInterface.predict), ::ComputationalResources.CPUProcesses{Nothing}, ::Bool)@resampling.jl:900
evaluate!(::MLJBase.Machine{MLJTuning.DeterministicTunedModel{MLJTuning.RandomSearch, Main.workspace26.Pipeline275}, true}, ::MLJBase.CV, ::Nothing, ::Nothing, ::Nothing, ::Int64, ::Int64, ::Vector{MLJBase.LPLoss{Int64}}, ::Function, ::ComputationalResources.CPUProcesses{Nothing}, ::Bool)@resampling.jl:965
#evaluate!#[email protected]:677[inlined]
top-level scope@Local: 1[inlined]
[c3e4b0f8] Pluto v0.12.21
  [f6006082] EvoTrees v0.7.0
  [add582a8] MLJ v0.16.0
  [a7f614a8] MLJBase v0.17.7
Julia Version 1.6.0-rc3
Commit 23267f0d46 (2021-03-16 17:04 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2689 0 @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, sandybridge)
@fonsp
Copy link
Owner

fonsp commented Mar 25, 2021

Can you explain what the error is, what you expected and why you think that it is a bug in Pluto?

Consider creating a smaller example, we don't know how those packages work.

@fonsp fonsp added the other packages Integration with other Julia packages label Mar 25, 2021
@pgagarinov
Copy link
Author

pgagarinov commented Mar 25, 2021

Can you explain what the error is,

The last cell creates a number of parallel Julia processes, each training a portion of ML model (Gradient Boosting model in this case). The example is taken from here: https://alan-turing-institute.github.io/MLJ.jl/dev/#Lightning-tour-1

I do not know how the internals of Pluto work in details (at least yet) but I watched your presentation and I know that you create artificial modules. Probably either this (or some other) mechanism inside Pluto breaks certain assumptions that MLJ makes when it runs training in parallel using Distributed.jl.

what you expected

I expected all cells including the last one to finish without any errors. I checked that it works when run directly from REPL, from VSCode and from Jupyter Lab.
This is how the result of the last cell looks like in Jupyter Lab

image

and why you think that it is a bug in Pluto?

Because exactly the same code works in REPL, VSCode and Jupyter Lab.

Consider creating a smaller example, we don't know how those packages work.

I'll try to come up with something smaller. I really enjoy using Pluto.jl btw, cool project!

@fonsp
Copy link
Owner

fonsp commented Mar 25, 2021

If it's using Distributed, then #300

@pgagarinov
Copy link
Author

pgagarinov commented Mar 25, 2021

If it's using Distributed, then #300

Got it, thanks.

@pgagarinov
Copy link
Author

Duplicate of #300

@pgagarinov
Copy link
Author

@fonsp JFYI - I ran one more experiment by replacing acceleration=CPUProcesses() with acceleration=CPUThreads() inside the last cell. This changed the behavior of Pluto. The good news is that Pluto doesn't throw any errors. The bad news is that it hungs at the end of calculation. The UI becomes unresponsive, the lag between the mouse click and the action is about 30 seconds. I know you would prefer a shorter self-contained example, I'll try to come up with one. I'm not sure if this has to do with Distributed.jl this time, given that I use CPUThreds and not CPUProcesses.

For now the problem can be formulated as
"Pluto hands when using CPUThreads acceleration in MLJ"

Example-wise it means it hangs after executing the last cell of the following notebook.

### A Pluto.jl notebook ###
# v0.12.21

using Markdown
using InteractiveUtils

# ╔═╡ cd8229de-8ce7-11eb-14e8-ad7c955b670e
using Pkg;Pkg.activate("MLJ_tour", shared=true)

# ╔═╡ e5c491bc-8ce7-11eb-09ff-a308c7dbceab
using MLJ

# ╔═╡ ed17427c-8ce7-11eb-0126-4368e77ea879
X, y = @load_reduced_ames;

# ╔═╡ f50bbd1c-8ce7-11eb-1545-f3de1dc22e9d
begin
        Booster = @load EvoTreeRegressor
        booster = Booster(max_depth=2) # specify hyperparamter at construction
        booster.nrounds=50             # or mutate post facto
end

# ╔═╡ fa10aac0-8ce7-11eb-37f4-9d245e26d2af
pipe = @pipeline ContinuousEncoder booster

# ╔═╡ 00519d36-8ce8-11eb-2109-55be54d08146
max_depth_range = range(pipe,
                        :(evo_tree_regressor.max_depth),
                        lower = 1,
                        upper = 10)

# ╔═╡ 1cc9ae40-8ce8-11eb-227d-290e298d68b1
self_tuning_pipe = TunedModel(model=pipe,
                              tuning=RandomSearch(),
                              ranges = max_depth_range,
                              resampling=CV(nfolds=3, rng=456),
                              measure=l1,
                              acceleration=CPUThreads(),
                              n=50)

# ╔═╡ 05c689ca-8ce8-11eb-25f4-dd0cc86c9cfc
mach = machine(self_tuning_pipe, X, y)

# ╔═╡ 2a6488ae-8ce8-11eb-318f-39347e6f7d87
evaluate!(mach,
        measures=[l1, l2],
        resampling=CV(nfolds=6, rng=123),
        acceleration=CPUThreads(), verbosity=2)

# ╔═╡ Cell order:
# ╠═cd8229de-8ce7-11eb-14e8-ad7c955b670e
# ╠═e5c491bc-8ce7-11eb-09ff-a308c7dbceab
# ╠═ed17427c-8ce7-11eb-0126-4368e77ea879
# ╠═f50bbd1c-8ce7-11eb-1545-f3de1dc22e9d
# ╠═fa10aac0-8ce7-11eb-37f4-9d245e26d2af
# ╠═00519d36-8ce8-11eb-2109-55be54d08146
# ╠═1cc9ae40-8ce8-11eb-227d-290e298d68b1
# ╠═05c689ca-8ce8-11eb-25f4-dd0cc86c9cfc
# ╠═2a6488ae-8ce8-11eb-318f-39347e6f7d87

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
other packages Integration with other Julia packages
Projects
None yet
Development

No branches or pull requests

2 participants