-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to add checkpointing scheme for reactant #777
Comments
Adding that line 39 in revolve = Reactant.to_rarray(revolve) so I know its some problem related to how I've included |
so where you have the previously checkpointed loop, you definitely should have Reactant.jl/test/control_flow.jl Line 503 in 66b035e
|
Now sure I follow, if I just use swilliamson@CRIOS-A66253 ~/D/G/S/eddy-stresses> julia --project=. eddy_paper.jl :( main!?#
┌ Warning: It's not recommended to use allowscalar([true]) to allow scalar indexing.
│ Instead, use `allowscalar() do end` or `@allowscalar` to denote exactly which operations can use scalar operations.
└ @ GPUArraysCore ~/.julia/packages/GPUArraysCore/aNaXo/src/GPUArraysCore.jl:184
ERROR: LoadError: malformed for loop assignment
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] trace_for(mod::Module, expr::Expr)
@ ReactantCore ~/.julia/packages/ReactantCore/t40zc/src/ReactantCore.jl:148
[3] var"@trace"(__source__::LineNumberNode, __module__::Module, expr::Any)
@ ReactantCore ~/.julia/packages/ReactantCore/t40zc/src/ReactantCore.jl:135
[4] include(fname::String)
@ Base.MainInclude ./client.jl:494
[5] top-level scope
@ ~/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper.jl:24
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_integration.jl:71
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_integration.jl:67
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper.jl:24 and if I try to keep the things needed for checkpointing, like swilliamson@CRIOS-A66253 ~/D/G/S/eddy-stresses> julia --project=. eddy_paper.jl :( main!?#
┌ Warning: It's not recommended to use allowscalar([true]) to allow scalar indexing.
│ Instead, use `allowscalar() do end` or `@allowscalar` to denote exactly which operations can use scalar operations.
└ @ GPUArraysCore ~/.julia/packages/GPUArraysCore/aNaXo/src/GPUArraysCore.jl:184
ERROR: LoadError: Checkpointing.jl: Unknown loop construct.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] var"@checkpoint_struct"(__source__::LineNumberNode, __module__::Module, alg::Any, model::Any, loop::Any)
@ Checkpointing ~/.julia/packages/Checkpointing/uBrnJ/src/Checkpointing.jl:200
[3] include(fname::String)
@ Base.MainInclude ./client.jl:494
[4] top-level scope
@ ~/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper.jl:24
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_integration.jl:71
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_integration.jl:67
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper.jl:24 |
Mainly I don't think I can just get rid of the |
can you instead do
|
This was the first thing I tried, it led to: swilliamson@CRIOS-A66253 ~/D/G/S/eddy-stresses> julia --project=. eddy_paper.jl :( main!?#
┌ Warning: It's not recommended to use allowscalar([true]) to allow scalar indexing.
│ Instead, use `allowscalar() do end` or `@allowscalar` to denote exactly which operations can use scalar operations.
└ @ GPUArraysCore ~/.julia/packages/GPUArraysCore/aNaXo/src/GPUArraysCore.jl:184
ERROR: LoadError: malformed for loop assignment
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] trace_for(mod::Module, expr::Expr)
@ ReactantCore ~/.julia/packages/ReactantCore/t40zc/src/ReactantCore.jl:148
[3] var"@trace"(__source__::LineNumberNode, __module__::Module, expr::Any)
@ ReactantCore ~/.julia/packages/ReactantCore/t40zc/src/ReactantCore.jl:135
[4] include(fname::String)
@ Base.MainInclude ./client.jl:494
[5] top-level scope
@ ~/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper.jl:24
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_integration.jl:71
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_integration.jl:67
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper.jl:24 |
Oh wait my mistake, I can adjust how |
Okay, using swilliamson@CRIOS-A66253 ~/D/G/S/eddy-stresses> julia --project=. eddy_paper.jl :( main!?#
┌ Warning: It's not recommended to use allowscalar([true]) to allow scalar indexing.
│ Instead, use `allowscalar() do end` or `@allowscalar` to denote exactly which operations can use scalar operations.
└ @ GPUArraysCore ~/.julia/packages/GPUArraysCore/aNaXo/src/GPUArraysCore.jl:184
typeof(P) = Parameter{Array{Float64, 3}, Vector{Float64}, Vector{Float64}}
[ Info: Revolve: Number of checkpoints: 15
[ Info: Revolve: Number of steps: 225
[ Info: Prediction:
[ Info: Forward steps : 522
[ Info: Overhead factor : 2.32
ERROR: LoadError: Abstract type Function does not have a definite size.
Stacktrace:
[1] sizeof
@ ./essentials.jl:631 [inlined]
[2] traced_type_inner(T::Type{<:Function}, seen::Dict{Type, Type}, mode::Reactant.TraceMode, track_numbers::Type)
@ Reactant ~/.julia/dev/Reactant/src/Tracing.jl:76
[3] traced_type_inner(T::Type, seen::Dict{Type, Type}, mode::Reactant.TraceMode, track_numbers::Type)
@ Reactant ~/.julia/dev/Reactant/src/Tracing.jl:415
[4] traced_type_inner(T::Type, seen::Dict{Type, Type}, mode::Reactant.TraceMode, track_numbers::Type)
@ Reactant ~/.julia/dev/Reactant/src/Tracing.jl:444
[5] traced_type(T::Type, ::Val{Reactant.ArrayToConcrete}, track_numbers::Type)
@ Reactant ~/.julia/dev/Reactant/src/Tracing.jl:611
[6] make_tracer(seen::Reactant.OrderedIdDict{Any, Any}, prev::Any, path::Any, mode::Reactant.TraceMode; toscalar::Bool, tobatch::Nothing, track_numbers::Type, kwargs::@Kwargs{})
@ Reactant ~/.julia/dev/Reactant/src/Tracing.jl:668
[7] make_tracer
@ ~/.julia/dev/Reactant/src/Tracing.jl:654 [inlined]
[8] to_rarray_internal
@ ~/.julia/dev/Reactant/src/Tracing.jl:1067 [inlined]
[9] #to_rarray#79
@ ~/.julia/dev/Reactant/src/Tracing.jl:1063 [inlined]
[10] to_rarray
@ ~/.julia/dev/Reactant/src/Tracing.jl:1061 [inlined]
[11] run_adjoint(::Type{Float32}; kwargs::@Kwargs{output::Bool, L_ratio::Int64, g::Float64, H::Int64, wind_forcing_x::String, Lx::Float64, seasonal_wind_x::Bool, topography::String, bc::String, bottom_drag::String, nn_forcing_dissipation::Bool, handwritten::Bool, α::Int64, nx::Int64, Ndays::Int64})
@ Main ~/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_experiment_functions.jl:39
[12] run_adjoint
@ ~/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_experiment_functions.jl:14 [inlined]
[13] top-level scope
@ ~/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_run_experiments.jl:3
[14] include(fname::String)
@ Base.MainInclude ./client.jl:494
[15] top-level scope
@ ~/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper.jl:28
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper_run_experiments.jl:3
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters_work/eddy-stresses/eddy_paper.jl:28 |
Sorry if this isn't the right place to post, but in the process of adding reactant to my code I'm seeing two main issues and wanted to document here:
If I don't use Checkpointing.jl, the code increases in memory usage until my computer kills the process, and this is happening for a one day integration which shouldn't be enough to cause memory issues
I tried to change to the integration that uses checkpointing to circumvent (1) by adding a few lines:
inside the relevant functions
run_adjoint
andouter
,but this just resulted in the error message:
I tried to add the scheme in the same way that
S
anddS
are treated, is this not the right idea? I mainly want to move to using checkpointing because, without it, I can't even see any error messagesThe text was updated successfully, but these errors were encountered: