Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add @when for conditional transformations #376

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
1 change: 1 addition & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ In addition, DataFramesMeta provides
* `@byrow` for applying functions to each row of a data frame (only supported inside other macros).
* `@passmissing` for propagating missing values inside row-wise DataFramesMeta.jl transformations.
* `@astable` to create multiple columns within a single transformation.
* `@when` to non-destructively work with a subset of observations (Similar to Stata's `if`)
* `@chain`, from [Chain.jl](https://github.com/jkrumbiegel/Chain.jl) for piping the above macros together, similar to [magrittr](https://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html)'s
`%>%` in R.
* `@label!` and `@note!` for attaching metadata to columns.
Expand Down
2 changes: 1 addition & 1 deletion src/DataFramesMeta.jl
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ export @with,
@rtransform, @rselect, @rtransform!, @rselect!,
@distinct, @rdistinct, @distinct!, @rdistinct!,
@eachrow, @eachrow!,
@byrow, @passmissing, @astable, @kwarg,
@byrow, @passmissing, @astable, @kwarg, @when,
@label!, @note!, printlabels, printnotes,
@groupby,
@based_on, @where # deprecated
Expand Down
202 changes: 152 additions & 50 deletions src/macros.jl
Original file line number Diff line number Diff line change
Expand Up @@ -547,11 +547,33 @@
"is currently allowed with $DOLLAR"))

function with_helper(d, body)
# Get rid of the leading @byrow, @passmissing etc.
# but otherwise leave body untouched
body, outer_flags = extract_macro_flags(body)
if outer_flags[ASTABLE_SYM][]
throw(ArgumentError("@astable macro-flag cannot be used inside of @with"))
end
# If we have a begin...end somewhere, we might
# have a @when.
# Remove the @when statements, recording that they
# exist. To do this we also have to de-construct
# body into a vector expressions.
es, when = get_when_statements(MacroTools.rmlines(MacroTools.block(body)).args)
newbody = Expr(:block, es...)
# Make body an expression to force the
# complicated method of fun_to_vec
# in the case of QuoteNode
t = fun_to_vec(Expr(:block, body); no_dest=true)
:($exec($d, $t))
t = fun_to_vec(newbody; no_dest=true, outer_flags = outer_flags)
if !isnothing(when)
w = fun_to_vec(when; no_dest = true, gensym_names=false, outer_flags = outer_flags)
z = gensym()
quote
$z = $subset($d, $w; view = true, skipmissing = true)
$exec($z, $t)
end
else
:($exec($d, $t))
end
end

"""
Expand Down Expand Up @@ -663,6 +685,83 @@
esc(with_helper(d, body))
end

"""
@when(args...)

Perform operations on a subset of `df`, but still
return a data frame with the same number of rows as `df`. `@when` can be used
with the `@transform` macros, `@select` macros, and `@with`.

`@when` is not a "real" macro. It is only functional inside DataFramesMeta.jl macros.
A motivating example:

```
@rtransform df begin
@when :a == 1
:y = :y - mean(:y)
end
```

The above block generates the column `:y` which is de-meaned with respect to observations where
`:a == 1`. If `:y` already exists in `df`, then new values over-write old values only
when `:a == 1`. If `:y` does not already exist in `df`, then new values are written
when `:a == 1`, and remaining values are filled with `missing`.

Only one `@when` statement is allowed per transformation macro and it must be the
first argument in the transformation.

`@when` inherits `@byrow` and `@passmissing` from the transformation. As an example:

```
@transform df @byrow begin
@when :a == 1
...
end
```

In the above, the condition inside `@when` operates row-wise. However, `@byrow` and `@passmissing` can
also be passed independently, such as `@byrow @when :a == 1`.

Like `@subset`, `@when` drops rows where `missing` values are returned. Unlike `@subset`,
there is currently no way to control this behavior.

## Details

`@when` operates by calling `select` with the `view = true` keyword argument,
followed by a `transform!` call. See `?transform!` for more details. Roughly,
the expression

```
@transform df begin
@when :a .== 1
:y = 5
end
```

translates to

```
df1 = @subset(copy(df), :a .== 1; view = true)
df2 = @transform! df1 :y = 5
parent(df2)
```

Unlike the other macro-flags, such as `@passmissing` and `@byrow`, `@when` cannot be
used at the top-level.
```
@transform df @byrow @when(:a == 1) begin
:x = 1
:y = 2
end
```
is not supported.

"""
macro when(args...)
throw(ArgumentError("@passmissing only works inside DataFramesMeta macros."))
end


ASTABLE_RHS_ORDERBY_DOCS = """
In operations, it is also allowed to use `AsTable(cols)` to work with
multiple columns at once, where the columns are grouped together in a
Expand Down Expand Up @@ -860,7 +959,7 @@

### Examples

```jldoctest

Check failure on line 962 in src/macros.jl

View workflow job for this annotation

GitHub Actions / build

doctest failure in ~/work/DataFramesMeta.jl/DataFramesMeta.jl/src/macros.jl:962-1053 ```jldoctest julia> using DataFramesMeta, Statistics julia> df = DataFrame(x = 1:3, y = [2, 1, 2]); julia> globalvar = [2, 1, 0]; julia> @subset(df, :x .> 1) 2×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 2 1 2 │ 3 2 julia> @subset(df, :x .> globalvar) 2×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 2 1 2 │ 3 2 julia> @subset df begin :x .> globalvar :y .== 3 end 0×2 DataFrame julia> df = DataFrame(n = 1:20, x = [3, 3, 3, 3, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 2, 2, 3, 1, 1, 2]); julia> g = groupby(df, :x); julia> @subset(g, :n .> mean(:n)) 8×2 DataFrame Row │ n x │ Int64 Int64 ─────┼────────────── 1 │ 12 1 2 │ 13 1 3 │ 15 2 4 │ 16 2 5 │ 17 3 6 │ 18 1 7 │ 19 1 8 │ 20 2 julia> @subset g begin :n .> mean(:n) :n .< 20 end 7×2 DataFrame Row │ n x │ Int64 Int64 ─────┼────────────── 1 │ 12 1 2 │ 13 1 3 │ 15 2 4 │ 16 2 5 │ 17 3 6 │ 18 1 7 │ 19 1 julia> df = DataFrame(a = [1, 2, missing], b = ["x", "y", missing]); julia> @subset(df, :a .== 1) 1×2 DataFrame Row │ a b │ Int64? String? ─────┼───────────────── 1 │ 1 x julia> @subset(df, :a .< 3; view = true) 2×2 SubDataFrame Row │ a b │ Int64? String? ─────┼───────────────── 1 │ 1 x 2 │ 2 y julia> @subset df begin :a .< 3 @kwarg view = true end 2×2 SubDataFrame Row │ a b │ Int64? String? ─────┼───────────────── 1 │ 1 x 2 │ 2 y ``` Subexpression: @subset df begin :x .> globalvar :y .== 3 end Evaluated output: 0×2 DataFrame Row │ x y │ Int64 Int64 ─────┴────────────── Expected output: 0×2 DataFrame diff = Warning: Diff output requires color. 0×2 DataFrameDataFrame Row │ x y │ Int64 Int64 ─────┴──────────────
julia> using DataFramesMeta, Statistics

julia> df = DataFrame(x = 1:3, y = [2, 1, 2]);
Expand Down Expand Up @@ -976,7 +1075,7 @@
Use this function as an alternative to placing the `.` to broadcast row-wise operations.

### Examples
```jldoctest

Check failure on line 1078 in src/macros.jl

View workflow job for this annotation

GitHub Actions / build

doctest failure in ~/work/DataFramesMeta.jl/DataFramesMeta.jl/src/macros.jl:1078-1108 ```jldoctest julia> using DataFramesMeta julia> df = DataFrame(A=1:5, B=["apple", "pear", "apple", "orange", "pear"]) 5×2 DataFrame Row │ A B │ Int64 String ─────┼─────────────── 1 │ 1 apple 2 │ 2 pear 3 │ 3 apple 4 │ 4 orange 5 │ 5 pear julia> @rsubset df :A > 3 2×2 DataFrame Row │ A B │ Int64 String ─────┼─────────────── 1 │ 4 orange 2 │ 5 pear julia> @rsubset df :A > 3 || :B == "pear" 3×2 DataFrame Row │ A B │ Int64 String ─────┼─────────────── 1 │ 2 pear 2 │ 4 orange 3 │ 5 pear ``` Subexpression: @rsubset df :A > 3 || :B == "pear" Evaluated output: 3×2 DataFrame Row │ A B │ Int64 String ─────┼─────────────── 1 │ 2 pear 2 │ 4 orange 3 │ 5 pear Expected output: 3×2 DataFrame Row │ A B │ Int64 String ─────┼─────────────── 1 │ 2 pear 2 │ 4 orange 3 │ 5 pear diff = Warning: Diff output requires color. 3×2 DataFrame DataFrame Row │ A B B │ Int64 String ─────┼─────────────── String ─────┼─────────────── 1 │ 2 pear pear 2 │ 4 orange orange 3 │ 5 pear
julia> using DataFramesMeta

julia> df = DataFrame(A=1:5, B=["apple", "pear", "apple", "orange", "pear"])
Expand Down Expand Up @@ -1128,7 +1227,7 @@

### Examples

```jldoctest

Check failure on line 1230 in src/macros.jl

View workflow job for this annotation

GitHub Actions / build

doctest failure in ~/work/DataFramesMeta.jl/DataFramesMeta.jl/src/macros.jl:1230-1304 ```jldoctest julia> using DataFramesMeta, Statistics julia> df = DataFrame(x = 1:3, y = [2, 1, 2]); julia> globalvar = [2, 1, 0]; julia> @subset!(copy(df), :x .> 1) 2×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 2 1 2 │ 3 2 julia> @subset!(copy(df), :x .> globalvar) 2×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 2 1 2 │ 3 2 julia> @subset! copy(df) begin :x .> globalvar :y .== 3 end 0×2 DataFrame julia> df = DataFrame(n = 1:20, x = [3, 3, 3, 3, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 2, 2, 3, 1, 1, 2]); julia> g = groupby(copy(df), :x); julia> @subset!(g, :n .> mean(:n)) 8×2 DataFrame Row │ n x │ Int64 Int64 ─────┼────────────── 1 │ 12 1 2 │ 13 1 3 │ 15 2 4 │ 16 2 5 │ 17 3 6 │ 18 1 7 │ 19 1 8 │ 20 2 julia> g = groupby(copy(df), :x); julia> @subset! g begin :n .> mean(:n) :n .< 20 end 7×2 DataFrame Row │ n x │ Int64 Int64 ─────┼────────────── 1 │ 12 1 2 │ 13 1 3 │ 15 2 4 │ 16 2 5 │ 17 3 6 │ 18 1 7 │ 19 1 julia> d = DataFrame(a = [1, 2, missing], b = ["x", "y", missing]); julia> @subset!(d, :a .== 1) 1×2 DataFrame Row │ a b │ Int64? String? ─────┼───────────────── 1 │ 1 x ``` Subexpression: @subset! copy(df) begin :x .> globalvar :y .== 3 end Evaluated output: 0×2 DataFrame Row │ x y │ Int64 Int64 ─────┴────────────── Expected output: 0×2 DataFrame diff = Warning: Diff output requires color. 0×2 DataFrameDataFrame Row │ x y │ Int64 Int64 ─────┴──────────────
julia> using DataFramesMeta, Statistics

julia> df = DataFrame(x = 1:3, y = [2, 1, 2]);
Expand Down Expand Up @@ -1312,7 +1411,7 @@

### Examples

```jldoctest

Check failure on line 1414 in src/macros.jl

View workflow job for this annotation

GitHub Actions / build

doctest failure in ~/work/DataFramesMeta.jl/DataFramesMeta.jl/src/macros.jl:1414-1486 ```jldoctest julia> using DataFramesMeta, Statistics julia> d = DataFrame(x = [3, 3, 3, 2, 1, 1, 1, 2, 1, 1], n = 1:10, c = ["a", "c", "b", "e", "d", "g", "f", "i", "j", "h"]); julia> @orderby(d, -:n) 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 10 h 2 │ 1 9 j 3 │ 2 8 i 4 │ 1 7 f 5 │ 1 6 g 6 │ 1 5 d 7 │ 2 4 e 8 │ 3 3 b 9 │ 3 2 c 10 │ 3 1 a julia> @orderby(d, invperm(sortperm(:c, rev = true))) 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 9 j 2 │ 2 8 i 3 │ 1 10 h 4 │ 1 6 g 5 │ 1 7 f 6 │ 2 4 e 7 │ 1 5 d 8 │ 3 2 c 9 │ 3 3 b 10 │ 3 1 a julia> @orderby d begin :x abs.(:n .- mean(:n)) end 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 e 2 │ 1 6 f 3 │ 1 7 g 4 │ 1 9 i 5 │ 1 10 j 6 │ 2 4 d 7 │ 2 8 h 8 │ 3 3 c 9 │ 3 2 b 10 │ 3 1 a julia> @orderby d @byrow :x^2 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 e 2 │ 1 6 f 3 │ 1 7 g 4 │ 1 9 i 5 │ 1 10 j 6 │ 2 4 d 7 │ 2 8 h 8 │ 3 1 a 9 │ 3 2 b 10 │ 3 3 c ``` Subexpression: @orderby d begin :x abs.(:n .- mean(:n)) end Evaluated output: 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 d 2 │ 1 6 g 3 │ 1 7 f 4 │ 1 9 j 5 │ 1 10 h 6 │ 2 4 e 7 │ 2 8 i 8 │ 3 3 b 9 │ 3 2 c 10 │ 3 1 a Expected output: 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 e 2 │ 1 6 f 3 │ 1 7 g 4 │ 1 9 i 5 │ 1 10 j 6 │ 2 4 d 7 │ 2 8 h 8 │ 3 3 c 9 │ 3 2 b 10 │ 3 1 a diff = Warning: Diff output requires color. 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 e d 2 │ 1 6 f g 3 │ 1 7 g f 4 │ 1 9 i j 5 │ 1 10 j h 6 │ 2 4 d e 7 │ 2 8 h i 8 │ 3 3 c b 9 │ 3 2 b c 10 │ 3 1 a

Check failure on line 1414 in src/macros.jl

View workflow job for this annotation

GitHub Actions / build

doctest failure in ~/work/DataFramesMeta.jl/DataFramesMeta.jl/src/macros.jl:1414-1486 ```jldoctest julia> using DataFramesMeta, Statistics julia> d = DataFrame(x = [3, 3, 3, 2, 1, 1, 1, 2, 1, 1], n = 1:10, c = ["a", "c", "b", "e", "d", "g", "f", "i", "j", "h"]); julia> @orderby(d, -:n) 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 10 h 2 │ 1 9 j 3 │ 2 8 i 4 │ 1 7 f 5 │ 1 6 g 6 │ 1 5 d 7 │ 2 4 e 8 │ 3 3 b 9 │ 3 2 c 10 │ 3 1 a julia> @orderby(d, invperm(sortperm(:c, rev = true))) 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 9 j 2 │ 2 8 i 3 │ 1 10 h 4 │ 1 6 g 5 │ 1 7 f 6 │ 2 4 e 7 │ 1 5 d 8 │ 3 2 c 9 │ 3 3 b 10 │ 3 1 a julia> @orderby d begin :x abs.(:n .- mean(:n)) end 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 e 2 │ 1 6 f 3 │ 1 7 g 4 │ 1 9 i 5 │ 1 10 j 6 │ 2 4 d 7 │ 2 8 h 8 │ 3 3 c 9 │ 3 2 b 10 │ 3 1 a julia> @orderby d @byrow :x^2 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 e 2 │ 1 6 f 3 │ 1 7 g 4 │ 1 9 i 5 │ 1 10 j 6 │ 2 4 d 7 │ 2 8 h 8 │ 3 1 a 9 │ 3 2 b 10 │ 3 3 c ``` Subexpression: @orderby d @byrow :x^2 Evaluated output: 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 d 2 │ 1 6 g 3 │ 1 7 f 4 │ 1 9 j 5 │ 1 10 h 6 │ 2 4 e 7 │ 2 8 i 8 │ 3 1 a 9 │ 3 2 c 10 │ 3 3 b Expected output: 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 e 2 │ 1 6 f 3 │ 1 7 g 4 │ 1 9 i 5 │ 1 10 j 6 │ 2 4 d 7 │ 2 8 h 8 │ 3 1 a 9 │ 3 2 b 10 │ 3 3 c diff = Warning: Diff output requires color. 10×3 DataFrame Row │ x n c │ Int64 Int64 String ─────┼────────────────────── 1 │ 1 5 e d 2 │ 1 6 f g 3 │ 1 7 g f 4 │ 1 9 i j 5 │ 1 10 j h 6 │ 2 4 d e 7 │ 2 8 h i 8 │ 3 1 a 9 │ 3 2 b c 10 │ 3 3 cb
julia> using DataFramesMeta, Statistics

julia> d = DataFrame(x = [3, 3, 3, 2, 1, 1, 1, 2, 1, 1], n = 1:10,
Expand Down Expand Up @@ -1407,7 +1506,7 @@
Use this function as an alternative to placing the `.` to broadcast row-wise operations.

### Examples
```jldoctest

Check failure on line 1509 in src/macros.jl

View workflow job for this annotation

GitHub Actions / build

doctest failure in ~/work/DataFramesMeta.jl/DataFramesMeta.jl/src/macros.jl:1509-1546 ```jldoctest julia> using DataFramesMeta julia> df = DataFrame(x = [8,8,-8,7,7,-7], y = [-1, 1, -2, 2, -3, 3]) 6×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 8 -1 2 │ 8 1 3 │ -8 -2 4 │ 7 2 5 │ 7 -3 6 │ -7 3 julia> @rorderby df abs(:x) (:x * :y^3) Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 7 -3 2 │ -7 3 3 │ 7 2 4 │ 8 -1 5 │ 8 1 6 │ -8 -2 julia> @rorderby df :y == 2 ? -:x : :y 6×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 7 2 2 │ 7 -3 3 │ -8 -2 4 │ 8 -1 5 │ 8 1 6 │ -7 3 ``` Subexpression: @rorderby df abs(:x) (:x * :y^3) Evaluated output: 6×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 7 -3 2 │ -7 3 3 │ 7 2 4 │ 8 -1 5 │ 8 1 6 │ -8 -2 Expected output: Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 7 -3 2 │ -7 3 3 │ 7 2 4 │ 8 -1 5 │ 8 1 6 │ -8 -2 diff = Warning: Diff output requires color. 6×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 7 -3 2 │ -7 3 3 │ 7 2 4 │ 8 -1 5 │ 8 1 6 │ -8 -2
julia> using DataFramesMeta

julia> df = DataFrame(x = [8,8,-8,7,7,-7], y = [-1, 1, -2, 2, -3, 3])
Expand Down Expand Up @@ -1456,15 +1555,53 @@
## transform & @transform
##
##############################################################################
copy_gd(x::GroupedDataFrame) = transform(x; ungroup = false)
copy_gd(x::AbstractDataFrame) = copy(x)
function generic_transform_select_helper(x, args...; wrap_byrow::Bool = false, modify::Bool = false, selectfun::Bool = false)
if selectfun
secondstagefun = select!
if modify
transformfun = select!
else
transformfun = select
end
else
secondstagefun = transform!
if modify
transformfun = transform!
else
transformfun = transform
end
end

x, exprs, outer_flags, kw = get_df_args_kwargs(x, args...; wrap_byrow = wrap_byrow)
exprs, when = get_when_statements(exprs)
# Main.@infiltrate
if !isnothing(when)
w = fun_to_vec(when; no_dest = true, gensym_names=false, outer_flags=outer_flags)
t = (fun_to_vec(ex; gensym_names=false, outer_flags=outer_flags) for ex in exprs)
z = gensym()
if modify
quote
$z = $subset($x, $w; view = true, skipmissing = true)
$parent($secondstagefun($z, $(t...); $(kw...)))
end
else
quote
$z = $subset($copy_gd($x), $w; view = true, skipmissing = true)
$parent($secondstagefun($z, $(t...); $(kw...)))
end
end
else
t = (fun_to_vec(ex; gensym_names=false, outer_flags=outer_flags) for ex in exprs)
quote
$transformfun($x, $(t...); $(kw...))
end
end
end

function transform_helper(x, args...)
x, exprs, outer_flags, kw = get_df_args_kwargs(x, args...; wrap_byrow = false)

t = (fun_to_vec(ex; gensym_names = false, outer_flags = outer_flags) for ex in exprs)
quote
$transform($x, $(t...); $(kw...))
end
generic_transform_select_helper(x, args...; wrap_byrow = false, modify = false)
end

"""
Expand Down Expand Up @@ -1593,12 +1730,7 @@
end

function rtransform_helper(x, args...)
x, exprs, outer_flags, kw = get_df_args_kwargs(x, args...; wrap_byrow = true)

t = (fun_to_vec(ex; gensym_names=false, outer_flags=outer_flags) for ex in exprs)
quote
$transform($x, $(t...); $(kw...))
end
generic_transform_select_helper(x, args...; wrap_byrow = true, modify = false)
end

"""
Expand Down Expand Up @@ -1646,12 +1778,7 @@


function transform!_helper(x, args...)
x, exprs, outer_flags, kw = get_df_args_kwargs(x, args...; wrap_byrow = false)

t = (fun_to_vec(ex; gensym_names = false, outer_flags = outer_flags) for ex in exprs)
quote
$transform!($x, $(t...); $(kw...))
end
generic_transform_select_helper(x, args...; wrap_byrow = false, modify = true)
end

"""
Expand Down Expand Up @@ -1760,12 +1887,7 @@
end

function rtransform!_helper(x, args...)
x, exprs, outer_flags, kw = get_df_args_kwargs(x, args...; wrap_byrow = true)

t = (fun_to_vec(ex; gensym_names=false, outer_flags=outer_flags) for ex in exprs)
quote
$transform!($x, $(t...); $(kw...))
end
generic_transform_select_helper(x, args...; wrap_byrow = true, modify = true)
end

"""
Expand All @@ -1784,12 +1906,7 @@
##############################################################################

function select_helper(x, args...)
x, exprs, outer_flags, kw = get_df_args_kwargs(x, args...; wrap_byrow = false)

t = (fun_to_vec(ex; gensym_names = false, outer_flags = outer_flags, allow_multicol = true) for ex in exprs)
quote
$select($x, $(t...); $(kw...))
end
generic_transform_select_helper(x, args...; wrap_byrow = false, modify = false, selectfun = true)
end

"""
Expand Down Expand Up @@ -1929,12 +2046,7 @@
end

function rselect_helper(x, args...)
x, exprs, outer_flags, kw = get_df_args_kwargs(x, args...; wrap_byrow = true)

t = (fun_to_vec(ex; gensym_names=false, outer_flags=outer_flags) for ex in exprs)
quote
$select($x, $(t...); $(kw...))
end
generic_transform_select_helper(x, args...; wrap_byrow = true, modify = false, selectfun = true)
end

"""
Expand Down Expand Up @@ -1982,12 +2094,7 @@
##############################################################################

function select!_helper(x, args...)
x, exprs, outer_flags, kw = get_df_args_kwargs(x, args...; wrap_byrow = false)

t = (fun_to_vec(ex; gensym_names = false, outer_flags = outer_flags) for ex in exprs)
quote
$select!($x, $(t...); $(kw...))
end
generic_transform_select_helper(x, args...; wrap_byrow = false, modify = true, selectfun = true)
end

"""
Expand Down Expand Up @@ -2118,12 +2225,7 @@
end

function rselect!_helper(x, args...)
x, exprs, outer_flags, kw = get_df_args_kwargs(x, args...; wrap_byrow = true)

t = (fun_to_vec(ex; gensym_names=false, outer_flags=outer_flags) for ex in exprs)
quote
$select!($x, $(t...); $(kw...))
end
generic_transform_select_helper(x, args...; wrap_byrow = true, modify = true, selectfun = true)
end

"""
Expand Down Expand Up @@ -2530,7 +2632,7 @@

### Examples

```jldoctest

Check failure on line 2635 in src/macros.jl

View workflow job for this annotation

GitHub Actions / build

doctest failure in ~/work/DataFramesMeta.jl/DataFramesMeta.jl/src/macros.jl:2635-2655 ```jldoctest julia> using DataFramesMeta; julia> df = DataFrame(x = 1:10, y = 10:-1:1); julia> @distinct(df, :x .+ :y) 1×2 DataFrame Row │ x y │ Int64 Int64 ─────┼─────────────── 1 │ 1 10 julia> @distinct df begin :x .+ :y end 1×2 DataFrame Row │ x y │ Int64 Int64 ─────┼─────────────── 1 │ 1 10 ``` Subexpression: @distinct(df, :x .+ :y) Evaluated output: 1×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 1 10 Expected output: 1×2 DataFrame Row │ x y │ Int64 Int64 ─────┼─────────────── 1 │ 1 10 diff = Warning: Diff output requires color. 1×2 DataFrame Row │ x y y │ Int64 Int64 ─────┼─────────────── Int64 ─────┼────────────── 1 │ 1 1 10

Check failure on line 2635 in src/macros.jl

View workflow job for this annotation

GitHub Actions / build

doctest failure in ~/work/DataFramesMeta.jl/DataFramesMeta.jl/src/macros.jl:2635-2655 ```jldoctest julia> using DataFramesMeta; julia> df = DataFrame(x = 1:10, y = 10:-1:1); julia> @distinct(df, :x .+ :y) 1×2 DataFrame Row │ x y │ Int64 Int64 ─────┼─────────────── 1 │ 1 10 julia> @distinct df begin :x .+ :y end 1×2 DataFrame Row │ x y │ Int64 Int64 ─────┼─────────────── 1 │ 1 10 ``` Subexpression: @distinct df begin :x .+ :y end Evaluated output: 1×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 1 10 Expected output: 1×2 DataFrame Row │ x y │ Int64 Int64 ─────┼─────────────── 1 │ 1 10 diff = Warning: Diff output requires color. 1×2 DataFrame Row │ x y y │ Int64 Int64 ─────┼─────────────── Int64 ─────┼────────────── 1 │ 1 1 10
julia> using DataFramesMeta;

julia> df = DataFrame(x = 1:10, y = 10:-1:1);
Expand Down
54 changes: 54 additions & 0 deletions src/parsing.jl
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,7 @@ is_macro_head(ex::Expr, name) = ex.head == :macrocall && ex.args[1] == Symbol(na
const BYROW_SYM = Symbol("@byrow")
const PASSMISSING_SYM = Symbol("@passmissing")
const ASTABLE_SYM = Symbol("@astable")
const WHEN_SYM = Symbol("@when")
const DEFAULT_FLAGS = (;BYROW_SYM => Ref(false), PASSMISSING_SYM => Ref(false), ASTABLE_SYM => Ref(false))

extract_macro_flags(ex, exprflags = deepcopy(DEFAULT_FLAGS)) = (ex, exprflags)
Expand All @@ -191,6 +192,9 @@ function extract_macro_flags(ex::Expr, exprflags = deepcopy(DEFAULT_FLAGS))
throw(ArgumentError("Redundant flag $macroname used."))
end
exprflag[] = true
if length(ex.args) > 3
throw(ArgumentError("Too many arguments passed to $macroname"))
end
return extract_macro_flags(MacroTools.unblock(ex.args[3]), exprflags)
else
return (ex, exprflags)
Expand All @@ -199,6 +203,56 @@ function extract_macro_flags(ex::Expr, exprflags = deepcopy(DEFAULT_FLAGS))
return (ex, exprflags)
end

"""
omit_nested_when(ex::Expr, when = Ref(false))

For a statement of the form `@passmissing @when x` return `@passmissing x` and
a flag signifying a `@when` statement was present.
"""
function omit_nested_when(ex::Expr, when = Ref(false))
if ex.head == :macrocall && ex.args[1] in keys(DEFAULT_FLAGS) || is_macro_head(ex, "@when")
macroname = ex.args[1]
if length(ex.args) > 3
throw(ArgumentError("Too many arguments passed to $macroname"))
end
if macroname == Symbol("@when")
when[] = true
return omit_nested_when(MacroTools.unblock(ex.args[3]), when)
else
new_expr, when = omit_nested_when(MacroTools.unblock(ex.args[3]), when)
ex.args[3] = new_expr
end
end
return ex, when
end
omit_nested_when(ex, when = Ref(false)) = ex, when

function get_when_statements(exprs)
new_exprs = []
when_statement = nothing
seen_non_when = false
seen_when = false
for expr in exprs
e, when = omit_nested_when(expr)
if when[]
if seen_when
throw(ArgumentError("Only one @when statement allowed at a time"))
end
if seen_non_when
throw(ArgumentError("All @when statements must come first"))
end
seen_when = true
when_statement = e
else
seen_non_when = true
push!(new_exprs, expr)
end
end

new_exprs, when_statement
end


"""
check_macro_flags_consistency(exprflags)

Expand Down
1 change: 1 addition & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ my_tests = ["dataframes.jl",
"astable.jl",
"astable_flag.jl",
"passmissing.jl",
"when.jl",
"multicol.jl"]

println("Running tests:")
Expand Down
Loading
Loading