-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Another attempt at an astable flag #298
Conversation
More progress! Not ready for a review yet as i have not added a complete test set or begun documenting. |
Ready for a review! This is a big PR. In the process, I have to finally remove some deprecated functionality for grouped data frames. In particular
used to work. In the past release we had a warning that this needed to be
and now we throw an error in the original expression. Here is the docstring for
|
I would think it should be |
Good catch. Will update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty cool! This will require a breaking release, right?
Using @astable
to ensure operations are run sequentially is clever. The name is a bit surprising for this, but well... I also hope the compilation overhead isn't too large.
src/macros.jl
Outdated
julia> @by df :a @astable begin | ||
$(DOLLAR)"Mean of b" = mean(:b) | ||
$(DOLLAR)"Standard deviation of b" = std(:b) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example can be achieved without @astable
, right? Maybe do m = mean(:b); std(:b, mean=m)
to illustrate the power of this function? Or, simpler, call extrema(:b)
to create two columns.
Also, I wouldn't use long column names with spaces in them: better illustrate a single feature at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great. changed.
src/macros.jl
Outdated
`@astable` is useful when performing intermediate calculations | ||
yet store their results in new columns. For example, the following fails. | ||
|
||
``` | ||
@rtransform df begin | ||
:new_col_1 = :x + :y | ||
:new_col_2 = :new_col_1 + :z | ||
end | ||
``` | ||
|
||
This because DataFrames.jl does not guarantee sequential evaluation of | ||
transformations. `@astable` solves this problem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is an interesting side-effect, the main goal of AsTable
is to allow returning multiple columns from a single "function". Probably worth mentioning? For example it's useful with extrema
to compute the minimum and the maximum at the same time.
This macro does more than that. It allows for local variables to per persistent as well. If I were to just force sequential transform calls, I would just create a |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
This is ready for another round of reviews. I have added docs to the manual as well. |
docs/src/index.md
Outdated
information. | ||
|
||
In a single block, all assignments of the form `:y = f(:x)` | ||
or `$y = f(:x)` at the top-level are generate new columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or `$y = f(:x)` at the top-level are generate new columns. | |
or `$y = f(:x)` at the top-level generate new columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add what $y
has to resolve to (I understand it has to be Symbol
, or strings are also accepted?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Turns out I was allowing unexpected behavior and patched the code.
src/macros.jl
Outdated
|
||
Column assignment in `@astable` follows the same rules as | ||
column assignment more generally. Construct a new column | ||
from a string by escaping it with `$DOLLAR`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add an example of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
Thanks, I've responded to the most recent round of reviews. I have added more tests. I can't think of any new tests to add at the moment, they seem pretty well covered. |
cc @jkrumbiegel, if you want to review. |
Yes I know. What I'm saying is that mentioning sequential operations as the main justification for it was a bit weird. |
test/astable_flag.jl
Outdated
:b_max = ex[2] | ||
end | ||
|
||
@test sort(d.b_min) == [5, 7] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of test is quite fragile. Better sort the whole data frame and compare it to the reference to make sure that groups and values match.
Co-authored-by: Milan Bouchet-Valat <[email protected]>
I don't understand the changes to the github review interface, but I think i've addressed lingering issues. I just added some checks to make sure you can't do |
Tests pass and this can be merged. |
test/astable_flag.jl
Outdated
end | ||
|
||
@testset "@astable with just assignments, mutating" begin | ||
# After finalizing above testset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdeffebach - this seems to be WIP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that. Thank you working on it now.
test/astable_flag.jl
Outdated
|
||
d = @rtransform df @astable begin | ||
:x = 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline is not needed. Also I am not clear why you add nothing
here and below? Does it change anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I remember.
@transform df begin
:x = 1
end
is valid code and does the same thing as the same with @astable
does. So I wanted to test something that made sure it was hittng the @astable
path and not the vanilla path.
I have deleted the extra new lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the round of reviews. I think I have addressed everything. Sorry about forgetting some of the tests.
test/astable_flag.jl
Outdated
end | ||
|
||
@testset "@astable with just assignments, mutating" begin | ||
# After finalizing above testset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that. Thank you working on it now.
test/astable_flag.jl
Outdated
|
||
d = @rtransform df @astable begin | ||
:x = 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I remember.
@transform df begin
:x = 1
end
is valid code and does the same thing as the same with @astable
does. So I wanted to test something that made sure it was hittng the @astable
path and not the vanilla path.
I have deleted the extra new lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just please check why Documenter fails before merging. Thank you!
src/parsing.jl
Outdated
println(MacroTools.prettify(fun)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
println(MacroTools.prettify(fun)) |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
Thanks for the review. @nalimilan ready to merge? |
Bravo! |
This attempt should be a more robust strategy than the current
@t
flag in DataFrameMacros.jl.I will outline it in more detail shortly.