-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support appending arbitrary container #57
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #57 +/- ##
==========================================
+ Coverage 67.06% 67.19% +0.12%
==========================================
Files 6 6
Lines 498 509 +11
==========================================
+ Hits 334 342 +8
- Misses 164 167 +3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, this seems super useful!
Now, I maybe overthinking things, but I’m deep in thought about insertable and ordered containers in Dictionaries.jl and I’m slightly worried we might be complecting the interface of appendable containers (or, ordered insertable containers) with the duality between column and row access tables. I’m pushing towards stronger interfaces with guarantees needed for writing robust generic code, which provides boundaries for what generic functions like append!
cannot do. There are certain relationships and identities that should hold between generic functions, and in this case we are speaking of appending iterable containers.
An example of the kind of identity I might keep in mind is that
length(b) == 1 && append!(a, b)
might be thought of as equivalent to
push!(a, only(b))
Basically if we could limit this to table containers b
where first(b)
is a row, then everything follows naturally from the container properties. Appending a column-access table to another should concatenate the columns in my opinion (and it should be easy to switch between the row and column based iteration orders).
What are your thoughts?
That's an interesting point. I think I agree. But a pain point is that there is no easy way to query if isrowiterator(t) =
Tables.istable(t) && Tables.rowaccess(t) && Tables.rows(t) === t
function Base.append!(t::Table, t2)
if isrowiterator(t2) && Tables.columnaccess(t2)
return append_columnaccess!(t, t2)
end
return append_rows!(t, t2)
end
append_row!(t::Table, rows) = # `rows` is a generic iterator
mapfoldl(_asnamedtuple(NamedTuple{columnnames(t)}), push!, rows; init = t) It's nice that now it naturally supports iterators with unknown eltype. (BTW, it would be nice to add something like With this, I think |
This already works. But we need JuliaData/Tables.jl#126 for the optimization (column-oriented memory access). |
I'm not sure I quite understand what if isrowiterator(t2) && Tables.columnaccess(t2)
return append_columnaccess!(t, t2)
end is it Also note that: Tables.istable(t) && Tables.rowaccess(t) && Tables.rows(t) === t doesn't return true for all valid row-iterators. For example, |
Yes,
As
They hit |
So why not just always call |
I thought it's nice to avoid allocating unnecessary objects. If performance is not a concern, I think the cleanest approach here would be to add things row-by-row as there is already Line 218 in 018a2d1
|
Then I would just check |
I initially had something like that. But then @andyferris pointed out that it might be better to strictly require that |
This PR implements
append!(::Table, source)
for arbitrarysource
table types: