-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Table
output in Data.read_many
#11546
base: develop
Are you sure you want to change the base?
Conversation
4c4b4e0
to
68df118
Compare
type Return_As_Table | ||
## Returns a table with a new column `Value` containing the objects loaded | ||
from each file. | ||
|
||
When the source for files to load was a table, all columns from the | ||
original table are also retained. In case of name clashes, the newly | ||
added columns will get a suffix. | ||
|
||
When the source was a simple Vector, the returned table will also contain | ||
a `File Name` column. | ||
Table_Of_Objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name Table_Of_Objects
was a working name. I'm not sure if we want to keep it or come up with something better?
Definitely Table_Of_Tables
in Excel sounded much better. But here it may not necessarily be tables, it may be various objects. So Table_Of_Objects
is quite precise, but it sounds weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should change approach altogether and try something like Added_Column
? (meaning that the loaded files are added as a column to the input)
Values_Column
Table_With_New_Column
any thoughts on these names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Table_With_New_Column
or Table_Of_Objects
seem the best to me, Values_Column
I think could end up being confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about more of a boolean? Merged/UnMerged. Merged True/False? I feel like it would be hard for someone to guess what table_of_objects is going to do without reading the help or trying it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if:
type Return_As
## Single table with values expanded
Merged_Table
## PRIVATE
Add this for backwards compatiblity then hide in widget?
Table_Of_Tables
##
Individual_Cells
## Objects returned as a raw vector allowing zipping to input
Vector
Not sure on name yet either - will think some more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about more of a boolean? Merged/UnMerged. Merged True/False?
But note that
Merged_Table
has additional arguments (columns_to_keep
andmatch
) that are only relevant in this mode. Switching to a boolean would make this problematic.
Plus dropdown options are clearer in the GUI generally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sorry that wasn't clear. When I said boolean I didn't mean a boolean type. More text options that were Merged or Not Merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector
is handled byReturn_As_Base
- the type needs to be split up so that theVector
return works even ifTable
is not imported. It is opaque to the user - the dropdown displays all available types and autoscoping works as shown in tests.I'd not add
Table_Of_Tables
toData.read_many
. It is supported inExcel_Workbook.read_many
to retain backwards compatibility. But adding it to a brand new function seems unnecessary - why add an obsolete option?
I was in my head thinking of it as a single type used by both functions but I appreciate we are actually building this dynamically. Just for working out how it might look in the dropdown thinking as a single type is a bit easier!
If the ..Table_Of_Tables
was an option not shown in the dropdown but in one of the mix in types then no special handling would be needed for the two reads to share the same object was my thought.
I wonder if:
..As_Vector
..With_New_Column
..As_Merged_Table
Might be cleanest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the
..Table_Of_Tables
was an option not shown in the dropdown but in one of the mix in types then no special handling would be needed for the two reads to share the same object was my thought.
But then more code would be affected by the legacy options, I'd rather not grow places that take legacy options when not needed. Unless there's any problem with the current approach, I'd keep the compatibility conversion as-is.
I wonder if:
..As_Vector
..With_New_Column
..As_Merged_Table
Might be cleanest?
That sounds quite clear to me 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I like those options
@@ -16,146 +7,3 @@ type Match_Columns | |||
|
|||
Note: column names are not compared. | |||
By_Position | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No meaningful code changes in this file, just moving the helpers to separate module - Match_Columns_Helpers
.
771a52c
to
eead84a
Compare
…ashing names with Excel Sheet Name
Pull Request Description
Data.read_many
#11311Important Notes
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
or the Snowflake database integration, a run of the Extra Tests has been scheduled.