Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to DB native export format #357
Switch to DB native export format #357
Changes from 4 commits
f012522
bc33066
29174e4
a7051c9
33a5ba3
a6b7768
e4131b4
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could? create a centralized jinja template for these DB specific queires. I don't know how much this export mechanism needs the injection protection, since it's not in queries that are being distributed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The table name comes from the manifest yeah? So it is user input, which does mean it's subject to chicanery, by the user or study author or malicious 3rd party app modifying files in $HOME... (btw: do we have any sanitizing of table names when reading the manifest?)
But I'm not overly stressed - I'll leave it up to you on risk assessment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do have a regex validation for custom prefixes, but not for table names. if we did that, I'd be fine saying this is safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but... on the other hand, a user can run a DROP TABLE query if they edit the manifest and it would be valid, so... maybe we're just not safe from a malicious user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or a malicious script on the user's machine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
table_type
seems like an odd addition here. In both implementations, it's just used to create the output file name, which felt like a bit of duplicated business logic.What if you used a "input/output, did it write anything" pattern like (totally ignorable suggestion, just brainstorming):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you need the bare table name for the actual sql query, and then this concat with table_type to tell you if it's flat/cube/something else. I like that a :little: bit more than having a split for readability.
And I need the path downstream so that I have one place to handle the 'take the parquet and create a csv from it once it's downloaded' logic. So... I think I'm going to soft advocate for as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, as discussed offline, this now takes a
filename
arg and returns a bool