-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #1601 Add initial support for Unizin Synthetic data #1600
base: master
Are you sure you want to change the base?
Conversation
# Change the default Bigquery Project ID | ||
"DEFAULT_PROJECT_ID": "udp-umich-prod", | ||
# Change the dataset project ID where queries are run against | ||
"DATASET_PROJECT_ID": "unizin-shared" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave this line commented by default, so developer needs to enable DATASET_PROJECT_ID in purpose, to connect to Unizin synthetic data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think it would be needed otherwise. Though there could be a case where a default project is used but the data sets are in another project. Like when/if we get true shared repositories.
Anyway, this is a temporary solution and I believe we'll need a future issue to set this value on a per course level in the admin.
DEFAULT_PROJECT_ID = ENV.get("DEFAULT_PROJECT_ID", None) | ||
|
||
# Override the default project ID for BigQuery if needed, like to unizin-shared | ||
DATASET_PROJECT_ID = ENV.get("DATASET_PROJECT_ID", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the default value be DEFAULT_PROJECT_ID, instead of None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but then we might need a new variable to indicate whether or not this is running on synthetic data or not.
Maybe it's better to make that explicit. I'm not sure yet, this still has some more work on it.
"CRON_QUERY_FILE": "config/cron_udp.hjson", | ||
|
||
# Change the default Bigquery Project ID | ||
"DEFAULT_PROJECT_ID": "udp-umich-prod", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use a placeholder value here, instead of "udp-umich-prod", like
"DEFAULT_PROJECT_ID": "<UDP_institution_id>",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was just empty before. This should all work with the service account and the regular data if it's not set and I don't think it will work yet.
Maybe it is worthwhile to add a section into this loading_data.md, with the direction of using Unizin synthetic dataset. |
Draft as there's a few things to fix up
DATASET_PROJECT_ID
is not set.connection_property
.DEFAULT_PROJECT_ID
to env sinceGOOGLE_CLOUD_PROJECT
is one that is expected to be defined in the environmentThis is initial support
Future support ideas include