-
Notifications
You must be signed in to change notification settings - Fork 79
Set up script for Composer and support for getting gcsDagLocation from Composer environment #585
Changes from all commits
ad956cc
b5eafa6
0172ad8
57111a7
ac9104c
c422b10
a14ae52
94a3a75
d5ad2f9
2a7f891
4bf2a61
3646d44
959d4d5
dc02c7c
e642b2f
a44f131
9774d10
bb69916
2934bd5
4d6162d
009a5e5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,7 +12,6 @@ | |
|
||
"""Google Cloud Platform library - BigQuery IPython Functionality.""" | ||
from builtins import str | ||
import google | ||
import google.datalab.utils as utils | ||
|
||
|
||
|
@@ -21,6 +20,10 @@ def _create_pipeline_subparser(parser): | |
'transform data using BigQuery.') | ||
pipeline_parser.add_argument('-n', '--name', type=str, help='BigQuery pipeline name', | ||
required=True) | ||
pipeline_parser.add_argument('-e', '--environment', type=str, | ||
help='The name of the Composer or Airflow environment.') | ||
pipeline_parser.add_argument('-z', '--zone', type=str, | ||
help='The name of the Composer or Airflow zone.') | ||
return pipeline_parser | ||
|
||
|
||
|
@@ -44,11 +47,17 @@ def _pipeline_cell(args, cell_body): | |
bq_pipeline_config = utils.commands.parse_config( | ||
cell_body, utils.commands.notebook_environment()) | ||
pipeline_spec = _get_pipeline_spec_from_config(bq_pipeline_config) | ||
# TODO(rajivpb): This import is a stop-gap for | ||
# https://github.com/googledatalab/pydatalab/issues/593 | ||
import google.datalab.contrib.pipeline._pipeline | ||
pipeline = google.datalab.contrib.pipeline._pipeline.Pipeline(name, pipeline_spec) | ||
utils.commands.notebook_environment()[name] = pipeline | ||
|
||
# If a composer environment and zone are specified, we deploy to composer | ||
if 'environment' in args and 'zone' in args: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think "'environment' in args and 'zone' in args" is always True. If user does not provide values for them, they will be None but still 'environment' and 'zone' exist in args. Perhaps you want "if args['environment'] and args[ 'zone']:". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I didn't know that. Will verify and fix in a different PR mostly because I'm trying to get this in before demo time today. |
||
# TODO(rajivpb): This import is a stop-gap for | ||
# https://github.com/googledatalab/pydatalab/issues/593 | ||
import google.datalab.contrib.pipeline.composer._composer | ||
composer = google.datalab.contrib.pipeline.composer._composer.Composer( | ||
args.get('zone'), args.get('environment')) | ||
composer.deploy(name, pipeline._get_airflow_spec()) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
#!/usr/bin/env bash | ||
PROJECT=${1:-cloud-ml-dev} | ||
EMAIL=${2:[email protected]} | ||
ZONE=${3:-us-central1} | ||
ENVIRONMENT=${3:-rajivpb-airflow} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should it be 4? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please fix. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correct; fixed now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you fixed the composer one but not airflow one yet? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, my bad - I kept thinking that all the comments referred to the composer's setup.sh and was a little confused on getting multiple comments for the same change. Will fix in a different PR mostly because I'm trying to get this in before demo time today. |
||
|
||
gcloud config set project $PROJECT | ||
gcloud config set account $EMAIL | ||
gcloud auth login --activate $EMAIL | ||
|
||
# We use the default cluster spec. | ||
gcloud container clusters create $ENVIRONMENT --zone $ZONE | ||
|
||
# Deploying the airflow container | ||
kubectl run airflow --image=gcr.io/cloud-airflow-releaser/airflow-worker-scheduler-1.8 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#!/usr/bin/env bash | ||
# We remove the local installation and install a version that allows custom repositories. | ||
sudo apt-get -y remove google-cloud-sdk | ||
|
||
# If an installation still exists after executing apt-get remove, try the following: | ||
# gcloud info --format="value(installation.sdk_root)" | ||
# that'll give you the installation directory of whichever installation gets executed | ||
# you can remove that entire directory | ||
# restart shell, rinse and repeat until gcloud is no longer on your path | ||
# you might have to clean up your PATH in .bashrc and nuke the .config/gcloud directory | ||
|
||
# Hopefully by now the machine is clean, so install gcloud | ||
curl https://sdk.cloud.google.com | CLOUDSDK_CORE_DISABLE_PROMPTS=1 bash | ||
|
||
# Recycling shell to pick up the new gcloud. | ||
exec -l $SHELL | ||
|
||
# These have to be set here and not on the top of the script because we recycle the shell somewhere | ||
# between the start of this script and here. | ||
PROJECT=${1:-cloud-ml-dev} | ||
EMAIL=${2:[email protected]} | ||
ZONE=${3:-us-central1} | ||
ENVIRONMENT=${4:-datalab-testing-1} | ||
|
||
gcloud config set project $PROJECT | ||
gcloud config set account $EMAIL | ||
gcloud auth login $EMAIL | ||
|
||
gcloud components repositories add https://storage.googleapis.com/composer-trusted-tester/components-2.json | ||
gcloud components update -q | ||
gcloud components install -q alpha kubectl | ||
|
||
gcloud config set composer/location $ZONE | ||
gcloud alpha composer environments create $ENVIRONMENT | ||
gcloud alpha composer environments describe $ENVIRONMENT | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,7 @@ | |
# Compiles the typescript sources to javascript and submits the files | ||
# to the pypi server specified as first parameter, defaults to testpypi | ||
# In order to run this script locally, make sure you have the following: | ||
# - A Python 3 environment (due to urllib issues) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there an issue open to fix this for Py2? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
# - Typescript installed | ||
# - A configured ~/.pypirc containing your pypi/testpypi credentials with | ||
# the server names matching the name you're passing in. Do not include | ||
|
@@ -28,6 +29,7 @@ | |
tsc --module amd --noImplicitAny datalab/notebook/static/*.ts | ||
tsc --module amd --noImplicitAny google/datalab/notebook/static/*.ts | ||
|
||
# Provide https://upload.pypi.org/legacy/ for prod binaries | ||
server="${1:-https://test.pypi.python.org/pypi}" | ||
echo "Submitting package to ${server}" | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only the name of the Composer right? Not used in Airflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. At the of putting the comment, I made it slightly future looking since we plan to work on an Airflow environment. Will change in a different PR mostly because I'm trying to get this in before demo time today.