-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add exporter DSL #402
Add exporter DSL #402
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with this pending whatever additional changes we need to get the job to run during testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a few naming nits.
} | ||
|
||
publishers { | ||
textFinder("\\[WARNING\\]", '', true, false, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this configuration is self-documenting, but here I've no idea what the true, false, true refer to. Would it be reasonable to indicate in a comment what this is doing? (I assume it's something like mark the build unstable if it finds a warning.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment.
--external-prefix=databases/${DATE:-$TODAY} \ | ||
--django-admin=${PLATFORM_VENV}/bin/django-admin.py \ | ||
--django-pythonpath=${PLATFORM_VENV}/edx-platform \ | ||
--gpg-keys=${GPG_KEYS_PATH} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I wonder if this will be actually needed. One of the ways that the course exporter differs from the regular exporter is that the output is not encrypted. That is useful for RDX, where we need to actually read and transform the dumps. And useful for general testing of exporter changes, for example. (Like the addition we're supposed to do for cohorts, IIRC.)
Oh, wait. My mistake. This is actually called by the exporter-slave job, not by the course exporter. I was misled by the name of this file -- this script is actually exporting an org, not a list of courses. The output does produce files at the level of individual courses, but the presence of job that takes courses as input made this confusing to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed.
# Install requirements into this (exporter) virtual environment | ||
pushd analytics-exporter/ | ||
pip install -r github_requirements.txt | ||
pip install --allow-external mysql-connector-python -e . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I think you and Andrew already noted, the --allow-external option is no longer used. We pulled it from pipeline makefile in Jan. 2017: edx/edx-analytics-pipeline@fcebbb7#diff-b67911656ef5d18c4ae36cb6741b7965.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
|
||
steps { | ||
shell(readFileFromWorkspace("dataeng/resources/course-exporter-worker.sh")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should analytics-exporter-slave call org-exporter-worker.sh, instead of course-exporter-worker.sh, or something like that? The name course-exporter-worker.sh seems a better fit to be called by analytics-exporter-course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the name of script.
virtualenv { | ||
nature("shell") | ||
command( | ||
readFileFromWorkspace("dataeng/resources/setup-course-exporter.sh") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this sets up the course-exporter, it also runs it. I would suggest calling it run-course-exporter, or something like that.
DAYOFWEEK=$(date +%u ${DATE_MODIFIER}) | ||
if [ $DAYOFWEEK != 7 ]; then | ||
exit 1 | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this about? The job can only use data that was generated on Sundays? I hope that job never breaks....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we have it, the CWSM dumps are only generated on sundays ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote that because the job will run but fail to upload to S3 unless it's run on a date that's a sunday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CWSM dumps are scheduled on Sundays, but if they fail on Sunday and are run on a Monday, then this prevents us from being able to change the date modifier to some other date. I think that some new users (maybe Emilio?) when first running this specified today's date instead of the Sunday date, and I think Andrew put that in originally to deal with that. I'm fine with it present or absent, until it becomes a problem one way or the other.
5bca0f2
to
4b606cd
Compare
This is ready for a final review. |
This looks good but I want to make sure the job history section is fixed before we merge this code into production (unless that investigation takes too long). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've fixed the job config history plugin via the ansible code. I'm 👍 on this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. 👍
@@ -0,0 +1,30 @@ | |||
import static analytics.AnalyticsEmailOptin.job as AnalyticsEmailOptinJob | |||
import static analytics.AnalyticsExporter.job as AnalyticsExporterJob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why does github highlight the characters on either side of " as " on these two lines? If they were just spaces I wouldn't expect a red square.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's going on, files created using vim and github show same behavior. I found this: textmate/groovy.tmbundle#17 which made me think it was textmate related(I use textmate). But doesn't look like that it is. olivex internal dsl repo show the same red blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it has something to do with asciidoc ?
DAYOFWEEK=$(date +%u ${DATE_MODIFIER}) | ||
if [ $DAYOFWEEK != 7 ]; then | ||
exit 1 | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CWSM dumps are scheduled on Sundays, but if they fail on Sunday and are run on a Monday, then this prevents us from being able to change the date modifier to some other date. I think that some new users (maybe Emilio?) when first running this specified today's date instead of the Sunday date, and I think Andrew put that in originally to deal with that. I'm fine with it present or absent, until it becomes a problem one way or the other.
dataeng/jobs/createJobsTest.groovy
Outdated
@@ -0,0 +1 @@ | |||
import static analytics.AnalyticsEmailOptin.job as TestJob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what's going on. I created this file using vim and the same red blocks appear.
dataeng/jobs/test.groovy
Outdated
@@ -0,0 +1 @@ | |||
import static analytics.AnalyticsEmailOptin.job as SomeJob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file was created using github itself.
7a6c14d
to
ae2eb1f
Compare
TODO: