-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Driver script #3
Comments
Sounds very similar to the oa-get routine in |
Yes, I saw that. But I didn't think oa-get is ready for prime time, and I just want something simple. But you're right that these should be tied together at some point. I might have to learn Python ... |
The OA Media Importer as a whole is not ready yet, but the crawling part mostly is, and using it does not require coding anything in python. One use case is the "Wikipedia" circle in http://malaria.bibsoup.net/ . |
I removed download_examples.sh now as fetch-samples.sh does the job. open-access-media-importer has some dependencies which might be a hurdle for some people. I think for our purpose it is fine to fetch the selected examples with wget. But we should refer to oa-get as tool for downloading other articles than used for our testing. |
Right now this is fetch-samples, but it needs to morph into a real driver script with these features:
articles
, which specifies the list of articles to process.Either an explicit list, or a reference to an XML file that contains
a list, or (default) all the articles that have been updated since last time,
according to the oa-service.
steps
, that specifies which step in the pipeline to execute (default is all):The text was updated successfully, but these errors were encountered: