-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process data #8
Process data #8
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @wagenrace - a couple of important comments below:
- This code is used to download the data on figshare, not process the data to upload to figshare. In Adding Data Preparation Code #4, I am specifically interested in adding the code that @fheigwer used to process the data that was presumably downloaded from IDR (DOI: http://doi.org/10.17867/10000101). I think changing the naming conventions would make this much clearer. Perhaps instead of
0.process-data
it should be0.process-idr-images
. To merge this PR, please change the module to1.download-data
. - I am thinking that the function
downloadData()
inmain.py
will never be used outside the1.download-data
module - lets put this file in a folder called1.download-data/scripts
- I was probably not being 100% clear, but in an analysis module repository style, each folder stands alone. Therefore, there should not be any need to import functions across folders and the current file
0process-data/__init__.py
is not necessary. - The current file
0process-data/main.py
is the code used to download the data. So anytime someone wanted access to the csv files, they would have to re-download. This module should be run only once by an external user. For example, the user will run some command that callsmain.py
and the data will be placed in appropriate folders (see Adding Data Download Code #5) - Please add execution instructions. This could be in the form of a jupyter notebook that, when run, will download.
about point 4. Why is unzipping needed? The zipped files can be loaded directly into python and be unzipped over there the same way it is done with mnist dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work @wagenrace - we are nearly there
Thanks for the important PR @wagenrace - I will go ahead and merge so we can continue with the project 👍 |
Add dependencies:
Issue #4
I performed the following prior to filing this pull request: