-
-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
from_hdf5 function (TEP014) #711
Conversation
@vg3095 show me |
@wkerzendorf Now , I have used ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general that's quite good work. I left some inline comments which you may find useful in making the design more modular.
@@ -4,10 +4,11 @@ | |||
import pandas as pd | |||
from astropy import units as u | |||
from collections import OrderedDict | |||
|
|||
import h5py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think pandas should be used instead of h5py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, pandas.HDFStore()
does not support iterating over hierarchically , while h5py
does.
The issue is still opened there . Link.
So, I am using h5py
only for transversing HDF file hierarchically, and for reading particular attribute (like time_explosion
) , I am using pd.HDFStore()
tardis/simulation/base.py
Outdated
@@ -410,3 +411,138 @@ def from_config(cls, config, **kwargs): | |||
convergence_strategy=config.montecarlo.convergence_strategy, | |||
nthreads=config.montecarlo.nthreads) | |||
|
|||
@classmethod | |||
def from_hdf5(cls, file_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename this to from_hdf
tardis/simulation/base.py
Outdated
|
||
|
||
@classmethod | ||
def read_plasma_data(cls, h5_file, path, sim_dict, file_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should live in the BasePlasma class, named as from_hdf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think , It will be better if I move it to BasePlasma
class, without renaming.(as for the same reasons , mentioned above for Radial1DModel
class)
tardis/simulation/base.py
Outdated
return model | ||
|
||
@classmethod | ||
def read_model_data(cls, h5_file, path, sim_dict, file_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should live in the Radial1DModel
class and named from_hdf
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the renaming part , read_model_data
, just stores attributes in a nested dictionary , and it does not return anything. Renaming it to from_hdf
under Radial1DModel
class will be misleading , as it can mean, it returns Radial1DModel
object .
So , I think , it will be better if I move it to Radial1DModel
, without renaming.
Also, I cannot return Radial1DModel
,just by reading model
attributes in HDF file .
Because , some attributes, which are required for generating this object , also resides in plasma
and runner
path. (like plasma/scalars/time_explosion
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I am suggesting is to take the work for reading a Radial1DModel
from an hdf file away from Simulation
and into Radial1DModel
. If you do that, you may find out that you probably don't need a separate read_model_data
function, but just combining it with from_hdf
could be fine.
In order to understand better what I'd want it to look like, take a look at to_hdf
. You'll see that the simulation.to_hdf
, basicall does nothing more than calling model.to_hdf
, runner.to_hdf
and plasma.to_hdf
and redirecting the responsibility towards the smaller structures.
Regarding your second point, there are two solutions:
- Either change the
to_hdf
s to store the missing attributes, or - Just assume access to the entire hdf file in each class (Model, Plasma, Runner, Simulation, ...) and read attributes from other sections.
For now, doing the second is just fine.
tardis/simulation/base.py
Outdated
sim_dict[key] = {} | ||
if 'model' in key: | ||
cls.read_model_data( | ||
h5_file, simulation + '/model/', sim_dict[key], file_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you create a classmethod under Radial1DModel
named from_hdf
you could call this here like Radial1DModel.from_hdf(hdf_file, ...)
and you'd get a fully populated Model. This way the from_hdf
concept would be more modular and easier to extend.
tardis/simulation/base.py
Outdated
h5_file, simulation + '/model/', sim_dict[key], file_path) | ||
if 'plasma' in key: | ||
cls.read_plasma_data( | ||
h5_file, simulation + '/plasma/', sim_dict[key], file_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here, for BasePlasma
.
tardis/simulation/base.py
Outdated
v_boundary_outer) | ||
# TODO : Extend it to plasma and montecarlo objects and return Simulation object | ||
|
||
return model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's just a demo, but make sure the Simulation.from_hdf
returns a Simulation
object, not a model. Essentially, each class with a to_hdf
should probably have a from_hdf
classmethod, which returns a populated instance of the class it refers to.
@ftsamis Thanks , for the detailed review . I have replied to some of the comments . Please take a look. As for the rest , I will update it accordingly. |
tardis/simulation/base.py
Outdated
raise IOError("Supplied HDF5 File %s does not exists" % file_path) | ||
if not h5_file: | ||
raise ValueError("h5file Parameter can`t be None") | ||
with pd.HDFStore(file_path, 'r') as data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least two of these checks are unnecessary, since when you reach the pd.HDFStore()
call, it would also raise an exception itself.
@ftsamis Example Usage (Renamed
As of now , it cannot return Changes:
|
@wkerzendorf
This PR is related to first and bonus objective for TEP014 (Reading Simulation from file).
I have implemented a from_hdf5 function in Simulation Base class, which returns Radial1DModel object.
Presently it is not extended for Plasma and MonteCarlo Runner objects.
homogeneous_density and luminosity_requested are set to None for now.
Example Usage :