-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialization of big Numpy arrays #150
Comments
@JackTemaki @Atticus1806 opinions? |
I would really prefer external handling. The config is already usually a lot longer than "old" configs due to explicitly setting everything, but it is still readable. I feel like dumping big arrays into it would probably make it unreadable or at least very annoying (slows text editor etc.). |
Yes, me too. But then the next question is, how exactly? I mean, probably Should the path be relative? Relative to what? Where should those files be stored? How should the API look like? |
For Sisyphus usage the |
I would prefer if it works relative thought, because then you can move both the config and the extra dirs around |
Ok, Where do we expect the config to be? So how should we generate relative paths to Should there be a reasonable default for |
Yes why not this way. With Sisyphus we always know where the file should be, and for the tests it can be within the config. |
So, where do we expect the config to be? So how should we generate relative paths to |
There are cases when some bigger Numpy array is part of your net dict, e.g. when you have some custom init for some parameter, e.g. like in the case of
GammatoneV2
.When some bigger Numpy array is part of the net dict, it is currently serialized as is, via
__repr__
. This makes the produced net dict very difficult to read, when 99% of it is just the Numpy array.So, should we do sth about it?
What are possible things we could do? Here some ideas:
We could at least move the definition to the top, similar as we do it for dim tags. Then the net dict itself stays readable. But still 99% of the resulting RETURNN config would be just the Numpy array.
We could move them outside, either as Numpy txt files and do
numpy.loadtxt
, or as Python files and import them. However, this means that any config serialization logic now needs extra logic to handle these cases. Although we are probably only writing this once anyway and then not care anymore about it.Such external file handling of the serialization could also be done in a generic way, and maybe it becomes useful for other purpose as well.
The text was updated successfully, but these errors were encountered: