-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: example writing multiple datasets to orso file #92
Conversation
src/ess/reflectometry/tools.py
Outdated
for fname, quantities in files.items(): | ||
wf = workflow.copy() | ||
wf[Filename[SampleRun]] = fname | ||
for name, value in quantities.items(): | ||
wf[name] = value | ||
reflectivity_curves.append(wf.compute(NormalizedIofQ)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be done using something like:
reflectivity_curves = sciline.compute_mapped(wf.map(files), NormalizedIofQ)
provided that files
is setup in a way compatible with map
. I suggest using a pandas.DataFrame
with column names as keys for Sciline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can use that here. But it seems to be slightly different from the current implementation. In the current implementation the runs that are mapped over don't have to all set the same parameters. One can set the SampleRotation
and another can set some other parameters. But if the parameters are defined in a table then all "runs" (rows of that table) need set the same parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do they have different parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's plausible. For example, in the reduction of data from the experiments we did in PSI one month ago there were some special settings in the low angle measurement. The high angle measurement just used the default setting.
We could of course specify a value for that setting for both runs, specifying the default value for the high angle run, but that is more cumbersome and error prone.
src/ess/reflectometry/tools.py
Outdated
datasets = [] | ||
for (fname, quantities), curve, scale_factor in zip( | ||
files.items(), reflectivity_curves, scale_factors, strict=True | ||
): | ||
wf = workflow.copy() | ||
wf[Filename[SampleRun]] = fname | ||
for name, value in quantities.items(): | ||
wf[name] = value | ||
wf[NormalizedIofQ] = scale_factor * curve | ||
dataset = wf.compute(orso.OrsoIofQDataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datasets = sciline.compute_mapped(wf.map(dataframe), orso.OrsoIofQDataset)
If you setup dataframe
with a NormalizedIofQ
column (I am not 100% we can map non-source nodes. Edit: No we cannot, I think we should add __delitem__
as also discussed in scipp/cyclebane#14).
Not sure about the find_corrections
, but I presume they are the same for all files, i.e., you can just run that once on the base workflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I presume they are the same for all files, i.e., you can just run that once on the base workflow?
They are probably the same, but I don't see the benefit of making that assumption here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are literally using the same workflow, how could it be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What corrections are applied could depend on the parameters of the workflow. Right now they don't but I don't think that is an implausible situation. Besides, there's no cost to rerunning it, it takes 2ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems quite implausible, I thought this is for handling measurements from the same "run" at multiple angles?
But if you are convinced that this is what you need, in that case I don't have any suggestions for improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems quite implausible, I thought this is for handling measurements from the same "run" at multiple angles?
That's the main use case yes. It's wrong to say that I'm convinced this is what we need. I don't know, and therefore opted to keep it flexible, since the cost of doing so is negligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing tests.
5f51422
to
6d6411a
Compare
[normalized_ioq, orso_dataset], params={Filename[SampleRun]: 'default'} | ||
) | ||
datasets = orso_datasets_from_measurements( | ||
workflow, | ||
[{}, {Filename[SampleRun]: 'special'}], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are not really testing what you claimed above was needed: Different sets of parameter for different runs. Yes, you pass {}
for the first, but since it relies on a default value that is not saying much? For what you are testing here, you could have went with the simpler version I had suggested, simply by passing the filename for each. As it is now, what orso_dataset_from_measurements
will do for the first run is a mystery for the reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be a misunderstanding here. Maybe we should talk about this in person? If you have a simpler and better solution in mind I'll be happy to use that.
You are not really testing what you claimed above was needed: Different sets of parameter for different runs.
We want to override different parameters in different runs. For example, in some run we might want to override the SampleRotation
parameter, but in another run we want to use the default (meaning reading it from the file).
The intent with this test is to assert if:
- If no value was provided for a parameter (in this case
Filename
), is the default value used? - If a value was provided, is that value used?
scale_to_overlap=False, | ||
) | ||
assert len(datasets) == 2 | ||
assert tuple(d.info.name for d in datasets) == ('default.orso', 'special.orso') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above you said orso_dataset_from_measurements
should be able to perform/record possibly distinct corrections applied for each run. Here you are testing the same correction, but it retrieves a different value (the filename). Is that the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't test any aspect of the "correction finder"-mechanism, so I'm not sure what you mean.
This only tests if the provided parameter value was used, or if the default parameter value was used.
a8cb24d
to
1860988
Compare
58349f3
to
a36f6e0
Compare
Fixes #87
This solution is not so well integrated in the sciline pipeline.
I'd appreciate suggestions about how to do this better.