Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statsmodels: rename contrasts if they cannot be uniquely identified downstream #861

Merged
merged 12 commits into from
Jun 2, 2022

Conversation

adelavega
Copy link
Collaborator

@adelavega adelavega commented May 21, 2022

Addresses: bids-standard/stats-models#40

  • Implements renaming these contrasts to prepend the incoming contrast name
  • Refactored DummyContrasts to simply create Contrasts prior to processing Contrasts. That way, they are processed like any other contrast, and the logic for dealing w/ 1 and other special contrasts is not repeated.

This might be more controversial but previously the 1 construct would add an intercept named either "intercept" or (in the case of only having one incoming contrast to that Node) the contrast name. Instead, I propose we simply rename the Contrasts if it references the special 1 construct, but always call the intercept "intercept" in the design matrix itself.

I don't feel super strongly about this last bit, but it complicated the logic. Now simply we need to rename the Contrast if it references 1, and add an intercept called intercept if needed to X.

@codecov
Copy link

codecov bot commented May 21, 2022

Codecov Report

Merging #861 (50429e1) into master (f4bf70c) will increase coverage by 0.16%.
The diff coverage is 85.71%.

@@            Coverage Diff             @@
##           master     #861      +/-   ##
==========================================
+ Coverage   86.06%   86.22%   +0.16%     
==========================================
  Files          35       35              
  Lines        3955     3943      -12     
  Branches     1022     1019       -3     
==========================================
- Hits         3404     3400       -4     
+ Misses        360      354       -6     
+ Partials      191      189       -2     
Impacted Files Coverage Δ
bids/modeling/statsmodels.py 85.86% <85.71%> (+2.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f4bf70c...50429e1. Read the comment docs.

@adelavega adelavega requested a review from effigies May 23, 2022 17:23
@@ -263,11 +264,11 @@ class BIDSStatsModelsNode:
overridden if one is passed when run() is called on a node.
"""

def __init__(self, level, name, transformations=None, model=None,
contrasts=None, dummy_contrasts=False, group_by=None):
def __init__(self, level, name, model, group_by, transformations=None,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made model and group_by mandatory, because they are mandatory in the model spec

# Loop over node level of this node
if self.level != "dataset":
group_by.append(self.level)
raise ValueError(f"group_by is not defined for Node: {name}")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be handling if group_by is None, because it should always be defined

var_names.append(int_name)
# If a single incoming contrast
if ('contrast' in df.columns and df['contrast'].nunique() == 1):
unique_in_contrast = df['contrast'].unique()[0]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this variable is only set if a Node iteration is grouped by contrast.

we later use this to rename contrasts with concatenation

@@ -53,14 +53,21 @@
"ConditionList": [1],
"Weights": [1],
"Test": "FEMA"
},
{
"Name": "neg",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified test model to add a user defined contrast (not using 1)

@@ -140,7 +152,7 @@ def test_entire_graph_smoketest(graph):
assert model_spec.X.shape == (2, 2)
assert model_spec.Z is None
assert len(model_spec.terms) == 2
assert not set(model_spec.terms.keys()) - {"RT", "gain", "RT:gain", "sex"}
assert not set(model_spec.terms.keys()) - {"intercept", "sex"}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X now just had intercept not contrast name in DM

]

expected_outs = [
'gain', 'gain_neg', 'RT', 'RT_neg', 'RT:gain', 'RT:gain_neg'
Copy link
Collaborator Author

@adelavega adelavega May 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the expected outputs at the participant level w/ the concatenation I implemented.

For this model: bids/tests/data/ds005/models/ds-005_type-test_model.json

cc: @jmumford

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work and seems intuitive. This is a nice model spec to learn from. One thing, I wouldn't scale RT (no mean centering and no sd scaling) as it adds between-subject RT differences at the group level (I have been very slowly writing a paper about this).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'm happy to make that change.

I'm modifying this model as an example on the next chapter of the walkthrough.

First chapter is now live (https://bids-standard.github.io/stats-models/)-- feedback welcome

@adelavega adelavega requested a review from jmumford May 23, 2022 20:05
@adelavega
Copy link
Collaborator Author

@effigies any other comments? I'm going to make the change Jeannette suggested and merge otherwise

@adelavega adelavega merged commit 5a166c3 into master Jun 2, 2022
name = int_name

# Rename special 1 construct
condition_list = ['intercept' if i == 1 else i for i in condition_list]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is going to fail:

{
  ...
  "DummyContrasts": {"Test": "t"}
}

The reason is that instead of changing the 1 in the design matrix to match the column name, it's now going to be "intercept", and when you expand the dummy contrasts to proper contrasts, then the "name" will be "intercept" so we will never pick the contrast name back up. There's also nothing stopping somebody from explicitly creating a contrast named "intercept", so detecting this case will now be harder.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, you're right. I am passing the name of the unique variable to this function, so I can rename the "intercept" contrast the name of the unique contrast grouping variable in this special case. i think i meant to do that but forgot.

however, about catching an explicitly named "intercept" contrast, i have two suggestions:

  1. the user can deal w/ the consequences
  2. throw an error if any explicitly named contrast conflict w/ dummycontrasts.

the spec currently says that explicilt contrasts over-ride dummycontrasts, so 2) would be most consistent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants