You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since both pseudo outcome kinds require nuisance model estimates and since these are visibly not provided as input arguments, they are estimated as part of the respective pseudo outcome method.
Importantly, the pseudo outcome methods are treatment-variant specific. Yet, the nuisance estimates estimated as part of the pseudo outcome methods are not treatment variant specific:
In the case of the R-Learner, the overall outcome model $\hat{\mu}$ is applied on all data; the overall propensity model $\hat{e}$ is applied on all data. Only after the estimation is the data filtered wrt to the treatment variant at hand:
In the case of the DR-Learner, the propensity $\hat{e}$ and all conditional average outcomes $\hat{mu}_k$ are estimated for all data points; filtering of variant-specific information only happens thereafter:
In the case of $k>2$ many treatment variants, the above approach causes needlessly much effort since the same nuisance estimates are created, i.e. repeated, for every single treatment variant, which is not considered to be the 'control'.
Computational burden aside, it is not clear that it is a better method interface that the pseudo outcome methods does the estimation itself. Wouldn't it feel more natural that (and concerns be better separated if) the pseudo outcome methods merely defined the pseudo outcome given the nuisance estimates, rather than estimating quantities itself?
The text was updated successfully, but these errors were encountered:
Status quo
As of now we have the following interface for the pseudo-outcome methods in the R-Learner and R-Learner:
DR-Learner
metalearners/metalearners/drlearner.py
Lines 381 to 390 in d863df1
R-Learner
metalearners/metalearners/rlearner.py
Lines 469 to 479 in d863df1
Since both pseudo outcome kinds require nuisance model estimates and since these are visibly not provided as input arguments, they are estimated as part of the respective pseudo outcome method.
Importantly, the pseudo outcome methods are treatment-variant specific. Yet, the nuisance estimates estimated as part of the pseudo outcome methods are not treatment variant specific:
In the case of the R-Learner, the overall outcome model$\hat{\mu}$ is applied on all data; the overall propensity model $\hat{e}$ is applied on all data. Only after the estimation is the data filtered wrt to the treatment variant at hand:
metalearners/metalearners/rlearner.py
Lines 495 to 508 in d863df1
In the case of the DR-Learner, the propensity$\hat{e}$ and all conditional average outcomes $\hat{mu}_k$ are estimated for all data points; filtering of variant-specific information only happens thereafter:
metalearners/metalearners/drlearner.py
Lines 394 to 411 in d863df1
Assessment
In the case of$k>2$ many treatment variants, the above approach causes needlessly much effort since the same nuisance estimates are created, i.e. repeated, for every single treatment variant, which is not considered to be the 'control'.
Computational burden aside, it is not clear that it is a better method interface that the pseudo outcome methods does the estimation itself. Wouldn't it feel more natural that (and concerns be better separated if) the pseudo outcome methods merely defined the pseudo outcome given the nuisance estimates, rather than estimating quantities itself?
The text was updated successfully, but these errors were encountered: