G_String Legacy is a software program for Generalizability analyses with a graphical user interface to code data entry and compute Generalizability coefficients based on variance component estimates generated by urGenova. It can also create synthetic data sets based on user-provided designs and variance coefficients. It was designed and coded by Ralph Bloch in cooperation with Geoff Norman as part of a project originally commissioned by the Medical Council of Canada and subsequently developed further. Most recently, G_String Legacy has moved to GitHub to provide a community around the software.
G_String is written in Java 8 on the Linux platform. It can run on Macintosh or PC computers under Windows, Linux, or iOS operating systems using the Java Runtime Environment.
urGenova was written by R.L. Brennan at the University of Iowa. It is included in the installation package for G_String Legacy, and operates in the background. It is a traditional command line program written in ANSI C; users needed to specify their parameters by means of a somewhat cryptic control file, since urGENOVA does not have a graphical user interface. Also, urGENOVA has difficulties with current long directory and file names. G_String takes care of that. While urGenova provides the variance components for the individual effects, it does not calculate variance coefficients under different conditions; G_String does that as well.
Generalizability theory (G theory) is complex, and readers interested in a deeper understanding should consult the resources listed in the Bibliography. We will here only explain some essential terms superficially. For novices in G theory the YouTube video provides a good eye opener, and the AMEE Guide #68 may be particularly helpful.
The purpose of a generalizability study is to estimate the reliability of a specific behavioral measurement with the intent of generalizing its findings, and to identify potential sources of measurement error. Behavioral measurements require defined methods to elicit and collect responses
from subjects of interest to a set of stimuli such as statements on a questionnaire, observations of performance, or almost any psychometric measurement. The responses will typically vary widely, depending on the individual subject, as well as other factors such as the rater, the specific question, or other sources of error. G theory is a statistical strategy to identify and quantify these various sources of error.
The variability can be expressed mathematically as a variance, which generalizability analysis subdivides into a variance component attributable to the subjects and variance components associated with various errors associated with the measurement process.
The variance component attributed to the subjects is designated '$\tau$', while the variance component resulting from all the other aspects of the measurement is referred to as '$\sigma^{2}(\delta)$' or '$\sigma^{2}( \Delta )$' respectively. The '$\delta$'-component is called relative or norm referenced, the '$\Delta$'-component is called absolute or criteria referenced.
The ratio between the subject variance component and the relative total variance, '$E\rho^{2}$' is called the ‘generalizability coefficient’, while '$\Phi$', the ratio between the subject variance component and the absolute total variance is called the ‘index of dependability’:
and
The individual aspects of the measurement design are customarily referred to as 'facets'. The facet corresponding to the subjects under investigation is referred to as 'facet of differentiation', the other aspects are called 'facets of generalization'. If the facet of differentiation is nested in some other facet (for example, students within classrooms) the nesting facet (classroom) is called a 'facet of stratification'.
'$\sigma^{2}(\tau)$', '$\sigma^{2}(\delta )$', '$\sigma^{2}(\Delta )$', '$E\rho^{2}$', and '$\Phi$' are calculated by G_String from the variance components corresponding to the various facets and their appropriate combinations, provided automatically by Brennan's urGenova contained in G_String Legacy. The actual calculations are detailed in the printout.
G theory is based on the linear regression model of statistics. The analysis solves for the parameter variances in a data set collected under a specific experimental design.
G_String also allows the researcher to construct synthetic data sets that would result from a specified design, assuming postulated variances. Being able to create synthetic data sets is not only useful in teaching G Analysis, it also allows for experimentation, thus promoting a deeper understanding of G Theory.
Users should be warned not to misrepresent synthetic data sets as empirical data. G_String synthetic data sets carry a pseudo-random signature that allows detection of their synthetic nature by statistical methods.