Releases: gavinsimpson/gratia
Version 0.10.0 released and on CRAN
Earlier this evening I wrapped up the source tarball for version 0.10.0 of gratia
and submitted it to CRAN. Following automated check's, this new version of
gratia is now available from CRAN 🥳
This is a small release of gratia to coincide with the final stages of the review
process of a Journal of Open Source Software paper on gratia that I submitted
earlier in the year. Apart from a slew of bug fixes, the main new feature in this
release is conditional_values()
, a tidy/ggplot version of mgcv::vis.gam()
that
is based on marginaleffects plot_predictions()
. conditional_values()
is
intended as a user friendly way to visualise predicted values of the model that
are conditional on supplied values of covariates. For more complex GAMs, such
visualisations are an essential way to understand and interpret the fitted model.
New features
-
conditional_values()
and itsdraw()
method compute and plot predictions
from a fitted GAM that are conditional on one or more covariates. The
function is a wrapper aroundfitted_values()
but allows the user simple ways
to specify which covariates to condition on and at what values those
covariates should take. It provides similar functionality to
marginaleffects::plot_predictions()
, but is simpler. See #300. -
penalty()
andbasis()
can now allow the smooth to be reparameterized such
that the resulting basis has an identity matrix. This more clearly highlights
the penalty null space, the functions that the penalty has no effect on. -
draw.gam()
anddraw.smooth_estimates()
gain argumentcaption
, which, if
set toFALSE
will not plot the smooth basis type as a caption on the plot.
#307 -
appraise()
andqq_plot.gam()
now allow the user to set a random seed that
is used when generating reference quantiles withmethod = "uniform"
or
method = "simulate"
.
Bug fixes
-
derivative_samples()
was ignoring thescale
argument. #293 Reported by
@jonathonmellor -
Argument
level
toderivative_samples()
was included accidentally. As of
v0.9.2.9002 this argument is deprecated and using it will now generate a
warning. #291 -
draw()
was not plotting cyclic P spline smooths. Reported by @Zuckerbrot
#297 -
derivatives()
would fail for"fs"
smooths with other parametric effects in
the model. Reported by @mahonmb #301 -
Partial residuals in
partial_residuals()
anddraw.gam()
were wrong for
GAMs fitted withfamily = binomial()
where theweights
argument contained
the binomial sample sizes because the prior weights were being used to form
weighted working residuals. Now working weights are used instead. Reported by
@emchuron #273 -
Internal function
gammals_link()
was expecting"theta"
as a synonym for
the scale parameter but the master table has"phi"
coded as the synonym.
Now both work as expected. -
level()
assumed thatlevel
would have only a single value even though it
could handle multiple levels. #321
Version 0.9.2 of gratia now on CRAN
Version 0.9.2 released to CRAN, June 25, 2024
This patch release is largely motivated to fix a few bugs that came to light recently as I was teaching my GAM course for Physalia and preparing a paper for submission to the Journal of Open Source Software. Version 0.9.1 was never released (submission was rejected by CRAN as the package vignettes took the package over the 5Mb limit and CRAN finally said "Nope").
The entries below summarise the changes in this version of gratia. Nothing major here, but I have started building in support for location, scale, shape families in fitted_samples()
, although currently only the location parameter of those models is supported.
Breaking changes
parametric_effects()
slightly escaped the great renaming that happened for
0.9.0. Columnstype
andterm
did not gain a prefix.
. This is now
rectified and these two columns are now.type
and.term
.
User visible changes
-
Plots of random effects are now labelled with their smooth label. Previously,
the title was taken fro the variable involved in the smooth, but this doesn't
work for terms likes(subject, continuous_var, bs = "re")
for random slopes,
which previsouly would have the title"subject"
. Now such terms will have
title"s(subject,continuous_var)"
. Simple random intercept terms,
s(subject, bs = "re")
, are now titled"s(subject)"
. #287 -
The vignettes
custom-plotting.Rmd
, andposterior-simulation.Rmd
were moved tovignettes/articles
and thus are no longer available as package
vignettes. Instead, they are accessible as Articles through the package
website: https://gavinsimpson.github.io/gratia/
New features
fitted_samples()
now works forgam()
models with multiple linear
predictors, but currently only the location parameter is supported. The
parameter is indicated through a new variable.parameter
in the returned
object.
Bug fixes
-
partial_residuals()
was computing partial residuals from the deviance
residuals. For compatibility withmgcv::plot.gam()
, partial residuals are
now computed from the working residuals. Reported by @wStockhausen #273 -
appraise()
was not passing theci_col
argument onqq_plot()
and
worm_plot()
. Reported by Sate Ahmed. -
Couldn't pass
mvn_method
on to posterior sampling functions from user facing
functionsfitted_samples()
,posterior_samples()
,smooth_samples()
,
derivative_samples()
, andrepsonse_derivatives()
. Reported by @stefgehrig
#279 -
fitted_values()
works again for quantile GAMs fitted byqgam()
. -
confint.gam()
was not applyingshift
to the estimate and upper and lower
interval. #280 reported by @TIMAVID & @rbentham -
parametric_effects()
anddraw.parametric_effects()
would forget about the
levels of factors (intentionally), but this would lead to problems with
ordered factors where the ordering of levels was not preserved. Now,
parametric_effects()
returns a named list of factor levels as attribute
"factor_levels"
containing the required information and the order of levels
is preserved when plotting. #284 Reported by @mhpob -
parametric_effects()
would fail if there were parametric terms in the model
but they were all interaction terms (which we don't currently handle). #282
gratia 0.9.0
Breaking changes
-
Many functions now return objects with different named variables. In order to
avoid clashes with variable names used in user's models or data, a period
(.
) is now being used as a prefix for generated variable names. The
functions whose names have changed are:smooth_estimates()
,
fitted_values()
,fitted_samples()
,posterior_samples()
,derivatives()
,
partial_derivatives()
, andderivative_samples()
. In addition,
add_confint()
also adds newly-named variables.1. `est` is now `.estimate`, 2. `lower` and `upper` are now `.lower_ci` and `.upper_ci`, 3. `draw` and `row` and now `.draw` and `.row` respectively, 4. `fitted`, `se`, `crit` are now `.fitted`, `.se`, `.crit`, respectively 5. `smooth`, `by`, and `type` in `smooth_estimates()` are now `.smooth`, `.by`, `.type`, respectively.
-
derivatives()
andpartial_derivatives()
now work more like
smooth_estimates()
; in place of thevar
anddata
columns, gratia now
stores the data variables at which the derivatives were evaluated as columns
in the object with their actual variable names. -
The way spline-on-the-sphere (SOS) smooths (
bs = "sos"
) are plotted has
changed to useggplot2::coord_sf()
instead of the previously-used
ggplot2::coord_map()
. This changed has been made as a result of
coord_map()
being soft-deprecated ("superseded") for a few minor versions of
ggplot2 by now already, and changes to the guides system in version 3.5.0 of
ggplot2.The axes on plots created with
coord_map()
never really worked
correctly and changing the angle of the tick labels never worked. As
coord_map()
is superseded, it didn't receive the updates to the guides
system and a side effect of these changes, the code that plotted SOS smooths
was producing a warning with the release of ggplot2 version 3.5.0.The projection settings used to draw SOS smooths was previously controlled via
argumentsprojection
andorientation
. These arguments do not affect
ggplot2::coord_sf()
, Instead the projection used is controlled through new
argumentcrs
, which takes a PROJ string detailing the projection to use or
an integer that refers to a known coordinate reference system (CRS). The
default projection used is+proj=ortho +lat_0=20 +lon_0=XX
whereXX
is the
mean of the longitude coordinates of the data points.
Defunct and deprecated functions and arguments
Defunct
evaluate_smooth()
was deprecated in gratia version 0.7.0. This function and
all it's methods have been removed from the package. Usesmooth_estimates()
instead.
Deprecated functions
The following functions were deprecated in version 0.9.0 of gratia. They will
eventually be removed from the package as part of a clean up ahead of an
eventual 1.0.0 release. These functions will become defunct by version 0.11.0 or
1.0.0, whichever is released soonest.
-
evaluate_parametric_term()
has been deprecated. Useparametric_effects()
instead. -
datagen()
has been deprecated. It never really did what it was originally
designed to do, and has been replaced bydata_slice()
.
Deprecated arguments
To make functions in the package more consistent, the arguments select
,
term
, and smooth
are all used for the same thing and hence the latter two
have been deprecated in favour of select
. If a deprecated argument is used, a
warning will be issued but the value assigned to the argument will be assigned
to select
and the function will continue.
User visible changes
-
smooth_samples()
now uses a single call to the RNG to generate draws from
the posterior of smooths. Previous to version 0.9.0,smooth_samples()
would
do a separate call tomvnfast::rmvn()
for each smooth. As a result, the
result of a call tosmooth_samples()
on a model with multiple smooths will
now produce different results to those generated previously. To regain the
old behaviour, addrng_per_smooth = TRUE
to thesmooth_samples()
call.Note, however, that using per-smooth RNG calls with
method = "mh"
will be
very inefficient as, with that method, posterior draws for all coefficients
in the model are sampled at once. So, only userng_per_smooth = TRUE
with
method = "gaussian"
. -
The output of
smooth_estimates()
and itsdraw()
method have changed for
tensor product smooths that involve one or more 2D marginal smooths. Now,
if no covariate values are supplied via thedata
argument,
smooth_estimates()
identifies if one of the marginals is a 2d surface and
allows the covariates involved in that surface to vary fastest, ahead of terms
in other marginals. This change has been made as it provides a better default
when nothing is provided todata
.This also affects
draw.gam()
. -
fitted_values()
now has some level of support for location, scale, shape
families. Supported families aremgcv::gaulss()
,mgcv::gammals()
,
mgcv::gumbls()
,mgcv::gevlss()
,mgcv::shash()
,mgcv::twlss()
, and
mgcv::ziplss()
. -
gratia now requires dplyr versions >= 1.1.0 and tidyselect >= 1.2.0.
-
A new vignette Posterior Simulation is available, which describes how to
do posterior simulation from fitted GAMs using {gratia}.
New features
-
Soap film smooths using basis
bs = "so"
are now handled bydraw()
,
smooth_estimates()
etc. #8 -
response_derivatives()
is a new function for computing derivatives of the
response with respect to a (continuous) focal variable. First or second
order derivatives can be computed using forward, backward, or central
finite differences. The uncertainty in the estimated derivative is determined
using posterior sampling viafitted_samples()
, and hence can be derived
from a Gaussian approximation to the posterior or using a Metropolis Hastings
sampler (see below.) -
derivative_samples()
is the work horse function behind
response_derivatives()
, which computes and returns posterior draws of the
derivatives of any additive combination of model terms. Requested by
@jonathanmellor #237 -
data_sim()
can now simulate response data from gamma, Tweedie and ordered
categorical distributions. -
data_sim()
gains two new example models"gwf2"
, simulating data only from
Gu & Wabha's f2 function, and"lwf6"
, example function 6 from Luo & Wabha
(1997 JASA 92(437), 107-116). -
data_sim()
can also simulate data for use with GAMs fitted using
family = gfam()
for grouped families where different types of data in
the response are handled. #266 and part of #265 -
fitted_samples()
andsmooth_samples()
can now use the Metropolis Hastings
sampler frommgcv::gam.mh()
, instead of a Gaussian approximation, to sample
from the posterior distribution of the model or specific smooths
respectively. -
posterior_samples()
is a new function in the family offitted_samples()
andsmooth_samples()
.posterior_samples()
returns draws from the
posterior distribution of the response, combining the uncertainty in the
estimated expected value of the response and the dispersion of the response
distribution. The difference betweenposterior_samples()
and
predicted_samples()
is that the latter only includes variation due to
drawing samples from the conditional distribution of the response (the
uncertainty in the expected values is ignored), while the former includes
both sources of uncertainty. -
fitted_samples()
can new use a matrix of user-supplied posterior draws.
Related to #120 -
add_fitted_samples()
,add_predicted_samples()
,add_posterior_samples()
,
andadd_smooth_samples()
are new utility functions that add the respective
draws from the posterior distribution to an existing data object for the
covariate values in that object:obj |> add_posterior_draws(model)
. #50 -
basis_size()
is a new function to extract the basis dimension (number of
basis functions) for smooths. Methods are available for objects that inherit
from classes"gam"
,"gamm"
, and"mgcv.smooth"
(for individual smooths). -
data_slice()
gains a method for data frames and tibbles. -
typical_values()
gains a method for data frames and tibbles. -
fitted_values()
now works with models fitted using themgcv::ocat()
family. The predicted probability for each category is returned, alongside a
Wald interval created using the standard error (SE) of the estimated
probability. The SE and estimated probabilities are transformed to the logit
(linear predictor) scale, a Wald credible interval is formed, which is then
back-transformed to the response (probability) scale. -
fitted_values()
now works for GAMMs fitted usingmgcv::gamm()
. Fitted
(predicted) values only use the GAM part of the model, and thus exclude the
random effects. -
link()
andinv_link()
work for models fitted using thecnorm()
family. -
A worm plot can now be drawn in place of the QQ plot with
appraise()
via
new argumentuse_worm = TRUE
. #62 -
smooths()
now works for models fitted withmgcv::gamm()
. -
overview()
now returns the basis dimension for each smooth and gains an
argumentstars
which ifTRUE
add significance stars to the output plus a
legend is printed in the tibble footer. Part of wish of @noamross #214 -
New
add_constant()
andtransform_fun()
methods forsmooth_samples()
. -
evenly()
gains argumentslower
andupper
to modify the lower and / or
upper bound of the interval over which evenly spaced values will be generated. -
add_sizer()
is a new function to add information on whether the derivative
of a smooth is significantly changing (where the credible interval excludes
0). Currently, methods forderivatives()
andsmooth_estimates()
objects
are implemented. Part of request of @asanders11 #117 -
draw.derivatives()
gains argumentsadd_change
andchange_type
to ...
gratia version 0.8.1 on CRAN
Version 0.8.1 of gratia is on CRAN. Version 0.8.0 was not released do to changes necessitated for the 1.1.0 release of dplyr. The full list of changes in the 0.8. and 0.8.1 versions is given below.
gratia 0.8.1
User visible changes
-
smooth_samples()
now returns objects with variables involved in smooths
that have their correct name. Previously variables were named.x1
,.x2
,
etc. Fixing #126 and improving compatibility withcompare_smooths()
and
smooth_estimates()
allowed the variables to be named correctly. -
gratia now depends on version 1.8-41 or later of the mgcv package.
New features
draw.gam()
can now handle tensor products that include a marginal random
effect smooth. Beware plotting such smooths if there are many levels,
however, as a separate surface plot will be produced for each level.
Bug fixes
-
Additional fixes for changes in dplyr 1.1.0.
-
smooth_samples()
now works when sampling from posteriors of multiple smooths
with different dimension. #126 reported by @Aariq
gratia 0.8.0
User visible changes
-
{gratia} now depends on R version 4.1 or later.
-
A new vignette "Data slices" is supplied with {gratia}.
-
Functions in {gratia} have harmonised to use an argument named
data
instead
ofnewdata
for passing new data at which to evaluate features of smooths. A
message will be printed ifnewdata
is used from now on. Existing code does
not need to be changed asdata
takes its value fromnewdata
.Note that due to the way
...
is handled in R, if your R script uses the
data
argument, and is run with versions of gratia prior to 8.0 (when
released; 0.7.3.8 if using the development version) the user-supplied data
will be silently ignored. As such, scripts usingdata
should check that the
installed version of gratia is >= 0.8 and package developers should update
to depend on versions >= 0.8 by usinggratia (>= 0.8)
inDESCRIPTION
. -
The order of the plots of smooths has changed in
draw.gam()
so that they
again match the order in which smooths were specified in the model formula.
See Bug Fixes below for more detail or #154.
New features
-
Added basic support for GAMLSS (distributional GAMs) fitted with the
gamlss()
function from package GJRM. Support is currently restricted to a
draw()
method. -
difference_smooths()
can now include the group means in the difference,
which many users expected. To include the group means usegroup_means = TRUE
in the function call, e.g.
difference_smooths(model, smooth = "s(x)", group_means = TRUE
). Note: this
function still differs fromplot_diff()
in package itsadug, which
essentially computes differences of model predictions. The main practical
difference is that other effects beyond the factor by smooth, including random
effects, may be included withplot_diff()
.This implements the main wish of #108 (@dinga92) and #143 (@mbolyanatz)
despite my protestations that this was complicated in some cases (it isn't;
the complexity just cancels out.) -
data_slice()
has been totally revised. Now, the user provides the values for
the variables they want in the slice and any variables in the model that are
not specified will be held at typical values (i.e. the value of the
observation that is closest to the median for numeric variables, or the modal
factor level.)Data slices are now produced by passing
name
=value
pairs for the
variables and their values that you want to appear in the slice. For examplem <- gam(y ~ s(x1) + x2 + fac) data_slice(model, x1 = evenly(x1, n = 100), x2 = mean(x2))
The
value
in the pair can be an expression that will be looked up
(evaluated) in thedata
argument or the model frame of the fitted model
(the default). In the above example, the resulting slice will be a data frame
of 100 observations, comprisingx1
, which is a vector of 100 values spread
evenly over the range ofx1
, a constant value of the mean ofx2
for the
x2
variable, and a constant factor level, the model class offac
, for the
fac
variable of the model. -
partial_derivatives()
is a new function for computing partial derivatives
of multivariate smooths (e.g.s(x,z)
,te(x,z)
) with respect to one of
the margins of the smooth. Multivariate smooths of any dimension are handled,
but only one of the dimensions is allowed to vary. Partial derivatives are
estimated using the method of finite differences, with forward, backward,
and central finite differences available. Requested by @noamross #101 -
overview()
provides a simple overview of model terms for fitted GAMs. -
The new
bs = "sz"
basis that was released with mgcv version 1.18-41 is
now supported insmooth_estimates()
,draw.gam()
, and
draw.smooth_estimates()
and this basis has its own unique plotting method.
#202 -
basis()
now has a method for fitted GAM(M)s which can extract the estimated
basis from the model and plot it, using the estimated coefficients for the
smooth to weight the basis. #137There is also a new
draw.basis()
method for plotting the results of a call
tobasis()
. This method can now also handle bivariate bases.tidy_basis()
is a lower level function that does the heavy lifting in
basis()
, and is now exported.tidy_basis()
returns a tidy representation
of a basis supplied as an object inheriting from class"mgcv.smooth"
. These
objects are returned in the$smooth
component of a fitted GAM(M) model. -
lp_matrix()
is a new utility function to quickly return the linear predictor
matrix for an estimated model. It is a wrapper to
predict(..., type = "lpmatrix")
-
evenly()
is a synonym forseq_min_max()
and is preferred going forward.
Gains argumentby
to produce sequences over a covariate that increment in
units ofby
. -
ref_level()
andlevel()
are new utility functions for extracting the
reference or a specific level of a factor respectively. These will be most
useful when specifying covariate values to condition on in a data slice. -
model_vars()
is a new, public facing way of returning a vector of variables
that are used in a model. -
difference_smooths()
will now use the user-supplied data as points at
which to evaluate a pair of smooths. Also note that the argumentnewdata
has
been renameddata
. #175 -
The
draw()
method fordifference_smooths()
now uses better labels for
plot titles to avoid long labels with even modest factor levels. -
derivatives()
now works for factor-smooth interaction ("fs"
) smooths. -
draw()
methods now allow the angle of tick labels on the x axis of plots to
be rotated using argumentangle
. Requested by @tamas-ferenci #87 -
draw.gam()
and related functions (draw.parametric_effects()
,
draw.smooth_estimates()
) now add the basis to the plot using a caption.
#155 -
smooth_coefs()
is a new utility function for extracting the coefficients
for a particular smooth from a fitted model.smooth_coef_indices()
is an
associated function that returns the indices (positions) in the vector of
model coefficients (returned bycoef(gam_model)
) of those coefficients that
pertain to the stated smooth. -
draw.gam()
now better handles patchworks of plots where one or more of
those plots has fixed aspect ratios. #190
Bug fixes
-
draw.posterior_smooths
now plots posterior samples with a fixed aspect ratio
if the smooth is isotropic. #148 -
derivatives()
now ignores random effect smooths (for which derivatives
don't make sense anyway). #168 -
confint.gam(...., method = "simultaneous")
now works with factor by smooths
whereparm
is passed the full name of a specific smooths(x)faclevel
. -
The order of plots produced by
gratia::draw.gam()
again matches the order
in which the smooths entered the model formula. Recent changes to the
internals ofgratia::draw.gam()
when the switch tosmooth_estimates()
was
undertaken lead to a change in behaviour resulting from the use of
dplyr::group_split()
, and it's coercion internally of a character vector to
a factor. This factor is now created explicitly, and the levels set to the
correct order. #154 -
Setting the
dist
argument to set response or smooth values toNA
if they
lay too far from the support of the data in multivariate smooths, this would
lead an incorrect scale for the response guide. This is now fixed. #193 -
Argument
fun
todraw.gam()
was not being applied to any parametric terms.
Reported by @grasshoppermouse #195 -
draw.gam()
was adding the uncertainty for all linear predictors to smooths
whenoverall_uncertainty = TRUE
was used. Nowdraw.gam()
only includes the
uncertainty for those linear predictors in which a smooth takes part. #158 -
partial_derivatives()
works when provided with a single data point at
which to evaluate the derivative. #199 -
transform_fun.smooth_estimates()
was addressing the wrong variable names
when trying to transform the confidence interval. #201 -
data_slice()
doesn't fail with an error when used with a model that contains
an offset term. #198 -
confint.gam()
no longer usesevaluate_smooth()
, which is soft deprecated.
#167 -
qq_plot()
andworm_plot()
could compute the wrong deviance residuals used
to generate the theoretical quantiles for some of the more exotic families
(distributions) available in mgcv. This also affectedappraise()
but only
for the QQ plot; the residuals shown in the other plots and the deviance
residuals shown on the y-axis of the QQ plot were correct. Only the
generation of the reference intervals/quantiles was affected.
gratia version 0.7.3 is released and on CRAN
gratia 0.7.3
This is a minor release for gratia, mainly motivated by a request to fix outputs from examples on M1 Macs where the results printed deviated markedly from the reference output generated on my Linux machine. The full entry for the release in NEWS.md
is reproduced below.
User visible changes
- Plots of smooths now use "Partial effect" for the y-axis label in place of "Effect", to better indicate what is displayed.
New features
-
confint.fderiv()
andconfint.gam()
now return their results as a tibble instead of a common-or-garden data frame. The latter mostly already did this. -
Examples for
confint.fderiv()
andconfint.gam()
were reworked, in part to remove some inconsistent output in the examples when run on M1 macs.
Bug fixes
compare_smooths()
failed when passed non-standard model "names" likecompare_smooths(m_gam, m_gamm$gam)
orcompare_smooths(l[[1]], l[[2]])
even if the evaluated objects were valid GAM(M) models. Reported by Andrew
Irwin #150
gratia version 0.7.2 is released and on CRAN
gratia 0.7.2 is available and on CRAN
Following the release of version 0.7.0, a couple of annoying bugs were identified which necessitated a patch release. I had implemented methods to plot partial effects for 3d and 4d smooths so decided to include these early enhancements in the patch release to try to shake out any bugs or problems with the implementation prior to a more substantial point (0.8.0) release later in the year (planned for September 2022 at the latest as gratia is needed for a GAM course). Similarly, the problem that delayed 0.7.1 (below) meant that a new plotting method to handle splines on the sphere snuck in to the release, for the same reasons as handling >2d smooths.
Due to an issue with the size of the package source tarball, which wasn't discovered until after submission to CRAN, 0.7.1 was never released.
While binaries for Windows and MacOS X systems are being built, you can install version 0.7.2 from R Universe: https://gavinsimpson.r-universe.dev/ui#builds
New features
-
draw.gam()
anddraw.smooth_estimates()
can now handle splines on the sphere (s(lat, long, bs = "sos")
) with special plotting methods usingggplot2::coord_map()
to handle the projection to spherical coordinates. An orthographic projection is used by default, with an essentially arbitrary (and northern hemisphere-centric) default for the orientation of the view. -
draw.gam()
anddraw.smooth_estimates()
: {gratia} can now handle smooths of 3 or 4 covariates when plotting. As an example of what is possible, the figure below shows the estimated smooths fromy ~ s(x,z) + s(year, bs = "cr") + ti(x,z, year, d = c(2,1), bs = c("tp", "cr"))
for a space-time GAM modelling shrimp abundance. The layout has been tweaked a little (via thedesign
argument topatchwork::plot_layout()
) from the default you get withdraw.gam()
but otherwise it is unchanged.For smooths of 3 covariates, the third covariate is handled with
ggplot2::facet_wrap()
and a set (defaultn
= 16) of small multiples is drawn, each a 2d surface evaluated at the specified value of the third covariate. For smooths of 4 covariates,ggplot2::facet_grid()
is used to draw the small multiples, with the default producing 4 rows by 4 columns of plots at the specific values of the third and fourth covariates. The number of small multiples produced is controlled by new argumentsn_3d
(default =n_3d = 16
) andn_4d
(defaultn_4d = 4
, yieldingn_4d * n_4d
= 16 facets) respectively.This only affects plotting;
smooth_estimates()
has been able to handle smooths of any number of covariates for a while.When handling higher-dimensional smooths, actually drawing the plots on the default device can be slow, especially with the default value of
n = 100
(which for 3D or 4D smooths would result in 160,000 data points being plotted). As such it is recommended that you reducen
to a smaller value:n = 50
is a reasonable compromise of resolution and speed. -
model_concurvity()
returns concurvity measures frommgcv::concurvity()
for estimated GAMs in a tidy format. The synonymconcrvity()
[sic] is also provided. Adraw()
method is provided which produces a bar plot or a heatmap of the concurvity values depending on whether the overall concurvity of each smooth or the pairwise concurvity of each smooth in the model is requested. -
fitted_values()
insures thatdata
(and hence the returned object) is a tibble rather than a common or garden data frame. -
draw.gam()
gains argumentresid_col = "steelblue3"
that allows the colour of the partial residuals (if plotted) to be changed.
Bug fixes
-
draw.posterior_smooths()
was redundantly plotting duplicate data in the rug plot. Now only the unique set of covariate values are used for drawing the rug. -
data_sim()
was not passing thescale
argument in the bivariate example setting ("eg2"
). -
draw()
methods forgamm()
andgamm4::gamm4()
fits were not passing arguments on todraw.gam()
. -
draw.smooth_estimates()
would produce a subtitle with data for a continuous by smooth as if it were a factor by smooth. Now the subtitle only contains the name of the continuous by variable. -
model_edf()
was not using thetype
argument. As a result it only ever returned the default EDF type. -
add_constant()
methods weren't applying the constant to all the required variables. -
draw.gam()
,draw.parametric_effects()
now actually work for a model with only parametric effects. #142 Reported by @Nelson-Gon -
parametric_effects()
would fail for a model with only parametric terms becausepredict.gam()
returns empty arrays when passed
exclude = character(0)
.
gratia version 0.7.0 now on CRAN
gratia version 0.7.0 released
I am pleased to announce the release of version 0.7.0 of the gratia package. gratia is intended to make working with generalized additive models (GAMs) easier and to facilitate the production of high quality visualizations of estimated smooths and entire models using the ggplot2 package.
Version 0.7.0 of the package represents a significant milestone: the main user-facing and internal functions for evaluating estimated smooths at covariate values have been entirely replaced by new functions written from the ground up to be easier to extend and maintain than the original functions. These new functions are smooth_estimates()
and parametric_effects()
. Consequently, functions evaluate_smooth()
and evaluate_parametric_term()
are now soft-deprecated; a warning will be issued upon their first usage to encourage the use of the new functions.
smooth_estimates()
and parametric_effects()
are more capable and easier to extend than their deprecated forebears. They can return results for multiple smooth or parametric terms in a single call, while the internals allow for new smooth types that require specialist handling to be added without rewriting the main code base or extensive redesigns.
The main user-facing plotting function draw()
for fitted GAMs and related models has been rewritten to use smooth_estimates()
and parametric_effects()
. Some small differences in behaviour may be encountered, but it is expected that previous code using gratia is backward compatible.
In addition to the major changes described above, version 0.7.0 also introduces a ranges of new functions to make the GAM-related aspects of your life a little bit easier.
fitted_values()
produces fitted or estimated values from the model. These can be on the scale of the link function or the response and a credible interval is provided for the requested coverage on the chosen scale.rootogram()
provides rootogram diagnostics, mainly for count-based models (fitted with familiespoisson()
,negbin()
,nb()
, andgaussian()
), but other families may be supported in the future. Thedraw()
method can plot various kinds of rootogram from the results ofrootogram()
.- New helper functions
typical_values()
,factor_combos()
anddata_combos()
for quickly creating data sets for producing predictions from
fitted models where some of the covariates are fixed at come typical or representative values. edf()
extracts the effective degrees of freedom (EDF) of a fitted model or a specific smooth in the model. Various forms for the EDF can be extracted.model_edf()
returns the EDF of the overall model. If supplied with multiple models, the EDFs of each model are returned for comparison.
Additional new features and information of bugs fixed can be found in the news.
The package has a new pkgdown website, with search facility: https://gavinsimpson.github.io/gratia/
Finally, I know the documentation available for the package and individual functions isn't anywhere near as good as it could be. I have tried to provide examples for the user-facing functions in the package. In addition, this version of gratia comes with a Getting Started vignette, which shows some of the main functions for working with GAMs with gratia. Development on the package towards version 0.8.0 will have a focus on providing better documentation and additional vignettes to illustrate the range of functionality in the package.
gratia version 0.5.1 now on CRAN
This release was prompted by an issue with an argument naming choice in the new smooth_estimates()
function. Some additional functionality was completed prior to realising I needed to release 0.5.1,
User visible changes
- The
newdata
argument tosmooth_estimates()
has been changed todata
as
was originally intended.
New features
-
smooth_estimates()
can now handle- bivariate and multivariate thinplate regression spline smooths, e.g.
s(x, z, a)
, - tensor product smooths (
te()
,t2()
, &ti()
), e.g.te(x, z, a)
- factor smooth interactions, e.g.
s(x, f, bs = "fs")
- random effect smooths, e.g.
s(f, bs = "re")
- bivariate and multivariate thinplate regression spline smooths, e.g.
-
penalty()
provides a tidy representation of the penalty matrices of
smooths. The tidy representation is most suitable for plotting with
ggplot()
.A
draw()
method is provided, which represents the penalty matrix as a
heatmap.
gratia version 0.5.0 now on CRAN
gratia 0.5.0
Covid-19- and teaching left me little development time, but a prompt from CRAN to address the use of {vdiffr} 📦 in package tests spurred me to wrap up some of the new features I had committed to the development version.
I also took the opportunity to complete the initial steps on a replacement for (or more accurately a successor to) evaluate_smooth()
. Some early decisions I made when developing evaluate_smooth()
meant that it was increasingly difficult to maintain and add support for more complex models, due to the way I had handled factor by
variable smooths.
The replacement/successor is smooth_estimates()
. At the moment it only handles simple 1-D smooths, but it should be much easier to accommodate other smooth types and more complex models with multiple linear predictors.
Eventually, once smooth_estimates()
can handle the range of smooths and models that evaluate_smooth()
can currently, I'll swap out instances of evaluate_smooth()
from the higher-level functions that rely upon it. At the moment I don't plan on removing evaluate_smooth()
from {gratia}, but its use will be at the very least soft-deprecated.
Some of the News for the release is copied below.
New features
-
Partial residuals for models can be computed with
partial_residuals()
. The
partial residuals are the weighted residuals of the model added to the
contribution of each smooth term (as returned bypredict(model, type = "terms")
.Also, new function
add_partial_residuals()
can be used to add the partial
residuals to data frames. -
Users can now control to some extent what colour or fill scales are used when
plotting smooths in thosedraw()
methods that use them. This is most useful
to change the fill scale when plotting 2D smooths, or to change the discrete
colour scale used when plotting random factor smooths (bs = "fs"
).The user can pass scales via arguments
discrete_colour
and
continuous_fill
. -
The effects of certain smooths can be excluded from data simulated from a model
usingsimulate.gam()
andpredicted_samples()
by passingexclude
orterms
on topredict.gam()
. This allows for excluding random effects, for example, from
model predicted values that are then used to simulate new data from the conditional
distribution. See the example inpredicted_samples()
.Wish of #74 (@hgoldspiel)
-
draw.gam()
and related functions gain argumentsconstant
andfun
to allow
for user-defined constants and transformations of smooth estimates and
confidence intervals to be applied.Part of wish of Wish of #79.
-
confint.gam()
now works for 2D smooths also. -
smooth_estimates()
is an early version of code to replace (or more likely
supersede)evaluate_smooth()
.smooth_estimates()
can currently only handle
1D smooths of the standard types.
User visible changes
-
The meaning of
parm
inconfint.gam
has changed. This argument now requires
a smooth label to match a smooth. A vector of labels can be provided, but
partial matching against a smooth label only works with a singleparm
value.The default behaviour remains unchanged however; if
parm
isNULL
then all
smooths are evaluated and returned with confidence intervals. -
data_class()
is no longer exported; it was only ever intended to be an internal
function.
Version 0.4.1 released to CRAN
Version 0.4.1 of gratia has been released to CRAN. Version 0.4.0 existed for a short while but the release to CRAN was pulled because of a last minute change needed to accommodate v 1.0.0 of dplyr that had gone overlooked in the testing for 0.4.0.
This gave me an opportunity to fix an additional bug (#73) as well.
The full list of changes is reproduced below for version 0.4.1 and 0.4.0.
gratia 0.4.1
User visible changes
-
draw.gam()
withscales = "fixed"
now applies to all terms that can be
plotted, including 2d smooths.Reported by @StefanoMezzini #73
Bug fixes
-
dplyr::combine()
was deprecated. Switch tovctrs::vec_c()
. -
draw.gam()
withscales = "fixed"
wasn't using fixed scales where 2d smooths
were in the model.Reported by @StefanoMezzini #73
gratia 0.4.0
New features
-
draw.gam()
can now include partial residuals when drawing univariate smooths.
Useresiduals = TRUE
to add partial residuals to each univariate smooth that
is drawn. This feature is not available for smooths of more than one variable,
by smooths, or factor-smooth interactions (bs = "fs"
). -
The coverage of credible and ocnfidence intervals drawn by
draw.gam()
can be
specified via argumentci_level
. The default is arbitrarily0.95
for no
other reason than (rough) compatibility withplot.gam()
.This chance has had the effect of making the intervals slightly narrower than
in previous versions of gratia; intervals were drawn at ± 2 ×
the standard error. The default intervals are now drawn at ± ~1.96
× the standard error. -
New function
difference_smooth()
for computing differences between factor
smooth interactions. Methods available forgam()
,bam()
,gamm()
and
gamm4::gamm4()
. Also has adraw()
method, which can handle differences of
1D and 2D smooths currently (handling 3D and 4D smooths is planned). -
New functions
add_fitted()
andadd_residuals()
to add fitted values
(expectations) and model residuals to an existing data frame. Currently methods
available for objects fitted bygam()
andbam()
. -
data_sim()
is a tidy reimplementation ofmgcv::gamSim()
with the added
ability to use sampling distributions other than the Gaussian for all models
implemented. Currently Gaussian, Poisson, and Bernoulli sampling distributions
are available. -
smooth_samples()
can handle continuous by variable smooths such as in
varying coefficient models. -
link()
andinv_link()
now work for all families available in mgcv,
including the location, scale, shape families, and the more specialised
families described in?mgcv::family.mgcv
. -
evaluate_smooth()
,data_slice()
,family()
,link()
,inv_link()
methods
for models fitted usinggamm4()
from the gamm4 package. -
data_slice()
can generate data for a 1-d slice (a single variable varying). -
The colour of the points, reference lines, and simulation band in
appraise()
can now be specified via argumentspoint_col
,point_alpha
,ci_col
ci_alpha
line_col
These are passed on to
qq_plot()
,observed_fitted_plot()
,
residuals_linpred_plot()
, andresiduals_hist_plot()
, which also now take
the new arguments were applicable. -
Added utility functions
is_factor_term()
andterm_variables()
for working
with models.is_factor_term()
identifies is the named term is a factor using
information from theterms()
object of the fitted model.term_variables()
returns a character vector of variable names that are involved in a model
term. These are strictly for working with parametric terms in models. -
appraise()
now works for models fitted byglm()
andlm()
, as do the
underlying functions it calls, especiallyqq_plot
.appraise()
also works for models fitted with familygaulss()
. Further
locational scale models and models fitted with extended family functions will
be supported in upcoming releases.
User visible changes
-
datagen()
is now an internal function and is no longer exported. Use
data_slice()
instead. -
evaluate_parametric_terms()
is now much stricter and can only evaluate main
effect terms, i.e. those whose order, as stored in theterms
object of the
model is1
.
Bug fixes
-
The
draw()
method forderivatives()
was not getting the x-axis label for
factor by smooths correctly, and instead was usingNA
for the second and
subsequent levels of the factor. -
The
datagen()
method for class"gam"
couldn't possibly have worked for
anything but the simplest models and would fail even with simple factor by
smooths. These issues have been fixed, but the behaviour ofdatagen()
has
changed, and the function is now not intended for use by users. -
Fixed an issue where in models terms of the form
factor1:factor2
were
incorrectly identified as being numeric parametric terms.
#68