-
Notifications
You must be signed in to change notification settings - Fork 1
/
DESCRIPTION
80 lines (80 loc) · 4.28 KB
/
DESCRIPTION
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
Package: mlr3resampling
Type: Package
Title: Resampling Algorithms for 'mlr3' Framework
Version: 2024.9.6
Authors@R: c(
person("Toby", "Hocking",
email="[email protected]",
role=c("aut", "cre"),
comment = c(ORCID="0000-0002-3146-0865")),
person("Michel", "Lang", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0001-9754-0393", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Bernd", "Bischl", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0001-6002-6980", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Jakob", "Richter", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0003-4481-5554", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Patrick", "Schratz", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0003-0748-6624", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Giuseppe", "Casalicchio", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0001-5324-5966", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Stefan", "Coors", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0002-7465-2146", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Quay", "Au", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0002-5252-8902", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Martin", "Binder", , "[email protected]", role = "ctb"),
person("Florian", "Pfisterer", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0001-8867-762X", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Raphael", "Sonabend", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0001-9225-4654", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Lennart", "Schneider", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0003-4152-5308", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Marc", "Becker", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0002-8115-0400", "Author of mlr3 when Resampling/ResamplingCV was copied/modified")),
person("Sebastian", "Fischer", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0002-9609-3197", "Author of mlr3 when Resampling/ResamplingCV was copied/modified"))
)
Description: A supervised learning algorithm inputs a train set,
and outputs a prediction function, which can be used on a test set.
If each data point belongs to a group
(such as geographic region, year, etc), then
how do we know if it is possible to train on one group, and predict
accurately on another group? Cross-validation can be used to determine
the extent to which this is possible, by first assigning fold IDs from
1 to K to all data (possibly using stratification, usually by group
and label). Then we loop over test sets (group/fold combinations),
train sets (same group, other groups, all groups), and compute
test/prediction accuracy for each combination. Comparing
test/prediction accuracy between same and other, we can determine the
extent to which it is possible (perfect if same/other have similar
test accuracy for each group; other is usually somewhat less accurate
than same; other can be just as bad as featureless baseline when the
groups have different patterns).
For more information,
<https://tdhock.github.io/blog/2023/R-gen-new-subsets/>
describes the method in depth.
How many train samples are required to get accurate predictions on a
test set? Cross-validation can be used to answer this question, with
variable size train sets.
License: GPL-3
URL: https://github.com/tdhock/mlr3resampling
BugReports: https://github.com/tdhock/mlr3resampling/issues
Imports:
data.table,
R6,
checkmate,
paradox,
mlr3,
mlr3misc
Suggests:
ggplot2,
animint2,
mlr3tuning,
lgr,
future,
testthat,
knitr,
markdown,
nc,
rpart,
directlabels
VignetteBuilder: knitr