-
Notifications
You must be signed in to change notification settings - Fork 42
/
Copy pathrename.Rd
159 lines (142 loc) · 6.34 KB
/
rename.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/rename.R
\name{rename}
\alias{rename}
\title{Rename variables according to a specified dictionary}
\usage{
rename(x, dictionary = "ioos.csv", debug = 0)
}
\arguments{
\item{x}{either an \linkS4class{oce} object, the elements of which will
be renamed, or NULL. In the latter case, the dictionary is returned
as a data frame, which can be useful for users who want to use \code{\link[=rbind]{rbind()}}
to append dictionary elements of their own, thus customizing the
action of \code{rename()}.}
\item{dictionary}{either a string or a data frame. If a string, then it is
either the name of a built-in vocabulary, either \code{ioos} or \code{sbe}
or the name of a CSV file that defines a dictionary in a four-column
format as described in \sQuote{Details}. If it is a data frame, then
it must hold four columns that follow the same pattern as in the CSV
style.}
\item{debug}{an integer specifying whether debugging information is
to be printed during the processing. This is a general parameter that
is used by many \code{oce} functions. Generally, setting \code{debug=0}
turns off the printing, while higher values suggest that more information
be printed. If one function calls another, it usually reduces the value of
\code{debug} first, so that a user can often obtain deeper debugging
by specifying higher \code{debug} values.}
}
\description{
There are many conventions for naming oceanographic variables, and this
function provides a way to map names in data files to names to be used in an
object created from those files.
}
\details{
The dictionary format, whether read from a built-in CSV file, or from a
user-supplied CSV file, or as a data frame, contains four character-valued
columns, as follows.
\enumerate{
\item The original name of a variable in the \code{data} slot of \code{x}. This is used in
matching such names against targets. Matches may be in the form of
equality, or \link{regexp} match. In the latter case, a \verb{#} character may be
used as an abbreviation for a digit. Note that \code{^} is inserted at the
start of the value, and \code{$} at the end, before searching for a match with
\code{\link[=grep]{grep()}}.
\item The desired oce-convention name to be used for a match. Many files will
yield duplicates, e.g. for multiple temperature sensors, so
\code{\link[=unduplicateNames]{unduplicateNames()}} is called after all names are processed, to avoid
problems.
\item The unit for the column, typically in a format handled by \code{\link[=expression]{expression()}}.
Note that this value is ignored if the object already holds stated units
for the quantity in question.
\item The scale for the column (again, only used if the object does not already
hold a scale).
}
The built-in dictionaries are stored in locations
\if{html}{\out{<div class="sourceCode">}}\preformatted{system.file("extdata", "dictionary_codas.csv", package = "oce")
system.file("extdata", "dictionary_ioos.csv", package = "oce")
system.file("extdata", "dictionary_sbe.csv", package = "oce")
}\if{html}{\out{</div>}}
The data for these come from References 1, 2 and 3, respectively. The format
is simple, consisting of 4 columns, with no header. The column entries
are as follows.
\enumerate{
\item The first column holds a specialized regular expression for the variable
name as stored in the datafile. This is conventional, except that \verb{#} is a
stand-in for the regular expression \verb{[0-9]} (that is, a single digit).
Formulating these expressions requires a bit of care, so it can make sense to
look at the \code{dictionary_sbe.csv} file to get some hints.
\item The second column holds the oce name.
\item The third column is the unit.
\item The fourth column is the scale.
}
In many cases, the third and fourth columns are empty, and even if values are
provided, they will be superceded by values within the data file.
As an example, the entry
\if{html}{\out{<div class="sourceCode">}}\preformatted{PSALST##,salinity,,PSS-78
}\if{html}{\out{</div>}}
indicates that a variable named \code{"PSALT"} followed by 2 digits is to be
renamed as \code{"salinity"}, that the unit (if not already defined within \code{x}) is
to be blank, and that the scale (again, if not already defined within \code{x}) is
to be \code{"PSS-78"}.
}
\section{History and Plans}{
This function was written in late September, 2024. It is likely
to evolve through the remaining months of 2024, after real-world
testing by the developers.
}
\examples{
library(oce)
# Example 1: made-up data
d <- new("oce")
d <- oceSetData(d, "S", c(30, 31))
d <- oceSetData(d, "T", c(10, 11))
dictText <- "S,salinity,,
T,temperature,degree*C,ITS-90"
dictionary <- read.csv(text = dictText, header = FALSE)
rename(d, dictionary)
#
# Example 2: a CIOOS NetCDF file. Note that this file
# is downloaded and removed at the end; in practice,
# it is likely that the file might be retained locally.
if (requireNamespace("curl")) {
file <- tempfile(fileext = ".nc") # removed later
server <- "https://cioosatlantic.ca/erddap/files"
program <- "bio_atlantic_zone_monitoring_program_ctd"
subprogram <- "Bedford\%20Basin\%20Monitoring\%20Program"
year <- 2023
cast <- 1
url <- sprintf(
"\%s/\%s/\%s/\%s/CTD_BCD\%s667_\%03d_1_DN.ODF.nc",
server, program, subprogram, year, year, cast
)
t <- try(curl::curl_download(url, file), silent = TRUE)
if (!inherits(t, "try-error")) {
d <- read.netcdf(file)
summary(d)
dd <- rename(d, "ioos")
summary(dd)
} else {
message("Cannot connect to ", url)
}
unlink(file)
}
}
\references{
\enumerate{
\item CODAS naming convention
\url{https://currents.soest.hawaii.edu/docs/adcp_doc/UHDAS_OPERATIONS/UHDAS_atsea/adcp_access/read_netCDF.html}
\item IOOS naming convention
\url{https://cfconventions.org/Data/cf-standard-names/78/build/cf-standard-name-table.html}
\item The SBE names come from a processing manual that was once at
\verb{http://www.seabird.com/document/sbe-data-processing-manual}, but as of
summer 2018, this no longer seems to be provided by SeaBird. A web search
will turn up copies of the manual that have been put online by various
research groups and data-archiving agencies. On 2018-07-05, the latest
version was named \code{SBEDataProcessing_7.26.4.pdf} and had release date
12/08/2017; this was the reference version used in coding \code{oce}.
}
}
\author{
Dan Kelley
}