diff --git a/DESCRIPTION b/DESCRIPTION index c5d0bc6..08bb6be 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -14,7 +14,7 @@ Description: Intense parallel workloads can be difficult to monitor. visualize the log file to reveal potential resource-related reasons for the crash. The 'autometric' package borrows heavily from the methods of packages 'ps' and 'psutil'. -Version: 0.1.2.9000 +Version: 0.1.2.9001 License: MIT + file LICENSE URL: https://wlandau.github.io/autometric/, diff --git a/NAMESPACE b/NAMESPACE index 18fbfe8..b2d1caa 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -1,5 +1,6 @@ # Generated by roxygen2: do not edit by hand +export(dir_stat) export(log_active) export(log_phase_get) export(log_phase_reset) diff --git a/NEWS.md b/NEWS.md index 5a5dd9e..22ce6c7 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,6 +1,6 @@ -# autometric 0.1.2.9000 - +# autometric 0.1.2.9001 +* Add `dir_stat()` to return file info metadata in a directory faster than `base::file.info()` (#5). # autometric 0.1.2 @@ -36,4 +36,3 @@ # autometric 0.0.3 * First version - diff --git a/R/dir_stat.R b/R/dir_stat.R new file mode 100644 index 0000000..4c83178 --- /dev/null +++ b/R/dir_stat.R @@ -0,0 +1,100 @@ +#' @title Efficiently describe files in a directory. +#' @export +#' @family utilities +#' @description List the paths, sizes, modification times, +#' of all the regular files at (or linked from) +#' the top level in a directory. +#' @details In large computational pipelines, it is common to end up with +#' tens of thousands of log files in a directory. +#' At this level of scale, [base::file.info()] +#' is slow on older file systems. +#' [autometric::dir_stat()] can be up to 40 times faster where +#' the C implementation is supported +#' (POSIX.1-2008 machines and Mac OS). +#' +#' [dir_stat()] is not recursive: it only queries regular files at the +#' top level of a directory. In addition, it follows symbolic links: +#' if a file is a link, then [dir_stat()] describes the file it points +#' to, rather than the link itself. +#' @return A data frame with one row per file and columns for the file path, +#' numeric size, and modification time stamp of each file. +#' The units of these last two columns are controlled by the +#' `units_size` and `units_mtime` arguments, respectively. +#' @param path Character string, file path to the directory of files +#' to describe. +#' @param units_size Character string with the units of the returned +#' `size` column in the output: `"megabytes"`, `"bytes"`, `"kilobytes"`, +#' or `"gigabytes"`. +#' @param units_mtime Character string with the units of the returned +#' `mtime` column in the output with file modification time stamps. +#' Choices are `"POSIXct` for a `POSIXct` time object or `"numeric"` +#' for an ordinary numeric vector. +#' @param recent Either `NULL` or an optional `"difftime"` object. +#' If a `"difftime"` object is supplied, then [dir_stat()] +#' only shows the most recently modified files in that time window. +#' For example, `recent = as.difftime(1.5, units = "hours")` tells +#' [dir_stat()] to only return information on files modified within +#' the last 1.5 hours. +#' @param method Character string, type of implementation used. +#' Set to `"c"` for an implementation that is up to 40 times faster than +#' [base::file.info()] but may not be supported on certain platforms. +#' Set to `"r"` to run [base::file.info()], which is slower. +#' If `method` is `"c"` but the C implementation is not supported +#' on your platform, [dir_stat()] automatically falls back on +#' [base::file.info()]. +#' The C implementation is supported on POSIX.1-2008 machines and on Mac OS. +#' @examples +#' file.create(tempfile()) +#' file.create(tempfile()) +#' if (tolower(Sys.info()["sysname"]) != "windows") { +#' print(dir_stat(tempdir(), recent = as.difftime(1, units = "hours"))) +#' } +dir_stat <- function( + path, + units_size = c("megabytes", "bytes", "kilobytes", "gigabytes"), + units_mtime = c("POSIXct", "numeric"), + recent = NULL, + method = c("c", "r") +) { + stopifnot(is.character(path)) + stopifnot(!anyNA(path)) + stopifnot(all(nzchar(path))) + stopifnot(dir.exists(path)) + units_size <- match.arg(units_size) + units_mtime <- match.arg(units_mtime) + method <- match.arg(method) + if (method == "r" || is.null(out <- dir_stat_c(path, units_mtime))) { + out <- dir_stat_r(path, units_mtime) + } + out$size <- out$size * get_factor_size(units_size) + if (!is.null(recent)) { + stopifnot(length(recent) == 1L) + stopifnot(!anyNA(recent)) + stopifnot(inherits(recent, "difftime")) + out <- out[.POSIXct(out$mtime) > Sys.time() - recent, ] + } + out +} + +dir_stat_c <- function(path, units_mtime) { + out <- .Call(r_dir_stat, path, PACKAGE = "autometric") + if (identical(units_mtime, "POSIXct")) { + out$mtime <- .POSIXct(out$mtime) + } + as.data.frame(out) +} + +dir_stat_r <- function(path, units_mtime) { + directories <- list.dirs(path, full.names = TRUE, recursive = FALSE) + files <- setdiff(list.files(path, full.names = TRUE), directories) + info <- file.info(files, extra_cols = FALSE) + out <- data.frame( + path = rownames(info), + size = as.numeric(info$size), + mtime = info$mtime + ) + if (identical(units_mtime, "numeric")) { + out$mtime <- as.numeric(out$mtime) + } + out +} diff --git a/R/log_active.R b/R/log_active.R index 10fca88..0eb4403 100644 --- a/R/log_active.R +++ b/R/log_active.R @@ -1,5 +1,6 @@ #' @title Check the log thread. #' @export +#' @family log #' @description Check if the log is running. #' @return `TRUE` if a background thread is actively writing to the log, #' `FALSE` otherwise. The result is based on a static C variable, diff --git a/R/log_phase_get.R b/R/log_phase_get.R index f758402..88b87c9 100644 --- a/R/log_phase_get.R +++ b/R/log_phase_get.R @@ -1,5 +1,6 @@ #' @title Get log phase #' @export +#' @family phase #' @description Get the current log phase. #' @return Character string with the name of the current log phase. #' @examples diff --git a/R/log_phase_reset.R b/R/log_phase_reset.R index 27f817b..6c8e470 100644 --- a/R/log_phase_reset.R +++ b/R/log_phase_reset.R @@ -1,5 +1,6 @@ #' @title Reset log phase #' @export +#' @family phase #' @description Reset the current log phase to the default value. #' @return `NULL` (invisibly). Called for its side effects. #' @examples diff --git a/R/log_phase_set.R b/R/log_phase_set.R index 741a938..af3ca01 100644 --- a/R/log_phase_set.R +++ b/R/log_phase_set.R @@ -1,5 +1,6 @@ #' @title Set log phase #' @export +#' @family phase #' @description Set the current log phase. #' @return `NULL` (invisibly). Called for its side effects. #' @param phase Character string with the phase of the log. diff --git a/R/log_plot.R b/R/log_plot.R index 44126a2..6757e70 100644 --- a/R/log_plot.R +++ b/R/log_plot.R @@ -1,5 +1,6 @@ #' @title Plot a metric of a process over time #' @export +#' @family log #' @description Visualize a metric of a log over time for a single process ID #' in a single log file. #' @return A base plot of a metric of a log over time. diff --git a/R/log_print.R b/R/log_print.R index 27251ce..b959143 100644 --- a/R/log_print.R +++ b/R/log_print.R @@ -1,5 +1,6 @@ #' @title Print once to the log. #' @export +#' @family log #' @description Sample CPU load metrics and #' print a single line to the log for each process in `pids`. #' Used for debugging and testing only. Not for users. diff --git a/R/log_read.R b/R/log_read.R index d5476d6..7b38f14 100644 --- a/R/log_read.R +++ b/R/log_read.R @@ -1,5 +1,6 @@ #' @title Read a log. #' @export +#' @family log #' @description Read a log file into R. #' @details [log_read()] is capable of reading a log file where both #' `autometric` and other processes have printed. Whenever `autometric` @@ -131,7 +132,7 @@ log_read <- function( out$name[is.na(out$name)] <- "" out$status <- as.integer(out$status) factor_cpu <- get_factor_cpu(units_cpu) - factor_memory <- get_factor_memory(units_memory) + factor_memory <- get_factor_size(units_memory) factor_time <- get_factor_time(units_time) for (field in c("core", "cpu")) { out[[field]] <- as.numeric(out[[field]] * factor_cpu) @@ -164,33 +165,3 @@ list_files <- function(path, hidden) { character(0L) } } - -get_factor_time <- function(units) { - switch( - units, - seconds = 1, - minutes = 1 / 60, - hours = 1 / (60 * 60), - days = 1 / (60 * 60 * 24) - ) -} - -get_factor_cpu <- function(units) { - switch( - units, - percentage = 1, - fraction = 1 / 100 - ) -} - -c("megabytes", "bytes", "kilobytes", "gigabytes") - -get_factor_memory <- function(units) { - switch( - units, - bytes = 1L, - kilobytes = 1e-3, - megabytes = 1e-6, - gigabytes = 1e-9 - ) -} diff --git a/R/log_start.R b/R/log_start.R index 8cdfa17..ab69987 100644 --- a/R/log_start.R +++ b/R/log_start.R @@ -1,5 +1,6 @@ #' @title Start the log thread. #' @export +#' @family log #' @description Start a background thread that periodically writes #' system usage metrics of the current R process to a log file. #' See [log_read()] for explanations of the specific metrics. diff --git a/R/log_stop.R b/R/log_stop.R index 3b359f1..cc26cb0 100644 --- a/R/log_stop.R +++ b/R/log_stop.R @@ -1,5 +1,6 @@ #' @title Stop the log thread. #' @export +#' @family log #' @description Stop the background thread that periodically writes #' system usage metrics of the current R process to a log file. #' @details The background thread is detached, so is there no way to diff --git a/R/log_support.R b/R/log_support.R index e24140d..3903552 100644 --- a/R/log_support.R +++ b/R/log_support.R @@ -1,5 +1,6 @@ #' @title Log support #' @export +#' @family log #' @description Check if your system supports background logging. #' @details The background logging functionality requires a Linux, Mac, #' or Windows computer, It also requires POSIX thread support diff --git a/R/utils_units.R b/R/utils_units.R new file mode 100644 index 0000000..7e525f4 --- /dev/null +++ b/R/utils_units.R @@ -0,0 +1,27 @@ +get_factor_time <- function(units) { + switch( + units, + seconds = 1, + minutes = 1 / 60, + hours = 1 / (60 * 60), + days = 1 / (60 * 60 * 24) + ) +} + +get_factor_cpu <- function(units) { + switch( + units, + percentage = 1, + fraction = 1 / 100 + ) +} + +get_factor_size <- function(units) { + switch( + units, + bytes = 1L, + kilobytes = 1e-3, + megabytes = 1e-6, + gigabytes = 1e-9 + ) +} diff --git a/inst/tinytest/test-dir_stat.R b/inst/tinytest/test-dir_stat.R new file mode 100644 index 0000000..e242c97 --- /dev/null +++ b/inst/tinytest/test-dir_stat.R @@ -0,0 +1,125 @@ +local({ + if (tolower(Sys.info()["sysname"]) != "windows") { + for (units_mtime in c("POSIXct", "numeric")) { + path <- tempfile() + dir.create(path) + writeLines("line", file.path(path, "a")) + writeLines("line2", file.path(path, "b")) + dir.create(file.path(path, "dir")) + file.create(file.path(path, "dir", "x")) + out_c <- dir_stat( + path, + method = "c", + units_size = "bytes", + units_mtime = units_mtime + ) + out_r <- dir_stat( + path, + method = "r", + units_size = "bytes", + units_mtime = units_mtime + ) + out_c <- out_c[order(out_c$path), ] + out_r <- out_c[order(out_r$path), ] + out_c$path <- basename(out_c$path) + out_r$path <- basename(out_r$path) + out_c$size <- as.integer(out_c$size) + out_r$size <- as.integer(out_r$size) + expect_equal(out_c, out_r) + expect_equal(as.numeric(out_c$mtime) - as.numeric(out_r$mtime), c(0, 0)) + expect_equal( + as.character(.POSIXct(out_c$mtime)), + as.character(.POSIXct(out_r$mtime)) + ) + expect_equal(out_c$size[out_c$path == "a"], 5L) + expect_equal(out_c$size[out_c$path == "b"], 6L) + expect_equal(sort(colnames(out_c)), sort(c("path", "size", "mtime"))) + expect_equal(nrow(out_c), 2L) + unlink(path, recursive = TRUE) + } + } +}) + +local({ + if (tolower(Sys.info()["sysname"]) != "windows") { + path <- tempfile() + dir.create(path) + writeLines("line", file.path(path, "a")) + writeLines("line2", file.path(path, "b")) + out <- dir_stat( + path, + method = "c", + units_size = "bytes", + units_mtime = "numeric", + recent = as.difftime(1, units = "hours") + ) + out$path <- basename(out$path) + expect_equal(nrow(out), 2L) + expect_equal(out$size[out$path == "a"], 5L) + expect_equal(out$size[out$path == "b"], 6L) + expect_true(is.numeric(out$mtime)) + unlink(path, recursive = TRUE) + } +}) + +local({ + if (tolower(Sys.info()["sysname"]) != "windows") { + path <- tempfile() + dir.create(path) + out <- dir_stat( + path, + method = "c", + units_size = "bytes", + units_mtime = "numeric", + recent = as.difftime(1, units = "hours") + ) + expect_equal(nrow(out), 0L) + expect_equal(sort(colnames(out)), sort(c("path", "size", "mtime"))) + unlink(path, recursive = TRUE) + } +}) + +local({ + if (tolower(Sys.info()["sysname"]) != "windows") { + path <- tempfile() + dir.create(path) + link <- file.path(path, "link") + target <- tempfile() + writeLines("a", target) + file.symlink(from = target, to = link) + out_c <- dir_stat( + path, + method = "c", + units_size = "bytes", + units_mtime = "numeric", + recent = as.difftime(1, units = "hours") + ) + out_r <- dir_stat( + path, + method = "r", + units_size = "bytes", + units_mtime = "numeric", + recent = as.difftime(1, units = "hours") + ) + expect_equal(out_c$size, 2L) + expect_equal(out_r$size, 2L) + writeLines("abc", target) + out_c <- dir_stat( + path, + method = "c", + units_size = "bytes", + units_mtime = "numeric", + recent = as.difftime(1, units = "hours") + ) + out_r <- dir_stat( + path, + method = "r", + units_size = "bytes", + units_mtime = "numeric", + recent = as.difftime(1, units = "hours") + ) + expect_equal(out_c$size, 4L) + expect_equal(out_r$size, 4L) + unlink(c(path, target), recursive = TRUE) + } +}) diff --git a/inst/tinytest/test-log_read.R b/inst/tinytest/test-log_read.R index 09231cd..7902e57 100644 --- a/inst/tinytest/test-log_read.R +++ b/inst/tinytest/test-log_read.R @@ -66,8 +66,8 @@ local({ local({ package <- getNamespace("autometric") - expect_equal(package$get_factor_memory("bytes"), 1) - expect_equal(package$get_factor_memory("kilobytes"), 1e-3) - expect_equal(package$get_factor_memory("megabytes"), 1e-6) - expect_equal(package$get_factor_memory("gigabytes"), 1e-9) + expect_equal(package$get_factor_size("bytes"), 1) + expect_equal(package$get_factor_size("kilobytes"), 1e-3) + expect_equal(package$get_factor_size("megabytes"), 1e-6) + expect_equal(package$get_factor_size("gigabytes"), 1e-9) }) diff --git a/man/dir_stat.Rd b/man/dir_stat.Rd new file mode 100644 index 0000000..a6a881e --- /dev/null +++ b/man/dir_stat.Rd @@ -0,0 +1,76 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/dir_stat.R +\name{dir_stat} +\alias{dir_stat} +\title{Efficiently describe files in a directory.} +\usage{ +dir_stat( + path, + units_size = c("megabytes", "bytes", "kilobytes", "gigabytes"), + units_mtime = c("POSIXct", "numeric"), + recent = NULL, + method = c("c", "r") +) +} +\arguments{ +\item{path}{Character string, file path to the directory of files +to describe.} + +\item{units_size}{Character string with the units of the returned +\code{size} column in the output: \code{"megabytes"}, \code{"bytes"}, \code{"kilobytes"}, +or \code{"gigabytes"}.} + +\item{units_mtime}{Character string with the units of the returned +\code{mtime} column in the output with file modification time stamps. +Choices are \verb{"POSIXct} for a \code{POSIXct} time object or \code{"numeric"} +for an ordinary numeric vector.} + +\item{recent}{Either \code{NULL} or an optional \code{"difftime"} object. +If a \code{"difftime"} object is supplied, then \code{\link[=dir_stat]{dir_stat()}} +only shows the most recently modified files in that time window. +For example, \code{recent = as.difftime(1.5, units = "hours")} tells +\code{\link[=dir_stat]{dir_stat()}} to only return information on files modified within +the last 1.5 hours.} + +\item{method}{Character string, type of implementation used. +Set to \code{"c"} for an implementation that is up to 40 times faster than +\code{\link[base:file.info]{base::file.info()}} but may not be supported on certain platforms. +Set to \code{"r"} to run \code{\link[base:file.info]{base::file.info()}}, which is slower. +If \code{method} is \code{"c"} but the C implementation is not supported +on your platform, \code{\link[=dir_stat]{dir_stat()}} automatically falls back on +\code{\link[base:file.info]{base::file.info()}}. +The C implementation is supported on POSIX.1-2008 machines and on Mac OS.} +} +\value{ +A data frame with one row per file and columns for the file path, +numeric size, and modification time stamp of each file. +The units of these last two columns are controlled by the +\code{units_size} and \code{units_mtime} arguments, respectively. +} +\description{ +List the paths, sizes, modification times, +of all the regular files at (or linked from) +the top level in a directory. +} +\details{ +In large computational pipelines, it is common to end up with +tens of thousands of log files in a directory. +At this level of scale, \code{\link[base:file.info]{base::file.info()}} +is slow on older file systems. +\code{\link[=dir_stat]{dir_stat()}} can be up to 40 times faster where +the C implementation is supported +(POSIX.1-2008 machines and Mac OS). + +\code{\link[=dir_stat]{dir_stat()}} is not recursive: it only queries regular files at the +top level of a directory. In addition, it follows symbolic links: +if a file is a link, then \code{\link[=dir_stat]{dir_stat()}} describes the file it points +to, rather than the link itself. +} +\examples{ + file.create(tempfile()) + file.create(tempfile()) + if (tolower(Sys.info()["sysname"]) != "windows") { + print(dir_stat(tempdir(), recent = as.difftime(1, units = "hours"))) + } +} +\concept{utilities} diff --git a/man/log_active.Rd b/man/log_active.Rd index fb56594..f81c5fb 100644 --- a/man/log_active.Rd +++ b/man/log_active.Rd @@ -25,3 +25,13 @@ Check if the log is running. log_active() unlink(path) } +\seealso{ +Other log: +\code{\link{log_plot}()}, +\code{\link{log_print}()}, +\code{\link{log_read}()}, +\code{\link{log_start}()}, +\code{\link{log_stop}()}, +\code{\link{log_support}()} +} +\concept{log} diff --git a/man/log_phase_get.Rd b/man/log_phase_get.Rd index 13d284e..842fcec 100644 --- a/man/log_phase_get.Rd +++ b/man/log_phase_get.Rd @@ -23,3 +23,9 @@ Get the current log phase. log_phase_get() log_read(path) } +\seealso{ +Other phase: +\code{\link{log_phase_reset}()}, +\code{\link{log_phase_set}()} +} +\concept{phase} diff --git a/man/log_phase_reset.Rd b/man/log_phase_reset.Rd index cc338c7..b7279b6 100644 --- a/man/log_phase_reset.Rd +++ b/man/log_phase_reset.Rd @@ -23,3 +23,9 @@ Reset the current log phase to the default value. log_phase_get() log_read(path) } +\seealso{ +Other phase: +\code{\link{log_phase_get}()}, +\code{\link{log_phase_set}()} +} +\concept{phase} diff --git a/man/log_phase_set.Rd b/man/log_phase_set.Rd index 0c64a46..c59b2c0 100644 --- a/man/log_phase_set.Rd +++ b/man/log_phase_set.Rd @@ -29,3 +29,9 @@ Set the current log phase. log_phase_get() log_read(path) } +\seealso{ +Other phase: +\code{\link{log_phase_get}()}, +\code{\link{log_phase_reset}()} +} +\concept{phase} diff --git a/man/log_plot.Rd b/man/log_plot.Rd index 5ecdf92..c330ea1 100644 --- a/man/log_plot.Rd +++ b/man/log_plot.Rd @@ -49,3 +49,13 @@ in a single log file. log_plot(log, metric = "cpu") unlink(path) } +\seealso{ +Other log: +\code{\link{log_active}()}, +\code{\link{log_print}()}, +\code{\link{log_read}()}, +\code{\link{log_start}()}, +\code{\link{log_stop}()}, +\code{\link{log_support}()} +} +\concept{log} diff --git a/man/log_print.Rd b/man/log_print.Rd index 39b3f09..e15624b 100644 --- a/man/log_print.Rd +++ b/man/log_print.Rd @@ -62,3 +62,13 @@ Used for debugging and testing only. Not for users. log_read(path) unlink(path) } +\seealso{ +Other log: +\code{\link{log_active}()}, +\code{\link{log_plot}()}, +\code{\link{log_read}()}, +\code{\link{log_start}()}, +\code{\link{log_stop}()}, +\code{\link{log_support}()} +} +\concept{log} diff --git a/man/log_read.Rd b/man/log_read.Rd index af0171b..f30a26f 100644 --- a/man/log_read.Rd +++ b/man/log_read.Rd @@ -102,3 +102,13 @@ into the units \code{units_time}, log_read(path) unlink(path) } +\seealso{ +Other log: +\code{\link{log_active}()}, +\code{\link{log_plot}()}, +\code{\link{log_print}()}, +\code{\link{log_start}()}, +\code{\link{log_stop}()}, +\code{\link{log_support}()} +} +\concept{log} diff --git a/man/log_start.Rd b/man/log_start.Rd index 222d957..36bf786 100644 --- a/man/log_start.Rd +++ b/man/log_start.Rd @@ -73,3 +73,13 @@ the first one. log_read(path) unlink(path) } +\seealso{ +Other log: +\code{\link{log_active}()}, +\code{\link{log_plot}()}, +\code{\link{log_print}()}, +\code{\link{log_read}()}, +\code{\link{log_stop}()}, +\code{\link{log_support}()} +} +\concept{log} diff --git a/man/log_stop.Rd b/man/log_stop.Rd index 71161fb..5f1d0e0 100644 --- a/man/log_stop.Rd +++ b/man/log_stop.Rd @@ -30,3 +30,13 @@ lines in the log even after you call \code{\link[=log_stop]{log_stop()}}. log_read(path) unlink(path) } +\seealso{ +Other log: +\code{\link{log_active}()}, +\code{\link{log_plot}()}, +\code{\link{log_print}()}, +\code{\link{log_read}()}, +\code{\link{log_start}()}, +\code{\link{log_support}()} +} +\concept{log} diff --git a/man/log_support.Rd b/man/log_support.Rd index 250c1d0..80c9ecd 100644 --- a/man/log_support.Rd +++ b/man/log_support.Rd @@ -21,3 +21,13 @@ and the \code{nanosleep()} C function. \examples{ log_support() } +\seealso{ +Other log: +\code{\link{log_active}()}, +\code{\link{log_plot}()}, +\code{\link{log_print}()}, +\code{\link{log_read}()}, +\code{\link{log_start}()}, +\code{\link{log_stop}()} +} +\concept{log} diff --git a/src/error.h b/src/constant.h similarity index 72% rename from src/error.h rename to src/constant.h index 2aada3e..a02b211 100644 --- a/src/error.h +++ b/src/constant.h @@ -1,6 +1,7 @@ -#ifndef ERROR_H -#define ERROR_H +#ifndef CONSTANT_H +#define CONSTANT_H +#define BUFFER_SIZE 2048 #define ERROR_ARITHMETIC 10000 #define ERROR_BUFFER 10001 #define ERROR_FILE 10002 diff --git a/src/export.c b/src/export.c index b8cb716..f9b0f85 100644 --- a/src/export.c +++ b/src/export.c @@ -1,6 +1,8 @@ -#include "autometric.h" +#include "log.h" +#include "stat.h" static const R_CallMethodDef call_methods[] = { + {"r_dir_stat", (DL_FUNC) &dir_stat, 1}, {"r_log_active", (DL_FUNC) &log_active, 0}, {"r_log_phase_get", (DL_FUNC) &log_phase_get, 0}, {"r_log_phase_reset", (DL_FUNC) &log_phase_reset, 0}, diff --git a/src/autometric.c b/src/log.c similarity index 96% rename from src/autometric.c rename to src/log.c index e372f3b..de0bf8e 100644 --- a/src/autometric.c +++ b/src/log.c @@ -1,6 +1,6 @@ -#include "autometric.h" +#include "log.h" #include "support.h" -#if SUPPORT +#if SUPPORT_LOG #include "metrics.h" #include "thread.h" @@ -10,7 +10,7 @@ SEXP log_active(void) { } SEXP log_phase_get(void) { - char phase[PHASE_N]; + char phase[BUFFER_SIZE]; pthread_phase_get(phase); SEXP out = PROTECT(mkString(phase)); UNPROTECT(1); @@ -41,7 +41,7 @@ SEXP log_print( const int* pids_ = INTEGER(pids); const int n_pids_ = INTEGER(n_pids)[0]; const char** names_ = (const char**) malloc(n_pids_ * sizeof(char*)); - char phase[PHASE_N]; + char phase[BUFFER_SIZE]; if (names_ == NULL) { return R_NilValue; } diff --git a/src/autometric.h b/src/log.h similarity index 100% rename from src/autometric.h rename to src/log.h diff --git a/src/metrics.h b/src/metrics.h index c426399..76e8650 100644 --- a/src/metrics.h +++ b/src/metrics.h @@ -1,7 +1,7 @@ #ifndef METRICS_H #define METRICS_H -#include "error.h" +#include "constant.h" #include "timers.h" #include #include diff --git a/src/metrics_linux.c b/src/metrics_linux.c index 6d46f22..31718f7 100644 --- a/src/metrics_linux.c +++ b/src/metrics_linux.c @@ -6,7 +6,7 @@ */ #include "metrics.h" -#if LINUX +#if OS_LINUX #include #include diff --git a/src/metrics_mac.c b/src/metrics_mac.c index 5e4b653..dd45905 100644 --- a/src/metrics_mac.c +++ b/src/metrics_mac.c @@ -9,7 +9,7 @@ */ #include "metrics.h" -#if MAC +#if OS_MAC #include #include diff --git a/src/metrics_windows.c b/src/metrics_windows.c index 5187cc7..03cffd8 100644 --- a/src/metrics_windows.c +++ b/src/metrics_windows.c @@ -6,7 +6,7 @@ */ #include "metrics.h" -#if WINDOWS +#if OS_WINDOWS #define PSAPI_VERSION 1 #include diff --git a/src/stat.c b/src/stat.c new file mode 100644 index 0000000..6f9cc0f --- /dev/null +++ b/src/stat.c @@ -0,0 +1,78 @@ +#include "stat.h" + +#if SUPPORT_READDIR + +#if OS_MAC + #define STAT_MTIME st_mtimespec +#else + #define STAT_MTIME st_mtim +#endif + +SEXP dir_stat(SEXP path) { + const char* parent = CHAR(STRING_ELT(path, 0)); + DIR* handle = opendir(parent); + if (handle == NULL) { + Rf_error("opendir() failed on %s: %s", parent, strerror(errno)); + } + int capacity = BUFFER_SIZE; + int count = 0; + int stat_result; + SEXP file; + SEXP size; + SEXP mtime; + PROTECT_INDEX index_path; + PROTECT_INDEX index_size; + PROTECT_INDEX index_mtime; + PROTECT_WITH_INDEX(file = allocVector(STRSXP, capacity), &index_path); + PROTECT_WITH_INDEX(size = allocVector(REALSXP, capacity), &index_size); + PROTECT_WITH_INDEX(mtime = allocVector(REALSXP, capacity), &index_mtime); + struct dirent *entry; + struct stat stat_info; + char buffer[BUFFER_SIZE]; + while ((entry = readdir(handle)) != NULL) { + R_CheckUserInterrupt(); + snprintf(buffer, BUFFER_SIZE, "%s/%s", parent, entry->d_name); + if (entry->d_type == DT_REG || entry->d_type == DT_LNK) { + stat_result = stat(buffer, &stat_info); + } else { + continue; + } + if (stat_result == -1) { + continue; + } + SET_STRING_ELT(file, count, mkChar(buffer)); + REAL(size)[count] = (double) stat_info.st_size; + REAL(mtime)[count] = (double) stat_info.STAT_MTIME.tv_sec + + 1e-9 * (double) stat_info.STAT_MTIME.tv_nsec; + ++count; + if (count == capacity) { + capacity *= 2; + REPROTECT(file = Rf_xlengthgets(file, capacity), index_path); + REPROTECT(size = Rf_xlengthgets(size, capacity), index_size); + REPROTECT(mtime = Rf_xlengthgets(mtime, capacity), index_mtime); + } + } + closedir(handle); + REPROTECT(file = Rf_xlengthgets(file, count), index_path); + REPROTECT(size = Rf_xlengthgets(size, count), index_size); + REPROTECT(mtime = Rf_xlengthgets(mtime, count), index_mtime); + SEXP result = PROTECT(allocVector(VECSXP, 3)); + SEXP names = PROTECT(allocVector(STRSXP, 3)); + SET_STRING_ELT(names, 0, mkChar("path")); + SET_STRING_ELT(names, 1, mkChar("size")); + SET_STRING_ELT(names, 2, mkChar("mtime")); + SET_VECTOR_ELT(result, 0, file); + SET_VECTOR_ELT(result, 1, size); + SET_VECTOR_ELT(result, 2, mtime); + setAttrib(result, R_NamesSymbol, names); + UNPROTECT(5); + return result; +} + +#else + +SEXP dir_stat(SEXP path) { + return R_NilValue; +} + +#endif diff --git a/src/stat.h b/src/stat.h new file mode 100644 index 0000000..c824ffb --- /dev/null +++ b/src/stat.h @@ -0,0 +1,18 @@ +#ifndef STAT_H +#define STAT_H + +#include +#include +#include "support.h" + +#if SUPPORT_READDIR + #include "constant.h" + #include + #include + #include + #include +#endif + +SEXP dir_stat(SEXP path); + +#endif diff --git a/src/support.h b/src/support.h index 1b65ea0..c24a4a1 100644 --- a/src/support.h +++ b/src/support.h @@ -4,46 +4,72 @@ #include #include -#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__) -#define WINDOWS 1 -#else -#define WINDOWS 0 +#ifndef OS_WINDOWS + #if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || \ + defined(__NT__) + #define OS_WINDOWS 1 + #else + #define OS_WINDOWS 0 + #endif #endif -#if defined(__APPLE__) && defined(__MACH__) -#define MAC 1 -#else -#define MAC 0 +#ifndef OS_MAC + #if defined(__APPLE__) && defined(__MACH__) + #define OS_MAC 1 + #else + #define OS_MAC 0 + #endif #endif -#if defined(__linux__) -#define LINUX 1 -#else -#define LINUX 0 +#ifndef OS_LINUX + #if defined(__linux__) + #define OS_LINUX 1 + #else + #define OS_LINUX 0 + #endif #endif -#if defined(WINDOWS) || defined(MAC) || defined(LINUX) -#define SUPPORTED_OS 1 -#else -#define SUPPORTED_OS 0 +#ifndef SUPPORT_OS + #if defined(OS_WINDOWS) || defined(OS_MAC) || defined(OS_LINUX) + #define SUPPORT_OS 1 + #else + #define SUPPORT_OS 0 + #endif #endif -#if (defined(_POSIX_TIMERS) && (_POSIX_TIMERS > 0)) || MAC -#define TIMERS 1 -#else -#define TIMERS 0 +#ifndef SUPPORT_TIMERS + #if (defined(_POSIX_TIMERS) && (_POSIX_TIMERS > 0)) || OS_MAC + #define SUPPORT_TIMERS 1 + #else + #define SUPPORT_TIMERS 0 + #endif #endif -#ifdef _POSIX_THREADS -#define THREADS 1 -#else -#define THREADS 0 +#ifndef SUPPORT_THREADS + #ifdef _POSIX_THREADS + #define SUPPORT_THREADS 1 + #else + #define SUPPORT_THREADS 0 + #endif #endif -#if SUPPORTED_OS && THREADS && (TIMERS || WINDOWS) -#define SUPPORT 1 -#else -#define SUPPORT 0 +#ifndef SUPPORT_LOG + #if SUPPORT_OS && SUPPORT_THREADS && (SUPPORT_TIMERS || OS_WINDOWS) + #define SUPPORT_LOG 1 + #else + #define SUPPORT_LOG 0 + #endif +#endif + +#ifndef SUPPORT_READDIR + #define SUPPORT_READDIR 0 + #ifdef _POSIX_VERSION + #if _POSIX_C_SOURCE >= 200809L || OS_MAC + #undef SUPPORT_READDIR + #define SUPPORT_READDIR 1 + #define _XOPEN_SOURCE 600 + #endif + #endif #endif #endif diff --git a/src/thread.c b/src/thread.c index 299742d..67f5df1 100644 --- a/src/thread.c +++ b/src/thread.c @@ -1,11 +1,11 @@ #include "support.h" -#if SUPPORT +#if SUPPORT_LOG #include "thread.h" pthread_mutex_t run_mutex = PTHREAD_MUTEX_INITIALIZER; int run_flag = 0; -char run_phase[PHASE_N] = PHASE_DEFAULT; +char run_phase[BUFFER_SIZE] = PHASE_DEFAULT; pthread_args_t* pthread_args_init( SEXP path, @@ -87,8 +87,8 @@ void pthread_phase_reset(void) { void pthread_phase_set(const char* phase) { pthread_mutex_lock(&run_mutex); - strncpy(run_phase, phase, PHASE_N - 1); - run_phase[PHASE_N - 1] = '\0'; + strncpy(run_phase, phase, BUFFER_SIZE - 1); + run_phase[BUFFER_SIZE - 1] = '\0'; pthread_mutex_unlock(&run_mutex); } @@ -101,7 +101,7 @@ int pthread_run_flag_get(void) { } void* pthread_run(void* arg) { - char phase[PHASE_N]; + char phase[BUFFER_SIZE]; pthread_args_t* args = (pthread_args_t*) arg; time_spec_t sleep_spec = time_spec_init(args->seconds, args->nanoseconds); metrics_t* metrics_array = metrics_array_init(args->n_pids); diff --git a/src/thread.h b/src/thread.h index d198e0f..0c99ed5 100644 --- a/src/thread.h +++ b/src/thread.h @@ -2,7 +2,7 @@ #define THREAD_H #include "support.h" -#if SUPPORT +#if SUPPORT_LOG #include "metrics.h" #include @@ -15,7 +15,6 @@ #include #define PHASE_DEFAULT "__DEFAULT__" -#define PHASE_N 256 typedef struct { char* path; diff --git a/src/timers.c b/src/timers.c index 7640d83..f0add35 100644 --- a/src/timers.c +++ b/src/timers.c @@ -1,6 +1,6 @@ #include "timers.h" -#if TIMERS +#if SUPPORT_TIMERS double seconds_current(void) { struct timespec time; @@ -25,7 +25,7 @@ void sleep_interval(const time_spec_t time_spec) { nanosleep(&time_spec, NULL); } -#elif WINDOWS +#elif OS_WINDOWS time_spec_t time_spec_init( const int seconds, diff --git a/src/timers.h b/src/timers.h index 87606f6..b91fa36 100644 --- a/src/timers.h +++ b/src/timers.h @@ -3,11 +3,11 @@ #include "support.h" -#if TIMERS +#if SUPPORT_TIMERS typedef struct timespec time_spec_t; -#elif WINDOWS +#elif OS_WINDOWS #include diff --git a/src/version.h b/src/version.h index 98a02b3..8893203 100644 --- a/src/version.h +++ b/src/version.h @@ -1 +1 @@ -#define VERSION "0.1.2.9000" +#define VERSION "0.1.2.9001"