Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Track artifacts as inputs #124

Merged
merged 6 commits into from
Dec 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# laminr v0.2.1

## NEW FUNCTIONALITY

- Allow tracking of artifacts loaded for non-default instances (PR #124)

## BUG FIXES

- Allow connecting to private LaminDB instances (PR #118)
Expand Down
23 changes: 21 additions & 2 deletions R/Artifact.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,30 @@ ArtifactRecord <- R6::R6Class( # nolint object_name_linter
load_file(file_path, suffix, ...)
},
#' @description
#' Cache the artifact to the local filesystem. This currently only supports
#' S3 storage.
#' Cache the artifact to the local filesystem. When the Python `lamindb`
#' package is not available this only supports S3 storage.
#'
#' @return The path to the cached artifact
cache = function() {

py_lamin <- private$.instance$get_py_lamin()
if (!is.null(py_lamin)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there every a case in which the Python package is not available? We already needed it for lamin connect etc. -- so I think it's always going to be available?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily -- if the user provides a correct current_user.env they could still connect to a lamindb instance without the python client. they just won't be able to track, create or delete records at this stage.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you need to CLI installed somewhere to do that but it doesn't necessarily mean it's an environment that {reticulate} can find.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok!

if (isTRUE(private$.instance$is_default)) {
py_artifact <- py_lamin$Artifact$get(self$uid)
} else {
instance_settings <- private$.instance$get_settings()
slug <- paste0(instance_settings$owner, "/", instance_settings$name)

py_artifact <- py_lamin$Artifact$using(slug)$get(self$uid)
}
return(py_artifact$cache()$path)
}

cli::cli_warn(paste(
"The Python {.pkg lamindb} package is not available.",
"Loaded artifacts will not be tracked."
))

# assume that an artifact will have a storage field,
# and that the storage field will have a type field
artifact_storage <- private$get_value("storage")
Expand Down
32 changes: 16 additions & 16 deletions R/Instance.R
Original file line number Diff line number Diff line change
Expand Up @@ -59,23 +59,23 @@ create_instance <- function(instance_settings, is_default = FALSE) {
active = active
)

py_lamin <- NULL
if (isTRUE(is_default)) {
check_requires("Connecting to Python", "reticulate", type = "warning")

py_lamin <- tryCatch(
reticulate::import("lamindb"),
error = function(err) {
cli::cli_warn(c(
paste(
"Failed to connect to the Python {.pkg lamindb} package,",
"you will not be able to create records"
),
"i" = "See {.run reticulate::py_config()} for more information"
))
NULL
}
)
py_lamin <- NULL
check_requires("Connecting to Python", "reticulate", type = "warning")
py_lamin <- tryCatch(
reticulate::import("lamindb"),
error = function(err) {
NULL
}
)
if (isTRUE(is_default) && is.null(py_lamin)) {
cli::cli_warn(c(
paste(
"Default instance failed to connect to the Python {.pkg lamindb} package,",
"you will not be able to create records"
),
"i" = "See {.run reticulate::py_config()} for more information"
))
}

# create the instance
Expand Down
1 change: 1 addition & 0 deletions vignettes/development.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ This document outlines the features of the **{laminr}** package and the roadmap
* [x] **Track code execution**: Automatically track the execution of R scripts and notebooks.
* [ ] **Capture run context**: Record information about the execution environment (e.g., package versions, parameters).
* [x] **Link code to artifacts**: Associate code execution with generated artifacts.
- [x] Link to artifacts loaded from other instances
* [ ] **Visualize data lineage**: Create visualizations of data lineage and dependencies.
* [x] **Finalize tracking**: End and save a run.

Expand Down
Loading