Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add usage vignette #18

Merged
merged 6 commits into from
Oct 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 11 additions & 9 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,7 @@

* Read user settings from env file created by lamin Python package (PR #2, PR #8).

* Render a pkgdown website (PR #13).

* Add `to_string()` and `print()` methods to the `Record` class and (incomplete) `describe()` method to the `Artifact()` class (PR #22)
* Add `to_string()` and `print()` methods to the `Record` class and (incomplete) `describe()` method to the `Artifact()` class (PR #22).

## MAJOR CHANGES

Expand All @@ -23,19 +21,23 @@

## MINOR CHANGES

* Update `README` with new set up instructions and simplify (PR #14).

* Do not complain when foreign keys are not found in a record, but also do not complain when they are (PR #13).

* Further simplify the `README`, and move the detailed usage description to a separate vignette (PR #13).
* Define a current user and current instance with lamin-cli prior to testing and generating documentation in the CI (PR #23).

* Add a simple unit test which queries laminlabs/lamindata (PR #27).

## DOCUMENTATION

* Update `README` with new set up instructions and simplify (PR #14).

* Add a `pkgdown` website to the project (PR #13).

* Generate vignettes using Quarto (PR #13).
* Further simplify the `README`, and move the detailed usage description to a separate vignette (PR #13).

* Define a current user and current instance with lamin-cli prior to testing and generating documentation in the CI (PR #23).
* Generate vignettes using Quarto (PR #13).

* Add a simple unit test which queries laminlabs/lamindata (PR #27).
* Add vignette to showcase laminr usage (PR #18).

## BUG FIXES

Expand Down
111 changes: 111 additions & 0 deletions vignettes/usage.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
---
title: "Usage"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Usage}
%\VignetteEncoding{UTF-8}
%\VignetteEngine{knitr::rmarkdown}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```

```{r setup}
library(laminr)
```

rcannood marked this conversation as resolved.
Show resolved Hide resolved
LaminDB is an open-source data framework for biology. You can find out about some of its features in the [documentation of the lamindb Python package](https://docs.lamin.ai/introduction).

This vignette will show you how to use the `laminr` package to interact with LaminDB.

## Initial setup

As part of a first-time set up, you will need to install `laminr`, the Python `lamin-cli` package, and set up an instance for first use.

```bash
pip install lamin-cli
lamin connect laminlabs/cellxgene
```

```R
install.packages("remotes")
remotes::install_github("laminlabs/laminr")
```

## Connect to a LaminDB instance

This vignette uses the [`laminlabs/cellxgene`](https://lamin.ai/laminlabs/cellxgene) instance, which is a LaminDB instance that interfaces the CELLxGENE data.

You can connect to the instance using the `connect` R function:

```{r connect}
db <- connect("laminlabs/cellxgene")
```

By printing the instance, you can see which registries are available, including Artifact, Collection, Feature, etc. Each of these registries have a corresponding [Python class](https://docs.lamin.ai/lamindb).

```{r print_instance}
db
```

All of the 'core' registries are directly available from the `db` object, while registries from other modules can be accessed via `db$<module_name>`, e.g.:

rcannood marked this conversation as resolved.
Show resolved Hide resolved
```{r get_module}
db$bionty
```

The `bionty` and other registries also have corresponding [Python classes](https://docs.lamin.ai/bionty).

## Registry

A registry is used to query, store and manage data. For instance, the `Artifact` registry stores datasets and models as files, folders, or arrays.

You can see which functions you can use to interact with the registry by printing the registry object:

```{r get_artifact_registry}
db$Artifact
```

For instance, you can fetch an Artifact by ID or UID. For example, Artifact [KBW89Mf7IGcekja2hADu](https://lamin.ai/laminlabs/cellxgene/artifact/KBW89Mf7IGcekja2hADu) is an AnnData object containing myeloid cells.

```{r get_artifact}
artifact <- db$Artifact$get("KBW89Mf7IGcekja2hADu")
```

You can view its metadata by printing the object:

```{r print_artifact}
artifact
```

Or get more detailed information by calling the `$describe()` method:

```{r describe_artifact}
artifact$describe()
```

You can access its fields as follows:

* `artifact$id`: `r artifact$id`
* `artifact$uid`: `r artifact$uid`
* `artifact$key`: `r artifact$key`

Or fetch data from related registries:

* `artifact$root`: `r artifact$storage$to_string()`
* `artifact$created_by`: `r artifact$created_by$to_string()`

Finally, for Artifact objects, you can directly fetch or download the data using `$cache()` and `$load()`, respectively.

```{r cache_artifact}
artifact$cache()
artifact$load()
```

:::{.callout-note}
Only S3 storage and AnnData accessors are supported at the moment. If additional storage and data accessors are desired, please open an issue on the [laminr GitHub repository](https://github.com/laminlabs/laminr/issues).
:::
Loading