Lensed UMAP provides three methods that apply lens-functions to a UMAP model. Lens functions can be used to untangle embeddings along a particular dimension. This dimension may be part of the data, or come from another information source. Using lens functions, analysts can update their UMAP models to the questions they are investigating, effectively viewing their data from different perspectives.
The lensed UMAP package provides functions that operate on (fitted) UMAP objects.
import numpy as np
import pandas as pd
from umap import UMAP
import lensed_umap as lu
import matplotlib.pyplot as plt
# Load data and extract lens
df = pd.read_csv("./data/five_circles.csv", header=0)
lens = np.log(df.hue)
# Compute initial UMAP model
projector = UMAP(
repulsion_strength=0.1, # To avoid tears in projection that
negative_sample_rate=2, # are not in the modelled graph!
).fit(df[["x", "y"]])
# Draw intial model
x, y = lu.extract_embedding(projector)
plt.scatter(x, y, 2, lens, cmap="viridis")
plt.axis("off")
plt.show()
# Apply a global lens
lensed = lu.apply_lens(projector, lens, resolution=6)
x, y = lu.extract_embedding(lensed)
plt.scatter(x, y, 2, lens, cmap="viridis")
plt.axis("off")
plt.show()
A notebook demonstrating how the package works is available at How lensed UMAP Works. The other notebooks demonstrate lenses on several data sets and contain the analyses presented in our paper. The datasets we used as input and the data generated by our notebooks are stored using git lfs, which turns the files in this repository into versioned links to the actual data files. Their documentation explains how to retrieve the actual data files.
lensed_umap
is available on PyPI:
pip install lensed_umap
A scientific paper describing our work is available on Arxiv:
@misc{bot2024lens,
title={Lens functions for exploring UMAP Projections with Domain Knowledge},
author={Daniel M. Bot and Jan Aerts},
year={2024},
eprint={2405.09204},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
The lensed UMAP package has a 3-Clause BSD license.