Add feature to calculate similarity score between two embeddings without storing them in a collection #804

mocobeta · 2025-02-28T09:47:23Z

Embeddings are great, but they are meaningful only in the relationship between each other.
I thought it would be helpful if we had a command to directly calculate the similarity score between two contents without storing them in a collection when trying out embedding models.

This PR adds a command embed-score, which takes two contents and returns the cosine similarity score (and actual embeddings when the given format is 'json').

# Basic usage
llm embed-score -c1 "I like pelicans" -c2 "I love pelicans" -m 3-small
0.9376833959553552

The new function describes my rough intention, but I'm fully open to any suggestions on the interface/implementation if this feature is worth having in this tool.

Confession: This is a collaborative work with Anthropic's Cline and Claude 3.7. I prompted Cline to write a new function I wanted and post-edited the generated code. Claude did a great job but couldn't produce unit tests that worked and aligned with existing fixtures (in a reasonable time slot). The whole process greatly helped me to understand the codebase.

Still needs to

Add documentation about the new command

mocobeta added 4 commits February 28, 2025 18:03

add command to calculate similarity score between two contents

3ed2382

update help message

d8b5c46

update docs/help.md

144e7d1

run black

99c270f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add feature to calculate similarity score between two embeddings without storing them in a collection #804

Add feature to calculate similarity score between two embeddings without storing them in a collection #804

mocobeta commented Feb 28, 2025 •

edited

Loading

Add feature to calculate similarity score between two embeddings without storing them in a collection #804

Are you sure you want to change the base?

Add feature to calculate similarity score between two embeddings without storing them in a collection #804

Conversation

mocobeta commented Feb 28, 2025 • edited Loading

mocobeta commented Feb 28, 2025 •

edited

Loading