Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Enhance ai-cache Plugin with Vector Similarity-Based LLM Cache Recall and Multi-DB Support #1248

Merged
merged 92 commits into from
Nov 21, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
4f7bfbd
fix bugs
johnlanni Jul 31, 2024
0f9e816
fix bugs
Suchun-sv Aug 1, 2024
ff1bce6
fix bugs
Suchun-sv Aug 12, 2024
1e9d42e
init
EnableAsync Aug 15, 2024
f2a9ff6
fix conflict
Suchun-sv Aug 23, 2024
5cbae03
Merge branch 'alibaba:main' into main
Suchun-sv Aug 23, 2024
27b2f71
alter some errors
Suchun-sv Aug 24, 2024
130f2ee
fix: embedding error
EnableAsync Aug 24, 2024
56314d7
fix bugs && update interface design
Suchun-sv Aug 24, 2024
3d7e85c
feat: add elasticsearch
EnableAsync Aug 25, 2024
85549d0
fix bugs && refine the variable names
Suchun-sv Aug 25, 2024
8444f5e
update design for cache to support extension
Suchun-sv Aug 25, 2024
a655bc4
Merge branch 'alibaba:main' into main
Suchun-sv Sep 5, 2024
57bc863
Merge branch 'alibaba:main' into feat/chroma
Suchun-sv Sep 5, 2024
d68fa88
Refined the code; README.md content needs to be updated.
Suchun-sv Sep 5, 2024
d6c643f
add: makefile for weaviate
EnableAsync Sep 6, 2024
3f3a1bc
feat: add weaviate
EnableAsync Sep 6, 2024
71cc25b
feat: add pinecone
EnableAsync Sep 6, 2024
5179392
fix bugs, README.md to be updated
Suchun-sv Sep 6, 2024
ece7e2f
fix bugs, refine variable name, update README.md
Suchun-sv Sep 6, 2024
e868a1a
Merge branch 'alibaba:main' into main
Suchun-sv Sep 6, 2024
138a526
delete folder
Suchun-sv Sep 6, 2024
65aafbd
Merge branch 'feat/chroma' of https://github.com/Suchun-sv/higress in…
EnableAsync Sep 6, 2024
bfaed4c
fix: format
EnableAsync Sep 6, 2024
e8ad550
fix typos
Suchun-sv Sep 6, 2024
95b06b7
Merge branch 'alibaba:main' into feat/chroma
Suchun-sv Sep 6, 2024
a40f5e9
update
EnableAsync Sep 6, 2024
c83f5c4
fix typos
Suchun-sv Sep 6, 2024
f3d3292
change append to appendMsg
Suchun-sv Sep 6, 2024
b0cf29d
fix bugs and refine code
Suchun-sv Sep 11, 2024
4a18f96
Merge branch 'main' into main
Suchun-sv Sep 11, 2024
21c9a79
fix bugs and update the SetEx function
Suchun-sv Sep 12, 2024
1767896
Merge branch 'main' into main
Suchun-sv Sep 12, 2024
71b9530
Optimize query flow logic (not fully tested)
Suchun-sv Sep 17, 2024
51b9ccc
Fix bugs and verify removal of cache setting
Suchun-sv Sep 21, 2024
3583bc9
fix bugs and update logic as requested
Suchun-sv Sep 21, 2024
10cc7ef
Merge branch 'main' into main
Suchun-sv Oct 10, 2024
36ca3f1
Merge branch 'alibaba:main' into main
Suchun-sv Oct 13, 2024
c261583
add cacheKeyStrategy and enableSemanticCache
Suchun-sv Oct 14, 2024
fa22d63
add cacheKeyStrategy and enableSemanticCache
Suchun-sv Oct 14, 2024
9145132
Vector or cache database must be configured
Suchun-sv Oct 14, 2024
ef443bf
new version envoy
EnableAsync Oct 18, 2024
14a2a3d
fix: GetContext type
EnableAsync Oct 18, 2024
b862ef9
feat: chroma
EnableAsync Oct 18, 2024
7bc5f65
merge
EnableAsync Oct 18, 2024
303f6ed
feat: weaviate
EnableAsync Oct 18, 2024
fb2c26c
fix: clean useless code
EnableAsync Oct 18, 2024
8486555
fix: clean useless code
EnableAsync Oct 18, 2024
e9a14d8
feat: es
EnableAsync Oct 18, 2024
32eccd7
feat: pinecone
EnableAsync Oct 18, 2024
e6f700c
feat: chroma dasvector es pinecone weaviate
EnableAsync Oct 18, 2024
02bc9a2
Merge remote-tracking branch 'origin/main' into feat/chroma
EnableAsync Oct 18, 2024
440cd8d
fix: bugs
EnableAsync Oct 18, 2024
628b74b
fix: bugs
EnableAsync Oct 18, 2024
342bd94
fix: remove uesless files
EnableAsync Oct 18, 2024
cbeb71b
fix: remove uesless files
EnableAsync Oct 18, 2024
43cfdaf
feat: qdrant
EnableAsync Oct 19, 2024
2a4363a
feat: milvus
EnableAsync Oct 19, 2024
9603479
feat: custom threshold
EnableAsync Oct 19, 2024
3d615cc
feat: custom threshold
EnableAsync Oct 19, 2024
558e75e
fix: code format
EnableAsync Oct 20, 2024
2cfcda6
add ai cache test
Suchun-sv Oct 20, 2024
4caf9be
update test
Suchun-sv Oct 20, 2024
d04d78a
fix bugs
Suchun-sv Oct 24, 2024
81bde6d
update
EnableAsync Oct 24, 2024
ea34f4a
fix: bugs
EnableAsync Oct 24, 2024
784740f
Merge branch 'main' into main
Suchun-sv Oct 24, 2024
f5b50fd
add support for skip-cache
Suchun-sv Oct 24, 2024
a1fe701
update README.md and change to FQDNCluster
Suchun-sv Oct 24, 2024
730d951
change to FQDNCluster
Suchun-sv Oct 24, 2024
335c04c
provide support for the legacy configuration
Suchun-sv Oct 25, 2024
59bddf6
simplify resp func, add func name when debug
Suchun-sv Oct 26, 2024
e4901d9
Merge branch 'alibaba:main' into main
Suchun-sv Oct 26, 2024
36f0d77
change *.typ to *
Suchun-sv Oct 26, 2024
009a1b1
add support for legacy config
Suchun-sv Oct 26, 2024
4515f43
update content_type in stream resp
Suchun-sv Oct 26, 2024
c048280
fix bugs
Suchun-sv Oct 26, 2024
0ec24f3
add support for legacy configuration
Suchun-sv Oct 26, 2024
a658bfe
fix bugs
Suchun-sv Oct 26, 2024
a199144
handle the data: [DONE] and return in escaped string
Suchun-sv Oct 26, 2024
77f05d6
dont read resp when ERROR_PARTIAL_MESSAGE_KEY not nil
Suchun-sv Oct 26, 2024
28c629c
Update redis_wrapper.go
CH3CHO Oct 27, 2024
bd84cd0
merge
EnableAsync Oct 29, 2024
d9ce358
merge
EnableAsync Oct 29, 2024
4a95557
update: README.md
EnableAsync Oct 29, 2024
04f288c
merge
EnableAsync Oct 29, 2024
902d810
fix: READMME.md
EnableAsync Oct 29, 2024
a1a7eef
Update README.md
EnableAsync Oct 29, 2024
d1b99b3
Merge remote-tracking branch 'my/feat/chroma' into feat/chroma
EnableAsync Nov 17, 2024
6a782a4
update
EnableAsync Nov 17, 2024
014d3ea
update
EnableAsync Nov 19, 2024
134aecc
Merge branch 'main' into feat/chroma
CH3CHO Nov 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 12 additions & 6 deletions plugins/wasm-go/extensions/ai-cache/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,29 +101,35 @@ LLM 结果缓存插件,默认配置方式可以直接用于 openai 协议的

# 向量数据库提供商特有配置
## Chroma
Chroma 所对应的 `vector.type` 为 `chroma`。它并无特有的配置字段。
Chroma 所对应的 `vector.type` 为 `chroma`。它并无特有的配置字段。需要提前创建 Collection。

## DashVector
DashVector 所对应的 `vector.type` 为 `dashvector`。它并无特有的配置字段。
DashVector 所对应的 `vector.type` 为 `dashvector`。它并无特有的配置字段。需要提前创建 Collection。

## ElasticSearch
ElasticSearch 所对应的 `vector.type` 为 `elasticsearch`。它特有的配置字段如下:
ElasticSearch 所对应的 `vector.type` 为 `elasticsearch`。需要提前创建 Index 并填入在 `vector.collectionID` 中。当前依赖于 [KNN](https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html) 方法,请保证 ES 版本支持 `KNN`,当前已在 `8.16` 版本测试。
它特有的配置字段如下:
| 名称 | 数据类型 | 填写要求 | 默认值 | 描述 |
|-------------------|----------|----------|--------|-------------------------------------------------------------------------------|
| `vector.esUsername` | string | 非必填 | - | ElasticSearch 用户名 |
| `vector.esPassword` | string | 非必填 | - | ElasticSearch 密码 |

`vector.esUsername` 和 `vector.esPassword` 用于 Basic 认证。同时也支持 Api Key 认证,当填写了 `vector.apiKey` 时,则启用 Api Key 认证,如果使用 SaaS 版本需要填写 `encoded` 的值。

## Milvus
Milvus 所对应的 `vector.type` 为 `milvus`。它并无特有的配置字段。
Milvus 所对应的 `vector.type` 为 `milvus`。它并无特有的配置字段。需要提前创建 Collection。

## Pinecone
Pinecone 所对应的 `vector.type` 为 `pinecone`。它并无特有的配置字段。
Pinecone 所对应的 `vector.type` 为 `pinecone`。它并无特有的配置字段。需要提前创建 Index,并填写 Index 访问域名至 `serviceHost`。
Pinecone 中的 `Namespace` 参数通过插件的 `vector.collectionID` 进行配置。

## Qdrant
Qdrant 所对应的 `vector.type` 为 `qdrant`。它并无特有的配置字段。
Qdrant 所对应的 `vector.type` 为 `qdrant`。它并无特有的配置字段。需要提前创建 Collection。

## Weaviate
Weaviate 所对应的 `vector.type` 为 `weaviate`。它并无特有的配置字段。
需要提前创建 Collection。需要注意的是 Weaviate 会设置首字母自动大写,在填写配置 `collectionID` 的时候需要将首字母设置为大写。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个 Weaviate 里好像是叫 class,不叫 collection。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其他几个 provider 的对应描述也调整一下吧。需要告知用户实际创建的是什么,需要把什么填入 collectionId 配置里。

如果使用 SaaS 需要填写 `serviceHost` 参数。

## 配置示例
### 基础配置
Expand Down
27 changes: 0 additions & 27 deletions plugins/wasm-go/extensions/ai-cache/embedding/weaviate.go

This file was deleted.

13 changes: 9 additions & 4 deletions plugins/wasm-go/extensions/ai-cache/vector/elasticsearch.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,16 @@ func (d *ESProvider) QueryEmbedding(
)
}

// base64 编码 ES 身份认证字符串
// base64 编码 ES 身份认证字符串或使用 Apikey
func (d *ESProvider) getCredentials() string {
credentials := fmt.Sprintf("%s:%s", d.config.esUsername, d.config.esPassword)
encodedCredentials := base64.StdEncoding.EncodeToString([]byte(credentials))
return fmt.Sprintf("Basic %s", encodedCredentials)
if len(d.config.apiKey) != 0 {
return fmt.Sprintf("ApiKey %s", d.config.apiKey)
} else {
credentials := fmt.Sprintf("%s:%s", d.config.esUsername, d.config.esPassword)
encodedCredentials := base64.StdEncoding.EncodeToString([]byte(credentials))
return fmt.Sprintf("Basic %s", encodedCredentials)
}

}

func (d *ESProvider) UploadAnswerAndEmbedding(
Expand Down
5 changes: 1 addition & 4 deletions plugins/wasm-go/extensions/ai-cache/vector/pinecone.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,14 @@ type pineconeProviderInitializer struct{}

func (c *pineconeProviderInitializer) ValidateConfig(config ProviderConfig) error {
if len(config.serviceHost) == 0 {
return errors.New("[Pinecone] serviceDomain is required")
return errors.New("[Pinecone] serviceHost is required")
}
if len(config.serviceName) == 0 {
return errors.New("[Pinecone] serviceName is required")
}
if len(config.apiKey) == 0 {
return errors.New("[Pinecone] apiKey is required")
}
if len(config.collectionID) == 0 {
return errors.New("[Pinecone] collectionID is required")
}
return nil
}

Expand Down
4 changes: 2 additions & 2 deletions plugins/wasm-go/extensions/ai-cache/vector/provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ var (
providerInitializers = map[string]providerInitializer{
PROVIDER_TYPE_DASH_VECTOR: &dashVectorProviderInitializer{},
PROVIDER_TYPE_CHROMA: &chromaProviderInitializer{},
PROVIDER_TYPE_ES: &weaviateProviderInitializer{},
PROVIDER_TYPE_WEAVIATE: &esProviderInitializer{},
PROVIDER_TYPE_ES: &esProviderInitializer{},
PROVIDER_TYPE_WEAVIATE: &weaviateProviderInitializer{},
PROVIDER_TYPE_PINECONE: &pineconeProviderInitializer{},
PROVIDER_TYPE_QDRANT: &qdrantProviderInitializer{},
PROVIDER_TYPE_MILVUS: &milvusProviderInitializer{},
Expand Down
2 changes: 2 additions & 0 deletions plugins/wasm-go/extensions/ai-cache/vector/weaviate.go
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ func (d *WeaviateProvider) QueryEmbedding(
"/v1/graphql",
[][2]string{
{"Content-Type", "application/json"},
{"Authorization", fmt.Sprintf("Bearer %s", d.config.apiKey)},
},
requestBody,
func(statusCode int, responseHeaders http.Header, responseBody []byte) {
Expand Down Expand Up @@ -128,6 +129,7 @@ func (d *WeaviateProvider) UploadAnswerAndEmbedding(
"/v1/objects",
[][2]string{
{"Content-Type", "application/json"},
{"Authorization", fmt.Sprintf("Bearer %s", d.config.apiKey)},
},
requestBody,
func(statusCode int, responseHeaders http.Header, responseBody []byte) {
Expand Down
Loading