Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate VectorStore from Elasticsearch client #13291

Merged
merged 11 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
412 changes: 250 additions & 162 deletions docs/docs/examples/vector_stores/ElasticsearchIndexDemo.ipynb

Large diffs are not rendered by default.

81 changes: 39 additions & 42 deletions docs/docs/examples/vector_stores/Elasticsearch_demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,7 @@
"id": "b5331b6b",
"metadata": {},
"source": [
"## Basic Example\n",
"\n",
"In this basic example, we take the a Paul Graham essay, split it into chunks, embed it using an open-source embedding model, load it into Elasticsearch, and then query it."
"## Basic Example\n"
]
},
{
Expand All @@ -37,6 +35,8 @@
"id": "f3aaf790",
"metadata": {},
"source": [
"In this basic example, we take the a Paul Graham essay, split it into chunks, embed it using an open-source embedding model, load it into Elasticsearch, and then query it. For an example using different retrieval strategies see [Elasticsearch Vector Store](https://docs.llamaindex.ai/en/stable/examples/vector_stores/ElasticsearchIndexDemo/).\n",
"\n",
"If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
]
},
Expand All @@ -47,20 +47,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install llama-index-embeddings-huggingface\n",
"%pip install llama-index-vector-stores-elasticsearch"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b3df0b97",
"metadata": {},
"outputs": [],
"source": [
"# !pip install llama-index elasticsearch --quiet\n",
"# !pip install sentence-transformers\n",
"# !pip install pydantic==1.10.11"
"%pip install -qU llama-index-vector-stores-elasticsearch llama-index-embeddings-huggingface llama-index"
]
},
{
Expand All @@ -73,8 +60,7 @@
"# import\n",
"from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
"from llama_index.vector_stores.elasticsearch import ElasticsearchStore\n",
"from llama_index.core import StorageContext\n",
"from IPython.display import Markdown, display"
"from llama_index.core import StorageContext"
]
},
{
Expand Down Expand Up @@ -105,10 +91,18 @@
"execution_count": null,
"id": "06874a37",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2024-05-13 15:10:43 URL:https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt [75042/75042] -> \"data/paul_graham/paul_graham_essay.txt\" [1]\n"
]
}
],
"source": [
"!mkdir -p 'data/paul_graham/'\n",
"!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
"!wget -nv 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
]
},
{
Expand All @@ -132,38 +126,41 @@
"execution_count": null,
"id": "667f3cb3-ce18-48d5-b9aa-bfc1a1f0f0f6",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"<b>The author worked on writing and programming outside of school. They wrote short stories and tried writing programs on an IBM 1401 computer. They also built a microcomputer kit and started programming on it, writing simple games and a word processor.</b>"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"outputs": [],
"source": [
"# load documents\n",
"documents = SimpleDirectoryReader(\"./data/paul_graham/\").load_data()\n",
"\n",
"# define index\n",
"vector_store = ElasticsearchStore(\n",
" index_name=\"paul_graham_essay\", es_url=\"http://localhost:9200\"\n",
" es_url=\"http://localhost:9200\", # see Elasticsearch Vector Store for more authentication options\n",
" index_name=\"paul_graham_essay\",\n",
")\n",
"storage_context = StorageContext.from_defaults(vector_store=vector_store)\n",
"\n",
"index = VectorStoreIndex.from_documents(\n",
" documents,\n",
" storage_context=storage_context,\n",
")\n",
"\n",
" documents, storage_context=storage_context\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4d3658bd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The author worked on writing and programming outside of school. They wrote short stories and tried writing programs on an IBM 1401 computer. They also built a microcomputer kit and started programming on it, writing simple games and a word processor.\n"
]
}
],
"source": [
"# Query Data\n",
"query_engine = index.as_query_engine()\n",
"response = query_engine.query(\"What did the author do growing up?\")\n",
"display(Markdown(f\"<b>{response}</b>\"))"
"print(response)"
]
}
],
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,16 @@
from llama_index.vector_stores.elasticsearch.base import ElasticsearchStore

__all__ = ["ElasticsearchStore"]
from elasticsearch.helpers.vectorstore import (
AsyncBM25Strategy,
AsyncSparseVectorStrategy,
AsyncDenseVectorStrategy,
AsyncRetrievalStrategy,
)

__all__ = [
"AsyncBM25Strategy",
"AsyncDenseVectorStrategy",
"AsyncRetrievalStrategy",
"AsyncSparseVectorStrategy",
"ElasticsearchStore",
]
Loading
Loading