Add Support for OCI Data Science Embedding Models #2

mrDzurb · 2024-12-05T02:45:42Z

Description

Implement support for invoking OCI DSC Model Deployment Embedding models from LlamaIndex.
https://jira.oci.oraclecorp.com/browse/ODSC-63451
Updates the LlamaIndex documentation to include a detailed guide on integrating OCI DSC models, covering the necessary configurations, API connections, and common use cases.
https://jira.oci.oraclecorp.com/browse/ODSC-63453

Testing

Installation

Pull the repo
pip install -e llama-index-integrations/embeddings/llama-index-embeddings-oci-data-science
It will also require to install oracle-ads, llama-index

Usage

import ads
from llama_index.embeddings.oci_data_science import OCIDataScienceEmbeddings

ads.set_auth(auth="security_token", profile="<replace-with-your-profile>")

embedding = OCIDataScienceEmbeddings(
    endpoint="https://<MD_OCID>/predict",
)

e1 = embeddings.get_query_embedding("This is a test document")
print(e1)

e2 = embeddings.get_text_embedding("This is a test document")
print(e2)

e3 = embeddings.get_text_embedding_batch([
        "This is a test document",
        "This is another test document"
    ])
print(e3)

Async

import ads
from llama_index.embeddings.oci_data_science import OCIDataScienceEmbeddings

ads.set_auth(auth="security_token", profile="<replace-with-your-profile>")

embedding = OCIDataScienceEmbeddings(
    endpoint="https://<MD_OCID>/predict",
)

e1 = embeddings.aget_text_embedding("This is a test document")
print(e1)

e2 = await embeddings.aget_text_embedding_batch([
        "This is a test document",
        "This is another test document"
    ])
print(e2)

Notebook Examples

Archive.zip

darenr · 2024-12-08T23:42:10Z

llama-index-integrations/embeddings/llama-index-embeddings-oci-data-science/README.md

+e1 = embeddings.get_query_embedding("This is a test document")
+print(e1)
+
+e2 = embeddings.get_text_embedding("This is a test document")


don't really need the second identical example

Agree, removed one.

darenr · 2024-12-08T23:42:24Z

llama-index-integrations/embeddings/llama-index-embeddings-oci-data-science/README.md

+    endpoint="https://<MD_OCID>/predict",
+)
+
+e1 = embeddings.aget_text_embedding("This is a test document")


doesn't this need an await?

Good catch, fixed.

VipulMascarenhas

changes look good overall, minor comment on the notebook.

VipulMascarenhas · 2024-12-10T07:06:47Z

docs/docs/examples/embeddings/oci_data_science.ipynb

nit:

colab link point to bedrock notebook, replace with oci_data_science.ipynb.

replace "%pip install llama-index-embeddings-oci-data-science" with "!%pip install llama-index-embeddings-oci-data-science".

Good catch, thanks!
As for the ``%pip install llama-index-embeddings-oci-data-science``` i think the first one is the recommended way to install the package to the kernel without further reloading the kernel.

* refactor and optimize milvus code Signed-off-by: ChengZi <[email protected]> * Update pyproject.toml --------- Signed-off-by: ChengZi <[email protected]> Co-authored-by: Massimiliano Pippi <[email protected]>

* Add test_async_basic_flow() * Make test_async_basic_flow() pass * Make sync tests work again * Make all tests pass * Move client connect & close to fixture also in async tests * Add adelete, adelete_nodes, aclear implementations * Deduplicate code in query() / aquery() * Get rid of clarified comment This gets tested in `test_query_kwargs()`. * Update WeaviateVectorStore documentation * Remove from_params() This method was not working as intended, it created a Weaviate v3 client instead of a v4 client as required by the rest of the code. It should be possible to just use the regular constructor instead of this method. * Throw a custom exception when calling async methods without providing an async client * Move AsyncClientNotProvidedError to a separate exceptions module * Bump llama-index-vector-stores-weaviate version to 2.0.0 * Remove debug output * DRY checks if _aclient is set, add SyncClientNotProvidedError * Pass either sync or async client in `weaviate_client` parameter, fix connect()/close() when no weaviate client is provided * Change new llama-index-vector-stores-weaviate version to 1.3.0 as the change is no longer breaking * Downgrade to pytest 7 for compatibility with currently used pants configuration in CI This commit can be reverted as soon as pytest >= 8 is used during the pants run. * Reorganize test modules to prevent parallel execution during pants run * Delete llama-index-integrations/vector_stores/llama-index-vector-stores-weaviate/poetry.lock --------- Co-authored-by: Massimiliano Pippi <[email protected]>

…" out of "vector_store_kwargs" (run-llama#17221) * parse "milvus_search_config" out of "vector_store_kwargs" passed to MilvusVectorStore.query * MilvusVectorStore Query parses "milvus_search_config" out of "vector_store_kwargs" * use kwargs dict.get instead of named index; * use kwargs.get for milvus_search_config throughout class; * Update pyproject.toml * pass **kwargs to _default_search * linting * actual linting --------- Co-authored-by: Massimiliano Pippi <[email protected]>

* rename resource fields * refactor Document * fix typing, bring back text_template for backward compat * fix bug in keyval docstore * make TextNode forward-compatible * redo deprecations * fix model identifier * update mocks * update mocks * fix fixture check

…ll` (run-llama#17663)

…rdcoded parameter (run-llama#17683)

…-llama#17694) fix max_tokens, add reasoning_effort

mrDzurb added 2 commits December 4, 2024 18:39

Add Support for OCI Data Science Embedding Models

5e2368f

Renames OCIDataScienceEmbeddings to OCIDataScienceEmbedding

977ba79

mrDzurb requested review from darenr, VipulMascarenhas, qiuosier and dipatidar December 8, 2024 21:59

darenr approved these changes Dec 8, 2024

View reviewed changes

darenr self-requested a review December 8, 2024 23:41

darenr requested changes Dec 8, 2024

View reviewed changes

Fixes the docs and examples

59b32fb

mrDzurb requested a review from darenr December 9, 2024 04:57

darenr approved these changes Dec 9, 2024

View reviewed changes

VipulMascarenhas approved these changes Dec 10, 2024

View reviewed changes

mrDzurb and others added 17 commits December 10, 2024 09:59

Fixes the notebook example.

9500d65

Ruff fixes.

8906ac9

make openai content blocks optional (run-llama#17240)

260de3f

Adjustments by black formatter.

3b826f7

Merge branch 'main' into ODSC-63451/oci_odsc_embedding

3cf8bab

refactor and optimize milvus code (run-llama#17229)

54e30cf

* refactor and optimize milvus code Signed-off-by: ChengZi <[email protected]> * Update pyproject.toml --------- Signed-off-by: ChengZi <[email protected]> Co-authored-by: Massimiliano Pippi <[email protected]>

fix(metrics): fixed NDCG calculation and updated previous tests (run-…

21f6e34

…llama#17236)

fix: update unstructured dependency pin (run-llama#17246)

cc1a8b6

fix: accept already base64-encoded data in ImageBlock (run-llama#17244)

d9348e3

Add ILIKE feature for PGVectorStore and NileVectorStore. (run-llama#1…

524befd

…7211)

Fix content blocks again (run-llama#17247)

ac5b735

Add Scrapegraph tool integration (run-llama#17238)

e258110

feat: rewrite the example using AzureOpenAI (run-llama#17245)

4500005

Handle empty retrieved Pinecone index values (run-llama#17242)

b876e73

LHFO94 and others added 30 commits January 27, 2025 09:18

.get_nodes now accepts include_values param to return embeddings (run…

c727d39

…-llama#17635)

fix repeated sources when doing parallel tool calling (run-llama#17645)

0d09a61

small typo fix in the default plan refine prompt (run-llama#17644)

c766521

update llama-cpp integration + docs (run-llama#17647)

c2a323e

fix bedrock function calling (run-llama#17658)

9b59881

Deepseek-r1 is now supported by fireworks (run-llama#17657)

c26f0f9

removed a dead link from fine-tuning.md in docs (run-llama#17652)

4d395a7

[docs] Fix 404 link for Q&A Patterns (run-llama#17660)

20e0a8f

Fix: [Bug] NameError: name 'p' is not defined run-llama#15581 (run-ll…

7eb8a82

…ama#17577)

Add Snowflake Cortex Integration (run-llama#17585)

384e143

Add endpoint parameter to TextEmbeddingsInference (run-llama#17598)

c8a7c90

Remove API key and fix colab link in deepseek.ipynb (run-llama#17664)

b15182f

[Fix] deps in llama-index-networks/examples (run-llama#17665)

d73b99d

feat: allow to exclude empty file simple directory reader (run-llama#…

d22fdcd

…17656)

Merge branch 'main' into ODSC-63451/oci_odsc_embedding

4a04612

fix: make ctx._events_buffer json-serializable (run-llama#17676)

7e396ae

[docs] Fix typo in BM25 Retriever Example (run-llama#17668)

c944c1e

Get description from pydantic field (run-llama#17679)

ff6d802

Fixed issue run-llama#17671 by adding the required decorators (run-ll…

0a6fa89

…ama#17678)

Add error_on_tool_error param to `FunctionCallingLLM.predict_and_ca…

772b6d2

…ll` (run-llama#17663)

Feat/fix Azure AI Search Hybrid Semantic Search Unusability due to ha…

38e68b2

…rdcoded parameter (run-llama#17683)

Fix workflow module guide doc (run-llama#17686)

f8d29ae

v0.12.15 (run-llama#17688)

3e2caa7

o3 mini support (run-llama#17689)

b989803

Fix user_msg vs chat_history AgentWorkflow inputs (run-llama#17690)

bc9da18

tweak to deepseek function calling support (run-llama#17692)

3d55cd2

Fixing broken link (run-llama#17691)

d00d5e9

fix max_tokens, add reasoning_effort for openai reasoning models (run…

89396ae

…-llama#17694) fix max_tokens, add reasoning_effort

small async voyageai fix (run-llama#17698)

4c81fff

Merge branch 'main' into ODSC-63451/oci_odsc_embedding

07a88ff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for OCI Data Science Embedding Models #2

Add Support for OCI Data Science Embedding Models #2

mrDzurb commented Dec 5, 2024

darenr Dec 8, 2024

mrDzurb Dec 9, 2024

darenr Dec 8, 2024

mrDzurb Dec 9, 2024

VipulMascarenhas left a comment

VipulMascarenhas Dec 10, 2024

mrDzurb Dec 10, 2024

Add Support for OCI Data Science Embedding Models #2

Are you sure you want to change the base?

Add Support for OCI Data Science Embedding Models #2

Conversation

mrDzurb commented Dec 5, 2024

Description

Testing

Installation

Usage

Async

Notebook Examples

darenr Dec 8, 2024

Choose a reason for hiding this comment

mrDzurb Dec 9, 2024

Choose a reason for hiding this comment

darenr Dec 8, 2024

Choose a reason for hiding this comment

mrDzurb Dec 9, 2024

Choose a reason for hiding this comment

VipulMascarenhas left a comment

Choose a reason for hiding this comment

VipulMascarenhas Dec 10, 2024

Choose a reason for hiding this comment

mrDzurb Dec 10, 2024

Choose a reason for hiding this comment