oracle-database-embedding-flow

This code sample illustrates how to:

Stream documents OCI object storage, see OCIDocumentLoader
Split those documents into chunks, see Splitter
Embed document chunks using OCI GenAI Embeddings, OCIEmbeddingModel
Store the resulting embeddings as vectors in Oracle Database, see OracleVectorStore

An example workflow (EmbeddingWorkflowIT) ties these steps together, a snippet of which is shown below:

// Stream documents from OCI object storage.
documentLoader.streamDocuments(BUCKET_NAME, OBJECT_PREFIX)
    // Split each object storage document into chunks.
    .map(splitter::split)
    // Embed each chunk list using OCI GenAI service.
    .map(embeddingModel::embedAll)
    // Store embeddings in Oracle Database 23ai.
    .forEach(vectorStore::addAll);

Run the test

The sample test loads documents from an object storage bucket named "mybucket" using the object prefix "documents". These documents are then embedded using the OCI GenAI service, and finally stored in a local Oracle Database container.

# Set your OCI compartment and namespace before running the t
export OCI_COMPARTMENT="my compartment OCID"
export OCI_NAMESPACE="my oci namespace"
mvn integration-test

It should take about 30-40 seconds to run the test, which asserts that vector have been successfully added to the database:.

[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.75 s -- in com.example.EmbedddingWorkflowIT
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  36.016 s

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

oracle-database-embedding-flow

Run the test

Files

README.md

Latest commit

History

README.md

File metadata and controls

oracle-database-embedding-flow

Run the test