Use pgai with Cohere

This page shows you how to:

Configure pgai for Cohere
Add AI functionality to your database

Configure pgai for Cohere

Cohere functions in pgai require an Cohere API key.

In production, we suggest setting the API key using an environment variable. During testing and development, it may be easiest to configure the key value as a session level parameter. For more options and details, consult the Handling API keys document.

Set your Cohere key as an environment variable in your shell:

export COHERE_API_KEY="this-is-my-super-secret-api-key-dont-tell"

Use the session level parameter when you connect to your database:

PGOPTIONS="-c ai.cohere_api_key=$COHERE_API_KEY" psql -d "postgres://<username>:<password>@<host>:<port>/<database-name>"

Run your AI query:

ai.cohere_api_key is set for the duration of your psql session, you do not need to specify it for pgai functions.

select ai.cohere_chat_complete
( 'command-r-plus'
, jsonb_build_array
  ( jsonb_build_object
    ( 'role', 'user'
    , 'content', 'How much wood would a woodchuck chuck if a woodchuck could chuck wood?'
    )
  )
, seed=>42
, temperature=>0.0
)->'message'->'content'->0->>'text'
;

Usage

This section shows you how to use AI directly from your database using SQL.

cohere_list_models
cohere_tokenize
cohere_detokenize
cohere_embed
cohere_classify
cohere_classify_simple
cohere_rerank
cohere_rerank_simple
cohere_chat_complete

cohere_list_models

List the models supported by the Cohere platform.

select *
from ai.cohere_list_models()
;

Results:

             name              |         endpoints         | finetuned | context_length |                                       tokenizer_url                                        | default_endpoints 
-------------------------------+---------------------------+-----------+----------------+--------------------------------------------------------------------------------------------+-------------------
 embed-english-light-v2.0      | {embed,classify}          | f         |            512 |                                                                                            | {}
 embed-english-v2.0            | {embed,classify}          | f         |            512 |                                                                                            | {}
 command-r                     | {generate,chat,summarize} | f         |         128000 | https://storage.googleapis.com/cohere-public/tokenizers/command-r.json                     | {}
 embed-multilingual-light-v3.0 | {embed,classify}          | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/embed-multilingual-light-v3.0.json | {}
 command-nightly               | {generate,chat,summarize} | f         |         128000 | https://storage.googleapis.com/cohere-public/tokenizers/command-nightly.json               | {}
 command-r-plus                | {generate,chat,summarize} | f         |         128000 | https://storage.googleapis.com/cohere-public/tokenizers/command-r-plus.json                | {chat}
 embed-multilingual-v3.0       | {embed,classify}          | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/embed-multilingual-v3.0.json       | {}
 embed-multilingual-v2.0       | {embed,classify}          | f         |            256 | https://storage.googleapis.com/cohere-public/tokenizers/embed-multilingual-v2.0.json       | {}
 c4ai-aya-23                   | {generate,chat}           | f         |           8192 | https://storage.googleapis.com/cohere-public/tokenizers/c4ai-aya-23.json                   | {}
 command-light-nightly         | {generate,summarize,chat} | f         |           4096 | https://storage.googleapis.com/cohere-public/tokenizers/command-light-nightly.json         | {}
 rerank-multilingual-v2.0      | {rerank}                  | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/rerank-multilingual-v2.0.json      | {}
 embed-english-v3.0            | {embed,classify}          | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/embed-english-v3.0.json            | {}
 command                       | {generate,summarize,chat} | f         |           4096 | https://storage.googleapis.com/cohere-public/tokenizers/command.json                       | {generate}
 rerank-multilingual-v3.0      | {rerank}                  | f         |           4096 | https://storage.googleapis.com/cohere-public/tokenizers/rerank-multilingual-v3.0.json      | {}
 rerank-english-v2.0           | {rerank}                  | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/rerank-english-v2.0.json           | {}
 command-light                 | {generate,summarize,chat} | f         |           4096 | https://storage.googleapis.com/cohere-public/tokenizers/command-light.json                 | {}
 rerank-english-v3.0           | {rerank}                  | f         |           4096 | https://storage.googleapis.com/cohere-public/tokenizers/rerank-english-v3.0.json           | {}
 embed-english-light-v3.0      | {embed,classify}          | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/embed-english-light-v3.0.json      | {}
(18 rows)

List the models on the Cohere platform that support a particular endpoint.

select *
from ai.cohere_list_models(endpoint=>'embed')
;

Results

             name              |    endpoints     | finetuned | context_length |                                       tokenizer_url                                        | default_endpoints 
-------------------------------+------------------+-----------+----------------+--------------------------------------------------------------------------------------------+-------------------
 embed-english-light-v2.0      | {embed,classify} | f         |            512 |                                                                                            | {}
 embed-english-v2.0            | {embed,classify} | f         |            512 |                                                                                            | {}
 embed-multilingual-light-v3.0 | {embed,classify} | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/embed-multilingual-light-v3.0.json | {}
 embed-multilingual-v3.0       | {embed,classify} | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/embed-multilingual-v3.0.json       | {}
 embed-multilingual-v2.0       | {embed,classify} | f         |            256 | https://storage.googleapis.com/cohere-public/tokenizers/embed-multilingual-v2.0.json       | {}
 embed-english-v3.0            | {embed,classify} | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/embed-english-v3.0.json            | {}
 embed-english-light-v3.0      | {embed,classify} | f         |            512 | https://storage.googleapis.com/cohere-public/tokenizers/embed-english-light-v3.0.json      | {}
(7 rows)

List the default model for a given endpoint.

select * 
from ai.cohere_list_models(endpoint=>'generate', default_only=>true);

Results

  name   |         endpoints         | finetuned | context_length |                            tokenizer_url                             | default_endpoints 
---------+---------------------------+-----------+----------------+----------------------------------------------------------------------+-------------------
 command | {generate,summarize,chat} | f         |           4096 | https://storage.googleapis.com/cohere-public/tokenizers/command.json | {generate}
(1 row)

cohere_tokenize

Tokenize text content.

select ai.cohere_tokenize
( 'command'
, 'One of the best programming skills you can have is knowing when to walk away for awhile.'
);

Results:

                                      cohere_tokenize                                       
--------------------------------------------------------------------------------------------
 {5256,1707,1682,2383,9461,4696,1739,1863,1871,1740,9397,2112,1705,4066,3465,1742,38700,21}
(1 row)

cohere_detokenize

Reverse the tokenize process.

select ai.cohere_detokenize
( 'command'
, array[14485,38374,2630,2060,2252,5164,4905,21,2744,2628,1675,3094,23407,21]
);

Results:

                              cohere_detokenize                               
------------------------------------------------------------------------------
 Good programmers don't just write programs. They build a working vocabulary.
(1 row)

cohere_embed

Embed content.

select ai.cohere_embed
( 'embed-english-light-v3.0'
, 'if a woodchuck could chuck wood, a woodchuck would chuck as much wood as he could'
, input_type=>'search_document'
);

Results:

                     cohere_embed                      
-------------------------------------------------------
 [-0.066833496,-0.052337646,...0.014167786,0.02053833]
(1 row)

cohere_classify

Classify inputs, assigning labels.

with examples(example, label) as
(
    values
      ('cat', 'animal')
    , ('dog', 'animal')
    , ('car', 'machine')
    , ('truck', 'machine')
    , ('apple', 'food')
    , ('broccoli', 'food')
)
select *
from jsonb_to_recordset
(
    ai.cohere_classify
    ( 'embed-english-light-v3.0'
    , array['bird', 'airplane', 'corn'] --inputs we want to classify
    , examples=>(select jsonb_agg(jsonb_build_object('text', examples.example, 'label', examples.label)) from examples)
    )->'classifications'
) x(input text, prediction text, confidence float8)
;

Results:

  input   | prediction | confidence 
----------+------------+------------
 bird     | animal     |  0.3708435
 airplane | machine    |   0.343932
 corn     | food       | 0.37896726
(3 rows)

cohere_classify_simple

A simpler interface to classification.

with examples(example, label) as
(
    values
      ('cat', 'animal')
    , ('dog', 'animal')
    , ('car', 'machine')
    , ('truck', 'machine')
    , ('apple', 'food')
    , ('broccoli', 'food')
)
select *
from ai.cohere_classify_simple
( 'embed-english-light-v3.0'
, array['bird', 'airplane', 'corn']
, examples=>(select jsonb_agg(jsonb_build_object('text', examples.example, 'label', examples.label)) from examples)
) x
;

Results:

  input   | prediction | confidence 
----------+------------+------------
 bird     | animal     |  0.3708435
 airplane | machine    |   0.343932
 corn     | food       | 0.37896726
(3 rows)

cohere_rerank

Rank documents according to semantic similarity to a query prompt.

select
  x."index"
, x.relevance_score
from jsonb_to_recordset
(
    ai.cohere_rerank
    ( 'rerank-english-v3.0'
    , 'How long does it take for two programmers to work on something?'
    , array
      [ $$Good programmers don't just write programs. They build a working vocabulary.$$
      , 'One of the best programming skills you can have is knowing when to walk away for awhile.'
      , 'What one programmer can do in one month, two programmers can do in two months.'
      , 'how much wood would a woodchuck chuck if a woodchuck could chuck wood?'
      ]
    )->'results'
) x("index" int, relevance_score float8)
order by relevance_score desc
;

Results:

 index | relevance_score
-------+-----------------
     2 |       0.8003801
     0 |    0.0011559008
     1 |    0.0006932423
     3 |    2.637042e-07
(4 rows)

cohere_rerank_simple

A simpler interface to rerank.

select *
from ai.cohere_rerank_simple
( 'rerank-english-v3.0'
, 'How long does it take for two programmers to work on something?'
, array
  [ $$Good programmers don't just write programs. They build a working vocabulary.$$
  , 'One of the best programming skills you can have is knowing when to walk away for awhile.'
  , 'What one programmer can do in one month, two programmers can do in two months.'
  , 'how much wood would a woodchuck chuck if a woodchuck could chuck wood?'
  ]
) x
order by relevance_score desc
;

Results:

 index |                                         document                                         | relevance_score
-------+------------------------------------------------------------------------------------------+-----------------
     2 | What one programmer can do in one month, two programmers can do in two months.           |       0.8003801
     0 | Good programmers don't just write programs. They build a working vocabulary.             |    0.0011559008
     1 | One of the best programming skills you can have is knowing when to walk away for awhile. |    0.0006932423
     3 | how much wood would a woodchuck chuck if a woodchuck could chuck wood?                   |    2.637042e-07
(4 rows)

cohere_chat_complete

Complete chat prompts

select ai.cohere_chat_complete
( 'command-r-plus'
, jsonb_build_array
  ( jsonb_build_object
    ( 'role', 'user'
    , 'content', 'How much wood would a woodchuck chuck if a woodchuck could chuck wood?'
    )
  )
, seed=>42
, temperature=>0.0
)->'message'->'content'->0->>'text'
;

Results:

                              ?column?
---------------------------------------------------------------------
 As much wood as a woodchuck would, if a woodchuck could chuck wood.
(1 row)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cohere.md

cohere.md

Use pgai with Cohere

Configure pgai for Cohere

Usage

cohere_list_models

cohere_tokenize

cohere_detokenize

cohere_embed

cohere_classify

cohere_classify_simple

cohere_rerank

cohere_rerank_simple

cohere_chat_complete

Files

cohere.md

Latest commit

History

cohere.md

File metadata and controls

Use pgai with Cohere

Configure pgai for Cohere

Usage

cohere_list_models

cohere_tokenize

cohere_detokenize

cohere_embed

cohere_classify

cohere_classify_simple

cohere_rerank

cohere_rerank_simple

cohere_chat_complete