Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Doc #56

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 0 additions & 11 deletions blog.mdx

This file was deleted.

File renamed without changes.
55 changes: 55 additions & 0 deletions llmvm/features/awq.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
title: AWQ
description: 'AWQ takes the concept of weight quantization to the next level by considering the activations of the model during the quantization process.'
---

## The Role of AWQ
**Activation-Aware Weight Quantization**

In traditional weight quantization, the weights are quantized independently of the data they process. In AWQ, the quantization process takes into account the actual data distribution in the activations produced by the model during inference.

Here's how:

1. Collect Activation Statistics: During a calibration phase, a subset of the data is used to collect statistics on the activations produced by the model. This involves running the model on this data and recording the range of values and the distribution of activations.
2. Searching Weight Quantization Parameters: Weights are quantized by taking the activation statistics into account. Concretely, we perform a space search for quantization paremeters (e.g., scales and zeropoints), to minimize the distortions incurred by quantization on output activations. As a result, the quantized weights can be accurately represented with fewer bits.
3. Quantizing: With the quantization parameters in place, the model weights are quantized using a reduced number of bits.

### Running REBEL
**Getting started with REBEL is easy**
```python quickstart_REBEL.py
# import our client
from llm_vm.client import Client
import os

# Instantiate the client specifying which LLM you want to use
client = Client(big_model='chat_gpt', small_model='gpt') #REBEL will use chat_gpt no matter what big model is specified here, this specification exists for non-REBEL completion calls.

#Calling client.complete with a tool list specified leads to the REBEL agent being used.
response = client.complete(
prompt = 'Is it warmer in Paris or Timbuktu and what are the temperatures in either city?',
context='',
openai_key=os.getenv("OPENAI_API_KEY"), #for REBEL we need an OpenAI key
tools=[{'description': 'Find the weather at a location and returns it in celcius.', #this tool list contains only one dictionary, therefore only one tool
'dynamic_params': {
"latitude": 'latitude of as a float',
"longitude": 'the longitude as a float'
},
'method': 'GET',
'url': "https://api.open-meteo.com/v1/forecast",
'static_params': {'current_weather': 'true'}}]) #No tools by default, so you have to add your own
print(response)
```
### Tool Definition
Tools are defined by dictionaries that are added to the list ```tools```. These dictionaries need to contain the following fields:

|Field| Type | Description|
|-|-|-|
|```'description'```| string | A description of what the tool does|
|```'dynamic_params'```| dictionary | A dictionary containing key value pairs (paramter name : description) of the API endpoint's mutable parameters that need to be set by REBEL in order to answer a query|
|```'method'```| string | ```GET``` or ```POST```, whichever is the type of the API endpoint|
|```'url'```| string | URL of the API endpoint that the given tool specifies|
|```'static_params'```| dictionary | Any parameters that are constant between all API calls. An API key/token is an example of this|

You can add any tool you want and as many as you need. REBEL will attempt to compositionally answer a question that may require calling multiple tools.

<Snippet file="github.mdx" />
128 changes: 128 additions & 0 deletions llmvm/features/client.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
The Anarchy Client is a Python library that facilitates easy access to Anarchy, a large language model (LLM) optimized for generating text-based completions. It provides a convenient interface to interact with the LLM, perform fine-tuning, and utilize various tools and agents for text generation. The client includes two main classes: `Simple_Inference_Client` and `Client`.

## Simple Inference Client

The `Simple_Inference_Client` is designed for straightforward text generation without the need for fine-tuning or additional agents. It allows you to generate text completions using the specified Anarchy LLM model.

### Usage

```python
from anarchy_client import Simple_Inference_Client

# Initialize the Simple Inference Client
client = Simple_Inference_Client(model="chat_gpt", openai_key="YOUR_OPENAI_API_KEY")

# Generate text completion
response = client.complete(prompt="Generate text based on this prompt.")
print(response)
```

### Methods

`complete(prompt, max_len=256, **kwargs)`

Generate text completion based on the provided prompt.

- `prompt` (str): The input prompt for text generation.
- `max_len` (int, optional): The maximum length of the generated text. Default is 256.
- `**kwargs` (dict, optional): Additional keyword arguments for text generation.

## Client

The `Client` class offers more advanced features, including fine-tuning, data synthesis, and integration with agents for specialized text generation tasks.

### Usage

```python
from anarchy_client import Client

# Initialize the Client with default models
client = Client()

# Generate text completion
response = client.complete(prompt="Generate text based on this prompt.")
print(response)
```

### Initialization Parameters

- `big_model` (str, optional): Name of the reliable source Anarchy LLM model. Default is `"chat_gpt"`.
- `small_model` (str, optional): Name of the small model used for fine-tuning. Default is `"pythia"`.
- `big_model_config` (dict, optional): Configuration options for the reliable source model.
- `small_model_config` (dict, optional): Configuration options for the small model used in fine-tuning.

### Methods

`complete(prompt, context="", openai_key=None, finetune=False, data_synthesis=False, temperature=0, stoptoken=None, tools=None, openai_kwargs={}, hf_kwargs={})`

Generate text completion using the Anarchy LLM.

- `prompt` (str): The input prompt for text generation.
- `context` (str, optional): Unchanging context to send to the LLM for generation. Defaults to an empty string and does not perform fine-tuning.
- `openai_key` (str, optional): API key for OpenAI access.
- `finetune` (bool, optional): If `True`, fine-tuning is initiated. Default is `False`.
- `data_synthesis` (bool, optional): Boolean value to determine whether data should be synthesized for fine-tuning. Default is `False`.
- `temperature` (float, optional): Sampling temperature for text generation, ranging from 0 to 2.
- `stoptoken` (str or list, optional): Sequence for stopping token generation.
- `tools` (list, optional): List of API tools for use with the Rebel agent.
- `openai_kwargs` (dict, optional): Keyword arguments for generation with OpenAI models.
- `hf_kwargs` (dict, optional): Keyword arguments for generation with Hugging Face models.

`load_finetune(model_filename=None, small_model=False)`

Load a fine-tuned model for either the reliable source (big) model or the small model.

- `model_filename` (str, optional): The filename of the fine-tuned model.
- `small_model` (bool, optional): If `True`, load the model for the small model. Default is `False`.

`change_model_dtype(big_model_dtype=None, small_model_dtype=None)`

Change the model data type for either the reliable source (big) model or the small model.

- `big_model_dtype` (str, optional): Data type for the reliable source model.
- `small_model_dtype` (str, optional): Data type for the small model.

`set_pinecone_db(api_key, env_name)`

Set up the connection to a Pinecone vector database for storing and querying embeddings.

- `api_key` (str): API key for accessing the Pinecone database.
- `env_name` (str): Name of the Pinecone environment.

`store_pdf(pdf_file_path, chunk_size=1024, **kwargs)`

Store text data from a PDF file into the Pinecone vector database after encoding it into embeddings.

- `pdf_file_path` (str): Path to the PDF file.
- `chunk_size` (int, optional): Chunk size for processing the PDF. Default is 1024.
- `**kwargs` (dict, optional): Additional keyword arguments for storing data in the database.

`create_pinecone_index(**kwargs)`

Create an index in the Pinecone vector database.

- `**kwargs` (dict, optional): Additional keyword arguments for creating the index.

`RAG_complete(prompt, context="", openai_key=None, finetune=False, data_synthesis=False, temperature=0, stoptoken=None, tools=None, query_kwargs={}, hf_kwargs={}, openai_kwargs={})`

Generate text completion using the RAG (Retrieval-Augmented Generation) approach. This method first retrieves relevant documents from the Pinecone database based on the prompt's embeddings and then generates text completions.

- `prompt` (str): The input prompt for text generation.
- `context` (str, optional): Unchanging context to send to the LLM for generation. Defaults to an empty string and does not perform fine-tuning.
- `openai_key` (str, optional): API key for OpenAI access.
- `finetune` (bool, optional): If `True`, fine-tuning is initiated. Default is `False`.
- `data_synthesis` (bool, optional): Boolean value to determine whether data should be synthesized for fine-tuning. Default is `False`.
- `temperature` (float, optional): Sampling temperature for text generation, ranging from 0 to 2.
- `stoptoken` (str or list, optional): Sequence for stopping token generation.
- `tools` (list, optional): List of API tools for use with the Rebel agent.
- `query_kwargs` (dict, optional): Keyword arguments for querying the Pinecone vector database.
- `hf_kwargs` (dict, optional): Keyword arguments for generation with Hugging Face models.
- `openai_kwargs` (dict, optional): Keyword arguments for generation with OpenAI models.

## Examples

For detailed examples and usage of the Anarchy Client, refer to the provided code snippets and the Anarchy Client documentation.

## License

This library is provided under the MIT License. See the [LICENSE](LICENSE) file for more details.
File renamed without changes.
File renamed without changes.
49 changes: 49 additions & 0 deletions llmvm/features/rag.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
title: RAG
description: 'RAG models combine the powers of pretrained dense retrieval (DPR) and sequence-to-sequence models.'
---

## The RAG Model
**Retrieval-augmented generation**

RAG models retrieve documents, pass them to a seq2seq model, then marginalize to generate outputs. The retriever and seq2seq modules are initialized from pretrained models, and fine-tuned jointly, allowing both retrieval and generation to adapt to downstream tasks.

### Running RAG
**Getting started with REBEL is easy**
```python quickstart_REBEL.py
# import our client
from llm_vm.client import Client
import os

# Instantiate the client specifying which LLM you want to use
client = Client(big_model='chat_gpt', small_model='gpt') #REBEL will use chat_gpt no matter what big model is specified here, this specification exists for non-REBEL completion calls.

#Calling client.complete with a tool list specified leads to the REBEL agent being used.
response = client.complete(
prompt = 'Is it warmer in Paris or Timbuktu and what are the temperatures in either city?',
context='',
openai_key=os.getenv("OPENAI_API_KEY"), #for REBEL we need an OpenAI key
tools=[{'description': 'Find the weather at a location and returns it in celcius.', #this tool list contains only one dictionary, therefore only one tool
'dynamic_params': {
"latitude": 'latitude of as a float',
"longitude": 'the longitude as a float'
},
'method': 'GET',
'url': "https://api.open-meteo.com/v1/forecast",
'static_params': {'current_weather': 'true'}}]) #No tools by default, so you have to add your own
print(response)
```
### Tool Definition
Tools are defined by dictionaries that are added to the list ```tools```. These dictionaries need to contain the following fields:

|Field| Type | Description|
|-|-|-|
|```'description'```| string | A description of what the tool does|
|```'dynamic_params'```| dictionary | A dictionary containing key value pairs (paramter name : description) of the API endpoint's mutable parameters that need to be set by REBEL in order to answer a query|
|```'method'```| string | ```GET``` or ```POST```, whichever is the type of the API endpoint|
|```'url'```| string | URL of the API endpoint that the given tool specifies|
|```'static_params'```| dictionary | Any parameters that are constant between all API calls. An API key/token is an example of this|

You can add any tool you want and as many as you need. REBEL will attempt to compositionally answer a question that may require calling multiple tools.

<Snippet file="github.mdx" />
49 changes: 49 additions & 0 deletions llmvm/features/rebel.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
title: REBEL
description: 'Our AI agents expand what you can do with LLMs!'
---

## The REBEL Agent
**REcursion Based Extensible LLM**

Our REBEL agent takes a novel approach to answering complex questions. Using recursive reasoning, REBEL expands what LLMs can do with problem decomposition and tool use. In this way, we are able to answer questions requiring data LLMs were not directly trained on.

### Running REBEL
**Getting started with REBEL is easy**
```python quickstart_REBEL.py
# import our client
from llm_vm.client import Client
import os

# Instantiate the client specifying which LLM you want to use
client = Client(big_model='chat_gpt', small_model='gpt') #REBEL will use chat_gpt no matter what big model is specified here, this specification exists for non-REBEL completion calls.

#Calling client.complete with a tool list specified leads to the REBEL agent being used.
response = client.complete(
prompt = 'Is it warmer in Paris or Timbuktu and what are the temperatures in either city?',
context='',
openai_key=os.getenv("OPENAI_API_KEY"), #for REBEL we need an OpenAI key
tools=[{'description': 'Find the weather at a location and returns it in celcius.', #this tool list contains only one dictionary, therefore only one tool
'dynamic_params': {
"latitude": 'latitude of as a float',
"longitude": 'the longitude as a float'
},
'method': 'GET',
'url': "https://api.open-meteo.com/v1/forecast",
'static_params': {'current_weather': 'true'}}]) #No tools by default, so you have to add your own
print(response)
```
### Tool Definition
Tools are defined by dictionaries that are added to the list ```tools```. These dictionaries need to contain the following fields:

|Field| Type | Description|
|-|-|-|
|```'description'```| string | A description of what the tool does|
|```'dynamic_params'```| dictionary | A dictionary containing key value pairs (paramter name : description) of the API endpoint's mutable parameters that need to be set by REBEL in order to answer a query|
|```'method'```| string | ```GET``` or ```POST```, whichever is the type of the API endpoint|
|```'url'```| string | URL of the API endpoint that the given tool specifies|
|```'static_params'```| dictionary | Any parameters that are constant between all API calls. An API key/token is an example of this|

You can add any tool you want and as many as you need. REBEL will attempt to compositionally answer a question that may require calling multiple tools.

<Snippet file="github.mdx" />
54 changes: 54 additions & 0 deletions llmvm/features/selfins.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
title: SELF-INSTRUCT
description: 'Motivation for SELF-INSTRUCT is to reduce the dependence on human annotators.'
---

## The SELF-INSTRUCTor
**The non-human annotator**

Large language models can be effective at following instructions if tuned with annotated “instructional” data. However, this needs a large dataset of human-generated instructional commands and their desired instances. Creating a large dataset via humans is expensive. Furthermore, creating such a human-generated dataset can lack diversity and creativity.

SELF-INSTRUCT tackles this bottleneck by reducing the dependence on human annotators. At a high level, there are two components:

1. Starting with a seed set of instructions+instances, the self-instruct process uses LLMs to generate new instructions and instances in a bootstrapped manner.

2. Next, it uses the generated instructions to finetune an instruction-following model from a vanilla pre-trained language model.
### Running REBEL
**Getting started with REBEL is easy**
```python quickstart_REBEL.py
# import our client
from llm_vm.client import Client
import os

# Instantiate the client specifying which LLM you want to use
client = Client(big_model='chat_gpt', small_model='gpt') #REBEL will use chat_gpt no matter what big model is specified here, this specification exists for non-REBEL completion calls.

#Calling client.complete with a tool list specified leads to the REBEL agent being used.
response = client.complete(
prompt = 'Is it warmer in Paris or Timbuktu and what are the temperatures in either city?',
context='',
openai_key=os.getenv("OPENAI_API_KEY"), #for REBEL we need an OpenAI key
tools=[{'description': 'Find the weather at a location and returns it in celcius.', #this tool list contains only one dictionary, therefore only one tool
'dynamic_params': {
"latitude": 'latitude of as a float',
"longitude": 'the longitude as a float'
},
'method': 'GET',
'url': "https://api.open-meteo.com/v1/forecast",
'static_params': {'current_weather': 'true'}}]) #No tools by default, so you have to add your own
print(response)
```
### Tool Definition
Tools are defined by dictionaries that are added to the list ```tools```. These dictionaries need to contain the following fields:

|Field| Type | Description|
|-|-|-|
|```'description'```| string | A description of what the tool does|
|```'dynamic_params'```| dictionary | A dictionary containing key value pairs (paramter name : description) of the API endpoint's mutable parameters that need to be set by REBEL in order to answer a query|
|```'method'```| string | ```GET``` or ```POST```, whichever is the type of the API endpoint|
|```'url'```| string | URL of the API endpoint that the given tool specifies|
|```'static_params'```| dictionary | Any parameters that are constant between all API calls. An API key/token is an example of this|

You can add any tool you want and as many as you need. REBEL will attempt to compositionally answer a question that may require calling multiple tools.

<Snippet file="github.mdx" />
File renamed without changes.
Loading