Skip to content

Commit

Permalink
Merge branch 'master' into on_chain_start_fix
Browse files Browse the repository at this point in the history
  • Loading branch information
keenborder786 authored Sep 19, 2024
2 parents 7bba1ea + c453b76 commit 1c4db33
Show file tree
Hide file tree
Showing 264 changed files with 5,003 additions and 4,830 deletions.
25 changes: 15 additions & 10 deletions .github/DISCUSSION_TEMPLATE/q-a.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,22 +96,27 @@ body:
- type: textarea
id: system-info
attributes:
label: System Info
description: |
Please share your system info with us. Do NOT skip this step and please don't trim
the output. Most users don't include enough information here and it makes it harder
for us to help you.
Please share your system info with us.
Run the following command in your terminal and paste the output here:
"pip freeze | grep langchain"
platform (windows / linux / mac)
python version
python -m langchain_core.sys_info
OR if you're on a recent version of langchain-core you can paste the output of:
or if you have an existing python interpreter running:
python -m langchain_core.sys_info
placeholder: |
"pip freeze | grep langchain"
platform
python version
from langchain_core import sys_info
sys_info.print_sys_info()
Alternatively, if you're on a recent version of langchain-core you can paste the output of:
alternatively, put the entire output of `pip freeze` here.
placeholder: |
python -m langchain_core.sys_info
These will only surface LangChain packages, don't forget to include any other relevant
packages you're using (if you're not sure what's relevant, you can paste the entire output of `pip freeze`).
validations:
required: true
2 changes: 1 addition & 1 deletion .github/workflows/_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ jobs:
path: langchain
sparse-checkout: | # this only grabs files for relevant dir
${{ inputs.working-directory }}
ref: master # this scopes to just master branch
ref: ${{ github.ref }} # this scopes to just ref'd branch
fetch-depth: 0 # this fetches entire commit history
- name: Check Tags
id: check-tags
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
"\n",
"This sample demonstrates the use of `Amazon Textract` in combination with LangChain as a DocumentLoader.\n",
"\n",
"`Textract` supports`PDF`, `TIF`F, `PNG` and `JPEG` format.\n",
"`Textract` supports`PDF`, `TIFF`, `PNG` and `JPEG` format.\n",
"\n",
"`Textract` supports these [document sizes, languages and characters](https://docs.aws.amazon.com/textract/latest/dg/limits-document.html)."
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"# Google Speech-to-Text Audio Transcripts\n",
"\n",
"The `GoogleSpeechToTextLoader` allows to transcribe audio files with the [Google Cloud Speech-to-Text API](https://cloud.google.com/speech-to-text) and loads the transcribed text into documents.\n",
"The `SpeechToTextLoader` allows to transcribe audio files with the [Google Cloud Speech-to-Text API](https://cloud.google.com/speech-to-text) and loads the transcribed text into documents.\n",
"\n",
"To use it, you should have the `google-cloud-speech` python package installed, and a Google Cloud project with the [Speech-to-Text API enabled](https://cloud.google.com/speech-to-text/v2/docs/transcribe-client-libraries#before_you_begin).\n",
"\n",
Expand Down Expand Up @@ -41,7 +41,7 @@
"source": [
"## Example\n",
"\n",
"The `GoogleSpeechToTextLoader` must include the `project_id` and `file_path` arguments. Audio files can be specified as a Google Cloud Storage URI (`gs://...`) or a local file path.\n",
"The `SpeechToTextLoader` must include the `project_id` and `file_path` arguments. Audio files can be specified as a Google Cloud Storage URI (`gs://...`) or a local file path.\n",
"\n",
"Only synchronous requests are supported by the loader, which has a [limit of 60 seconds or 10MB](https://cloud.google.com/speech-to-text/v2/docs/sync-recognize#:~:text=60%20seconds%20and/or%2010%20MB) per audio file."
]
Expand All @@ -52,13 +52,13 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_google_community import GoogleSpeechToTextLoader\n",
"from langchain_google_community import SpeechToTextLoader\n",
"\n",
"project_id = \"<PROJECT_ID>\"\n",
"file_path = \"gs://cloud-samples-data/speech/audio.flac\"\n",
"# or a local file path: file_path = \"./audio.wav\"\n",
"\n",
"loader = GoogleSpeechToTextLoader(project_id=project_id, file_path=file_path)\n",
"loader = SpeechToTextLoader(project_id=project_id, file_path=file_path)\n",
"\n",
"docs = loader.load()"
]
Expand Down Expand Up @@ -152,7 +152,7 @@
" RecognitionConfig,\n",
" RecognitionFeatures,\n",
")\n",
"from langchain_google_community import GoogleSpeechToTextLoader\n",
"from langchain_google_community import SpeechToTextLoader\n",
"\n",
"project_id = \"<PROJECT_ID>\"\n",
"location = \"global\"\n",
Expand All @@ -171,7 +171,7 @@
" ),\n",
")\n",
"\n",
"loader = GoogleSpeechToTextLoader(\n",
"loader = SpeechToTextLoader(\n",
" project_id=project_id,\n",
" location=location,\n",
" recognizer_id=recognizer_id,\n",
Expand Down
45 changes: 43 additions & 2 deletions docs/docs/integrations/document_loaders/unstructured_file.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"\n",
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/document_loaders/file_loaders/unstructured/)|\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"| [UnstructuredLoader](https://python.langchain.com/api_reference/unstructured/document_loaders/langchain_unstructured.document_loaders.UnstructuredLoader.html) | [langchain_community](https://python.langchain.com/api_reference/unstructured/index.html) | ✅ | ❌ | ✅ | \n",
"| [UnstructuredLoader](https://python.langchain.com/api_reference/unstructured/document_loaders/langchain_unstructured.document_loaders.UnstructuredLoader.html) | [langchain_unstructured](https://python.langchain.com/api_reference/unstructured/index.html) | ✅ | ❌ | ✅ | \n",
"### Loader features\n",
"| Source | Document Lazy Loading | Native Async Support\n",
"| :---: | :---: | :---: | \n",
Expand Down Expand Up @@ -519,6 +519,47 @@
"print(\"Length of text in the document:\", len(docs[0].page_content))"
]
},
{
"cell_type": "markdown",
"id": "3ec3c22d-02cd-498b-921f-b839d1404f32",
"metadata": {},
"source": [
"## Loading web pages\n",
"\n",
"`UnstructuredLoader` accepts a `web_url` kwarg when run locally that populates the `url` parameter of the underlying Unstructured [partition](https://docs.unstructured.io/open-source/core-functionality/partitioning). This allows for the parsing of remotely hosted documents, such as HTML web pages.\n",
"\n",
"Example usage:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "bf9a8546-659d-4861-bff2-fdf1ad93ac65",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"page_content='Example Domain' metadata={'category_depth': 0, 'languages': ['eng'], 'filetype': 'text/html', 'url': 'https://www.example.com', 'category': 'Title', 'element_id': 'fdaa78d856f9d143aeeed85bf23f58f8'}\n",
"\n",
"page_content='This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.' metadata={'languages': ['eng'], 'parent_id': 'fdaa78d856f9d143aeeed85bf23f58f8', 'filetype': 'text/html', 'url': 'https://www.example.com', 'category': 'NarrativeText', 'element_id': '3652b8458b0688639f973fe36253c992'}\n",
"\n",
"page_content='More information...' metadata={'category_depth': 0, 'link_texts': ['More information...'], 'link_urls': ['https://www.iana.org/domains/example'], 'languages': ['eng'], 'filetype': 'text/html', 'url': 'https://www.example.com', 'category': 'Title', 'element_id': '793ab98565d6f6d6f3a6d614e3ace2a9'}\n",
"\n"
]
}
],
"source": [
"from langchain_unstructured import UnstructuredLoader\n",
"\n",
"loader = UnstructuredLoader(web_url=\"https://www.example.com\")\n",
"docs = loader.load()\n",
"\n",
"for doc in docs:\n",
" print(f\"{doc}\\n\")"
]
},
{
"cell_type": "markdown",
"id": "ce01aa40",
Expand Down Expand Up @@ -546,7 +587,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
"version": "3.10.4"
}
},
"nbformat": 4,
Expand Down
120 changes: 1 addition & 119 deletions docs/docs/integrations/llms/sambanova.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,129 +6,11 @@
"source": [
"# SambaNova\n",
"\n",
"**[SambaNova](https://sambanova.ai/)'s** [Sambaverse](https://sambaverse.sambanova.ai/) and [Sambastudio](https://sambanova.ai/technology/full-stack-ai-platform) are platforms for running your own open-source models\n",
"**[SambaNova](https://sambanova.ai/)'s** [Sambastudio](https://sambanova.ai/technology/full-stack-ai-platform) is a platform for running your own open-source models\n",
"\n",
"This example goes over how to use LangChain to interact with SambaNova models"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sambaverse"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Sambaverse** allows you to interact with multiple open-source models. You can view the list of available models and interact with them in the [playground](https://sambaverse.sambanova.ai/playground).\n",
" **Please note that Sambaverse's free offering is performance-limited.** Companies that are ready to evaluate the production tokens-per-second performance, volume throughput, and 10x lower total cost of ownership (TCO) of SambaNova should [contact us](https://sambaverse.sambanova.ai/contact-us) for a non-limited evaluation instance."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"An API key is required to access Sambaverse models. To get a key, create an account at [sambaverse.sambanova.ai](https://sambaverse.sambanova.ai/)\n",
"\n",
"The [sseclient-py](https://pypi.org/project/sseclient-py/) package is required to run streaming predictions "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --quiet sseclient-py==1.8.0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Register your API key as an environment variable:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"sambaverse_api_key = \"<Your sambaverse API key>\"\n",
"\n",
"# Set the environment variables\n",
"os.environ[\"SAMBAVERSE_API_KEY\"] = sambaverse_api_key"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Call Sambaverse models directly from LangChain!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.llms.sambanova import Sambaverse\n",
"\n",
"llm = Sambaverse(\n",
" sambaverse_model_name=\"Meta/llama-2-7b-chat-hf\",\n",
" streaming=False,\n",
" model_kwargs={\n",
" \"do_sample\": True,\n",
" \"max_tokens_to_generate\": 1000,\n",
" \"temperature\": 0.01,\n",
" \"select_expert\": \"llama-2-7b-chat-hf\",\n",
" \"process_prompt\": False,\n",
" # \"stop_sequences\": '\\\"sequence1\\\",\\\"sequence2\\\"',\n",
" # \"repetition_penalty\": 1.0,\n",
" # \"top_k\": 50,\n",
" # \"top_p\": 1.0\n",
" },\n",
")\n",
"\n",
"print(llm.invoke(\"Why should I use open source models?\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Streaming response\n",
"\n",
"from langchain_community.llms.sambanova import Sambaverse\n",
"\n",
"llm = Sambaverse(\n",
" sambaverse_model_name=\"Meta/llama-2-7b-chat-hf\",\n",
" streaming=True,\n",
" model_kwargs={\n",
" \"do_sample\": True,\n",
" \"max_tokens_to_generate\": 1000,\n",
" \"temperature\": 0.01,\n",
" \"select_expert\": \"llama-2-7b-chat-hf\",\n",
" \"process_prompt\": False,\n",
" # \"stop_sequences\": '\\\"sequence1\\\",\\\"sequence2\\\"',\n",
" # \"repetition_penalty\": 1.0,\n",
" # \"top_k\": 50,\n",
" # \"top_p\": 1.0\n",
" },\n",
")\n",
"\n",
"for chunk in llm.stream(\"Why should I use open source models?\"):\n",
" print(chunk, end=\"\", flush=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
10 changes: 5 additions & 5 deletions docs/docs/integrations/providers/mlflow.mdx
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# MLflow Deployments for LLMs
# MLflow AI Gateway for LLMs

>[The MLflow Deployments for LLMs](https://www.mlflow.org/docs/latest/llms/deployments/index.html) is a powerful tool designed to streamline the usage and management of various large
>[The MLflow AI Gateway for LLMs](https://www.mlflow.org/docs/latest/llms/deployments/index.html) is a powerful tool designed to streamline the usage and management of various large
> language model (LLM) providers, such as OpenAI and Anthropic, within an organization. It offers a high-level interface
> that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests.
## Installation and Setup

Install `mlflow` with MLflow Deployments dependencies:
Install `mlflow` with MLflow GenAI dependencies:

```sh
pip install 'mlflow[genai]'
Expand Down Expand Up @@ -39,10 +39,10 @@ endpoints:
openai_api_key: $OPENAI_API_KEY
```
Start the deployments server:
Start the gateway server:
```sh
mlflow deployments start-server --config-path /path/to/config.yaml
mlflow gateway start --config-path /path/to/config.yaml
```

## Example provided by `MLflow`
Expand Down
Loading

0 comments on commit 1c4db33

Please sign in to comment.