Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asr、general-ocr、object-recognize、translate组件补充cookbook #198

Merged
merged 4 commits into from
Mar 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 151 additions & 0 deletions cookbooks/asr.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# 短语音识别组件\n",
"\n",
"## 目标\n",
"使用短语音识别组件对输入的语音文件进行识别,返回识别的文字。\n",
"\n",
"## 准备工作\n",
"### 平台注册\n",
"1.先在appbuilder平台注册,获取token\n",
"\n",
"2.安装appbuilder-sdk"
]
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"!pip install appbuilder-sdk"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "markdown",
"source": [
"## 基本用法\n",
"\n",
"### 快速开始\n",
"\n",
"下面是短语音识别的代码示例:"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"import os\n",
"import requests\n",
"import appbuilder\n",
"# 设置环境变量和初始化\n",
"# 请前往千帆AppBuilder官网创建密钥,流程详见:https://cloud.baidu.com/doc/AppBuilder/s/Olq6grrt6#1%E3%80%81%E5%88%9B%E5%BB%BA%E5%AF%86%E9%92%A5\n",
"os.environ[\"APPBUILDER_TOKEN\"] = \"...\"\n",
"\n",
"asr = appbuilder.ASR()\n",
"\n",
"audio_file_url = \"https://bj.bcebos.com/v1/appbuilder/asr_test.pcm?authorization=bce-auth-v1\" \\\n",
" \"%2FALTAKGa8m4qCUasgoljdEDAzLm%2F2024-01-11T10%3A56%3A41Z%2F-1%2Fhost\" \\\n",
" \"%2Fa6c4d2ca8a3f0259f4cae8ae3fa98a9f75afde1a063eaec04847c99ab7d1e411\"\n",
"audio_data = requests.get(audio_file_url).content\n",
"content_data = {\"audio_format\": \"pcm\", \"raw_audio\": audio_data, \"rate\": 16000}\n",
"msg = appbuilder.Message(content_data)\n",
"out = asr.run(msg)\n",
"print(out.content)"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "markdown",
"source": [
"## 参数说明\n",
"\n",
"### 鉴权配置\n",
"使用组件之前,请首先申请并设置鉴权参数,可参考[组件使用流程](https://cloud.baidu.com/doc/AppBuilder/s/Olq6grrt6#1%E3%80%81%E5%88%9B%E5%BB%BA%E5%AF%86%E9%92%A5)。\n",
"```python\n",
"# 设置环境中的TOKEN,以下示例略\n",
"os.environ[\"APPBUILDER_TOKEN\"] = \"bce-YOURTOKEN\"\n",
"```\n",
"\n",
"### 初始化参数\n",
"\n",
"无\n",
"\n",
"### 调用参数\n",
"\n",
"|参数名称 |参数类型 |是否必须 |描述 | 示例值 |\n",
"|--------|--------|--------|----|--------|\n",
"|message |String |是 |输入的消息,用于模型的主要输入内容。这是一个必需的参数| Message(content={\"raw_audio\": b\"...\"}) |\n",
"|audio_format|String|是 |定义语言文件的格式,包括\"pcm\"、\"wav\"、\"amr\"、\"m4a\",默认值为\"pcm\"| pcm |\n",
"|rate|Integer|是 |定义录音采样率,固定值16000| 16000 |\n",
"|timeout| Float | 否 | HTTP超时时间,单位:秒 |1|\n",
"|retry|Integer|是 |HTTP重试次数| 3 |\n",
"\n",
"### 响应参数\n",
"|参数名称 | 参数类型 |描述 |示例值|\n",
"|--------|--------------|----|------|\n",
"|result | List[String] |返回结果|[\"北京科技馆。\"]|\n",
"\n",
"### 响应示例\n",
"```json\n",
"{\"result\": [\"北京科技馆。\"]}\n",
"```\n",
"### 错误码\n",
"| 错误码 |描述|\n",
"|---|---|\n",
"| 0 |success|\n",
"| 2000 |data empty|"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
182 changes: 182 additions & 0 deletions cookbooks/general_ocr.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
{
"cells": [
{
"cell_type": "markdown",
"source": [
"# 通用文字识别-高精度版组件\n",
"\n",
"## 目标\n",
"使用通用文字识别-高精度版组件对图片上的全部文字内容进行检测识别。\n",
"\n",
"## 准备工作\n",
"### 平台注册\n",
"1.先在appbuilder平台注册,获取token\n",
"\n",
"2.安装appbuilder-sdk"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"!pip install appbuilder-sdk"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "markdown",
"source": [
"## 基本用法\n",
"\n",
"### 快速开始\n",
"\n",
"以下是一个简单的例子来演示如何开始使用GeneralOCR组件:"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"import os\n",
"import appbuilder\n",
"import requests\n",
"\n",
"# 请前往千帆AppBuilder官网创建密钥,流程详见:https://cloud.baidu.com/doc/AppBuilder/s/Olq6grrt6#1%E3%80%81%E5%88%9B%E5%BB%BA%E5%AF%86%E9%92%A5\n",
"os.environ[\"APPBUILDER_TOKEN\"] = '...'\n",
"# 从BOS读取样例图片\n",
"image_url = \"https://bj.bcebos.com/v1/appbuilder/general_ocr_test.png?\"\\\n",
" \"authorization=bce-auth-v1%2FALTAKGa8m4qCUasgoljdEDAzLm%2F2024-01-\"\\\n",
" \"11T10%3A59%3A17Z%2F-1%2Fhost%2F081bf7bcccbda5207c82a4de074628b04ae\"\\\n",
" \"857a27513734d765495f89ffa5f73\"\n",
"raw_image = requests.get(image_url).content\n",
"general_ocr = appbuilder.GeneralOCR()\n",
"out = general_ocr.run(appbuilder.Message(content={\"raw_image\": raw_image}))\n",
"print(out.content)"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "markdown",
"source": [
"## 参数说明\n",
"\n",
"### 鉴权说明\n",
"使用组件之前,请首先申请并设置鉴权参数,可参考[组件使用流程](https://cloud.baidu.com/doc/AppBuilder/s/Olq6grrt6#1%E3%80%81%E5%88%9B%E5%BB%BA%E5%AF%86%E9%92%A5)。\n",
"```python\n",
"# 设置环境中的TOKEN,以下示例略\n",
"os.environ[\"APPBUILDER_TOKEN\"] = \"bce-YOURTOKEN\"\n",
"```\n",
"\n",
"### 初始化参数\n",
"\n",
"无\n",
"\n",
"### 调用参数\n",
"\n",
"| 参数名称 | 参数类型 | 是否必须 | 描述 | 示例值 |\n",
"|---------|---------|------|-----------------------------|------------------------------------------------|\n",
"| message | String | 是 | 输入的消息,用于模型的主要输入内容。这是一个必需的参数 | Message(content={\"raw_image\": b\"待识别的图片字节流数据\"}) |\n",
"|timeout| Float | 否 | HTTP超时时间,单位:秒 |1|\n",
"| retry | Integer | 否 | HTTP重试次数 | 3 |\n",
"\n",
"### 响应参数\n",
"| 参数名称 | 参数类型 | 描述 | 示例值 |\n",
"|--------------|---------|---------|---------------------------------------------------|\n",
"| words_result | Array[] | 返回结果 | [{\"words\":\"一站式企业级大模型平台,提供先进的生成式AI生产及应用全流程开发工具链\"}] |\n",
"| + words | String | 识别结果字符串 | \"百度智能云千帆大模型平台\" |\n",
"\n",
"### 响应示例\n",
"#### 示例图片\n",
"![示例图片](./image/general_ocr示例.png)\n",
"#### 识别结果\n",
"```json\n",
"{\n",
" \"words_result\":[\n",
" {\n",
" \"words\":\"一站式企业级大模型平台,提供先进的生成式AI生产及应用全流程开发工具链\"\n",
" },\n",
" {\n",
" \"words\":\"百度智能云千帆大模型平台\"\n",
" },\n",
" {\n",
" \"words\":\"文心大模型4.0已正式发布,个人和企业客户可通过百度智能云千帆大模型平台接入使用\"\n",
" },\n",
" {\n",
" \"words\":\"立即使用\"\n",
" },\n",
" {\n",
" \"words\":\"在线体验\"\n",
" },\n",
" {\n",
" \"words\":\"使用文档\"\n",
" },\n",
" {\n",
" \"words\":\"定价说明\"\n",
" },\n",
" {\n",
" \"words\":\"千帆社区\"\n",
" },\n",
" {\n",
" \"words\":\"常见概念、使用指导\"\n",
" },\n",
" {\n",
" \"words\":\"定价、计费方式、计量说明\"\n",
" },\n",
" {\n",
" \"words\":\"大模型开发学习、交流社区\"\n",
" }\n",
" ]\n",
"}\n",
"```\n"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Binary file added cookbooks/image/general_ocr示例.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added cookbooks/image/object_recognize示例.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading