使用return_word_box功能,输出很奇怪 #14430
Unanswered
huameinan219
asked this question in
Q&A
Replies: 1 comment
-
您提到使用
以下是对该输出结构的解析以及如何将其转化为检测框的详细说明: 输出结构解析
输出为何包含单字信息?这是因为 如何将单字信息转化为检测框?如果您只想提取单字的检测框,可以按照以下步骤进行处理:
示例代码以下是完整的示例代码: from paddleocr import PaddleOCR
# 初始化 OCR
ocr = PaddleOCR(lang="ch", return_word_box=True)
# 执行 OCR
img_path = "./your_image.jpg"
results = ocr.ocr(img_path)
# 解析结果
for line in results[0]:
overall_box, (text, confidence, extra_info) = line
char_offsets = extra_info[1][1] # 单字偏移量
overall_width = overall_box[1][0] - overall_box[0][0]
overall_height = overall_box[2][1] - overall_box[0][1]
# 计算单字框
char_boxes = []
x_start = overall_box[0][0]
for char_width in char_offsets[0]:
x_end = x_start + char_width
char_boxes.append([[x_start, overall_box[0][1]],
[x_end, overall_box[0][1]],
[x_end, overall_box[2][1]],
[x_start, overall_box[2][1]]])
x_start = x_end
print(f"Text: {text}, Confidence: {confidence}")
print(f"Overall Box: {overall_box}")
print(f"Character Boxes: {char_boxes}") 总结
如有进一步问题,欢迎继续提问! Response generated by feifei-bot | chatgpt-4o-latest |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
🔎 Search before asking
🐛 Bug (问题描述)
直接安装paddleocr包,调用PaddleOCR(lang="ch", return_word_box=True),输出的结果类似于:
[[[26.0, 37.0], [304.0, 37.0], [304.0, 73.0], [26.0, 73.0]], ('纯臻营养护发素', 0.9946897625923157, [46.085826210826205, [['纯', '臻', '营', '养', '护', '发', '素']], [[3, 10, 16, 23, 30, 36, 43]], ['cn']])]
后面的 [46.085826210826205, [['纯', '臻', '营', '养', '护', '发', '素']], [[3, 10, 16, 23, 30, 36, 43]] 代表什么呢?如何将其转化为4点检测框呢?我查看了#10377,里面的解释没看懂。
查看源码源码第233行,cal_ocr_word_box函数返回的应该是检测框啊,为啥调用paddleocr包,返回的单字检测结果不是框?该如何将其转化为框呢?
🏃♂️ Environment (运行环境)
x86 CPU
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
PaddleOCR(lang="ch", return_word_box=True)
Beta Was this translation helpful? Give feedback.
All reactions