From 8e7b6ab199c75ac8fb04637347f8e675e2b864a3 Mon Sep 17 00:00:00 2001
From: myhloli <moe@myhloli.com>
Date: Sun, 12 Jan 2025 03:57:08 +0800
Subject: [PATCH] docs(faq): add troubleshooting guide for old GPUs
 encountering CUDA errors

Added a new section in both English and Chinese FAQs addressing the issue where old GPUs like M40 encounter a RuntimeError due to unsupported BF16 precision. The guide includes steps to manually disable BF16 precision by modifying the relevant code in "pdf_parse_union_core_v2.py".
---
 docs/FAQ_en_us.md | 20 ++++++++++++++++++++
 docs/FAQ_zh_cn.md | 23 ++++++++++++++++++++++-
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/docs/FAQ_en_us.md b/docs/FAQ_en_us.md
index f62a7849..053145f4 100644
--- a/docs/FAQ_en_us.md
+++ b/docs/FAQ_en_us.md
@@ -73,3 +73,23 @@ pip install -U magic-pdf[full,old_linux] --extra-index-url https://wheels.myhlol
 ```
 
 Reference: https://github.com/opendatalab/MinerU/issues/1004
+
+### 9. Old Graphics Cards Such as M40 Encounter "RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED"
+
+An error occurs during operation (cuda):
+```
+RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasGemmStridedBatchedEx(handle, opa, opb, (int)m, (int)n, (int)k, (void*)&falpha, a, CUDA_R_16BF, (int)lda, stridea, b, CUDA_R_16BF, (int)ldb, strideb, (void*)&fbeta, c, CUDA_R_16BF, (int)ldc, stridec, (int)num_batches, compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
+```
+Because BF16 precision is not supported on graphics cards before the Turing architecture and some graphics cards are not recognized by torch, it is necessary to manually disable BF16 precision.
+Modify the code in lines 287-290 of the "pdf_parse_union_core_v2.py" file (note that the location may vary in different versions):
+```
+if torch.cuda.is_bf16_supported():
+    supports_bfloat16 = True
+else:
+    supports_bfloat16 = False
+```
+Change it to:
+```
+supports_bfloat16 = False
+```
+Reference: https://github.com/opendatalab/MinerU/issues/1508
\ No newline at end of file
diff --git a/docs/FAQ_zh_cn.md b/docs/FAQ_zh_cn.md
index d3616d6a..795dd1b5 100644
--- a/docs/FAQ_zh_cn.md
+++ b/docs/FAQ_zh_cn.md
@@ -57,7 +57,6 @@ cuda11对新显卡的兼容性不好，需要升级paddle使用的cuda版本
 ```bash
 pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu123/
 ```
-
 参考：https://github.com/opendatalab/MinerU/issues/558
 
 ### 7.在部分Linux服务器上，程序一运行就报错 `非法指令 (核心已转储)` 或 `Illegal instruction (core dumped)`
@@ -74,3 +73,25 @@ pip install -U magic-pdf[full,old_linux] --extra-index-url https://wheels.myhlol
 ```
 
 参考：https://github.com/opendatalab/MinerU/issues/1004
+
+### 9. 旧显卡如M40出现 "RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED"
+
+在运行过程中（使用CUDA）出现以下错误：
+```
+RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasGemmStridedBatchedEx(handle, opa, opb, (int)m, (int)n, (int)k, (void*)&falpha, a, CUDA_R_16BF, (int)lda, stridea, b, CUDA_R_16BF, (int)ldb, strideb, (void*)&fbeta, c, CUDA_R_16BF, (int)ldc, stridec, (int)num_batches, compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
+```
+由于Turing架构之前的显卡不支持BF16精度，并且部分显卡未能被PyTorch正确识别，因此需要手动关闭BF16精度。
+
+请找到并修改`pdf_parse_union_core_v2.py`文件中的第287至290行代码（注意：不同版本中位置可能有所不同），原代码如下：
+```python
+if torch.cuda.is_bf16_supported():
+    supports_bfloat16 = True
+else:
+    supports_bfloat16 = False
+```
+将其修改为：
+```python
+supports_bfloat16 = False
+```
+
+参考：https://github.com/opendatalab/MinerU/issues/1508