Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing of Text and recognizing fractions as Text #1356

Closed
karthikak24 opened this issue Dec 24, 2024 · 1 comment
Closed

Missing of Text and recognizing fractions as Text #1356

karthikak24 opened this issue Dec 24, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@karthikak24
Copy link

Description of the bug | 错误描述

In layout detection the text content are missing and some fractions are identified as text.
Screenshot 2024-12-24 143648
Screenshot 2024-12-24 143725

How to reproduce the bug | 如何复现

AQA-83001H-QP-JUN22.PDF
this is the pdf which i tried. Most of the content are extracted correctly. But 20% is missed in layout detection itself

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.9.x

Device mode | 设备模式

cuda

@karthikak24 karthikak24 added the bug Something isn't working label Dec 24, 2024
@karthikak24 karthikak24 changed the title Missing of Text and marking formulas as Text Missing of Text and recognizing fractions as Text Dec 24, 2024
@myhloli
Copy link
Collaborator

myhloli commented Jan 22, 2025

In the upcoming 1.1.0 version, this issue has been significantly improved. Please stay tuned for the release updates of the new version.

@myhloli myhloli closed this as completed Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants