Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'NoneType' object has no attribute 'bboxes' #252

Open
lianyant opened this issue Aug 13, 2024 · 1 comment
Open

AttributeError: 'NoneType' object has no attribute 'bboxes' #252

lianyant opened this issue Aug 13, 2024 · 1 comment

Comments

@lianyant
Copy link

marker_single /home/llw/github/marker/pdf_marker/workspace2/pdf/1-200.pdf /home/llw/github/marker/pdf_marker/workspace2/markdown/ --batch_multiplier 3 --langs Chinese
Loaded detection model vikp/surya_det3 on device cuda with dtype torch.float16
Loaded detection model vikp/surya_layout3 on device cuda with dtype torch.float16
Loaded reading order model vikp/surya_order on device cuda with dtype torch.float16
Loaded recognition model vikp/surya_rec on device cuda with dtype torch.float16
Loaded texify model to cuda with torch.float16 dtype
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:12<00:00, 1.40it/s]
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:17<00:00, 1.47s/it]
Finding reading order: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:12<00:00, 1.05s/it]
Traceback (most recent call last):
File "/home/llw/.local/bin/marker_single", line 8, in
sys.exit(main())
File "/home/llw/.local/lib/python3.10/site-packages/convert_single.py", line 33, in main
full_text, images, out_meta = convert_single_pdf(fname, model_lst, max_pages=args.max_pages, langs=langs, batch_multiplier=args.batch_multiplier, start_page=args.start_page)
File "/home/llw/.local/lib/python3.10/site-packages/marker/convert.py", line 127, in convert_single_pdf
table_count = format_tables(pages)
File "/home/llw/.local/lib/python3.10/site-packages/marker/tables/table.py", line 138, in format_tables
table_rows = get_table_pdftext(page, table_box)
File "/home/llw/.local/lib/python3.10/site-packages/marker/tables/table.py", line 103, in get_table_pdftext
table_rows = assign_cells_to_columns(page, table_box, table_rows)
File "/home/llw/.local/lib/python3.10/site-packages/marker/tables/cells.py", line 56, in assign_cells_to_columns
separators = find_column_separators(page, table_box, round_factor=round_factor)
File "/home/llw/.local/lib/python3.10/site-packages/marker/tables/cells.py", line 31, in find_column_separators
line_boxes = [p.bbox for p in page.text_lines.bboxes]
AttributeError: 'NoneType' object has no attribute 'bboxes'

@alevonian
Copy link

alevonian commented Sep 12, 2024

I am also getting this if I set:
export OCR_ENGINE=ocrmypdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants