Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.13 support #136

Open
dolfim-ibm opened this issue Oct 10, 2024 · 9 comments
Open

Python 3.13 support #136

dolfim-ibm opened this issue Oct 10, 2024 · 9 comments
Assignees

Comments

@dolfim-ibm
Copy link
Contributor

dolfim-ibm commented Oct 10, 2024

In this issue we keep track of the Python 3.13 support in Docling and its components. In most cases, we have to wait for stating a complete support, until the main dependencies will be distributed for 3.13 as well.

Docling components

🟢 The package is available and fully working
🟠 The package relies on dependencies which are not available on Python 3.13
🔴 The package is not working or available for Python 3.13

Package Status PR
Docling 🟠
Docling Core 🟢
Docling IBM Models 🟠
(torch)
Docling Parse 🟢 DS4SD/docling-parse#39
DeepSearch GLM 🟢 DS4SD/deepsearch-glm#82

Dependencies

Package Status Details
Torch 🔴 Nightly builds working

Workaround

Full support for Python 3.13 is currently waiting for pytorch.

At the moment, no release has full support, but nightly builds are available. Docling was tested on Python 3.13 with the following steps:

# Create a python 3.13 virtualenv
python3.13 -m venv venv
source ./venv/bin/activate

# Install torch nightly builds, see https://pytorch.org/
pip3 install --pre torch torchvisio --index-url https://download.pytorch.org/whl/nightly/cpu

# Install docling
pip3 install docling

# Run docling
docling --no-ocr https://arxiv.org/pdf/2408.09869

Note: we are disabling OCR since easyocr and the nightly torch builds have some conflicts.

Cc @cau-git @vagenas @PeterStaar-IBM

@PeterStaar-IBM
Copy link
Contributor

@dolfim-ibm I think we need to follow this PR: pytorch/pytorch#130249 for docling-ibm-models

@dolfim-ibm dolfim-ibm pinned this issue Nov 9, 2024
@cau-git cau-git unpinned this issue Nov 11, 2024
@dolfim-ibm dolfim-ibm pinned this issue Nov 11, 2024
@imene-swaan
Copy link

any updates on this?

@dolfim-ibm
Copy link
Contributor Author

any updates on this?

we are waiting for torch to support python 3.13. They started with linux x86, but all other platforms and arch are missing.

@cau-git cau-git unpinned this issue Nov 25, 2024
@dolfim-ibm dolfim-ibm pinned this issue Nov 27, 2024
@w0o
Copy link

w0o commented Nov 29, 2024

Seems that nightlies are working on OSX now with Python 3.13, however Docling is requiring <2.5.0 and the expected fix will be delivered in 2.5.1, can we consider extending the upper version range by a major or even removing it?

@dolfim-ibm
Copy link
Contributor Author

Seems that nightlies are working on OSX now with Python 3.13, however Docling is requiring <2.5.0 and the expected fix will be delivered in 2.5.1, can we consider extending the upper version range by a major or even removing it?

Great, thanks for checking that!

Where do you see the <2.5.0 dependency? It should be something more flexible like >=2.2.2,<3.0.0.

$ poetry show torch

required by
 - docling-ibm-models >=2.2.2,<3.0.0
 - easyocr *
 - sentence-transformers >=1.11.0

@w0o
Copy link

w0o commented Nov 29, 2024

You are right, actually after installing nightly successfully on 3.13 I'm faced with yet another conflict when trying to install Docling:

ERROR: Cannot install docling==1.10.0, docling==1.11.0, docling==1.12.0, docling==1.12.1, docling==1.12.2, docling==1.13.0, docling==1.13.1, docling==1.14.0, docling==1.15.0, docling==1.16.0, docling==1.16.1, docling==1.17.0, docling==1.18.0, docling==1.19.0, docling==1.19.1, docling==1.2.0, docling==1.2.1, docling==1.20.0, docling==1.3.0, docling==1.4.0, docling==1.5.0, docling==1.6.0, docling==1.6.1, docling==1.6.2, docling==1.6.3, docling==1.8.5, docling==1.9.0, docling==2.0.0, docling==2.1.0, docling==2.2.0, docling==2.2.1 and docling==2.3.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    docling 2.3.0 depends on torchvision<1 and >=0; sys_platform != "darwin" or platform_machine != "x86_64"
    docling 2.2.1 depends on torchvision<1 and >=0; sys_platform != "darwin" or platform_machine != "x86_64"
    docling 2.2.0 depends on torchvision<1 and >=0; sys_platform != "darwin" or platform_machine != "x86_64"
    docling 2.1.0 depends on torchvision<1 and >=0; sys_platform != "darwin" or platform_machine != "x86_64"
    docling 2.0.0 depends on torchvision<1 and >=0; sys_platform != "darwin" or platform_machine != "x86_64"
    docling 1.20.0 depends on deepsearch-glm<0.23.0 and >=0.22.0
    docling 1.19.1 depends on deepsearch-glm<0.23.0 and >=0.22.0
    docling 1.19.0 depends on deepsearch-glm<0.23.0 and >=0.22.0
    docling 1.18.0 depends on deepsearch-glm<0.23.0 and >=0.22.0
    docling 1.17.0 depends on deepsearch-glm<0.23.0 and >=0.22.0
    docling 1.16.1 depends on deepsearch-glm<0.22.0 and >=0.21.1
    docling 1.16.0 depends on deepsearch-glm<0.22.0 and >=0.21.1
    docling 1.15.0 depends on deepsearch-glm<0.22.0 and >=0.21.1
    docling 1.14.0 depends on deepsearch-glm<0.22.0 and >=0.21.1
    docling 1.13.1 depends on deepsearch-glm<0.22.0 and >=0.21.1
    docling 1.13.0 depends on deepsearch-glm<0.22.0 and >=0.21.1
    docling 1.12.2 depends on deepsearch-glm<0.22.0 and >=0.21.0
    docling 1.12.1 depends on deepsearch-glm<0.22.0 and >=0.21.0
    docling 1.12.0 depends on deepsearch-glm<0.22.0 and >=0.21.0
    docling 1.11.0 depends on deepsearch-glm<0.22.0 and >=0.21.0
    docling 1.10.0 depends on deepsearch-glm<0.22.0 and >=0.21.0
    docling 1.9.0 depends on deepsearch-glm<0.20.0 and >=0.19.1
    docling 1.8.5 depends on deepsearch-glm<0.20.0 and >=0.19.1
    docling 1.6.3 depends on docling-parse<0.3.0 and >=0.2.0
    docling 1.6.2 depends on docling-parse<0.3.0 and >=0.2.0
    docling 1.6.1 depends on docling-parse<0.3.0 and >=0.2.0
    docling 1.6.0 depends on docling-parse<0.3.0 and >=0.2.0
    docling 1.5.0 depends on docling-parse<0.3.0 and >=0.2.0
    docling 1.4.0 depends on docling-parse<0.3.0 and >=0.2.0
    docling 1.3.0 depends on docling-parse<0.0.2 and >=0.0.1
    docling 1.2.1 depends on docling-parse<0.0.2 and >=0.0.1
    docling 1.2.0 depends on docling-parse<0.0.2 and >=0.0.1

It looks like untangling this one is not easy as trying to manually install these I run into various other issues with deepsearch-glm (torchvision installs fine).
Any ideas what is going on or a workaround would be much appreciated.

@dolfim-ibm
Copy link
Contributor Author

Interesting. Something we should double check are all the numpy requirements, because python3.13 is supported only by numpt >=2.1.0.
I think we aligned all of them already, but maybe some were forgotten. We will try to reproduce the nightly installation as well.

@dolfim-ibm dolfim-ibm mentioned this issue Nov 29, 2024
3 tasks
@dolfim-ibm
Copy link
Contributor Author

@w0o we found some small conflict but we fixed them in v2.8.1. The description of this issue was also updated with guidelines for using pytorch nightly builds.

@w0o
Copy link

w0o commented Nov 30, 2024

Nicely done 🎉
Worked like charm and I was able to run it on 3.13 (OSX) but just a quick note, I couldn't help noticing that torchaudio was included and I tested and found out it installs and works fine without it so maybe consider removing it (unless there is some use-case of Docling which requires it).
pip3 install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu

Props for the swift update!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants
@vagenas @w0o @imene-swaan @cau-git @PeterStaar-IBM @dolfim-ibm and others