Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking if the file exists or not by simply opening and closing. #124 #125

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Viddesh1
Copy link

Issue number #124

After:-

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\pdfextract\markitdown\src\markitdown\_markitdown.py", line 1289, in convert       
    with open(source, 'r'):
         ^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'test.xlsx'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\pdfextract\VK\two.py", line 4, in <module>
    result = md.convert("test.xlsx")
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Admin\Desktop\pdfextract\markitdown\src\markitdown\_markitdown.py", line 1292, in convert       
    raise FileNotFoundError
FileNotFoundError

@Viddesh1 Viddesh1 marked this pull request as ready for review December 18, 2024 11:00
@gagb
Copy link
Contributor

gagb commented Dec 20, 2024

Why not use os.path.exists()?

@gagb gagb added the awaiting op response The PR is awaiting response/edits from the original poster. label Dec 20, 2024
@Viddesh1
Copy link
Author

Hello @gagb ,

Thanks for taking a look.

        try:
            with open(source, 'r'):
                pass  # Just checking if the file can be opened; no need to read its contents.
        except FileNotFoundError:
            raise FileNotFoundError

with open("file_name.xlsx", "r") syntax checks if the file exists or not and also opens the file in read mode to see if the file is readable or not.

        if not os.path.exists(source):
            raise FileNotFoundError(f"File {source} does not exists. Please check")

os.path.exists("file_name.xlsx") checks if the file name with extension is exists or not. Does not read it content by opening it.

By using the os.path.exists the trace back look like below:- (much shorter)

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\pdfextract\VK\two.py", line 5, in <module>
    result = md.convert("test.xlsx")
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Admin\Desktop\pdfextract\markitdown\src\markitdown\_markitdown.py", line 1295, in convert
    raise FileNotFoundError(f"File {source} does not exists. Please check.")
FileNotFoundError: File test.xlsx does not exists. Please check.

Regards!
Viddesh

@gagb
Copy link
Contributor

gagb commented Dec 21, 2024

Hello @gagb ,

Thanks for taking a look.

        try:
            with open(source, 'r'):
                pass  # Just checking if the file can be opened; no need to read its contents.
        except FileNotFoundError:
            raise FileNotFoundError

with open("file_name.xlsx", "r") syntax checks if the file exists or not and also opens the file in read mode to see if the file is readable or not.

        if not os.path.exists(source):
            raise FileNotFoundError(f"File {source} does not exists. Please check")

os.path.exists("file_name.xlsx") checks if the file name with extension is exists or not. Does not read it content by opening it.

By using the os.path.exists the trace back look like below:- (much shorter)

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\pdfextract\VK\two.py", line 5, in <module>
    result = md.convert("test.xlsx")
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Admin\Desktop\pdfextract\markitdown\src\markitdown\_markitdown.py", line 1295, in convert
    raise FileNotFoundError(f"File {source} does not exists. Please check.")
FileNotFoundError: File test.xlsx does not exists. Please check.

Regards! Viddesh

What's an example of a case where file does exists but open throws an error?

@Viddesh1
Copy link
Author

Hello @gagb ,

Find some of example below:-

  1. File permission not provided by root user for read, write
  2. File is a directory eg test.xlsx
  3. File is locked
  4. Invalid encoding. (may be)
  5. File is corrupted.

Thanks!
Viddesh

@gagb gagb removed the awaiting op response The PR is awaiting response/edits from the original poster. label Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants