-
-
Notifications
You must be signed in to change notification settings - Fork 666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add checks for unsupported characters within image path and output path #4390
base: master
Are you sure you want to change the base?
Add checks for unsupported characters within image path and output path #4390
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for contributing a pull request! 🙏
Welcome to the ITK community! 🤗👋☀️
We are glad you are here and appreciate your contribution. Please keep in mind our community participation guidelines. 📜
More support and guidance on the contribution process can be found in our contributing guide. 📖
This is an automatic message. Allow for time for the ITK community to be able to read the pull request and comment
on it.
d1bdbce
to
7304267
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this check in makes sense via Python interface. Mostly looks good.
try: | ||
filename.encode('ascii') | ||
except UnicodeEncodeError: | ||
msg += "\nThe output path contains not supported special characters. \n" + f"Filename = {filename}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not supported -> unsupported
Perhaps give user a hint about setting UTF-8 as the default code page?
try: | ||
inputFileName.encode('ascii') | ||
except UnicodeEncodeError: | ||
msg += "\nThe image path contains not supported special characters. \n" + f"Filename = {inputFileName}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preferably give a hint about UTF8 code page.
There may be something in the SWIG layer that can be down with the conversion. There is some information here: That needs to be gone through. |
@Pfleiderer-Adrian thanks for having a look at this. A few comments:
|
The second commit should be squashed into the first one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple suggestions.
msg += "\nThe output dir doesn't exist. \n" + f"Filename = {filename}" | ||
raise RuntimeError( | ||
f"Could not create IO object for writing file {filename}" + msg | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a useful check to occur in the C++ code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I assume that performing an equivalent check in C++ would be a lot harder. If the author has the will to explore this, it would be good.
try: | ||
filename.encode('ascii') | ||
except UnicodeEncodeError: | ||
msg += "\nThe output path contains not supported special characters. \n" + f"Filename = {filename}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be useful to place this check into a function.
Thank you for your feedback! I'll post an update this weekend. :) |
Thank you for diving into this issue, @Pfleiderer-Adrian Do I understand correctly that this issue (#4388) is only a problem for Windows users, not for Linux users? (Specifically, only those Windows users that do not have UTF-8 support enabled and selected a locale that does not support the filename?) I'm asking, because it feels too strict to only accept ASCII characters from now, for all users. So is it really the intention to reject any non-ASCII character in file names, for any user? Aren't we then throwing the baby out with the bath water? Would it be an idea to only do those checks after trying to use |
Description
I encountered an issue while attempting to load images using the ITK / SimpleITK framework in my Python script. The traceback indicates that the problem arises when attempting to create an IO object for reading an image file that contains Umlauts (ä, ü, ö) in its path. See: #4388
Providing a more descriptive error message indicating that the path cannot be read due to special characters, such as Umlauts, would be helpful for users to understand and address the issue.
Changes
I have added a simple filepath check for illegal characters in the _NewImageReader and imwrite routines. Additionally, for the imwrite routine, a check was added to verify if the output directory exists.
PR Checklist