-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loop on empty input #16
Comments
It enters infinite loop for single-line text files and some other files too. |
got this bug too ! |
Proposed patch diff --git a/pyPdf/pdf.py b/pyPdf/pdf.py
index bf60d01..586ea81 100644
--- a/pyPdf/pdf.py
+++ b/pyPdf/pdf.py
@@ -701,7 +701,7 @@ class PdfFileReader(object):
# start at the end:
stream.seek(-1, 2)
line = ''
- while not line:
+ while not line and stream.tell():
line = self.readNextEndLine(stream)
if line[:5] != "%%EOF":
raise utils.PdfReadError, "EOF marker not found" Without patch:: >>> import pyPdf
>>> from cStringIO import StringIO
>>> c = StringIO('')
>>> pdf = pyPdf.PdfFileReader(c)
--- Infinite loop ---
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/pyPdf2/lib/python2.7/site-packages/pyPdf/pdf.py", line 374, in __init__
self.read(stream)
File "/tmp/pyPdf2/lib/python2.7/site-packages/pyPdf/pdf.py", line 705, in read
line = self.readNextEndLine(stream)
File "/tmp/pyPdf2/lib/python2.7/site-packages/pyPdf/pdf.py", line 870, in readNextEndLine
line = x + line
KeyboardInterrupt With patch:: >>> import pyPdf
>>> from cStringIO import StringIO
>>> c = StringIO('')
>>> pdf = pyPdf.PdfFileReader(c)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/pyPdf2/lib/python2.7/site-packages/pyPdf/pdf.py", line 374, in __init__
self.read(stream)
File "/tmp/pyPdf2/lib/python2.7/site-packages/pyPdf/pdf.py", line 707, in read
raise utils.PdfReadError, "EOF marker not found"
pyPdf.utils.PdfReadError: EOF marker not found |
Hum a better patch: --- a/pyPdf/pdf.py
+++ b/pyPdf/pdf.py
@@ -701,7 +701,7 @@ class PdfFileReader(object):
# start at the end:
stream.seek(-1, 2)
line = ''
- while not line:
+ while not line and stream.tell():
line = self.readNextEndLine(stream)
if line[:5] != "%%EOF":
raise utils.PdfReadError, "EOF marker not found"
@@ -857,7 +857,7 @@ class PdfFileReader(object):
def readNextEndLine(self, stream):
line = ""
- while True:
+ while stream.tell():
x = stream.read(1)
stream.seek(-2, 1)
if x == '\n' or x == '\r': This one work with empty stream but also one line stream: >>> import pyPdf
>>> from cStringIO import StringIO
>>> c = StringIO(' ')
>>> pdf = pyPdf.PdfFileReader(c)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/pyPdf2/lib/python2.7/site-packages/pyPdf/pdf.py", line 374, in __init__
self.read(stream)
File "/tmp/pyPdf2/lib/python2.7/site-packages/pyPdf/pdf.py", line 707, in read
raise utils.PdfReadError, "EOF marker not found"
pyPdf.utils.PdfReadError: EOF marker not found |
The second chunk is not really going to work... |
sorry, corrected :-) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Create an empty StringIO and call the pdf reader on it. It will loop in the readNextEndLine calls before the %%EOF check in read.
The text was updated successfully, but these errors were encountered: