-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathscanf_py3k.py
executable file
·876 lines (660 loc) · 28.2 KB
/
scanf_py3k.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
#!/usr/bin/env python3
"""scanf.py: scanf-style input for Python.
Danny Yoo ([email protected])
Ported by third party to Python 3.2 in 2012
The initial motivation for this module was based on a posting on
Python-tutor:
http://mail.python.org/pipermail/tutor/2004-July/030480.html
I haven't been able to find a nice module to do scanf-style input.
Even the Library Reference recommends regular expressions as a
substitute:
http://docs.python.org/lib/node109.html
But there appears to have been activity about this on python-list:
http://aspn.activestate.com/ASPN/Mail/Message/python-list/785450
Still, let's see if we can get a close equivalent scanf() in place.
At the least, it'll be fun for me, and it might be useful for people
who are still recovering from C. *grin*
Functions provided:
scanf(formatString) -- formatted scanning across stdin
sscanf(sourceString, formatString) -- formated scanning across strings
fscanf(sourceFile, formatString) -- formated scanning across files
The behavior of this scanf() will be slightly different from that
defined in C, because, in truth, I'm a little lazy, and am not quite
sure if people will need all of scanf's features in typical Python
programming.
But let's first show what conversions this scanf() will support.
Format strings are of the following form:
% [*] [width] [format]
where [*] and [width] are optional, and [format] is mandatory. The
optional flags modify the format.
* suppresses variable capture.
width maximum character width.
We support the following scanf conversion formats (copied from K&R):
d decimal integer.
i integer. The integer may be in octal (leading zero) or
hexadecimal (leading 0x or 0X). ## fixme
o octal integer (with or without leading zero). ## fixme
x hexadecimal integer (with or without leading 0x or 0X) ## fixme
c characters. The next input characters (default 1) are
placed at the indicated spot. The normal skip over white space
is suppressed; to read the next non-white space character, use
%1s.
s character string (not quoted).
f floating-point number with optional sign and optional decimal point.
% literal %; no assignment is made.
Literal characters can appear in the scanf format string: they must
match the same characters in the input.
There is no guarantee of what happens if calls to scanf are mixed with
other input functions. See the BUGS section below for details on this.
If the input doesn't conform to the format string, a FormatError is
raised.
Example format strings:
"%d %d" Two decimal integers.
"%d.%d.%d.%d" Four decimal integers, separated by literal periods.
The periods won't be captured.
"hello %s" Literally matches "hello" followed by any number of
spaces, followed by a captured word.
There's also an interface for calling the internal function bscanf()
that works on CharacterBuffer types, if in the future there is
something that supports getc() and ungetc() natively. There's also an
undocumented compile() function that takes format strings and returns
a function that can scan through CharacterBuffers. Ooops, I guess I
just documented it. *grin*
######################################################################
BUGS and GOTCHAS:
One major problem that I'm running into is a lack of ungetc(); it
would be nice if there were such a function in Python, but I can't
find it. I have to simulate it by using a CharacterBuffer object, but
it's not an ideal solution.
So at most, you may lose a single character to the internal buffers
maintained by this module if you use scanf(). The other two *scanf()
functions, thankfully, aren't effected by this problem, since I can
simulate ungetc() more accurately by using seek() in the other two
cases.
If you really need to get that buffered character back, you can grab
it through _STDIN.lastChar, though manually fiddling with this is not
recommended.
So use scanf() with the following caveat: unlike C's stdin(), this
version scanf() can't be interchanged with calls to other input
functions without some kind of weird side effect. We keep a
one-character buffer into stdin, so at most you might lose one
character to the internal buffers.
fscanf() is only allowed to work on things that support both read(1)
and seek(1, -1), since then I can reliably do a ungetch-like thing.
scanf("%s") can be dangerous in a hostile environment, since it's very
possible for something to pass in a huge string without spaces. So use
an explicit width instead if you can help it.
######################################################################
LICENSE
Copyright (c) 2008, Danny yoo
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
import sys
import unittest
from string import whitespace as WHITESPACE
from string import digits as DIGITS
__all__ = ['scanf', 'sscanf', 'fscanf']
__version__ = '1.0'
class CharacterBuffer(object):
"""A CharacterBuffer allows us to get a character, and to "unget" a
character. Abstract class"""
def getch(self):
"""Returns the next character. If there are no more characters
left in the stream, returns the empty string."""
pass ## implement me!
def ungetch(self, ch):
"""Tries to put back a character. Can be called at most once
between calls to getch()."""
pass ## implement me!
def scanCharacterSet(self, characterSet, maxChars=0):
"""Support function that scans across a buffer till we hit
something outside the allowable characterSet."""
return self.scanPredicate(lambda ch: ch in characterSet, maxChars)
def scanPredicate(self, predicate, maxChars=0):
"""Support function that scans across a buffer till we hit
something outside what's allowable by the predicate."""
chars = []
countChars = 0
while True:
if (maxChars != 0 and countChars >= maxChars):
break
ch = self.getch()
if ch != '' and predicate(ch):
chars.append(ch)
countChars += 1
else:
self.ungetch(ch)
break
return ''.join(chars)
class CharacterBufferFromIterable(CharacterBuffer):
"""Implementation of CharacterBuffers for iterable things.
We keep a 'lastChar' attribute to simulate ungetc()."""
def __init__(self, iterable):
self.iterator = iter(iterable)
self.lastChar = ''
def getch(self):
if self.lastChar == '':
try:
return next(self.iterator)
except StopIteration:
return ''
else:
(ch, self.lastChar) = (self.lastChar, '')
return ch
def ungetch(self, ch):
self.lastChar = ch
class CharacterBufferFromStdin(CharacterBuffer):
def __init__(self, stdin):
self.stdin = stdin
self.lastChar = ''
def getch(self):
if self.lastChar == '':
try:
return self.stdin.read(1)
except StopIteration:
return ''
else:
(ch, self.lastChar) = (self.lastChar, '')
return ch
def ungetch(self, ch):
self.lastChar = ch
class CharacterBufferFromFile(CharacterBuffer):
"""Implementation of CharacterBuffers for files. We use the native
read(1) and seek() calls, so we don't have to do so much magic."""
def __init__(self, myfile):
self.myfile = myfile
def getch(self):
return self.myfile.read(1)
def ungetch(self, ch):
self.myfile.seek(self.myfile.tell() - len(ch.encode()))
def readiter(inputFile, *args):
"""Returns an iterator that calls read(*args) on the inputFile."""
while True:
ch = inputFile.read(*args)
if ch:
yield ch
else:
raise StopIteration
def isIterable(thing):
"""Returns true if 'thing' looks iterable."""
try: iter(thing)
except TypeError: return False
return True
def isFileLike(thing):
"""Returns true if thing looks like a file."""
if hasattr(thing, "read") and hasattr(thing, "seek"):
try:
thing.seek(1, 1)
thing.seek(-1, 1)
return True
except IOError:
pass
return False
def makeCharBuffer(thing):
"""Try to coerse 'thing' into a CharacterBuffer. 'thing' can be
an instance of:
1. CharacterBuffer
2. A file-like object,
3. An iterable.
makeCharBuffer() will make guesses in that order.
"""
if isinstance(thing, CharacterBuffer):
return thing
elif isFileLike(thing):
## this check must come before isIterable, since files
## provide a line-based iterator that we don't want to use.
## Plus we want to take advantage of file.seek()
return CharacterBufferFromFile(thing)
elif isIterable(thing):
return CharacterBufferFromIterable(thing)
else:
raise ValueError("Can't coerse %r to CharacterBuffer" % thing)
class CappedBuffer(CharacterBuffer):
"""Implementation of a buffer that caps the number of bytes we can
getch(). The cap may or may not include whitespace characters."""
def __init__(self, buffer, width, ignoreWhitespace=False):
self.buffer = buffer
self.bytesRead = 0
self.width = width
self.ignoreWhitespace = ignoreWhitespace
def getch(self):
if self.bytesRead < self.width:
nextChar = self.buffer.getch()
if not self.isIgnoredChar(nextChar):
self.bytesRead += len(nextChar)
return nextChar
else:
return ''
def isIgnoredChar(self, ch):
return self.ignoreWhitespace and isWhitespaceChar(ch)
def ungetch(self, ch):
self.buffer.ungetch(ch)
if not self.isIgnoredChar(ch):
self.bytesRead -= len(ch)
# make sure wacky things don't happen when ungetch()ing.
assert self.bytesRead >= 0
class FormatError(ValueError):
"""A FormatError is raised if we run into errors while scanning
for input."""
pass
class IncompleteCaptureError(ValueError):
"""The *scanf() functions raise IncompleteCaptureError if a problem
occurs doing scanning."""
pass
"""We keep a module-level STDIN CharacterBuffer, so that we can call
scanf() several times and not lose characters between invocations."""
_STDIN = CharacterBufferFromStdin(sys.stdin)
def scanf(formatString):
"""scanf(formatString) -> tuple
Scans standard input for formats specified in the formatString. See
module's docs for list of supported format characters."""
return bscanf(_STDIN, formatString)
def sscanf(inputString, formatString):
"""sscanf(inputString, formatString) -> tuple
Scans inputString for formats specified in the formatString. See
module's docs for list of supported format characters."""
return bscanf(CharacterBufferFromIterable(inputString), formatString)
def fscanf(inputFile, formatString):
"""fscanf(inputFile, formatString) -> tuple
Scans inputFile for formats specified in the formatString. See
module's docs for list of supported format characters."""
buffer = CharacterBufferFromFile(inputFile)
return bscanf(buffer, formatString)
def bscanf(buffer, formatString):
"""fscanf(buffer, formatString) -> tuple
Scans a CharacterBuffer 'buffer' for formats specified in the
formatString. See scanf module's docs for list of supported format
characters."""
## TODO: we may want to do some caching here of compiled formatStrings,
## similar to that of the 're' module.
parser = compile(formatString)
return parser(buffer)
def isWhitespaceChar(ch, _set=set(WHITESPACE)):
"""Returns true if the charcter looks like whitespace.
We follow the definition of C's isspace() function.
"""
return ch in _set
def handleWhitespace(buffer):
"""Scans for whitespace. Returns all the whitespace it collects."""
chars = []
while True:
ch = buffer.getch()
if isWhitespaceChar(ch):
chars.append(ch)
else:
buffer.ungetch(ch)
break
return ''.join(chars)
## We keep a few sets as module variables just to incur the cost of
## constructing them just once.
_PLUS_MINUS_SET = set("+-")
_DIGIT_SET = set(DIGITS)
_OCT_SET = set("01234567")
_HEX_SET = set("0123456789ABCDEFabcdef")
def handleDecimalInt(buffer, optional=False, allowLeadingWhitespace=True):
"""Tries to scan for an integer. If 'optional' is set to False,
returns None if an integer can't be successfully scanned."""
if allowLeadingWhitespace:
handleWhitespace(buffer) ## eat leading spaces
chars = []
chars += buffer.scanCharacterSet(_PLUS_MINUS_SET, 1)
chars += buffer.scanCharacterSet(_DIGIT_SET)
try:
return int(''.join(chars), 10)
except ValueError:
if optional:
return None
raise FormatError("invalid literal characters: %s" % ''.join(chars))
def handleOct(buffer):
chars = []
chars += buffer.scanCharacterSet(_PLUS_MINUS_SET)
chars += buffer.scanCharacterSet(_OCT_SET)
try:
return int(''.join(chars), 8)
except ValueError:
raise FormatError("invalid literal characters: %s" % ''.join(chars))
def handleInt(buffer, base=0):
chars = []
chars += buffer.scanCharacterSet(_PLUS_MINUS_SET)
chars += buffer.scanCharacterSet("0")
if chars and chars[-1] == '0':
chars += buffer.scanCharacterSet("xX")
chars += buffer.scanCharacterSet(_HEX_SET)
try:
return int(''.join(chars), base)
except ValueError:
raise FormatError("invalid literal characters: %s" % ''.join(chars))
def handleHex(buffer):
return handleInt(buffer, 16)
def handleFloat(buffer, allowLeadingWhitespace=True):
if allowLeadingWhitespace:
handleWhitespace(buffer) ## eat leading whitespace
chars = []
chars += buffer.scanCharacterSet(_PLUS_MINUS_SET)
chars += buffer.scanCharacterSet(_DIGIT_SET)
chars += buffer.scanCharacterSet(".")
chars += buffer.scanCharacterSet(_DIGIT_SET)
chars += buffer.scanCharacterSet("eE")
chars += buffer.scanCharacterSet(_PLUS_MINUS_SET)
chars += buffer.scanCharacterSet(_DIGIT_SET)
try:
return float(''.join(chars))
except ValueError:
raise FormatError("invalid literal characters: %s" % ''.join(chars))
def handleChars(buffer,
allowLeadingWhitespace=False,
isBadCharacter=lambda ch: False,
optional=False):
"""Read as many characters are there are in the buffer."""
if allowLeadingWhitespace:
handleWhitespace(buffer)
chars = []
chars += buffer.scanPredicate(lambda ch: not isBadCharacter(ch))
if chars:
return ''.join(chars)
else:
if optional: return None
raise FormatError(("Empty buffer."))
def handleString(buffer, allowLeadingWhitespace=True):
"""Reading a string format is just an application of reading
characters (skipping leading spaces, and reading up to space)."""
return handleChars(buffer,
allowLeadingWhitespace=allowLeadingWhitespace,
isBadCharacter=isWhitespaceChar)
def makeHandleLiteral(literal):
def f(buffer, optional=False):
ch = buffer.getch()
if ch == literal:
return ch
else:
buffer.ungetch(ch)
if optional: return None
raise FormatError("%s != %s" % (literal, ch))
return f
def makeWidthLimitedHandler(handler, width, ignoreWhitespace=False):
"""Constructs a Handler that caps the number of bytes that can be read
from the byte buffer."""
def f(buffer):
return handler(CappedBuffer(buffer, width, ignoreWhitespace))
return f
"""Just for kicks: handleChar is a handler for a single character."""
handleChar = makeWidthLimitedHandler(handleChars, 1, ignoreWhitespace=False)
def makeIgnoredHandler(handler):
def f(buffer):
handler(buffer)
return None
return f
class CompiledPattern:
def __init__(self, handlers, formatString):
self.handlers = handlers
self.formatString = formatString
def __call__(self, buffer):
results = []
try:
for h in self.handlers:
value = h(buffer)
## We use None as the sentinel value that ignored handlers
## will emit.
if value is not None:
results.append(value)
return tuple(results)
except FormatError as e:
raise IncompleteCaptureError(e, tuple(results))
def __repr__(self):
return "compile(%r)" % self.formatString
def compile(formatString):
"""Given a format string, emits a new CompiledPattern that eats
CharacterBuffers and returns captured values as a tuple.
If there's a failure during scanning, raises IncompleteCaptureError,
with args being a two-tuple of the FormatError, and the results that
were captured before the error occurred.
"""
handlers = []
formatBuffer = CharacterBufferFromIterable(formatString)
while True:
ch = formatBuffer.getch()
if ch == '': break
if isWhitespaceChar(ch):
handleWhitespace(formatBuffer)
handlers.append(makeIgnoredHandler(handleWhitespace))
elif ch == '%':
handlers.append(_compileFormat(formatBuffer))
else:
handlers.append(makeIgnoredHandler(makeHandleLiteral(ch)))
return CompiledPattern(handlers, formatString)
def _compileFormat(formatBuffer):
def readOptionalSuppression():
f = makeHandleLiteral("*")
return f(formatBuffer, optional=True) == "*"
def readOptionalWidth():
return handleDecimalInt(formatBuffer,
optional=True,
allowLeadingWhitespace=False)
def readFormat():
return formatBuffer.getch() ## Finally, read the format
suppression = readOptionalSuppression()
width = readOptionalWidth()
formatCh = readFormat()
handler = makeFormattedHandler(suppression, width, formatCh)
if handler:
return handler
else:
## At this point, since we couldn't figure out the format, die loudly.
raise FormatError("Invalid format character %s" % formatCh)
_FORMAT_HANDLERS = {'d': handleDecimalInt,
'i': handleInt,
'x': handleHex,
'o': handleOct,
's': handleString,
'f': handleFloat,
'%': makeIgnoredHandler(makeHandleLiteral('%'))
}
def makeFormattedHandler(suppression, width, formatCh):
"""Given suppression, width, and a formatType, returns a function
that eats a buffer and returns that thing."""
def applySuppression(handler):
if suppression:
return makeIgnoredHandler(handler)
return handler
def applyWidth(handler):
if width != None:
return makeWidthLimitedHandler(handler, width,
ignoreWhitespace=True)
return handler
## 'c' is a special case: it's the only handler that can't ignore
## whitespace.
if formatCh == 'c':
if width == None:
return applySuppression(handleChar)
else:
return applySuppression(
makeWidthLimitedHandler(handleChars, width,
ignoreWhitespace=False))
if formatCh in _FORMAT_HANDLERS:
return applySuppression(applyWidth(_FORMAT_HANDLERS[formatCh]))
else:
return None
######################################################################
##
## Of course we have test cases. *grin*
##
class ScanfTests(unittest.TestCase):
def bufferFromString(self, s):
return CharacterBufferFromIterable(s)
def testBufferFromString(self):
b = self.bufferFromString("hello")
for letter in list('hello'):
self.assertEqual(letter, b.getch())
self.assertEqual('', b.getch())
def testCharacterSetScanning(self):
b = makeCharBuffer("+++-+++++1234")
self.assertEqual("+++", b.scanCharacterSet(set("+")))
self.assertEqual("", b.scanCharacterSet(set("+")))
self.assertEqual("-", b.scanCharacterSet(set("-")))
self.assertEqual("+", b.scanCharacterSet(set("+"), 1))
def testPredicateScanning(self):
b = makeCharBuffer("+++-+++++1234")
self.assertEqual("+++", b.scanPredicate(lambda ch: ch == '+'))
def testUngetch(self):
b = self.bufferFromString("ong")
b.ungetch('y')
self.assertEqual('y', b.getch())
self.assertEqual('o', b.getch())
b.ungetch('u')
self.assertEqual('u', b.getch())
self.assertEqual('n', b.getch())
self.assertEqual('g', b.getch())
self.assertEqual('', b.getch())
def testRepeatedGetchOnEmptyStreamIsOk(self):
b = self.bufferFromString("")
self.assertEqual('', b.getch())
self.assertEqual('', b.getch())
def testCappedBuffer(self):
b = CappedBuffer(self.bufferFromString("supercalifragilisticexpialidocious"), 5)
self.assertEqual("s", b.getch())
self.assertEqual("u", b.getch())
self.assertEqual("p", b.getch())
self.assertEqual("e", b.getch())
self.assertEqual("r", b.getch())
self.assertEqual('', b.getch())
self.assertEqual('', b.getch())
b.ungetch('r')
self.assertEqual("r", b.getch())
self.assertEqual('' ,b.getch())
def testWhitespaceScanning(self):
b = self.bufferFromString(" 42\n43")
self.assertEqual(" ", handleWhitespace(b))
self.assertEqual("", handleWhitespace(b))
self.assertEqual("4", b.getch())
def testDecimalDigitScanning(self):
b = self.bufferFromString("42 43!44")
self.assertEqual(42, handleDecimalInt(b))
self.assertEqual(" ", handleWhitespace(b))
self.assertEqual(43, handleDecimalInt(b))
b2 = self.bufferFromString("-1-2+3-4");
self.assertEqual(-1, handleDecimalInt(b2))
self.assertEqual(-2, handleDecimalInt(b2))
self.assertEqual(3, handleDecimalInt(b2))
self.assertEqual(-4, handleDecimalInt(b2))
self.assertRaises(FormatError, handleDecimalInt, b2)
def testCharacter(self):
b = self.bufferFromString("hi!")
self.assertEqual("h", handleChar(b))
self.assertEqual("i", handleChar(b))
self.assertEqual("!", handleChar(b))
self.assertRaises(FormatError, handleChar, b)
def testString(self):
b = self.bufferFromString("-42 + 1 equals -41")
self.assertEqual("-42", handleString(b))
handleWhitespace(b)
self.assertEqual("+", handleString(b))
handleWhitespace(b)
self.assertEqual("1", handleString(b))
handleWhitespace(b)
self.assertEqual("equals", handleString(b))
handleWhitespace(b)
self.assertEqual("-41", handleString(b))
def testIntegerScanning(self):
self.assertEqual((42, 43),
sscanf(" 42\n 43 ", "%d %d"))
self.assertEqual((8,), sscanf("10", "%o"))
self.assertEqual((8,), sscanf("010", "%o"))
self.assertEqual((15,), sscanf("F", "%x"))
self.assertEqual((15,), sscanf("f", "%x"))
self.assertEqual((15,), sscanf("0xF", "%x"))
self.assertEqual((15,), sscanf("0XF", "%x"))
self.assertEqual((15,), sscanf("0Xf", "%x"))
self.assertEqual((-1, -2, 3, -4), sscanf("-1-2+3-4", "%d%d%d%d"))
def testWordScanning(self):
self.assertEqual(("hello", "world"),
sscanf(" hello world", "%s %s"))
def testSuppression(self):
self.assertEqual((), sscanf(" hello world", "%*s %*s"))
self.assertEqual(("happy",),
sscanf("hello happy world", "%*s %s %*s"))
self.assertEqual((), sscanf("h", "%*c"))
def testWidth(self):
self.assertEqual(("00010",), sscanf("00010101010111", "%5c"))
self.assertEqual(("xy",), sscanf("xyz", "%2s"))
self.assertEqual(("xy",), sscanf(" xyz", "%2s"))
self.assertEqual((" ",), sscanf(" xyz", "%2c"))
def testFscanf(self):
import io
b = io.StringIO("hello world")
self.assertEqual(("hello", " ", "world"), fscanf(b, "%s%c%s"))
## Check that calling fscanf() twice doesn't
## drop the last character
b2 = io.StringIO("hello world")
self.assertEqual(("hello",), fscanf(b2, "%s"))
self.assertEqual((" ",), fscanf(b2, "%c"))
self.assertEqual(("world",), fscanf(b2, "%s"))
def testSkipLeadingSpaceOnScanning(self):
"""Ralph Heinkel reported a bug where floats weren't being
parsed properly if there were leading whitespace for %f.
This case checks that"""
self.assertEqual((42.0,),
sscanf(" 42.0", "%f"))
def testFloats(self):
self.assertEqual((3.14,
10.,
.001,
1e100,
3.14e-10,
0e0,), sscanf("""3.14
10.
.001
1e100
3.14e-10
0e0""", "%f %f %f %f %f %f"))
def testMoreSimpleScanningExamples(self):
self.assertEqual((192,168,1,1),
sscanf("192.168.1.1", "%d.%d.%d.%d"))
self.assertEqual(("a", "b", "c"),
sscanf(" ab c ", "%1s%1s%s"))
self.assertEqual(("hello", " ", "world"),
sscanf("hello world", "%s%c%s"))
self.assertRaises(IncompleteCaptureError,
sscanf, "192.168.1.1", "%d %d %d %d")
self.assertEqual(("danny",),
sscanf("hi danny", "hi %s"))
self.assertEqual(("danny",),
sscanf(" hi danny", " hi %s"))
self.assertEqual(("a", "b", 3),
sscanf("ab3", "%c%c%d"))
## this case is weird, but it happens in C too!
self.assertRaises(IncompleteCaptureError,
sscanf, " hi danny", "hi %s")
## The example that's used in
## 'http://docs.python.org/lib/node109.html'
self.assertEqual(("/usr/bin/sendmail", 0, 4),
sscanf("/usr/bin/sendmail - 0 errors, 4 warnings",
"%s - %d errors, %d warnings"))
def testErroneousFormats(self):
self.assertRaises(FormatError, compile, "%")
self.assertRaises(FormatError, compile, "% ")
self.assertRaises(FormatError, compile, "%*")
self.assertRaises(FormatError, compile, "%*z")
self.assertRaises(FormatError, compile, "% d")
self.assertRaises(FormatError, compile, "%* d")
if __name__ == '__main__':
unittest.main()