diff --git a/doc/tutorial.org b/doc/tutorial.org index e3ddf2b..9c7360c 100644 --- a/doc/tutorial.org +++ b/doc/tutorial.org @@ -3,7 +3,7 @@ This tutorial is essentially a translation of Monadic Parser Combinators[1], or at least the first half, into common lisp. Discussion of static types and the details of monads are - omitted as we're simply concerned with parsing. + somewhat omitted as we're simply concerned with parsing. The example code in the following document is completely self-contained, and does not require an installation of the =SMUG= @@ -13,7 +13,7 @@ descent parsing is assumed. The only requirements are an ANSI common lisp environment, and a burning desire to find a better way to parse. - + * Introduction This tutorial, like this library, is based on an approach to @@ -30,36 +30,17 @@ ** Common Lisp In some cases, the natural name for a parser conflicts with a name - in the =COMMON-LISP= package. Rather then shadow the symbols, i've + in the =COMMON-LISP= package. Rather then shadow the symbols, I've chosen to prefix names with a #\. character. It is thought that this aids usability, as one can simply (:use :smug/tutorial). It also - helps to distiguish parser returning/running functions from other + helps to distinguish parser returning/running functions from other functions. -*** Testing - -#+BEGIN_SRC lisp - (defmacro test> (form &key ((=> provided-result) nil result-provided?) - (test 'equal)) - `(let* ((form-result ,form) - (result (if ',result-provided? - (funcall ',test form-result ',provided-result) - form-result))) - (prog1 result (assert result - () "~A~% => ~A ~% ...should be :~% ~A" - ',form form-result ',(if result-provided? - provided-result - "Something that evaluates to a non-NIL value"))))) -#+END_SRC - -#+RESULTS: -: TEST> - * How To Combine Parsers #+BEGIN_QUOTE A Parser for Things - is a functions from Strings + is a function from Strings to Lists of Pairs of Things and Strings! --- Fritz Ruehr, Willamette University [2] @@ -71,9 +52,8 @@ some way. Or, to put it simply, a function from strings to things. #+BEGIN_SRC lisp - ;; our fictional parser matches the string "string" - ;; and returns a SYMBOL thing - + ;; our fictional parser matches the string "string" + ;; and returns a SYMBOL thing (parse-thing "string") => THING #+END_SRC @@ -125,10 +105,12 @@ (defgeneric input-rest (input) (:method ((input string)) - (make-array (1- (length input)) - :displaced-to input - :displaced-index-offset 1 - :element-type (array-element-type input)))) + (multiple-value-bind (string displacement) + (array-displacement input) + (make-array (1- (length input)) + :displaced-to (or string input) + :displaced-index-offset (1+ displacement) + :element-type (array-element-type input))))) #+END_SRC #+BEGIN_SRC lisp @@ -149,7 +131,7 @@ The first parser is =.RESULT=, which always succeeds by returning the value passed to it, and does not consume any input. Because we've earlier defined parsers as functions that take a single argument - we'll make a curry with the input parameter. + we'll make a curry[5] with the input parameter. #+NAME: tutorial_.result #+BEGIN_SRC lisp @@ -210,12 +192,12 @@ * =.RUN=: =FUNCALL= in disguise -All the primatives return a =FUNCTION= that must be =FUNCALL='ed with =INPUT= in +All the primitives return a =FUNCTION= that must be =FUNCALL='ed with =INPUT= in order to run the parser. There are many reasons to define a =.RUN= function. We can =CL:TRACE= it, or change the input parameter =TYPE=, or change what the actual -primatives return. +primitives return. #+NAME: tutorial_.run #+BEGIN_SRC lisp @@ -223,7 +205,7 @@ primatives return. (funcall parser input)) #+END_SRC -* =.BIND=: Our first primative combinator +* =.BIND=: Our first primitive combinator Now that we have our primitive parsers, we need a way to combine them. We'd like to be able to apply parsers in sequence, and it @@ -287,7 +269,7 @@ primatives return. Of course, just having functions called =.BIND= and =.RESULT= does not a monad make. There are other contracts that =.BIND= (also known as pipe, >>=, *, or let) or =.RESULT= (aka lift, unit, return) must - fulfil. + fulfill. *** The monad laws @@ -534,21 +516,19 @@ order to save a bundle in the future. required, a deterministic combinator =.OR=, will be introduced later in the tutorial. -** TODO Effeciency +** TODO Efficiency =.FIRST= is the real choice when it comes down to it, as =.PLUS= really does matter. -#+BEGIN_SRC lisp +This works. +#+NAME: tutorial_.first +#+BEGIN_SRC lisp (defun .first (parser) (lambda (input) - (let ((results (funcall parser input))) + (let ((results (.run parser input))) (when results (list (cl:first results)))))) - #+END_SRC - -#+RESULTS: -: \.FIRST * Syntax : =LET*= and the identity monad @@ -677,14 +657,17 @@ and is a much nicer way to work over nesting =.BIND='s. #+NAME: tutorial_.string= #+BEGIN_SRC lisp (defun .string= (string) - (if (input-empty-p string) - (.result "") + (if (string= string "") + (.fail) (.let* - ((_ (.is 'char= (input-first string))) + ((_ (.is 'char= (aref string 0))) (_ (.string= (input-rest string)))) (.result string)))) #+END_SRC +#+RESULTS: tutorial_.string= +: \.STRING= + #+NAME: tutorial_test.string= #+BEGIN_SRC lisp (funcall (.string= "asdf") "asdfjkl") ;=> (("asdf" . "jkl")) @@ -694,7 +677,6 @@ and is a much nicer way to work over nesting =.BIND='s. Once can see how much nicer =.LET*= notation is, and also how the ignorable =_= comes in handy. - * =.MAP= : The repetition combinator Earlier, we defined a parser, =.WORD=, using =.BIND= and a recursive @@ -752,32 +734,40 @@ and is a much nicer way to work over nesting =.BIND='s. #+END_SRC We could now define a =TWO-OR-MORE= and =THREE-OR-MORE= etc., but it -is likely better to define a function to rule them all. The -=:AT-LEAST= keyword solves the "how many do we want to start with?" +is likely better to define a function to rule them all. + +** =.MAP= :: ONE FUNCTION TO RULE THEM ALL -# There is also the simple matter of using =.PLUS= or **. The -# =:USING= argument takes care of that. +The=:AT-LEAST= keyword solves the "how many do we want to start with?" + + There is also the simple matter of using =.PLUS= or **. The + =:USING= argument takes care of that. So, we can specify the =.MAP= parser as follows. #+NAME: tutorial_.map #+BEGIN_SRC lisp (defun .map (result-type parser - &key - (at-least 1)) + &key + (at-least 1) + (using #'.plus)) "=> =LIST= of /parser/ results." (labels ((.zero-or-more (parser) - (.plus - (.result nil) + (funcall + using (.let* ((x parser) (xs (.zero-or-more parser))) - (.result (cons x xs)))))) + (.result (when result-type + (cons x xs)))) + (.result nil)))) (if (zerop at-least) - (.zero-or-more parser) + (.let* ((result (.zero-or-more parser))) + (.result (when result-type (coerce result result-type)))) (.let* ((x parser) - (xs (.map result-type parser + (xs (.map 'list parser :at-least (1- at-least)))) - (.result (coerce (cons x xs) result-type)))))) + (.result (when result-type + (coerce (cons x xs) result-type))))))) #+END_SRC #+RESULTS: tutorial_.map @@ -793,24 +783,21 @@ So, we can specify the =.MAP= parser as follows. #+BEGIN_SRC lisp :results value (defun .quoted-string (&key (quote #\') (escape #\|)) - (.prog2 (.is 'char= quote)) - (.map (.plus (.is-not 'char= quote) - (.progn - (.is 'char= escape) - (.item))) - :result-type 'cl:string)) - - 'cl:string)) + (.let* ((_ (.char= quote)) + (string + (.map 'string + (.plus (.let* ((_ (.char= escape))) + (.item)) + (.is-not 'char= quote)))) + (_ (.char= quote))) + (.result string))) - (funcall (.quoted-string) "'The quote char is |' and the escape char is ||.'''")) #+END_SRC -#+RESULTS: | (The quote char is ' and the escape char is | . . '') | * =.OR=, =.NOT=, and =.AND= : deterministic logic combinators - ** =.OR= =.OR= is a deterministic =.PLUS=. It take any number of parsers. The @@ -865,21 +852,25 @@ So, we can specify the =.MAP= parser as follows. #+END_SRC ** Examples using =.OR=, =.NOT=, and =.AND= +*** =.NO-MORE-INPUT= Now that we have =.NOT=, we can specifically test for failure rather than abort the parse entirely. since the primitive parser =.ITEM= only fails when the input is empty, we can define =.NO-MORE-INPUT= by negating it. +#+name:tutorial_.no-more-input #+BEGIN_SRC lisp (defun .no-more-input () (.not (.item))) #+END_SRC - - Using =.AND=, we can implement =.PROGN= (which is really just =.AND= - because it will fail when the parser does), =.PROG1= (which comes in - handy for matching things and the end of the line, or when there - is no more input) and =.PROG2=, which as we will see is also quite useful. + +*** =.PROGN=, =.PROG1=, =.PROG2= + Using =.AND=, we can implement =.PROGN= (which is really just + =.AND= because it will fail when the parser does), =.PROG1= (which + comes in handy for matching things and the end of the line, or + when there is no more input) and =.PROG2=, which as we will see is + also quite useful. #+NAME: tutorial_.progn #+BEGIN_SRC lisp @@ -895,12 +886,14 @@ So, we can specify the =.MAP= parser as follows. (.and parser1 (apply #'.prog1 parser2 parsers))) #+END_SRC - The MAYBE combinator, which allows a parser to fail and still +*** =.OPTIONAL= + + The OPTIONAL combinator, which allows a parser to fail and still continue, is a natural use of =.OR=. -#+NAME: tutorial_.maybe +#+NAME: tutorial_.optional #+BEGIN_SRC lisp - (defun .maybe (parser) + (defun .optional (parser) (.or parser (.result nil))) #+END_SRC @@ -924,38 +917,165 @@ So, we can specify the =.MAP= parser as follows. #+END_SRC -* Parsing Literate Org Mode Source +* Literate Org Mode The file that this tutorial is 'weaved'[4] from has a lot of code. That code really needs to be in a source file. While we 'weave' the documentation, we 'tangle' the source code itself, all from the same -'literate' programming documents. +'literate' programming documents. -** The Code Block +** Tangle : make code from source document [[http://orgmode.org/][Org Mode]] syntax is used to layout the literate 'essay' that makes up this tutorial. The source code itself is stored in Code Blocks. -#+BEGIN_QUOTE -Live code blocks can be specified with a `src' block or inline. The -structure of a `src' block is - - ,#+BEGIN_SRC org - ,#+NAME: - ,#+BEGIN_SRC
- - ,#+END_SRC - ,#+END_SRC - -The `#+NAME:' line is optional, and can be used to name the code -block. Live code blocks require that a language be specified on -the #+BEGIN\_SRC' line. Switches and header arguments are optional. +#+BEGIN_SRC org + Live code blocks can be specified with a `src' block or inline. The + structure of a `src' block is + + + ,#+NAME: + ,#+BEGIN_SRC
+ + ,#+END_SRC + + The `#+NAME:' line is optional, and can be used to name the code + block. Live code blocks require that a language be specified on + the #+BEGIN\_SRC' line. Switches and header arguments are optional. + + -- http://orgmode.org/org.html#Structure-of-code-blocks +#+END_SRC + +#+BEGIN_SRC lisp + (defparameter *code-blocks* + (org-code-blocks (merge-pathnames + "doc/tutorial.org" + (asdf:system-source-directory :smug)))) +#+END_SRC + +*** :tangle "../tangle.lisp" + +Somewhere in this file is a line that resembles the following. + +#+BEGIN_SRC org + ,#+BEGIN_SRC lisp :noweb yes :tangle "../tangle.lisp" :padline no +#+END_SRC + +What we care about right now is the header arguments, and in specific +the :tangle keyword. + +In [[%3D%3D][=
=]] we set the =CODE-BLOCK-HEADER-ARGUMENTS= to an +=ALIST= if it parses as such, and a =.LINE= otherwise. We really only +care about the =ALIST=, and besides, it really is a =PLIST= after all, +so =GETF-CODE-BLOCK= is a decent name. + +#+BEGIN_SRC lisp + (defun getf-code-block (cb indicator) + (let* ((db (code-block-header-arguments cb)) + (value (when (listp db) (assoc indicator db :test 'string-equal)))) + (values (cdr value) (car value)))) +#+END_SRC + +So, in specific, we are looking for the =|:tangle "../tangle.lisp"|= +header argument, as that one code block that makes this a literate +/program/. + +#+BEGIN_SRC lisp +(defun tangle-code-blocks (&optional (code-blocks *code-blocks*)) + (remove-if-not (lambda (cb) (code-block-getf cb "tangle")) + code-blocks)) +#+END_SRC + +**** The tangle.lisp code block + +#+name: tangle_tangle-code-block +#+BEGIN_SRC lisp :results output code +(find-if (lambda (cb) + (string-equal "../tangle.lisp" + (read-from-string (getf-code-block cb "tangle")))) + (tangle-code-blocks)) + +#+END_SRC + +#+RESULTS: tangle_tangle-code-block +#+BEGIN_SRC lisp + +#S(CODE-BLOCK + :NAME NIL + :LANGUAGE "lisp" + :SWITCHES NIL + :HEADER-ARGUMENTS (("noweb" . "yes ") ("tangle" . "\"../tangle.lisp\" ") + ("padline" . "no")) + :BODY (" (defpackage :smug/tangle" " (:use :cl :smug/tutorial)" + " (:export))" " (in-package :smug/tangle) " "" + " <>" "" " <>" "" + " <>" "" + " <>" "" + " <>" "" " <>")) +#+END_SRC + +**** :noweb yes + +The =tangle.lisp= code block has a =:noweb yes= header argument. + +#+BEGIN_SRC lisp + :HEADER-ARGUMENTS (("noweb" . "yes ") ("tangle" . "\"../tangle.lisp\" ") + ("padline" . "no")) +#+END_SRC + + +#+BEGIN_SRC lisp :noweb yes :results output code + + (let ((cb + <>)) + (code-block-body cb)) +#+END_SRC + +#+RESULTS: +#+BEGIN_SRC lisp + +(" (defpackage :smug/tangle" " (:use :cl :smug/tutorial)" " (:export))" + " (in-package :smug/tangle) " "" " <>" "" + " <>" "" " <>" "" + " <>" "" + " <>" "" " <>") +#+END_SRC + + + +*** Parsing the file + +#+BEGIN_SRC lisp + + ;; #+quicklisp (ql:quickload "alexandria") + + (defun org-code-blocks (org-doc) + (let ((string + (etypecase org-doc + (string org-doc) + (pathname (alexandria:read-file-into-string org-doc))))) + (destructuring-bind ((list . input)) + (.run (.first (.map 'list (.or (.code-block) (.line)))) string) + (values (remove-if-not #'code-block-p list) input)))) + + +#+END_SRC + +#+RESULTS: +: ORG-CODE-BLOCKS --- http://orgmode.org/org.html#Structure-of-code-blocks -#+END_QUOTE *** =.STRING-EQUAL=: For case insensitivity +#+BEGIN_QUOTE +Org uses option keywords (like #+TITLE to set the title) and +environment keywords (like #+BEGIN\_HTML to start a HTML +environment). They are written in uppercase in the manual to enhance +its readability, but you can use lowercase in your Org files. +-- http://orgmode.org/org.html#Conventions + +#+END_QUOTE + The =#+NAME=, =#+BEGIN_SRC= and =#+END_SRC= are case insensitive. We have =.CHAR== and =.STRING== already, so =.CHAR-EQUAL= and =.STRING-EQUAL= are in order[fn:f_chareq]. @@ -963,10 +1083,10 @@ are in order[fn:f_chareq]. #+NAME: tutorial_.char-equal #+BEGIN_SRC lisp (defun .char-equal (char) - (.is #'char-equal char)) + (.is #'cl:char-equal char)) #+END_SRC -For our =.STRING=, we simply return the string we passed in. Because +For our =.STRING==, we simply return the string we passed in. Because =CL:STRING-EQUAL= "ignore[s] differences in case"[fn:f_chareq], we actually need to return that matched string from what we are parsing. @@ -974,89 +1094,271 @@ actually need to return that matched string from what we are parsing. #+BEGIN_SRC lisp (defun .string-equal (string) (labels ((%string-equal (string) - (if (input-empty-p string) - (.result nil) - (.let* ((first (.char-equal (input-first string))) - (rest (%string-equal (input-rest string)))) - (.result (cons first rest)))))) + (.let* ((first (.char-equal (aref string 0))) + (rest (if (> (length string) 1) + (%string-equal (subseq string 1)) + (.result nil)))) + (.result (cons first rest))))) (.let* ((list (%string-equal string))) (.result (coerce list 'string))))) #+END_SRC +#+RESULTS: tutorial_.string-equal +: \.STRING-EQUAL + #+BEGIN_SRC lisp (test> (.run (.string-equal "asd") "AsD") - :=> (("AsD" . ""))) + => (("AsD" . ""))) #+END_SRC +#+RESULTS: +: T + +*** The Code Block +# <> -*** =.switches= +#+name: tangle_.code-block +#+BEGIN_SRC lisp + (defstruct code-block + name + language + switches + header-arguments + body) + + (defun .code-block () + (.let* ((plist (.code-block-plist))) + (.result (apply 'make-code-block plist)))) +#+END_SRC + + +**** =|#+NAME: |= + +#+name: code-block-name +#+BEGIN_SRC lisp + (defun |#+NAME: | () + (.progn (.map nil (.is 'member *whitespace*)) + (.string-equal "#+NAME: ") + (.prog1 (.map 'string (.is-not 'char= #\Newline)) + (.char= #\Newline)))) +#+END_SRC + + +#+NAME: tutorial_test.code-block-name +#+BEGIN_SRC lisp :noweb yes + (test> (.run + (|#+NAME: |) +" #+naME: foobar +") + => (("foobar" . ""))) + +#+END_SRC + +**** =|#+BEGIN\_SRC"
|= #+BEGIN_QUOTE +Live code blocks require that a language be specified on +the #+BEGIN\_SRC' line. Switches and header arguments are optional. + +-- http://orgmode.org/org.html#Structure-of-code-blocks +#+END_QUOTE + +#+name: tutorial_.begin_src +#+BEGIN_SRC lisp :noweb yes + (defun |#+BEGIN_SRC
| () + (.let* ((language (|#+BEGIN_SRC |)) + (switches (||)) + (args (|
|))) + (.result (list :language language + :switches switches + :header-arguments args)))) +#+END_SRC + +#+RESULTS: tutorial_.begin_src +: | + +***** =|#+BEGIN_SRC |= + +#+name: tutorial_.begin_src +#+BEGIN_SRC lisp + (defun |#+BEGIN_SRC | () + (.progn (.whitespace nil :at-least 0) + (.string-equal "#+BEGIN_SRC ") + (.atom))) +#+END_SRC + + +#+BEGIN_SRC lisp :results output code +(test> (.run (|#+BEGIN_SRC |) "#+BEGIN_SRC lisp :noweb yes") + => (("lisp" . " :noweb yes"))) + +#+END_SRC + -Both in example and in src snippets, you can add a -n switch to -the end of the BEGIN line, to get the lines of the example -numbered. +***** == -If you use a +n switch, the numbering from the previous +#+BEGIN_QUOTE +[...] in =src= snippets, you can add a -n switch to +the end of the =BEGIN= line, to get the lines of the example +numbered. -- http://orgmode.org/org.html#Literal-examples +#+END_QUOTE + +#+name: code-block-switches +#+BEGIN_SRC lisp -n +(.string= " -n") +#+END_SRC + +#+BEGIN_QUOTE +If you use a =+n= switch, the numbering from the previous numbered snippet will be continued in the current one. -In literal -examples, Org will interpret strings like ‘(ref:name)’ as labels, -and use them as targets for special hyperlinks like -[[(name)]] (i.e., the reference name enclosed in single -parenthesis). In HTML, hovering the mouse over such a link will -remote-highlight the corresponding code line, which is kind of -cool. + -- http://orgmode.org/org.html#Literal-examples +#+END_QUOTE + +#+name: code-block-switches +#+BEGIN_SRC lisp +n +(.string= " +n") +#+END_SRC + +#+BEGIN_QUOTE +In literal examples, Org will interpret strings like ‘(ref:name)’ as +labels, and use them as targets for special hyperlinks like [[(name)]] +(i.e., the reference name enclosed in single parenthesis). In HTML, +hovering the mouse over such a link will remote-highlight the +corresponding code line, which is kind of cool. You can also add a -r switch which removes the labels from the source code121. With the -n switch, links to these references -will be labeled by the line numbers from the code listing, -otherwise links will use the labels with no parentheses. Here is -an example: +will be labelled by the line numbers from the code listing, +otherwise links will use the labels with no parentheses. +#+END_QUOTE + +#+name: code-block-switches +#+BEGIN_SRC lisp +n -r +(.string= " -r") ;; (ref:switch) +#+END_SRC - ,#+BEGIN_SRC emacs-lisp -n -r - (save-excursion (ref:sc) - (goto-char (point-min))) (ref:jump) - ,#+END_SRC - In line [[(sc)]] we remember the current position. [[(jump)][Line (jump)]] - jumps to point-min. +So, using [[(switch)]] you now can link to a specific label. It is a +normal org [[(switch)][link]], so it can be edited. +#+BEGIN_QUOTE If the syntax for the label format conflicts with the language syntax, use a -l switch to change the format, for example - ‘#+BEGIN_SRC pascal -n -r -l "((%s))"’. See also the variable org-coderef-label-format. + ‘#+BEGIN\_SRC pascal -n -r -l "((%s))"’. See also the variable =org-coderef-label-format=. + +-- http://orgmode.org/org.html#Literal-examples #+END_QUOTE -#+NAME: tutorial_code-block-switches +#+name: code-block-switches +#+BEGIN_SRC lisp + (.let* ((-l (.string= " -l ")) + (format (.prog2 (.char= #\") + (.map 'string (.is-not 'char= #\")) + (.char= #\")))) + (.result (list :switch -l + :format format))) +#+END_SRC +****** =(defun || () ...)= + +#+NAME: tangle_code-block-switches +#+BEGIN_SRC lisp :noweb yes + (defun || () + (flet ((.switch () + (.or + <>))) + (.map 'list + (.let* ((switch (.switch))) + (.result (if (stringp switch) + (list :switch switch) + switch))) + :at-least 0))) +#+END_SRC + +***** =
= +# <<=
=>> + +"Header Arguments" are quite odd. As far as I can tell, a [[http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node108.html][Propery List]] +with keys as symbols in the keyword package, but the values are not +=.ATOM='s, but rather whatever comes before the next key, syntax wise. + +Regardless, what is nice about it is the fact that is does make up the +rest of the line. An =ALIST= will be fine on our side. + +#+NAME: tangle_code-block-header-arguments-org +#+BEGIN_SRC org + ,#+BEGIN_SRC lisp :results output code :noweb yes :padline no +#+END_SRC + +#+NAME: tangle_code-block-header-arguments +#+BEGIN_SRC lisp :results value code + (defun |
| () + (flet ((.arg () + (.let* ((key (.progn + (.optional (.whitespace)) + (.char= #\:) + (.atom))) + (value (.progn + (.whitespace) + (.map 'string (.is-not 'member '(#\Newline #\:)))))) + (.result (cons key value))))) + (.or (.prog1 (.first (.map 'list (.arg))) + (.char= #\Newline)) + (.line)))) +#+END_SRC + +#+RESULTS: #+BEGIN_SRC lisp - (.map 'list (.let* ((symbol (.is 'find "+-")) - - (.result symbol))) - #+END_SRC -*** =.code-block= : Phase 1 +#+BEGIN_SRC lisp :results output code :noweb yes + (test> (.run (|
|) + (format nil " :results output code :noweb yes :padline no~%"))) +#+END_SRC +#+RESULTS: #+BEGIN_SRC lisp - (defun .code-block-plist () - (flet ((.begin - (.prog2 (.string-equal "#+BEGIN_SRC ") - (.let* ((language - (.map 'string (.is-not 'member '(#\Space #\Newline)))) - (sep (.item))) - (src (.prog1 (.string-of (.item)) - (.string-equal "#+END_SRC")))) - (.result (list :name name - :language language + +(((("results" . "output code ") ("noweb" . "yes ") ("padline" . "no")) . "")) +#+END_SRC + +**** =| #+END_SRC|= +#+name: tutorial_body-end-src +#+BEGIN_SRC lisp + (defun | #+END_SRC| () + (flet ((.end () + (.prog2 (.map nil (.is 'member *whitespace*)) + (.string-equal "#+END_SRC") + (.or (.is 'member '(#\space #\newline)) + (.not (.item)))))) + (.prog1 + (.map 'list (.and (.not (.end)) + (.line))) + (.end)))) +#+END_SRC + +**** =.code-block-plist= + +#+name: tutorial_.code-block-plist +#+BEGIN_SRC lisp :noweb yes - :src src)))) - (.let* ((name (.prog2 (.string-equal "#+NAME: ") - (.map 'string (.item)) - (.char= #\Newline))) - (begin + <> + + <> + + <> + + <> + + (defun .code-block-plist () + (.let* ((name (.optional (|#+NAME: |))) + (begin (|#+BEGIN_SRC
|)) + (body (| #+END_SRC|))) + (.result (list* :name name :body body begin)))) #+END_SRC + #+BEGIN_SRC lisp (let ((string "#+NAME: ,#+BEGIN_SRC
@@ -1071,11 +1373,160 @@ use a -l switch to change the format, for example #+END_SRC +#<< +** =*debug-input*= and friends. + +This is the first real parser. One of the things about the +non-determinism is that things get FUBAR'd. + +#+BEGIN_SRC lisp + + (defparameter *debug-input* (make-hash-table :test #'equal)) + + (defmethod input-first :before (input) + ,#+#:|remove to debug| + (let ((num (gethash input *debug-input*))) + (when (and num (> num 10)) + (break "~A times we've tried ~A" num input)) + (setf (gethash input *debug-input*) (1+ (or num 0))))) + +#+END_SRC + +** =.make-string= + +#+BEGIN_SRC lisp + + + +#+END_SRC +** =.LINE=: many lines make up a document + +With all the parsers needed defined, a =.LINE= is trivial. + +#+name: tangle_.line +#+BEGIN_SRC lisp + (defun .line () + (.prog1 (.map 'string (.is-not 'char= #\newline)) + (.char= #\newline))) +#+END_SRC + +** =.WHITESPACE=: member of? + +#+name: tangle_whitespace +#+BEGIN_SRC lisp +(defparameter *whitespace* '(#\space #\tab)) + +(defun .whitespace (&optional (result-type 'string) &key (at-least 1)) + (.map result-type (.is 'member *whitespace*) :at-least at-least)) +#+END_SRC + +#+RESULTS: tangle_whitespace +: \.WHITESPACE + + +#+BEGIN_SRC lisp :results output code +(test> (.run (.whitespace) (format nil "~t asd")) + => ((" " . "asd") (" " . " asd"))) + +#+END_SRC + +** =.ATOM= +#+name: tangle_.atom +#+BEGIN_SRC lisp + (defun .atom () + (.first (.map 'string (.is-not 'member (cons #\Newline *whitespace*)) + :at-least 1))) + +#+END_SRC + +#+BEGIN_SRC lisp :results output code + (test> (.run (.map 'list (.prog1 (.atom) + (.or (.whitespace) (.not (.item))))) + ":foo bar :baz bat") + => (((":foo" "bar" ":baz" "bat") . "") ((":foo" "bar" ":baz") . "bat") + ((":foo" "bar") . ":baz bat") ((":foo") . "bar :baz bat") + (NIL . ":foo bar :baz bat"))) + +#+END_SRC + +#+RESULTS: +#+BEGIN_SRC lisp + +(((":foo" "bar" ":baz" "bat") . "") ((":foo" "bar" ":baz") . "bat") + ((":foo" "bar") . ":baz bat") ((":foo") . "bar :baz bat") + (NIL . ":foo bar :baz bat")) +#+END_SRC + +#+BEGIN_SRC lisp :results output code + (test> (.run (.first + (.map 'list (.prog1 + (.first (.map + 'list (.prog2 + (.optional (.whitespace)) + (.atom) + (.optional (.whitespace))) + :at-least 1)) + (.plus (.char= #\Newline) (.not (.item)))))) + + ":foo bar :baz bat + :jkl asd :qwerty uiop + ")) + +#+END_SRC + +#+RESULTS: +#+BEGIN_SRC lisp + +((((":foo" "bar" ":baz" "bat") (":jkl" "asd" ":qwerty" "uiop")) . "")) +#+END_SRC + + * The Source Code + + +** SMUG/TANGLE +*** The tangle.lisp file +#+BEGIN_SRC lisp :noweb yes :tangle "../tangle.lisp" :padline no + (defpackage :smug/tangle + (:use :cl :smug/tutorial) + (:export)) + (in-package :smug/tangle) + + <> + + <> + + <> + + <> + + <> + + <> +#+END_SRC + ** SMUG/TUTORIAL +*** Testing + +#+BEGIN_SRC lisp + (defmacro test> (form &key ((=> provided-result) nil result-provided?) + (test 'equal)) + `(let* ((form-result ,form) + (result (if ',result-provided? + (funcall ',test form-result ',provided-result) + form-result))) + (prog1 result (assert result + () "~A~% => ~S ~% ...should be :~% ~S" + ',form form-result ',(if result-provided? + provided-result + "Something that evaluates to a non-NIL value"))))) +#+END_SRC -*** SMUG/TUTORIAL primatives +#+RESULTS: +: TEST> + +*** SMUG/TUTORIAL primitives **** =.RUN= @@ -1138,7 +1589,6 @@ use a -l switch to change the format, for example #:.not #+END_SRC - *** The =INPUT= interface functions #+NAME: tutorial-source @@ -1155,8 +1605,6 @@ use a -l switch to change the format, for example *** The =PARSER= itself - - #+NAME: tutorial-source #+BEGIN_SRC lisp :noweb yes <> @@ -1174,47 +1622,60 @@ use a -l switch to change the format, for example (defpackage :smug/tutorial (:use :cl) (:export + #:.let* + #:.map + #:.is + #:.is-not + #:.char= + #:.char-equal + #:.string-equal + #:.string= + #:.progn + #:.prog1 + #:.prog2 + #:.and + #:.or + #:.not + #:.first + #:.optional <>)) (in-package :smug/tutorial) - + <> - + <> - + <> - + <> - - <> - + + <> + <> - + <> - + <> - + <> - + <> - + <> <> - + <> <> - + <> - - - <> - - <> + + <> + #+END_SRC - * COMMENT =.PROGN=, =.PROG1= and =.PROG2= This likely should not be here. @@ -1280,6 +1741,7 @@ This likely should not be here. #+BEGIN_SRC emacs-lisp (setq org-src-fontify-natively t) (setq *org-babel-use-quick-and-dirty-noweb-expansion* t) + (setq org-use-sub-superscripts "{}") #+END_SRC tutorial_.bind @@ -1297,6 +1759,16 @@ You can make words `*bold*', `/italic/', `_underlined_', `=code=' and and verbatim string is not processed for Org mode specific syntax; it is exported verbatim. + +* Task List +** TODO Patch printf.lisp to use this code + +#+BEGIN_QUOTE +(10:36:46 AM) nyef: drewc: In case you're here, someone is asking about your little toy. + +(10:37:23 AM) dlowe: drmeister: I've used smug. It's quite nice. http://github.com/dlowe-net/printf +#+END_QUOTE + * footnotes [1] [[file:monparsing.org][Monadic Parser Combinators]] (pdf, ps, bibtex) Graham Hutton and @@ -1310,13 +1782,16 @@ Nottingham, 1996. [3] like, say, scheme - -#(end-lisp) - [4] Literate Programming [fn:f_chareq] http://clhs.lisp.se/Body/f_chareq.htm +[5] "In mathematics and computer science, currying is the technique of +translating the evaluation of a function that takes multiple arguments +(or a tuple of arguments) into evaluating a sequence of functions, +each with a single argument (partial application). " + -- https://en.wikipedia.org/wiki/Currying +