Version History
- To Do
- 0.26.4
- 0.26.2
- 0.26.0
- 0.24.0
- 0.22.24
- 0.22.22
- 0.22.20
- 0.22.18
- 0.22.16
- 0.22.14
- 0.22.12
- 0.22.10
- 0.22.8
- 0.22.6
- 0.22.4
- 0.22.2
- 0.22.0
- 0.21.0
- 0.20.2
- 0.20.0
- 0.19.8
- 0.19.6
- 0.19.5
- 0.19.4
- 0.19.3
- 0.19.2
- 0.19.1
- 0.19.0
- 0.18.9
- 0.18.8
- 0.18.7
- 0.18.6
- 0.18.5
- 0.18.4
- 0.18.3
- 0.18.2
- 0.18.1
- 0.18.0
- 0.17.4
- 0.17.3
- 0.17.2
- 0.17.1
- 0.17.0
- 0.16.1
- 0.16.0
- 0.15.4
- 0.15.3
- 0.15.2
- 0.15.1
- 0.15.0
- 0.14.0
- 0.13.7
- 0.13.6
- 0.13.5
- 0.13.4
- 0.13.3
- 0.13.2
- 0.13.1
- 0.13.0
- 0.12.3
- 0.12.2
- 0.12.1
- 0.12.0
- 0.11.10
- 0.11.9
- 0.11.8
- 0.11.7
- 0.11.6
- 0.11.5
- 0.11.4
- 0.11.3
- 0.11.2
- 0.11.1
- 0.11.0
- 0.10.3
- 0.10.2
- 0.10.1
- 0.10.0
- 0.9.4
- 0.9.3
- 0.9.2
- 0.9.1
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.1
- 0.6.0
- 0.5.0
- 0.4.17
- 0.4.16
- 0.4.15
- 0.4.14
- 0.4.13
- 0.4.12
- 0.4.11
- 0.4.10
- 0.4.9
- 0.4.8
- 0.4.7
- 0.4.6
- 0.4.5
- 0.4.4
- 0.4.3
- 0.4.2
- 0.4.1
- 0.4.0
- 0.3.2
- 0.3.1
- 0.3.0
- 0.2.9
- 0.2.8
- 0.2.7
- 0.2.6
- 0.2.5
- 0.2.4
- 0.2.3
- 0.2.2
- 0.2.1
- 0.2.0
- 0.1.9
- 0.1.8
- 0.1.7
- 0.1.6
- 0.1.5
- 0.1.4
- 0.1.3
- 0.1.2
- 0.1.1
- 0.1.0
To Do
-
Change: Extensions wiki from table format for options to list, easier to maintain and read when descriptions can benefit form complex formatting
-
Fix: clean up and verify the Extensions wiki options lists for name changes, missing or extra entries. Update description for better understanding fooa.
-
Add: generated HTML element positions to
TagRanges
to allow mapping from source offset to HTML offset for the element(s). This is needed to allow synchronization with source when using an attribute to hold the source information is not an option. -
Add: Latex extension
-
Change: complete parser profiles for variations within a family Markdown Parser Emulation.
- League/CommonMark
- Jekyll
- Php Markdown Extra
- CommonMark (default for family):
ParserEmulationProfile.COMMONMARK
- FixedIndent (default for family):
ParserEmulationProfile.FIXED_INDENT
- GitHub Comments (just CommonMark):
ParserEmulationProfile.COMMONMARK
- GitHub Docs: (Now CommonMark):
ParserEmulationProfile.COMMONMARK
, old parsing wasParserEmulationProfile.GITHUB_DOC
- Kramdown (default for family):
ParserEmulationProfile.KRAMDOWN
- Markdown.pl (default for family):
ParserEmulationProfile.MARKDOWN
- MultiMarkdown:
ParserEmulationProfile.MULTI_MARKDOWN
- Pegdown, with pegdown extensions use
PegdownOptionsAdapter
inflexmark-profile-pegdown
- Pegdown, without pegdown extensions
ParserEmulationProfile.PEGDOWN
-
Fix: #146, Formatter missing blank line after HTML blocks. General formatter issue if
Parser.BLANK_LINES_IN_AST
was false. Affected elements:- Abbreviations
- Definitions
- Html Blocks
- Lists
-
Fix:
Node.segmentSpanChars(StringBuilder, int, int, String, String, String, String)
would not output the segment ifstartOffset >= endOffset
, which would be the case if the node's segment was replaced with a string. Now dumping the AST usingAstCollectingVisitor
will reflect replaced segments.
-
Fix: #143, FlexmarkHtmlParser builders not static, merged #144, Update FlexmarkHtmlParser.java
-
Fix: make html parser instance re-usable, now resets state for each parse invocation
-
Fix: Update parser to CommonMark Spec 0.28, the only effect on parsing is:
-
***foo***
is now parsed asitalic(bold(foo))
whereas previously it wasbold(italic(foo))
.Use
ParserEmulationProfile.COMMONMARK_0_27
or just set theParser.STRONG_WRAPS_EMPHASIS
totrue
to get old parsing behaviour. -
Matched Nested parentheses in link URLs now do not need to be escaped.
[foo](fun(bar))
will now be parsed as a link, previously it would be a[foo]
reference followed by text(fun(bar))
.Use
ParserEmulationProfile.COMMONMARK_0_27
or just set theParser.LINKS_ALLOW_MATCHED_PARENTHESES
tofalse
to get old parsing behaviour.
-
-
Add: profiles for
COMMONMARK_0_26
,COMMONMARK_0_27
,COMMONMARK_0_28
, withCOMMONMARK
defaulting to the most recent implemented spec: 0.28 for now. The 0.26 profile only differs from the 0.27 by using double blank lines to terminate lists. Delimiter parsing is not downgraded from 0.27 -
Add: spec tests for 0.27 and 0.28 which use the version specific emulation profiles.
-
API Change: add node renderer API for delegating rendering to previously registered node renderer stack, allowing partial node rendering customizations with non-customized rendering passed to previous renderer.
NodeRendererContext.delegateRender()
to delegate current node rendering to previously registered handler for the current node. Should only be called from handler'srender()
method. If there is no renderer to which to delegate then the node will not be rendered.NodeRendererContext.getDelegatedSubContext(Appendable, boolean)
to get a sub-context to be used in delegated rendering, this context inherits current node rendering handler and node information needed fordelegateRender()
method.- Added
DelegatingNodeRendererFactory
which should be implemented by custom renderers which rely on delegating some rendering of their nodes to existing renderers. It provides a set ofNodeRendererFactory
classes to which this renderer may delegate to provide a controlled ordering of renderers independent of registration order.
- Fix: remove
flexmark-ext-spec-example
module fromflexmark-profile-pegdown
dependencies because it isn't used and caused junit to be included in runtime scope.
- Fix: remove
flexmark-ext-spec-example
module fromflexmark-all
dependencies because it would cause junit to be included in runtime scope.
-
Change: refactor link and reference nodes for common access to url, pageref, anchor and title fields. Also add convenience
Node.getSegmentsForChars()
which returns only sequences which when appended intoSegmentedSequence
will result in the chars for the node. -
Add: sample for url change in AST and output via formatter: FormatterWithMods.java
-
Fix: #138, HTML to Markdown converter missing list end for two consecutive lists
-
Add:
FlexmarkHtmlParser.LISTS_END_ON_DOUBLE_BLANK
defaultfalse
, when set totrue
consecutive lists are separated by double blank line, otherwise by an empty HTML comment line.
-
Change: when
Parser.CODE_SOFT_LINE_BREAKS
istrue
andHtmlRenderer.SOFT_BREAK
is not all spaces or tabs then inline code will render soft line breaks as perHtmlRenderer.SOFT_BREAK
definition. A side-effect of this is white-space collapsing only occurs for inline code line segments. -
Change: add
Parser.CODE_BLOCK_INDENT
defaults to value ofParser.LISTS_ITEM_INDENT
, allows for separate control of indented block setting from list item settings.
-
Fix: #133, The pom.xml file's url should be update
-
Add: extra wiki link tests for combination with
TypographicExtension
-
Fix: escaped HTML blocks should be wrapped in
<p>...</p>
-
Fix: allow
Document.getLineNumber(int offset)
to return line number whenParser.TRACK_DOCUMENT_LINES
is false, less efficient because it counts EOL in document char sequence. Useful when line number is needed for error reporting without incurring storage overhead of storing individual line character sequences.
-
Fix: potential exception when generating source position in HTML for individual lines
-
Add:
Parser.TRACK_DOCUMENT_LINES
defaultfalse
. Whentrue
document lines are tracked in the document'slineSegments
list and offset to line method can be used to get the 0-based line number for the given offset. Whenfalse
these functions return0
. -
Add: line number API functions to
Node
andDocument
, setParser.TRACK_DOCUMENT_LINES
to true to have these functions return other than 0.-
Node.getLineNumber()
andNode.getStartLineNumber()
returns 0-based line number for node's start offset. -
Node.getEndLineNumber()
returns 0-based line number for node's end offset - 1. -
Document.getLineCount()
returns number of lines in the document -
Document.getLineNumber(int offset)
returns 0-based line number for the given offset orDocument.getLineCount()
if offset is outside the document text range.
-
-
Add:
GfmUsersExtension
to parse Gfm style user refs@user-name
-
Add:
GfmIssuesExtension
to parse Gfm style issue refs#123
-
Add:
Parser.INLINE_DELIMITER_DIRECTIONAL_PUNCTUATIONS
defaultfalse
, whentrue
allows delimiters to start after opening brackets([{<
and end after closing brackets)}]>
without needing to have whitespace characters on the other side of delimiter.Currently
aa**foo**aa
will parse bold butaa**foo()**aa
will not because bracket direction is not taken into account. With above option set totrue
the second case will also parse bold butaa**foo(**aa
andaa**)foo**aa
will not. -
Add:
FlexmarkHtmlParser.PRE_CODE_PRESERVE_EMPHASIS
, defaultfalse
. Whentrue
will preserve inline emphasis tags (<strong>
,<em>
,<del>
,<ins>
,<sub>
,<sup>
) and convert them to markdown syntax inside<pre><code>
HTML blocks. Otherwise will strip these out of generated markdown.<pre><code class="html"><div> <strong>test</strong> </div> </code></pre>
By default will strip out emphasis and convert to:
```html <div> test </div> ```
With option set to true will preserve emphasis and convert to:
```html <div> **test** </div> ```
- Add:
TablesExtension.MIN_SEPARATOR_DASHES
to control how many dashes minimum before separator column is recognized as a table separator.
-
Add: parser family specific HTML block test cases
-
Add
Parser.HTML_BLOCK_DEEP_PARSE_INDENTED_CODE_INTERRUPTS
defaultfalse
, whentrue
Indented code can interrupt HTML block without a preceding blank line.
-
Add:
ParserEmulationProfile.PEGDOWN_STRICT
profile to emulate HTML block parsing according to pegdown rules.ParserEmulationProfile.PEGDOWN
uses less strict HTML block parsing which will end an HTML block on a blank line. -
Add:
Parser.HTML_BLOCK_DEEP_PARSE_FIRST_OPEN_TAG_ON_ONE_LINE
to not parse open tags unless they are contained on one line. Parsers like MultiMarkdown 6.0 more compatible with this mode on. -
Add: html deep block parsing for non-commonmark parsers. Need to add HTML block parsing tests to parser emulation family tests.
-
API Change:
BlockParser.isRawText()
used for interruptible blocks, when this method returnstrue
then indenting spaces are passed to the block. Used byHtmlBlockParser
to keep indents on continuation lines that could be interrupted by another markdown element. -
Fix: add optional tag logic to
HtmlDeepParser
so that optional end tags when omitted do not cause nesting of tags as per 8.1.2.4. Optional tags
- Fix: Deep HTML parser double parsing the first line of the HTML
-
API Change: remove
tagOpened
andtagClosed
fromHtmlFormattingAppendable
. These methods are part of implementation. MovedHtmlFormattingAppendableBase
and madeprotected
. -
API Change: add
BlockParser.isInterruptible()
andBlockParser.canInterruptBy(BlockParserFactory)
to allow block parser control over which blocks can interrupt them. -
Add:
HtmlDeepParser
class to handle chunkwise parsing of HTML blocks to allow better HTML block parsing behaviour and pegdown compatibility.Parser.HTML_BLOCK_DEEP_PARSER
defaultfalse
- enable deep HTML block parsingParser.HTML_BLOCK_DEEP_PARSE_NON_BLOCK
default 'true', parse non-block tags inside HTML blocksParser.HTML_BLOCK_DEEP_PARSE_BLANK_LINE_INTERRUPTS
defaulttrue
, when true Blank line interrupts HTML block when not in raw tag, otherwise only when closedParser.HTML_BLOCK_DEEP_PARSE_MARKDOWN_INTERRUPTS_CLOSED
default false, when true Other markdown elements can interrupt a closed HTML block without an intervening blank lineParser.HTML_BLOCK_DEEP_PARSE_BLANK_LINE_INTERRUPTS_PARTIAL_TAG
default true, when true blank line interrupts partially open tag ie.<TAG
without a corresponding>
and having a blank line before>
-
Fix:
ParserEmulationProfile.PEGDOWN
now will do deep HTML parsing by default with greater pegdown compatibility.
-
API Change: add open tag tracking to
HtmlFormattingAppendable
to allow generating correct HTML when generating src line positionspan
wrapping of individual paragraph linespublic Stack<String> getOpenTags(); public List<String> getOpenTagsAfterLast(CharSequence latestTag); public void tagOpened(CharSequence tagName); public void tagClosed(CharSequence tagName);
-
Add:
Parser.CODE_SOFT_LINE_BREAKS
to generateSoftLineBreak
nodes insideCode
nodes. Needed when generating line based src pos and inline code spans lines. -
Fix: src line position information rendering now properly closes and re-opens any inline tags spanning lines of source.
-
Fix: when src line position option is enabled, inline tags no longer contain source position since they are wrapped in
<span>
tags with line source position. -
Fix: add src position information to superscript, insert nodes.
-
Fix: #120, Fixed Indent mode always has the last list item as tight. Fixed Spaces parsing ignores blank line after last list item and does not check if previous item is loose either. Making it impossible to make the last item loose.
Added new
Parser.LISTS_LOOSE_WHEN_LAST_ITEM_PREV_HAS_TRAILING_BLANK_LINE
, defaultfalse
. When true the last item in a list will be loose if previous item has a trailing blank line.ParserEmulationProfile.FIXED_INDENT
to allow last item to be loose.
- API Change: #117, Add target attribute support to ResolvedLink and updates to the associated
core renderer code. Now
LinkResolver
can also set any attribute of the link viaResolvedLink.getAttributes()
andResolvedLink.getNonNullAttributes()
methods and manipulate the desired attributes of the link.
- Fix: #112, Potential bug: Node.getChildOfType always returns null unless parent is an instance of the desired class. Affected some list parsing in Old Gfm compatibility, which were bugs.
-
Fix: #109, Image Ref missing title tag in rendered HTML, this bug also affects title tag of link ref elements.
-
Fix: assertion error in
ImageRef
rendering.
-
Add:
Parser.SPACE_IN_LINK_ELEMENTS
, defaultfalse
, to allow whitespace between![]
or[]
and()
of links or images. -
Add:
Parser.SPACE_IN_LINK_ELEMENTS
setting totrue
when pegdown profile is selected.
- Add:
LinkType.LINK_REF
,LinkType.IMAGE_REF
andResolvedLink.getTitle()
to allow link resolvers to specify/modify link title and also to allow link resolvers to provide url and title for unresolvedLinkRef
andImageRef
nodes.
-
Fix: index out of bounds exception in
flexmark-ext-escaped-character
processing elements embedded in other elements with prefix text removed. -
Fix:
flexmark-ext-escaped-character
erroneously processing fenced code content -
Fix: change
ReplacedTextMapper
original range to base range and add real original offset range. Otherwise processingSegmentedSequence
would be wrong and cause index out of bounds exceptions. -
Fix:
flexmark-ext-autolink
erroneously processing fenced code content
- Fix: Formatter lost
SimTocBlock
element line spacer and other blank line formatting in the sim toc content
- Fix: #97, Use UnsupportedOperationException
-
Fix: #93, Adding TextCollectingVisitor class handlers
-
Change: add
Parser.FENCED_CODE_BLOCK_PARSER
as replacement forParser.CODE_CONTENT_BLOCK
with the latter marked as deprecated. -
API Change: add child
Text
nodes toCode
andFencedCodeBlock
that contain the text of the element. NOTE: text node is only added toFencedCodeBlock
ifParser.FENCED_CODE_CONTENT_BLOCK
isfalse
, otherwiseCodeContent
block is used.
- Add: #86, how can use definition extension correctly ?, definition item spaced two or more
blank lines from previous element when
DefinitionExtension.DOUBLE_BLANK_LINE_BREAKS_LIST
is true to not parse as definition item.
-
Fix: #86, how can use definition extension correctly ?
-
Add:
DefinitionExtension.DOUBLE_BLANK_LINE_BREAKS_LIST
, default false. When true double blank line between definition item and next definition term will break a definition list.
- Fix: #91, DocumentParser Exception on Empty Documents
- Add: vararg argument to
PegdownOptionsAdapter.getFlexmarkOptions(Extension[])
andPegdownOptionsAdapter.flexmarkOptions(int, Extension[])
for additional extensions to add. - Add: helper functions to
Parser.addExtensions(MutableDataHolder, Extension[])
andParser.removeExtensions(MutableDataHolder, Extension[])
- Fix: IndexOutOfBoundsException on paragraph pre-processing under some conditions
- Fix: #83, Question: is there any way to keep every html entities as-is?, add
HtmlRenderer.UNESCAPE_HTML_ENTITIES
, defaulttrue
. Set tofalse
to leave HTML entities as is in the rendered HTML.
- Add:
HtmlRenderer.HTML_BLOCK_OPEN_TAG_EOL
andHtmlRenderer.HTML_BLOCK_CLOSE_TAG_EOL
, defaulttrue
. Whenfalse
will suppress EOL before/after HTML block tags which are automatically generated during html rendering. Used for doxia compatibility.
-
Add: #78, Space in link, option
Parser.SPACE_IN_LINK_URLS
, defaultfalse
. When enabled will allow spaces in link address as long as they are not followed by a"
, used to start the title of the link. Link address wrapped in<>
will include any trailing spaces.Auto-links, wrapped or not in
<>
cannot have spaces. These require URL encoding the space with%20
.
-
Fix: #76, HTML to Markdown hangs if comments included in Text nodes
-
Add: MS-Word generated HTML list basic recognition:
1.
,1)
,A.
,A)
,a.
,a)
,IV.
,IV)
,iv.
,iv)
-
Fix: MS-Excel generated HTML table parsing bug
-
Add:
FlexmarkHtmlParser.RENDER_COMMENTS
, defaultfalse
. When set to true HTML comments will be rendered in the Markdown. -
Add:
FlexmarkHtmlParser.DOT_ONLY_NUMERIC_LISTS
, defaulttrue
. When set to false closing parenthesis as a list delimiter will be used in Markdown if present in MS-Word style list. Otherwise parenthesis delimited list will be converted to dot.
. -
Add:
DoNotLinkDecorate
interface to distinguish text decoration from link text decoration for flexibility. -
Fix:
TypographicQuotes
andTypographicSmarts
to not implementDoNotDecorate
interface soAbbreviationExtension
will text in quotes. -
Fix: replace regex used for extracting HTML comments from HTML blocks to manual search. RegEx would go into an infinite loop on MS Word created HTML.
-
Fix:
TocExtension
andSimTocExtension
to setHtmlRenderer.GENERATE_HEADER_ID
totrue
andHtmlRenderer.RENDER_HEADER_ID
totrue
if they are not already explicitly set. -
Fix: HTML to Markdown converter to add a space after empty list items
-
Fix: #75, Incorrect footnote link.
\r\n
sequence was not properly recognized inParsing
patterns used by the parser, causing parsing discrepancies when EOL was not\n
in many elements.
-
Fix: HTML to Markdown converter to not ignore text in lists which is not included in a list item but instead to put this text into a new list item.
-
API Change: add
DelimiterProcessor.canBeOpener(boolean, boolean, boolean, boolean, boolean, boolean)
andDelimiterProcessor.canBeCloser(boolean, boolean, boolean, boolean, boolean, boolean)
to allow customization of when a delimiter can be an opener or closer. Default CommonMark does not work for<<
and>>
which can open and close anywhere.
- Fix: #73, Can't nest code blocks in ordered list. GitHub Doc compatibility error in parsing fenced code nested in list items.
-
Fix: remove unused
flexmark-jira-parser
module from repo. -
Fix: #72, Multiple angle quotes not being handled correctly
- Fix: #70, parse failed for angle quotes if the end angle quote follows with a line feed or a
carriage return. Actual error was in
flexmark-ext-typographic
inQuoteDelimiterProcessorBase
not checking for sequence length.
-
Add: #69, Add options to allow changing the rendering of bold, italic and other styles with different tags.
The following
String
options are defined that will override the wrapper used for the HTML of styles.⚠️ both open and close values must be set to non-null values for the setting to take effect:- HtmlRenderer.STRONG_EMPHASIS_STYLE_HTML_OPEN
- HtmlRenderer.STRONG_EMPHASIS_STYLE_HTML_CLOSE
- HtmlRenderer.EMPHASIS_STYLE_HTML_OPEN
- HtmlRenderer.EMPHASIS_STYLE_HTML_CLOSE
- HtmlRenderer.CODE_STYLE_HTML_OPEN
- HtmlRenderer.CODE_STYLE_HTML_CLOSE
- InsExtension.INS_STYLE_HTML_OPEN
- InsExtension.INS_STYLE_HTML_CLOSE
- StrikethroughSubscriptExtension.STRIKETHROUGH_STYLE_HTML_OPEN
- StrikethroughSubscriptExtension.STRIKETHROUGH_STYLE_HTML_CLOSE
- StrikethroughSubscriptExtension.SUBSCRIPT_STYLE_HTML_OPEN
- StrikethroughSubscriptExtension.SUBSCRIPT_STYLE_HTML_CLOSE
- StrikethroughExtension.STRIKETHROUGH_STYLE_HTML_OPEN
- StrikethroughExtension.STRIKETHROUGH_STYLE_HTML_CLOSE
- SubscriptExtension.SUBSCRIPT_STYLE_HTML_OPEN
- SubscriptExtension.SUBSCRIPT_STYLE_HTML_CLOSE
- SuperscriptExtension.SUPERSCRIPT_STYLE_HTML_OPEN
- SuperscriptExtension.SUPERSCRIPT_STYLE_HTML_CLOSE
Change: language level 6 for source and byte code in attempt to get compatibility for Doxia.
Code still requires JDK7 for BitSet functionality and org.nibor.autolink
is JDK 1.7 bytecode
level. So no success but the source is JDK 1.6.
Add: Parser.CODE_CONTENT_BLOCK
, default false
. If set to true will create an AST CodeBlock
node as a child node of FencedCodeBlock
and IndentedCodeBlock
. Allows custom rendering of
CodeBlock
content code while leaving rendering of fenced code and indented code blocks
standard.
Change: PhasedNodeRenderer.renderDocument(NodeRendererContext, HtmlWriter, Document, RenderingPhase)
is now also called for RenderingPhase.BODY
but no rendering should be done
for this phase. All rendering for the body phase should be done through regular node renderer
API.
Fix: #67, Formatter with GitHubDoc emulation indented code of list items not indented enough
Fix: #65, Add some comments to Extension
, Parser.ParserExtension
,
Parser.ReferenceHoldingExtension
and HtmlRenderer.HtmlRendererExtension
API Change: To fix #66 it was necessary to add more parameters to
BlockParser.canContain(ParserState, BlockParser, Block)
to allow for more testing whether a
fenced code block can be contained by a list item.
Fix: #66, GitHub Doc profile incorrect parsing of following markdown
Fix: #62, Autolinks extension for http:// and https:// links includes trailing spaces
Fix: #60, Kramdown parser discrepancy for mismatched ordered/ unordered list items. Now if both
LISTS_ITEM_TYPE_MISMATCH_TO_NEW_LIST
and LISTS_ITEM_TYPE_MISMATCH_TO_SUB_LIST
are set to
true then a new list will be created if the item had a blank line, otherwise a sub-list is
created.
Change: Kramdown
profile to set LISTS_ITEM_TYPE_MISMATCH_TO_NEW_LIST
so that parsing of
mismatched item type starts a new list.
- Add: default class option for fenced code and indented code
HtmlRenderer.FENCED_CODE_NO_LANGUAGE_CLASS
which can be used to disable highlighting if info not specified for fenced code or in indented code.
- Change: flexmark-parent renamed to flexmark-all
-
Fix: default gfm task list item to include
spacer after input check box and addreadonly
attribute -
Add: flexmark-parent artifact with classifier
lib
to create a jar with all core, extension and conversion modules
- Add: PDF converter extension Usage: PDF Output
-
Fix: Implement
TypographicExtension
to convert smarts and quotes conversion:'
to apostrophe'
’...
and. . .
to ellipsis…
…--
en dash–
–---
em dash—
—- single quoted
'some text'
to‘some text’
‘some text’ - double quoted
"some text"
to“some text”
“some text” - double angle quoted
<<some text>>
to«some text»
«some text»
-
API Change:
DelimiterProcessor.unmatchedDelimiterNode(InlineParser, DelimiterRun)
method added to allow substituting unmatched delimiter text by another node.
-
Fix: #56, Pegdown HARDWRAPS option adds an additional line break when using trailing spaces
-
Add: test for pegdown HARDWRAPS extension.
- Fix: #55, Indented Link Reference Definitions not parsed correctly
- Fix: #54 when two spaces followed by \r\n would not parse as HARD break.
-
Fix: #54, 2 spaces at end of line is not recognized as a newline if the end of line has \r\n
-
Fix: related to #54, \r\n should not include \r as part of node's text
-
Add: option to HTML to Markdown to handle auto link conversion:
FlexmarkHtmlParser.EXTRACT_AUTO_LINKS
defaulttrue
to convert links which do not contain atitle
attribute or has one which is empty, and whose text and href are equalFlexmarkHtmlParser.WRAP_AUTO_LINKS
defaultfalse
to wrap auto links in<>
-
Add: options to HTML to Markdown to handle smarts and quotes conversion:
“
,“
,”
,”
to"
‘
,‘
,’
,’
,'
to'
«
,«
to<<
»
,»
to>>
…
,…
to...
&endash;
,–
to--
&emdash;
,—
to---
- Fix: #53, HTML to Markdown converter does handle pre wrapped text properly
- Add: table formatting options for column width adjustment and alignment application
-
Add:
Formatter.BLOCK_QUOTE_BLANK_LINES
defaulttrue
, to wrap block quotes in blank lines -
Change: defaults for:
BLOCK_QUOTE_MARKERS
toBlockQuoteMarker.ADD_COMPACT_WITH_SPACE
INDENTED_CODE_MINIMIZE_INDENT
totrue
FENCED_CODE_MINIMIZE_INDENT
totrue
FENCED_CODE_MATCH_CLOSING_MARKER
totrue
-
Add: sample for formatter use.
-
Add: #50, Add Email obfuscation?, using pegdown profile e-mail obfuscation is on by default. For CommonMark and other processors you will need to set
HtmlRenderer.OBFUSCATE_EMAIL
totrue
, if you need to have repeatability for testing then setHtmlRenderer.OBFUSCATE_EMAIL_RANDOM
tofalse
. -
Add: #47, Add option to have BlankLine nodes in the AST. option
Parser.BLANK_LINES_IN_AST
which results inBlankLine
nodes to be put into the AST for every blank line in the file.⚠️ A blank line in terms of Markdown syntax is not necessarily a blank line in the file. Lines prefixed with>
that are otherwise empty are blank lines inside the block quote and will be in the AST as blank lines. -
Fix: #48, When
Parser.HEADING_NO_ATX_SPACE
is enabled trailing###
should not require a space -
Add: #49, Add
flexmark-formatter
module to render AST as markdown with formatting options, and Formatting API that can be used by extensions to customize formatting markdown source of custom elements. This module implements formatting of core nodes, all unknown nodes are pass through as is. See Markdown Formatter for options.Formatter.FormatterExtension
implementation to all modules where it makes sense to format custom elements.
-
Add: Extra extension flags to
com.vladsch.flexmark.profiles.pegdown.Extensions
to allow easy configuration of extensions that don't exist in pegdown:Extensions.SUBSCRIPT
Extensions.EXTANCHORLINKS_WRAP
Extensions.FOOTNOTES
Extensions.TOC
Extensions.MULTI_LINE_IMAGE_URLS
Extensions.SUPERSCRIPT
Extensions.INSERTED
-
Fix: HTML to Markdown
- skipped list items under some conditions.
- style markers now wrapping full text instead of breaking it up into text sections
- make sure there is no space before trailing marker or after leading marker and that there are spaces surrounding the style markers
- Change: HTML to Markdown conversion
- blank line before headings.
- a few elements to be unwrapped on conversion and some to wrap but as block elements so they could be easily cleaned in resulting markdown.
- text node handling to trim text and escape
\
- suppress empty headings
- task list item when first content of list item is input check box
- options for output control, pass via
DataHolder
taking methodparse()
method:FlexmarkHtmlParser.LIST_CONTENT_INDENT
, defaulttrue
, continuation lines of list items and definitions indent to content column otherwise 4 spacesFlexmarkHtmlParser.SETEXT_HEADINGS
, defaulttrue
, if true then use Setext headings for h1 and h2FlexmarkHtmlParser.OUTPUT_UNKNOWN_TAGS
, defaultfalse
, when true unprocessed tags will be output, otherwise they are ignoredFlexmarkHtmlParser.ORDERED_LIST_DELIMITER
, default'.'
, delimiter for ordered itemsFlexmarkHtmlParser.UNORDERED_LIST_DELIMITER
, default'*'
, delimiter for unordered list itemsFlexmarkHtmlParser.DEFINITION_MARKER_SPACES
, default3
, min spaces after:
for definitionsFlexmarkHtmlParser.MIN_TABLE_SEPARATOR_COLUMN_WIDTH
, default1
, min 1, minimum number of-
in separator column, excluding alignment colons:
FlexmarkHtmlParser.MIN_TABLE_SEPARATOR_DASHES
, default3
, min 3, minimum separator column width, including alignment colons:
FlexmarkHtmlParser.CODE_INDENT
, default 4 spaces, indent to use for indented codeFlexmarkHtmlParser.EOL_IN_TITLE_ATTRIBUTE
, default" "
, string to use in place of EOL in image and link title attribute.FlexmarkHtmlParser.NBSP_TEXT
, default" "
, string to use in place of non-break-spaceFlexmarkHtmlParser.THEMATIC_BREAK
, default"*** ** * ** ***"
,<hr>
replacement
- Change:
FormattingAppendableImpl.flush()
no longer forces an EOL, only allows it if it is already pending. AlsoFormattingAppendableImpl.flush(int)
andFormattingAppendableImpl.getText(int)
allow -1 in which case they will suppress trailing EOL. - Change:
HtmlRenderer.render()
methods now add a trailing EOL to resulting HTML if it is missing. Previously this was done byFormattingAppendableImpl.flush()
-
Fix:
FlexmarkHtmlParser
- infinite loop when handling unknown/unhandled HTML tags.
- recognizing emoji shortcuts for GitHub emoji, GitHub emoji URL images and images using EmojiCheatSheet file names.
- convert
<br>
to hard break in paragraphs th
tags intbody
tr
now create athead
row instead oftbody
row
-
Update: Emoji Extension to latest Emoji Cheat Sheet shortcuts
-
Fix: XWiki
Macro.getAttributes()
to useLinkedHashMap
to preserve attribute order. -
Add:
flexmark-html-parser
module to convert HTML to markdown, usesjsoup
for HTML parsing.Converts HTML to Markdown, assumes all non-application specific extensions are available:
- abbreviations
- aside
- block quotes
- bold, italic, inline code
- bullet and numbered lists
- definition
- emoji shortcuts
- fenced code
- strike through
- subscript
- superscript
- tables
- will also handle conversion for multi-line URL images
- Fix: #44, flexmark-profile-pegdown maven, pegdown profile module was misnamed in pom, no errors but was not deploying.
- Fix: #43, Out of date documentation for Pegdown Migration Helper, caused by
flexmark-profile-pegdown
missing from project dependencies and not being updated.
- Fix: #42, Index out of bounds when XWiki macro match is rejected because of context mismatch
-
Fix:
HtmlRenderer.Builder
andParser.Builder
constructor which takes an instance of other plusDataHolder
to reload all extensions to allow extensions to register different processors based on options. Previously these copied all registered components plus new ones, not allowing components to be not-registered based on options. -
Add: XWiki Macro members to facilitate usage of these nodes in XWiki processing:
MacroBlock
Map<String, String> getAttributes()
: map of attributesMacro getMacroNode()
: returns theMacro
first child nodeboolean isClosedTag()
: true if macro has{{macro /}}
formBasedSequence getMacroContentChars()
: raw text of macro content
Macro
Map<String, String> getAttributes()
: map of attributesboolean isClosedTag()
: true if macro has{{macro /}}
formboolean isBlockMacro()
: true if macro is the first child ofMacroBlock
BasedSequence getMacroContentChars()
: raw text of macro content
-
Add: XWiki options
MacroExtension.ENABLE_BLOCK_MACROS
,MacroExtension.ENABLE_INLINE_MACROS
andMacroExtension.ENABLE_RENDERING
. If block macros are disabled then all macros will be inline macros.
-
Fix: jekyll tags not parsed if two spaces included after tag and before the closing marker
-
Add: jekyll tags
JekyllTagExtension.INCLUDED_HTML
a map of include parameter strings to string of HTML content to replace the include tag:{% include file %}
-
Add:
HtmlRenderer.RECHECK_UNDEFINED_REFERENCES
to check link and image refs which are not defined again, used byParser.transferReferences(Document, Document)
to let extensions know they need to try resolving references. -
Add:
Parser.ReferenceHoldingExtension
to be implemented by extensions that create node repostitories so that these can be copied from included documents to including documents. -
Add:
Parser.transferReferences(Document, Document)
for use by library users and staticParser.transferReferences(NodeRepository<T> destination, NodeRepository<T> included, boolean ifUndefined)
for use by extensions to implmementParser.ReferenceHoldingExtension
-
Add: Reference, Footnote and Abbreviation reference copying
-
Add: Jekyll Tags extension to parse tags of the form:
{% tag params %}
, with optionsJekyllTagExtension.ENABLE_BLOCK_TAGS
andJekyllTagExtension.ENABLE_INLINE_TAGS
to disable individual types of parsers. -
Fix: remove unused xwiki macro options
-
Fix: #40, XWiki Ext Block Macros should not interrupt a paragraph
-
Add: test for #40
-
Add:
InlineParserExtensionFactory
andInlineParserExtension
to allow custom inline parsing elements. -
Add: inline xwiki macro processing
-
Change: default node renderer for xwiki macros now renders the macro as text so that if node rendering is not done by the application then output will be equivalent to text for unhandled macros.
-
Add:
Parser.HEADING_NO_EMPTY_HEADING_WITHOUT_SPACE
option whentrue
andParser.HEADING_NO_ATX_SPACE
is false then bare#
's are not interpreted as headings unless they are followed by at least one space or tab character. -
Fix: treat spaces and tabs for purpose of heading parsing as equivalent
-
Add:
flexmark-ext-xwiki-macros
xwiki application specific macro extension, for now only the block macros are implemented. Inline are treated as text.
- Remove:
ReversedCharSequence
andIndexMapper
because these were moved toreverse-regex-util
library that implements reverse search using regex.
- Fix: GitHub profile to correctly parse deeply nested, content indent aligned lists at the expense of matching poorly formatted markdown matching. GitHub Doc profile has many differences from GitHub parsing for edge cases because GitHub has very unorthodox and irregular parsing rules that I have not been able to completely figure out.
-
Fix: #38, AutoLink extension does not recognize links with ampersand in the link
-
Add: Footnote extension to use
Parser.LISTS_ITEM_INDENT
for determining which elements make up part of the footnote. -
Fix:
PegdownOptionsAdapter.flexmarkOptions(int)
was not passing pegdown options to constructor resulting in always having default 0 options -
Fix:
PegdownOptionsAdapter
did not properly set some list options forpegdown
compatibility. -
Fix: removed some flags from
org.pegdown.Extensions
that were only added in my fork of pegdown and never made it into a released version:Extensions.EXTANCHORLINKS_WRAP
Extensions.FOOTNOTES
Extensions.INTELLIJ_DUMMY_IDENTIFIER
Extensions.MULTI_LINE_IMAGE_URLS
Extensions.RELAXED_STRONG_EMPHASIS_RULES
Extensions.TOC
Extensions.TRACE_PARSER
-
Add: Tests to
flexmark-profile-pegdown
based on actual pegdown HTML rendering of markdown for compatibility testing. Some pegdown idiosyncrasies/bugs not replicated. -
API Change:
Parser.LISTS_LOOSE_ON_PREV_LOOSE_ITEM
toParser.LISTS_LOOSE_WHEN_PREV_HAS_TRAILING_BLANK_LINE
which accurately reflects the setting. -
API Change:
Parser.LISTS_LOOSE_WHEN_BLANK_FOLLOWS_ITEM_PARAGRAPH
toParser.LISTS_LOOSE_WHEN_BLANK_LINE_FOLLOWS_ITEM_PARAGRAPH
which accurately reflects the setting. -
API Change:
Parser.LISTS_LOOSE_WHEN_HAS_NON_LIST_CHILDREN
to be able to emulate GitHub Doc, MultiMarkdown and pegdown withExtensions.FORCELISTITEMPARA
option. -
API Change: add block quote options
-
Parser.BLOCK_QUOTE_ALLOW_LEADING_SPACE
, default true, whenfalse
block quote with leading spaces will not be ignored -
Parser.BLOCK_QUOTE_INTERRUPTS_PARAGRAPH
, default true, whenfalse
block quote will not interrupt a paragraph, requiring a blank line before all block quotes -
Parser.BLOCK_QUOTE_INTERRUPTS_ITEM_PARAGRAPH
, default true, whenfalse
block quote will not interrupt an item paragraph, requiring a blank line between list item text and the block quote. -
Parser.BLOCK_QUOTE_WITH_LEAD_SPACES_INTERRUPTS_ITEM_PARAGRAPH
, default true, whenfalse
block quote will be ignored if it has leading space before the marker and there is no blank line between the item text and the block quote.
-
-
API Change: add heading parsing option
Parser.HEADING_CAN_INTERRUPT_ITEM_PARAGRAPH
, defaulttrue
, whenfalse
heading in list item paragraph is ignored. Not Used but can be applied for partial GitHub compatibility implementation. GitHub parses ATX headings only in list items if the list has a blank line included in it. Too much of a kludge to replicate. -
API Change: rename
ParserEmulationFamily
toParserEmulationProfile
and make it implementMutableDataSetter
and addParserEmulationProfile.family
field to get emulation family. Now there are predefined profiles for:COMMONMARK
, familyCOMMONMARK
FIXED_INDENT
, familyFIXED_INDENT
KRAMDOWN
, familyKRAMDOWN
MARKDOWN
, familyMARKDOWN
GITHUB_DOC
, familyMARKDOWN
MULTI_MARKDOWN
, familyFIXED_INDENT
PEGDOWN
, familyFIXED_INDENT
Usage is simplified to:
private static final DataHolder OPTIONS = new MutableDataSet()
.setFrom(ParserEmulationProfile.MULTI_MARKDOWN);
-
Add: More tests to all non-CommonMark compatibility tests, block quote handling, block quote in list item handling, etc
-
Add: GitHub document compatibility tests. It is a mix of
MARKDOWN
and someKRAMDOWN
list parsing qualities, family changed toMARKDOWN
since it is closer to it. Some edge cases not replicated, too many kludges in its parsing rules.
-
Fix: #36, Gfm Task Items incorrectly converts deeply indented content as indented code when non-commonmark family
-
Fix: #34, Add option to Wiki link extension to escape the pipe separating text and link
-
Add:
WikiLinkExtension.ALLOW_ANCHORS
, defaultfalse
, to parse link for anchor refs. If link is text and page ref combined with anchor ref, then text will be without the trailing anchor marker#
or anchor ref. For somewhat backwards compatible mode setting this key totrue
, will parse and set node's anchorRef field but will also remove the anchor marker and ref from the node's text if text and page ref are combined. -
Add:
WikiLinkExtension.ALLOW_ANCHOR_ESCAPE
, defaultfalse
, to allow\
escapes for anchor markers. -
Add:
WikiLinkExtension.ALLOW_PIPE_ESCAPE
, defaultfalse
, to allow\
escapes for pipe|
that separates text from link. -
Add:
WikiLinkNodeRenderer
to unescape link for wiki and image links before passing it to be resolved. Previously the link was not unescaped, causing backslashes\
to appear in the HTML link address.
-
-
Fix: #31, empty Gfm-Task list item nodes character span does not include the task marker
-
Fix: #32, Thematic Break AST node includes the full line when embedded in other elements
- Fix: definitions would loose consecutive definition items without intervening definition terms.
-
Fix: abbreviation extension for JDK7 needed to sort regex list of abbreviations in descending alphabetical order.
-
Fix: travis YAML to JDK7
-
Fix: Emoji extension to not allow spaces between delimiters and add API methods to support delimiters converting themselves to text if contained text does not meet constraints.
-
Add: method
Delimiter.convertDelimitersToText(int, Delimiter)
to remove used delimiters and convert them to text nodes instead of the expected delimited text node. For use when a delimiter processor determines that the delimited text is not valid for its wrapping. -
Change:
LinkRefProcessor.adjustInlineText(Document, Node)
to allow access to options and now returningBasedSequence
of characters to keep as part of its contained text children. -
Add:
LinkRefProcessor.allowDelimiters(BasedSequence, Document, Node)
to have delimiter processing disabled for some regions withing the contained text -
Add:
LinkRefProcessor.updateNodeElements(Document, Node)
to allow the processor to update some node elements after all delimiter processing for contained text has completed. For example allowsWikiLink
nodes withWikiLinkExtension.ALLOW_INLINES
enabled to adjustlink
field without inline markers. -
Add:
WikiLinkExtension.ALLOW_INLINES
, defaultfalse
, to allow delimiter processing in text part when|
is used, or when combined text and page ref in the contained text. -
Change: wiki links now split text from link using
|
based onWikiLinkExtension.LINK_FIRST_SYNTAX
setting. If link is first then last|
is used to split off the text portion. If text is first then first|
is used. This ensures that no|
are in the text, allowing|
in links to be handled by theLinkResolver
-
Change: Wiki link nodes now pass the whole link text to link resolver, previously only the
pageRef()
portion was passed and theanchorRef()
was appended to the resolved link. Now it is the link resolver's responsibility to handle this extraction and attachment of anchor refs. -
API Change: add
ParserExtension.parserOptions(MutableDataHolder)
method that is called on all extensions before calling theParserExtension.extend(Builder)
so that all extensions get a chance to update options before any are loaded. -
API Change: add
HtmlRendererExtension.rendererOptions(MutableDataHolder)
method that is called on all extensions before calling theHtmlRendererExtension.extend(Builder, String)
so that all extensions get a chance to update options before any are loaded. -
API Change:
Node.childChars()
renamed toNode.getChildChars()
for consistency. -
Add: list parser options
Parser.LISTS_ITEM_MARKER_SUFFIXES
defaultString[]{}
, to allow suffixes after item markers that will be treated as part of the marker not content for purposes of computing content offset. -
Add: list parser options
Parser.LISTS_NUMBERED_ITEM_MARKER_SUFFIXED
defaulttrue
, to disable list marker suffix processing for ordered list items. -
API Change: gfm-tasks extension now using list suffix options for proper parsing of task item sub-item and child paragraphs.
-
TaskListExtension.CONVERT_ORDERED_LIST_ITEMS
changed toParser.LISTS_NUMBERED_ITEM_MARKER_SUFFIXED
, same default oftrue
but now part of core parser option. -
TaskListItem.getTaskOpeningMarker()
andTaskListItem.setTaskOpeningMarker(BasedSequence)
are now part ofListItem
,ListItem.getMarkerSuffix()
andTaskListItem.setMarkerSuffix(BasedSequence)
-
-
Fix: Definition items were included as child nodes of definition terms.
-
Fix: Definition list character range was not being updated
-
Fix: Definition list to respect the auto-loose setting for lists. If one definition item in the list is loose then all items will be loose. ie. have their text wrapped in
<p></p>
-
Add:
WikiLinkExtension.IMAGE_LINKS
, defaultfalse
, when true enables wiki images of the form![[]]
with optional|
used for separating file ref from alt text. Other options:-
WikiLinkExtension.IMAGE_FILE_EXTENSION
, default""
, to add file extension or suffix to file reference. -
WikiLinkExtension.IMAGE_PREFIX
, default""
, to add a prefix to file reference
-
-
Add: #24, DefinitionList extension doesn't seem to work, implemented definition lists as per PHP Markdown Extra.
-
API Change: paragraph rendering now can determine whether
<p>
wrapping is disabled by checking if invoking its parent'sParagraphItemContainer.isParagraphWrappingDisabled(Paragraph, ListOptions, DataHolder)
method. Removes all details of having to know anything about the parent type other than its implementation ofParagraphItemContainer
-
Change: make
NO_FILE_EOL
default condition for tests. Last EOL will be stripped unless the previous line is blank. Also haveFILE_EOL
to reverse the condition. Default for tests now isNO_FILE_EOL
to force tests without having the file EOL terminated. -
Fix: #28, Table caption support?, add
TablesExtension.WITH_CAPTION
defaulttrue
, when true will parse table caption line, line after table with format[
...]
-
Change: methods that had
String
arguments toCharSequence
-
Change: add
BasedSequence.appendTo(StringBuilder)
and start/end variations to allow optimized appending to string builder for specific implementations.
-
Fix: #27, Abbreviation node not called when 2 abbreviations, was expecting \n even at end of file.
-
Add:
RenderingTestCase.NO_FILE_EOL
optionNO_FILE_EOL
to test cases which will strip out the last EOL of example in a spec to simulate input without trailing EOL. Otherwise all test cases had trailing EOL.
-
API Change: for
HtmlWriter
now usingFormattingAppendableImpl
to handle output formatting with greater flexibility. This affects how NodeRenderers generate output by allowing for more flexibility to control when line breaks should be suppressed or blank lines added.Modifications to existing code, some from Java 7 downgrade:
-
HtmlWriter.getAppendCount()
changed toHtmlWriter.getModCount()
or can use theHtmlWriter.getOffsetAfter()
to get actual character offset after append to find out if something was appended in child rendering code. -
Now there is EOL and blank line tracking even when using
HtmlWriter.raw()
so trying to output an extra blank line viaHtmlWriter.raw("\n")
will no longer work. UseHtmlWriter.blankLine()
. -
There are several
raw
output methods all have slightly different behaviour since normal output does processing you may not want forraw
text.-
HtmlWriter.raw(String)
will output raw without setting a pre-formatted region, which means accumulated spaces before theraw()
call will be output, which is desired for inline html. -
HtmlWriter.rawPre(String)
will set a pre-formatted region and output text as is, without indents but will output any accumulates spaces before therawPre()
call. This is desired for real<pre><code>
output. -
HtmlWriter.rawIndentedPre(String)
will also set a pre-formatted region but will set a fixed indent at the current indentation level that will be prefixed to all the lines and not output any pending white space characters. This is desired for HTML block output.
-
-
HtmlWriter.tagIndent(String, Runnable)
will no longer output an end of line after the opening tag if there was no lines appended between the opening and closing tag. In which case both tags will be on the same line. If you desire to have the tags always split on separate lines use theHtmlWriter.tagLineIndent(String, Runnable)
, which will force the closing tag on a new line, even if no lines were appended by child content. -
Rendered HTML by default will now have a maximum of 1 blank line, no matter how many are output with the data. This is controlled by the
HtmlWriter.flush(int)
call and theHtmlRenderer.render(Node, Appendable, int)
can be used which takesmaxBlankLines
as the last argument and controls maximum trailing blank lines that will be appended.
Changes made:
-
\n in appended text is equivalent to
HtmlWriter.line()
call. Will only terminate a line of text, never create blankLines. For blank lines useHtmlWriter.blankLine()
orHtmlWriter.blankLine(int)
-
spaces before and after \n are suppressed unless in
preFormatted
region. Before are not needed and after are controlled withHtmlWriter.indent()
andHtmlWriter.unIndent()
-
tabs outside of
preFormatted
region are converted to spaces and multiple spaces are collapsed to a single space, or eliminated if they come before or after \n. Allowing generating clean HTML. -
all forms of blank lines not generated by calls to
HtmlWriter.blankLine()
orHtmlWriter.blankLine(int)
are suppressed in order to generate properly formatted output without extra blank lines. -
flush()
orflush(int)
methods can now be used to control number of trailing blank lines in the output. They also need to be called to make sure trailing EOL is output to the underlying appendable. -
now conditional
line()
andindent()
formatting is available without having to use theHtmlWriter.tag()
with runnable argument for the child element text generation, by using theHtmlWriter.openConditional(ConditionalFormatter)
andHtmlWriter.closeConditional(ConditionalFormatter)
methods giving the parent element ability to change indent, add a new line or blank lines, prefix any output before the child element's text. Similar ability on closing the conditional formatting region. -
Add:
HtmlWriter.blankLine()
andHtmlWriter.blankLine(int)
methods that will add a single blank line or a count of blank lines, even when called multiple times without intervening output. Ensures a blank lines where needed but no more than the requested blank lines. -
to preserve spaces and EOL in
HtmlWriter.raw()
now temporarily setting pre-formatted region for the call. If this gets in the way and you wantraw()
text without HTML escapes but without pre-formatting, just use theappend()
methods passed through fromFormattingAppendableImpl
, they don't do any escaping but process all other formatting, unless in pre-formatting region.
-
-
API Change: clean up of sequences in flexmark-java-util was long overdue. The
BasedSequence
interface has evolved but not existing uses were updated. The following changes allow optimization ofBasedSequence.subSequence()
andBasedSequence.baseSubSequence()
depending on implementation particulars.Fastest parsing results will be achieved if the sequence passed to the parser is a
CharSubSequence
.Modifications to existing code, some from Java 7 downgrade:
-
BasedSequenceImpl.NULL
toBasedSequence.NULL
-
new SubSequence(
toSubSequence.of(
-
BasedSequence.of(
toBasedSequenceImpl.of(
-
new StringSubSequence(
toBasedSequenceImpl.of(
orCharSubSequence.of(
-
com.vladsch.flexmark.util.Escaping
moved tocom.vladsch.flexmark.util.html.Escaping
-
com.vladsch.flexmark.util.Html5Entities
moved tocom.vladsch.flexmark.util.html.Html5Entities
-
Attribute
changes:-
com.vladsch.flexmark.util.options.Attributes
moved tocom.vladsch.flexmark.util.html.Attributes
-
com.vladsch.flexmark.util.options.Attribute
moved tocom.vladsch.flexmark.util.html.Attribute
-
Attribute
is now an interface withMutableAttribute
interface extending it for in-place attribute manipulation. implemented inAttributeImpl
, instantiate withAttributeImpl.of()
variants.- assign delimiters for
class
andstyle
attribute names that you cannot override. The reset will getNUL
delimiters unless you specify otherwise.
- assign delimiters for
-
Attribute
now has a value list delimiter and a value name delimiter, these are used to split and combine values. Both can beAttribute.NUL
then no multiple values can be stored or manipulated in the attribute. If value name delimiter isAttribute.NUL
then multiple values are delimited by value list delimiter and considered named values without individual item values. For example:-
class
attribute has a list delimiter of' '
and class names can be added removed but they have no values associated with them beyond their name. -
style
attribute on the other hand has a list delimiter of';'
and a name delimiter of':'
. Item part before the colon is the name and the part after is the value. So you can change individual style's settings using the attribute manipulation functions of:Attribute.removeValue(String)
,Attribute.setValue(String)
which will parse the strings and add/remove/replace values. Any item whose value is empty is removed. Any existing item's value is change to the new value and any new items are appended at the end.
-
-
-
class Test {
void test() {
MutableAttribute attr = MutableAttribute.of("style");
// after setValue(): attr.getValue() == "color:#white;background:#black"
attr.setValue("color:#white;background:#black");
// after setValue(): attr.getValue() == "color:#green;background:#black;font-family:monospaced;"
attr.setValue("font-family:monospaced;color:#green");
// after setValue(): attr.getValue() == "color:#green;font-family:monospaced;"
attr.setValue("background");
}
}
Changes made:
-
Add:
CharSubSequence
implementation ofBasedSequence
-
uses
char[]
as the base that it accesses directly and createssubSequence
andbaseSubSequence
instances that also access it directly eliminating the nested calls throughcharAt
-
converts
\0
to\uFFFD
on construction eliminating this test on every character access. -
in the future has the potential to support fast
toCharArray()
and related optimizations. -
Remove: all string based
BasedSequence
implementations, now replaced byCharSubSequence
-
Change:
BasedSequence.getBase()
now returnsObject
to reflect that the underlying source of text could be anything. -
Add:
BasedSequence.getBaseSequence()
which returns aBasedSequence
of the wrapped full source. -
Change: all sequence classes are now final and constructors private. Use static methods
BasedSequenceImpl.of()
or the same of specific classes to get sequences of source. This allows optimization based on inputs or when asubSequence()
of the base is required useBasedSequence.baseSubSequence()
method, which will return an optimized instance where possible. -
Add: documentation to
BasedSequence
interface -
Fix: YouTrack and Jira renderers to remove soft line breaks and to make sure no extra blank lines are added after loose items. Loose items had 2 blank lines between them because one was added by the paragraph node and the other by the list item.
-
Add: added Toc and SimToc options:
-
hierarchy: as before hierarchical list of headings
-
flat: flat list of headings
-
reversed: flat reversed list of headings
-
increasing: flat, alphabetically increasing by heading text
-
decreasing: flat, alphabetically decreasing by heading text
Extension changes:
-
new key
TocExtension.LIST_TYPE
takes aTocOptions.ListType
enum for list generation type. -
TocOptions
now supportsImmutable
interface allowing to get a mutable copy whose fields can be modified directly withtoMutable()
, and back toTocOptions
withtoImmutable()
. -
SimTocOptions
have been removed.TocOptions
holds all the options for both the regular TOC and simulated TOC.
-
Add:
SubscriptExtension
andStrikethroughSubscriptExtension
extensions toflexmark-ext-gfm-strikethrough
artifact. You guessed it, adds~subscript~
parsing -
Add:
SuperscriptExtension
toflexmark-ext-superscript
artifact for^superscript^
parsing. -
Add:
InsExtension
toflexmark-ext-ins
artifact for++inserted++
or aka underlined parsing.
-
Change: remove Java 8 language constructs and reduce language level to 7 for android support.
-
Change: remove
Substring
class from sequences and renameStringSequence
toStringBasedSequence
.Substring
was duplicating code and not used except in tests. -
Change: move some statics from
BasedSequenceImpl
toBasedSequence
and addBasedSequence.of()
variations to the latter that check for already being aBasedSequence
and just returning it or ifString
is passed then wrapping it inStringBasedSequence
and if neither then wrap it inSubSequence
. -
Fix: #21, table column alignment was not taking accumulated spans in the row into account when getting alignment for cell from table separator row.
-
Fix: strike through extension was not rendering correctly for YouTrack conversion.
-
Fix: moved all renderer specific extension tests to their corresponding renderer instead of having each extension test for each renderer. Much easier to ensure the extensions are properly tested and if any extension has not been properly updated.
-
Add: Blank lines to Jira and YouTrack renderer after: heading, thematic break, block quote and table.
-
Add: YouTrack converter, same as Jira but with a few differences
-
Fix: Jira converter not to add an extra blank line after paragraphs in block quotes
-
Add: final to all node visitors' node parameter
-
Add: upgrade to CommonMark spec 0.27
-
Add: option
Parser.PARSE_JEKYLL_MACROS_IN_URLS
which allows any characters to appear between{{
and}}
in URLs, including spaces, pipes and backslashes. -
Add: option
HTML_COMMENT_BLOCKS_INTERRUPT_PARAGRAPH
withtrue
by default but when false then they require a blank line before, otherwise they become inline HTML. -
Add: test for
HTML_COMMENT_BLOCKS_INTERRUPT_PARAGRAPH
option -
Add: header id generator options to allow slight customization of heading ids
-
HtmlRenderer.HEADER_ID_GENERATOR_RESOLVE_DUPES
when set to true adds a-
with an increasing number from 1 to N for each duplicated id generated. -
HtmlRenderer.HEADER_ID_GENERATOR_TO_DASH_CHARS
a string of characters in the heading text which will be mapped to-
, alphanumerics are passed as is. Anything else is suppressed. -
HtmlRenderer.HEADER_ID_GENERATOR_NO_DUPED_DASHES
if set will not generate consecutive dashes in the reference ids. -
Change: complete rewrite of list handling and list handling options to allow for Markdown parser emulation based on major parser families, as described in Markdown Parser Emulation. All major families are done and tested.
-
CommonMark
-
GitHub Comments
-
CommonMark (default for family)
-
FixedIndent
-
MultiMarkdown
-
Kramdown
-
Kramdown (default for family)
-
Markdown
-
Markdown.pl (default for family)
No attempt was made to emulate inline processing of these parsers. Only list processing for now. Most of the other element discrepancies are already addressable with existing parser options.
-
Fix: #19, ArrayIndexOutOfBounds while parsing markdown with backslash as last character of text block
-
Change:
SpecReader
to parse for headings only if spaces are present after leading#
. Otherwise, leading github issue#7
in the description is treated as a heading. -
Add: list parsing option
Parser.LISTS_EMPTY_BULLET_ITEM_INTERRUPTS_ITEM_PARAGRAPH
to allow an empty list sub-item to be recognized as such. Needed by Markdown Navigator to recognize empty sub-items during parsing for list item formatting ops
-
Fix: #13, Hard line breaks do not work if markdown text/files uses CR LF as line separator
-
Fix: #14, Link reference definitions indented by spaces not recognized
-
Fix: #17, IndentedCodeBlock endOffset too large?, Added an option
Parser.INDENTED_CODE_NO_TRAILING_BLANK_LINES
-
Fix: #18, Unclosed FencedCodeBlock endOffset too small
-
Add test for
Parser.HARD_LINE_BREAK_LIMIT
option -
Change
HtmlRenderer.GENERATE_HEADER_ID
default to true -
Fix #16, jira converter does not include lang= for fenced code with info
-
Fix jira converter to output to better handle loose lists, items and options
-
Fix jira converter to output an extra blank line after the end of the outermost list
-
Add aside extension that behaves as block quotes with
|
marker and generates<aside>
tags instead of block quotes. -
Add multi-line url image links option
-
Add
HtmlRenderer.SOURCE_WRAP_HTML
andHtmlRenderer.SOURCE_WRAP_HTML_BLOCKS
options to wrap HTML blocks indiv
with source position information attribute. -
Fix #10, Wrong startOffset in HardLineBreak
-
Add
Parser.HARD_LINE_BREAK_LIMIT
false by default, when true only the last 2 spaces of a hard line break will be part of the node. The rest are part of the previous text node.
-
Add table class option to tables extension
-
Add
Parser.BLOCK_QUOTE_IGNORE_BLANK_LINE
option to make block quotes ignore blank lines between block quotes. This duplicates pegdown and GFM parsing. -
Add:
HtmlRenderer.TYPE
data key to allow extensions to set rendering type. Default valueHTML
. -
Add:
JiraConverterExtension
to convert standard markdown AST to JIRA formatted text -
Add: Rendering type to allow standard non-html rendering capability to be added by extensions. For now rendering types are:
HTML
default andJIRA
. The latter generates JIRA formatted text from Markdown AST. -
Change:
HtmlRendererExtension.extend(Builder, String)
now gets a string argument for the type of rendering to be desired. If the extension does not recognize the type it should not register a renderer. Current values are:HTML
andJIRA
others may be added by extensions -
Change: standard extensions that can be mapped to JIRA formatted text to implement JIRA renderer.
-
Add:
JIRA
rendering for auto-link, emoji, strikethrough, tables and wiki link extensions.
-
Factor out utilities to a separate module to eliminate maven pom cycles
-
Prepare for maven release
-
Fix Toc extension did not generate the right table of contents hierarchy if header levels were missing. The indentation is still not perfect but will do for now.
-
Add source position attribute to list item tags
-
Rename
TablesExtension.HEADER_SEPARATOR_COLUMNS
toTablesExtension.HEADER_SEPARATOR_COLUMN_MATCH
and implement this option in the table parser. -
Fix child paragraphs of list items would not be wrapped in
p
tags if list auto-loose option was disabled and the item was not loose
-
Add
HtmlWriter.srcPosWithTrailingEOL()
that will extend the position information to include trailing EOL after then end of the sequence, skipping any spaces or tabs. Used to include EOL for closing code fence marker. -
Fix #8, Fenced code closing sequence allows trailing spaces but not tabs. Tab added to ignored trailing characters after the closing marker.
-
Add
HtmlWriter.srcPos()
methods to add source position information to the next.withAttr()
tag. -
Add
HtmlRenderer.SOURCE_POSITION_ATTRIBUTE
the name of the source position HTML attribute so set to the source position given in.srcPos()
and.srcPosWithEOL()
methods. These behave as.attr()
methods and the source position attribute will be applied to the next tag which is preceded by one of.withAttr()
methods. -
Add
HtmlRenderer.SOURCE_POSITION_PARAGRAPH_LINES
if true then paragraph source lines will be wrapped in<span></span>
with source position information for the line. Works even for tight list items that do not generate a<p>
wrapper for their text.⚠️ Only works if source position attribute is set to non-empty value. -
Add
AttributablePart
instances: -
CoreNodeRenderer.CODE_CONTENT
to mark thecode
tag part of fenced code and indented code -
CoreNodeRenderer.PARAGRAPH_LINE
to mark line spans of paragraphs source positions, list items or any other text block supporting lazy continuation. -
Refine source position attribute generation to make highlighting HTML elements from source position information more intuitive.
-
Fix #5, SimToc did not unescape the title string when rendering
-
Add
EmojiExtension.ATTR_IMAGE_SIZE
andEmojiExtension.ATTR_ALIGN
options to control image size and align attributes for rendering -
Change rename
RenderAs.SIMPLE
toRenderAs.SECTIONS
-
Change
SpecExampleNodeRenderer
forRenderAs.SECTIONS
option to render the spec example section and number if it is available ash5
, also only renderhr
tag if the section text is not empty. -
Add
Parser.LISTS_LOOSE_ON_PREV_LOOSE_ITEM
option to set the next list item as loose if previous one was loose. This makes list items mimic GFM quirky (diplomatic way of saying buggy) list item parsing. -
Add
HtmlRenderer.HARD_BREAK
option so that GFM comment mode where soft wraps are turned into<br>
and hard wraps are turned into<br><br>
could be emulated. Are we having fun yet!?!? -
Fix #6, List items are not properly marked as tight/loose
-
Fix #7, Task list items not copying the original list item's tight/loose flag
-
Add
CoreNodeRenderer.LOOSE_LIST_ITEM
andCoreNodeRenderer.TIGHT_LIST_ITEM
instances ofAttributablePart
to identify attributes used for generating<li>
tag for list items. -
Add
TaskListNodeRenderer.TASK_ITEM_PARAGRAPH
AttributablePart
to identify when a loose task list item<p>
tag attributes are being requested. -
Add
TableRow.rowNumber
inflexmark-ext-tables
which is the row number within the table section. Allowing easy first/even/odd determination for rendering where the browser does not have CSS capabilities to handle this. -
Add
TaskListExtension.CONVERT_ORDERED_LIST_ITEMS
option to convert ordered list items to task list items, default set to true. -
Change
TaskListItem.isOrderedItem()
added to allow distinguishing ordered from unordered task list items.
-
Add
WikiLinkExtension.DISABLE_RENDERING
option to render wiki links as the text of the node, for cases where wiki links are not allowed in the document but for purposes of error annotations they should still be parsed. -
Fix
Attributes.getValue(String)
was not checking if the attribute was missing, causing NPE. -
Change rename
HtmlRenderer.LANGUAGE_CLASS_PREFIX
toHtmlRenderer.FENCED_CODE_LANGUAGE_CLASS_PREFIX
-
Add test in
HtmlRenderer.MainNodeRenderer.resolveLink(LinkType, String, Boolean)
for empty url in which case no link resolvers are called. -
Fix emoji renderer was not setting image height, width nor align attributes
-
Fix
SimTocBlockParser
was not including the blank line spacer as part of theSimTocContent
node. -
Change bare
AttributeProvider
in the API to addAttributeProviderFactory
to allow for context based construction of attribute providers and for dependency resolution between attribute provider factories. Attribute provider factories that do not define any dependencies should extendIndependentAttributeProviderFactory
which provides defaults so onlyAttributeProviderFactory.create(NodeRendererContext)
method needs to be implemented. -
Add
NodeAdaptingVisitor
,NodeAdaptingVisitHandler
andNodeAdaptedVisitor
to handle customized node mapping functions for easier multiplexing from base class Node to specific node subclasses. -
Change
Visitor
,VisitHandler
,NodeVisitor
to use the new adapting node visitor classes -
Change
CustomNodeRenderer
,NodeRenderingHandler
,NodeVisitor
to use the new adapting node visitor classes -
Add
LinkResolvingVisitor
,LinkResolvingHandler
andLinkResolverAdapter
to allow generic link resolving mapping. -
Add Node parameter to
LinkResolver.resolveLink(Node, NodeRendererContext, ResolvedLink)
for symmetry and to allow node specific link resolution mapping viaLinkResolverAdapter
. -
Change
WikiLinkExtension.LINK_FILE_EXTENSION
default to""
-
Add
Parser.BLOCK_QUOTE_TO_BLANK_LINE
option, when true block quote block stays open until a blank line. Making block quote parsing compatible with GFM and most other markdown processors. -
Fix
ImageRef
andLinkRef
did not properly set the source information for a dummy reference part of a reference[ref][]
-
Add a bunch of list parsing options to allow mimicking list parsing by various markdown implementations:
-
Add
Parser.LISTS_ITEM_TYPE_MATCH
when true a new list is started when the list item type does not match an existing list type. When false bullet list can contain ordered list items and vice versa. In combination withParser.LISTS_ITEM_MISMATCH_TO_SUBITEM
allows mimicking different parser behavior: kramdown, GFM, Markdown.pl, ... -
Add
Parser.LISTS_ITEM_MISMATCH_TO_SUBITEM
when true a mismatched item is treated as a sub item instead of starting a new list. When false a new list will be started. :information_source: only applicable ifParser.LISTS_ITEM_TYPE_MATCH
is true. -
Change
Parser.ORDERED_LIST_DOT_ONLY
toParser.LISTS_ORDERED_ITEM_DOT_ONLY
-
Add
Parser.LISTS_BULLET_ITEM_INTERRUPTS_PARAGRAPH
option, when true a bullet list item can interrupt a paragraph. i.e. start without having a blank line before -
Add
Parser.LISTS_BULLET_ITEM_INTERRUPTS_ITEM_PARAGRAPH
option, when true a bullet list sub item can interrupt the parent item's item text paragraph. -
Change
Parser.ORDERED_LIST_INTERRUPTS_PARAGRAPH
toParser.LISTS_ORDERED_ITEM_INTERRUPTS_PARAGRAPH
option, now controls whether an ordered list item can interrupt a paragraph. i.e. can start without having a blank line before.Parser.LISTS_ORDERED_NON_ONE_ITEM_INTERRUPTS_PARAGRAPH
controls whether this is only true for items with 1. prefix, or any ordered item. -
Add
Parser.LISTS_ORDERED_ITEM_INTERRUPTS_ITEM_PARAGRAPH
option, when true an ordered list sub item can interrupt the parent item's item text paragraph.Parser.LISTS_ORDERED_NON_ONE_ITEM_INTERRUPTS_PARENT_ITEM_PARAGRAPH
controls whether this is only true for items with 1. prefix, or any ordered item. -
Change
Parser.ORDERED_LIST_START
toParser.LISTS_ORDERED_LIST_MANUAL_START
-
Add
Parser.LISTS_ORDERED_NON_ONE_ITEM_INTERRUPTS_PARAGRAPH
controls whether any ordered item can interrupt or only one starting with 1. ℹ️ only applies ifParser.LISTS_ORDERED_ITEM_INTERRUPTS_PARAGRAPH
is true. -
Add
Parser.LISTS_ORDERED_NON_ONE_ITEM_INTERRUPTS_PARENT_ITEM_PARAGRAPH
controls whether any ordered item can interrupt or only one starting with 1. ℹ️ only applies ifParser.LISTS_ORDERED_ITEM_INTERRUPTS_ITEM_PARAGRAPH
is true.
- Fix #3, Incorrect emphasis close marker source offset
- Fix
SimTocBlockParser
to useParsing.LINK_TITLE
pattern for TOC title to match what is allowed for reference title.
-
Change compliance to spec 0.26 with better emphasis delimiter parsing rules and list rules without double line breaks and ordered lists don't interrupt paragraphs unless they number with 1.
-
Add
Parser.ORDERED_LIST_INTERRUPTS_PARAGRAPH
to control whether an ordered list item can start without having a blank line before, default true. -
Change
Parser.ORDERED_LIST_START
also controls if an ordered list interrupts a paragraph. If ordered list start is set to false, then an ordered list always starts at 1, so it will interrupt a paragraph regardless of whether it is1.
in the source. -
Change
HtmlRenderer.ORDERED_LIST_START
moved toParser.ORDERED_LIST_START
since it is needed during parsing. -
Add
Parser.ORDERED_SUBITEM_INTERRUPTS_PARENT_ITEM
, when true, even withParser.ORDERED_LIST_INTERRUPTS_PARAGRAPH
set to false an ordered sub-item will interrupt parent's item paragraph. -
Remove
Parser.INLINE_RELAXED_EMPHASIS
since spec 0.26 does the right thing without needing an option.
-
Change move
Parsing
static strings and patterns into instance fields so that they can be changed according to selected options, making it easier to configure parsing patterns. -
Change move all
InlineParserImpl
parsing string and patterns toParsing
-
Add
ParserState.getParsing()
to allow block parsers to use the current parsing config from the core. Extensions can extend this class for their own option dependent patterns. -
Add
Parser.ORDERED_LIST_DOT_ONLY
to not allow)
ordered list item delimiter -
Add
Parser.INTELLIJ_DUMMY_IDENTIFIER
to include the'\u001f'
completion location dummy identifier used by Markdown Navigator. Other characters can be easily added if the need arises. -
Change
InlineParserImpl.parseCustom(BasedSequence, Node, BitSet, Map<Character, CharacterNodeFactory>)
now if a character is in the bit set but the map does not contain a node factory then the character will be treated as text and prevent any standard delimiter or other inline parsing checking. Can be used by extensions to prevent their special characters from being hijacked by other processors. -
Add
RenderingTestCase.IGNORE
andRenderingTestCase.FAIL
boolean data keys. WhenRenderingTestCase.getOptions(SpecExample, String)
returns a data set with one of these keys set to true then it is treated the same way as if the option string wasIGNORE
orFAIL
respectively. Allows conditional ignore or fail on spec examples in tests.
- Change
TextBase
node now represents text equivalent nodes and can containsText
and other decorated text nodes processed by extensions. For exampleAbbreviation
is really just text but is decorated with link like rendering. Similarly, auto links are just text with link decorations.
To allow extensions to create such decorated text while allowing contiguous plain text
processing without a lot of code, extensions should replace Text
nodes that they decorate with
TextBase
and add their unprocessed text as Text
nodes under the TextBase
node, along with
their extension specific decorated nodes with a child Text
node for the decorated part of the
text. The custom decorated text nodes should also implement DoNotDecorate
interface so that
other extensions will know not to decorate their text.
If a Text
node is not a child of TextBase
then a new instance of TextBase
should be
created and all undecorated and decorated text nodes should be its children.
Text decoration by extension should always be done on Text
and never on TextBase
nodes.
TextBase
rendering is just rendering of its children. TextCollectingVisitor
uses the
characters of TextBase
node and does not descend into its children.
- Add
TypographicExtension
for typographic quotes and smarts,EscapedCharacterExtension
for syntax highlighting escaped characters,DefinitionExtension
for definition lists. These are placeholders for now. Code and tests to be done. For now all plugin required nodes are there.
-
Remove all the dependencies between nodes and their visitors, no more global visitor and maintaining visitor derived classes or
accept(Visitor)
implementation that requires implementation in every non-abstract class. -
Change all extensions to implement a custom node visitor interface which defines
VISITOR_HANDLERS
static method taking an instance that implements the visitor interface and returns an array ofVisitHandler
which can be passed toNodeVisitor
constructor along with any other visitor handlers as vararg.
Nodes no longer have an accept()
method. NodeVisitor
delegate is used for recursive
traversal of the AST. Its generic NodeVisitor.visit(Node)
can be used to start the visit. This
method will map the actual node class to the VisitHandler
associated with the given node.
Maintaining the core Visitor
interface and its derivatives became too much of a pain. Handling
custom nodes is now identical to handling core nodes and the limitation of inheritance other
than from Node has been removed from all nodes.
-
Change
LinkResolver
andLinkResolverFactory
interfaces and registration in HtmlRenderer to handle resolving of URLs for links. -
ResolvedLink
represents the link being resolved.ResolvedLink.getUrl()
will initially return the raw link value from the markdown element.LinkResolvers
can modify this value according to their understanding of the link type and link format. They may or may not change the link type and status. -
LinkType
specifies type of link. Core definesLinkType.LINK
andLinkType.IMAGE
, extensions can define other types that use different link resolving logic. Wiki link extension definesWikiLinkExtension.WIKI_LINK
type and provides a custom link resolver that will convert the wiki link text to a URL and the type toLinkType.LINK
. It also changes the status toLinkStatus.UNCHECKED
-
LinkStatus
holds the result of the resolving process. Initial link status is -
LinkStatus.UNKNOWN
, resolvers are called until status changes to another value. -
LinkStatus.UNKNOWN
link has not been resolved yet -
LinkStatus.VALID
link is resolved and valid -
LinkStatus.UNCHECKED
link is resolved, validity not verified -
LinkStatus.NOT_FOUND
link is resolved and its target is not found -
Link resolvers are tried until one reports success. They can modify the URL, if available the Text, and attributes. The latter is still modifiable by attribute providers at two points: right after all resolvers have passed and before final rendering of the link.
-
like other processors they have before/after dependencies.
-
Encoding is done by the context as the last step if it is requested in options. No URL encoding of links which are passed through resolving process.
-
Any unresolved link's url is rendered as is.
-
Results of resolving a link are cached based on
LinkType
and the initial url text. Subsequent requests to resolve the same type and url will return the same instance ofResolvedLink
. -
Add
AttributablePart
that nodes provide when marking a tagHtmlWriter.withAttr(AttributablePart)
so that an attribute provider has information about the exact HTML element the node is requesting attributes for. Core only defines: -
AttributablePart.NODE
a generic placeholder when the node does not specify one -
AttributablePart.ID
a node's id attribute is being requested -
AttributablePart.LINK
a node is rendering a link, theAttributes
parameter will hold an attribute namedAttribute.LINK_STATUS
whose value represents the name of theLinkStatus
of the resolved link. Attribute providers can use this value to set specific attributes based on the resolved link status. This attribute does not render in the final HTML.
Extensions can and should define parts for specific elements they allow to modify with extensions.
-
Change
AttributeProvider.setAttributes(Node, AttributablePart, Attributes)
to now get an attributable part that pinpoints the exact element of the node being rendered, for nodes that have many elements capable of having attributes. -
Change
LinkResolver.resolveLink(NodeRendererContext, ResolvedLink)
the context allows the resolver to get the node for which this link is being resolved viaNodeRendererContext.getCurrentNode()
.
-
Add
Attributes
andAttribute
dedicated classes to handle attributes instead of relying onMap<>
. Allows easier replacing, adding, removing a value from an attribute values, which are a space separated list of strings. Determining which ones should not be rendered and when. -
Add
LinkResolver
andLinkResolverFactory
interfaces and registration in HtmlRenderer to handle resolving of URLs for links, including adding attributes. -
Link resolvers are tried until one reports success. They can modify the URL, if available the Text, and attributes. The latter is still modifiable by attribute providers at two points: right after all resolvers have passed and before final rendering of the link.
-
like other processors they have before/after dependencies.
-
After all resolvers have handled the link it is passed to AttributeProviders to possibly add/remove/change attributes via
AttributeProvider.setAttributes(LinkRendering)
at this point if the link was resolvedLinkRendering.getIsResolved()
will return true, null means no resolver handled it, it will render as is, false means it does not resolve. -
The Attribute providers will be invoked again on the final link rendering but at this point there is no information on whether the link resolved or not but there is final attributes that can be manipulated.
-
Encoding is done by the context as the last step if it is requested in options. No URL encoding of links which are passed through resolving process.
-
Any unresolved link is rendered as is.
-
Add dependencies to CustomBlockParserFactory and core factories to eliminate the need to order factories manually. Custom factories that have no specific dependencies will still run before core factories. Core factories now define dependencies between each other to ensure correct processing.
-
Add
BlockParser.isPropagatingLastBlankLine(BlockParser)
to removeinstanceOf
tests inDocumentParser
, making it agnostic to specific block parsers. -
Add
Node.getLastBlankLineChild()
to removeinstanceOf
tests inDocumentParser
, making it agnostic to specific node types. -
Add Sim TOC syntax as per Markdown Navigator simulated TOC element, with parse and rendering options.
-
Add Flexmark Spec Example Extension to parse flexmark spec files, same as Markdown Navigator.
-
Change Sim TOC to be a container and accept only a single HTML block without blank lines or a heading and a list without blank lines.
-
Made rendering test case classes usable for other spec based testing, not just flexmark. Too useful for testing other parsing implementations to leave it just for flexmark.
-
Add Zzzzzz module to test suite so that the archetype also gets to run for sanity testing of basic extension module.
-
Change spaces in example lead line to NB SP so GitHub can display it as a code fence.
-
Add HR after spec front matter so GitHub can think it is Jekyll or YAML front matter.
-
Fix all examples were not being recognized due to NB SP change. Now will accept first line with NB SP or SP or TABs for whitespace.
-
Add exception if an example file results in no examples
-
Add FAIL example option to allow sanity tests for failure in specs
-
More API rework to make core rendering and extension rendering identical in performance and ease of implementation.
NodeRenderer
interface changed. Now it provides a map of node classes toNodeRenderingHandler
instance. In the implementation it becomes a new instance creation with class and lambda method reference. The method already has the right node class eliminating the need to have a slew ofif (ast instance of ...) do((cast)...)
. Just implement aNodeRenderHandler.render(YourNode, NodeRendererContext, HtmlWriter)
and add an entry into the map. -
Add
AbstractCustomVisitor
constructor takes a collection or vararg ofNodeVisitingHandler
instances and uses these to map visitation ofCustomNode
andCustomBlock
to specific classes of nodes and their corresponding visit handlers, like custom node rendering allows to create custom node class specificvisit()
methods. Similar toAbstractVisitor
any node specific methods not overridden willvisitChildren()
of that node. -
Add
AbstractCustomBlockVisitor
similar toAbstractCustomVisitor
but will only visit children of block elements exceptParagraph
.
-
Add ability to prioritize dependents, affects only dependents not constrained by dependencies.
-
Change document post processors are run after node post processors unless constrained by dependencies provided.
-
Add AbstractBlockVisitor, that does not visit children of
Paragraph
or any of the inline nodes to allow efficient block collection while ignoring non-blocks. -
Add do not render links condition to
NodeRendererContext
to allow disabling of link rendering in children. -
Add
NodeRendererContext
andHtmlWriter
arguments toNodeRenderer.render()
method so that node renderers are not linked to a specific context or html writer, only document. -
Add
RenderingVisitor
class which handles passing extra rendering parameters to overloaded node methods while using theVisitor.visit()
plumbing. -
Add
NodeRendererContext.getSubContext()
allowing a renderer to get html rendered to be processed or inserted as needed. The sub-context has its own html writer and do not render links state. -
Add TOC extension as per pegdown, with options to not claim the
[TOC level=#]
line when # is not valid: 0, 1, 2. Another option to only use header text for rendering the links, default will use inline emphasis and other non-link generated html. -
Change header id generation is now part of the core that can be turned on and used by the extensions via
NodeRenderingContext.getNodeId(Node)
, can be null if no id was generated for the node in question. -
Add
HtmlIdGenerator
andHtmlIdGeneratorFactory
to allow extending/replacing the header id generator in the core. Default is not to add id's to headers. With appropriateHtmlRenderer.Builder.htmlIdGeneratorFactory()
method to register a custom generator. If id generation is enabled but no custom generator is registered then GitHub compatible rules are used to generate header ids. These can be overridden byAttributeProvider
by changing or removing theid
from the map. -
Add
HtmlRenderer.RENDER_HEADER_ID
option, default false. When enabled will render a header id attribute for headers using the configuredHtmlIdGenerator
-
Add
HtmlRenderer.GENERATE_HEADER_ID
option, default false. When enabled will generate a header id attribute using the configuredHtmlIdGenerator
but not render it. Use this when an extension needs a header id, like AnchorLinksExtension and TocExtension. -
Add
HtmlRenderer.DO_NOT_RENDER_LINKS
option, default false. When enabled will disable link rendering in the document. This will cause sub-contexts to also have link rendering disabled.
-
Add AnchorLinks extension to automatically generate anchor links for headers.
-
Add
NodeIterator
andNodeIterable
andDescendantNodeIterator
andDescendantNodeIterable
for quick and easy traversal of nodes depth first, both can be reversed. Descendant iterator processes parent then children. -
Add
OrderedSet
,OrderedMap
andOrderedMultiMap
which combine the functions of set, map and the latter of a one to one bi-directional map, i.e. key->value and value->key, allowing for either key or value to be null. All of these are iterable and have iterators for indices, values, keys, and entries, including reversed and reversible iterators.
Additionally, these have can be in hosted mode, in which they make callbacks on changes allowing
to keep tandem structures in sync. OrderedMap
and OrderedMultiMap
use OrderedSet
in this
way.
-
Change document parser to use helper classes for block parser and block pre-processor optimizations. Gain in performance for large files is significant. 500k file flexmark-java was 1.39x longer parse time than commonmark-java, now 1.02x to 1.05x times.
-
Add separate abstract classes for node post processors that work on certain nodes and document post processors that will traverse the whole AST and potentially return a new document node. These also specify which nodes to skip based on class list of ancestors. That way processors that should not process text nodes that are part of links can just add
DoNotDecorate.class
to the exclusion list. -
Add `` to handle dependency resolution for document post processors and node post processors.
-
Add
NodePostProcessor
to support more efficient text node post processing than each post processor traversing to post process text nodes instead of each extension traversing the full document tree.
-
Change
AbstractBlockParserFactory
to require a data holder argument in the constructor so that options can be instantiated inAbstractBlockParser
. -
Change builder to use
PostProcessorFactory
which takes aDocument
argument in create allowing creation of document specific post processors that can be re-used on any node of that document.
-
Add ext-zzzzzz module to hold a skeleton of an extension module, until I complete the plugin to handle all the extension configuration and stabilize the API.
-
Add project icon
-
Add
Parser.LISTS_RELAXED_START
option to allow lists to start only if preceded by a blank line, when false. Default true. -
Add
Parser.THEMATIC_BREAK_RELAXED_START
to allow thematic breaks only if preceded by a blank line when false. Default true. -
Add Html comment nodes for blocks and inline
-
Add separate options for escaping and suppressing comments
-
Fix List Option no break out of lists on two blank lines.
-
Fix List Option no bullet match for starting a new list.
-
Add
blockAdded()
andremoveBlock()
toParserState
to allow custom processor created blocks to be included in optimization structures used by block pre-processing handling. -
Change ext-table
TableBlock
to useblockAdded()
method to allow preProcessing of table blocks by custom block pre-processors.
-
Change built in ignore test option to
IGNORE
.Parser.THEMATIC_BREAK_RELAXED_START.
-
Change spec files to
.md
extension so that Markdown Navigator could be used to edit it to add completions of options, annotations of options and option declarations, structure view and formatting.
- Fix Optimized
Escaping.unescape()
to pre-search for\
or&
not using regex but dedicated character search and if found start processing from that position, to save time. Approximate gain of 22% reduction in processing time for a large file (500k) of mostly text.
-
Add
BlockPreProcessorFactory
andBlockPreProcessor
interfaces for block node substitution before inline processing. -
Add generic
Dependent
,DependencyHandler
andResolvedDependencies
to handle resolving dependencies between processors. Need to addPostProcessor
dependency declaration so that interdependent extensions can automatically control execution order where it matters. -
Change ext-autolink to use
ComboSpecTestCase
, was the last one left using two test classes. This extension needs attention, it has serious impact on parsing performance due to un-escaping text and mapping it back to source positions. -
Add separate nodes for
BulletListItem
andOrderedListItem
-
Fix bug in
ReplacedTextMapper
forgetting to return original index as offset from the start of the string. Would return offset into original source.
-
Change spec test
options()
to take a comma separated list of option set names and use the combined option set of individual sets to run the test case. This is implemented in theRenderingTestCase
and does not require modification of the tests. -
Add
ignore
as one of theoptions()
passed. If present the test case will be ignored by throwingAssumptionViolatedException()
. To allow future compatibility tests to remain in the spec but be ignored not to pollute the test results. -
Add Heading allow no space after # for atx and do not allow non-indent spaces before heading options:
-
Parser.HEADERS_NO_ATX_SPACE
-
Parser.HEADERS_NO_LEAD_SPACE
-
Add Relaxed inline emphasis parsing option. No code behind the option yet.
-
Parser.INLINE_RELAXED_EMPHASIS
-
Add footnote link class options for footnote ref link and footnote back link:
-
FootnoteExtension.FOOTNOTE_LINK_REF_CLASS
-
FootnoteExtension.FOOTNOTE_BACK_LINK_REF_CLASS
-
Update integration test to check some basic file parsing
-
Move wrap.md parsing from ex-table to integration test so that all extensions could be enabled.
-
Add unimplemented inline parser option to use GFM emphasis parsing rules,
Parser.RELAXED_INLINE_EMPHASIS
boolean options, default false.
-
Add: ability for block parsers to store the line indents on a per line basis in the block content and optionally the block node. Needed for proper
ParagraphPreProcessor
handling of some elements, like tables. -
Remove: storage of end of line length in content blocks. Easily done with
.eolLength()
which counts trailing\r
and\n
. -
Fix: flexmark-ext-table extension to include each line's indents which exceed the first line's indent as part of the table row.
- Change: flexmark-ext-table extension to use paragraph pre-processor interface and perform inline parsing on a line of a table before splitting into columns so that pipes embedded in inline elements will not be treated as column breaks. Partially complete. Need to preserve original line's indentation information in a paragraph block.
- Change:
BlockPreProcessor
toParagraphPreProcessor
interface and paragraph pre-processing to allow dependencies between paragraph pre-processors and have some pre-processors to be executed against all paragraph blocks before running dependents on it. That way document global properties likeReferenceRepository
will be updated before other pre-processors dependent on it will be run.
- Fix parser and renderer to store builder copies for re-use in
withOptions()
methods without possible side effects from original builder being modified after assignment.
-
Add
ComboSpecTestCase
to flexmark-test-util than combinesFullSpecTestCase
andSpecTestCase
functionality so that only one test class needs to be created to get both individual spec example results and the full file result. -
Change all tests that used
FullSpecTestCase
andSpecTestCase
to useComboSpecTestCase
and modify their ast_spec.txt to use options to test for extension options handling. -
Fix table extension factory to be stateless and use
ParserState::getProperties()
for its state storage. -
Update Writing Extensions wiki
-
Add
Parser.EXTENSIONS
data key to hold a list of Extensions so extensions can be part of the universal options passing. Now there is no need to explicitly callbuilder().extensions()
if you set the extensions key to the list of desired extensions. -
Change spec test related classes to handle option sets specified in the spec per example. That way multiple extension/option combinations can be validated within the same spec file, eliminating the need to have separate spec file and test classes for each extension/option combination.
-
Change more tests using multiple test classes and spec files to use the options mechanism. On the todo list:
-
Wiki link extension with creole vs gfm syntax tests
-
Table extension for gfm and non-gfm table testing, need to add tests for various options.
-
Remove options argument from flexmark-ext-tables
TablesExtension::create()
-
Fix
ReferenceRepository
to useEscaping::normalizeReference
for reference normalization. -
Fix
CoreNodeRenderer
to useSUPPRESS_HTML_BLOCK
instead ofSUPPRESS_INLINE_HTML
-
Add
HtmlWriter::withCondLine()
to make next tag output an EOL only if there is child text between the opening and closing tags. Only works for methods that take aRunnable
argument for output child text. -
Fix ext-gfm-tables for incorrect separator line parsing, introduced right before ext-gfm-tables and ext-tables split.
-
Fix ext-tables to handle multi-line headers or no line headers with options MIN_HEADER_ROWS, MAX_HEADER_ROWS.
-
Add ext-tables option HEADER_SEPARATOR_COLUMNS, if true will only recognize tables whose headers contain no more columns than the separator line. Default false, any number of columns in header lines will be accepted.
-
Update Emoji extension to use the universal options mechanism.
-
Add WikiLink file link extension string option to append to generated links
-
Add Suppress HTML in addition to escape option, with separate options for blocks and inline html
-
Add
MutableData
to be used as a building set for various Parser/Renderer options that all extensions can hook into usingPropertyKey
instances. Moved all builder options to use this mechanism. Now can set options for all extensions in one place and the extensions can query these to get their configuration parameters. -
Add
flexmark-ext-tables
to implement tables per pegdown with GFM limitations configurable in options. -
Add spanning columns parsing for flexmark-ext-tables
-
Make all core block parser processors optional to allow disabling core functionality.
-
Make all core delimiter processors optional to allow disabling core functionality.
-
Add
LinkRefProcessor
and associated registration methods for extensions to allow flexible parsing of elements that start from link references, such as footnotes[^]
and wiki links[[]]
. Otherwise, there is no way to properly control generation of link refs and custom nodes. Additionally, this allows processing of these nodes during inline parsing instead of each extension having to traverse the AST and transform it. -
Change Attribute provider interface to include
tag
being generated since custom nodes can have more than one tag which require attributes, for example footnotes. -
Remove previous attribute handling which did not take a
tag
and was done in the node rendering code. -
Change
WikiLinkExtension
andFootnoteExtension
to useLinkRefProcessor
instead ofPostProcessor
. -
Change
SpecReader
to recognize lines starting with EXAMPLE_START, that way each example start line can be augmented with section and example number for cross reference to failed tests. -
Add section and example number printing to
FullSpecTestCase
for cross referencing to test run results. -
Add output of text for delimited nodes if <= 10 then the full string, else 5 chars from start and end of the characters to make visual validation easier.
-
Fix parsing to make undefined link refs tentative which are replaced by equivalent text if they are included in a label of a defined ref or link.
-
Add
PhasedNodeRenderer
which is an extension ofNodeRenderer
with additional methods to allow rendering for specific parts of the HTML document. -
Add html rendering phases to allow generating for different parts of the document.
-
HEAD_TOP
-
HEAD
-
HEAD_CSS
-
HEAD_SCRIPTS
-
HEAD_BOTTOM
-
BODY_TOP
-
BODY
-
BODY_BOTTOM
-
BODY_LOAD_SCRIPTS
-
BODY_SCRIPTS
-
Add
FootnoteExtension
which converts[^footnote]
to footnote references and[^footnote]: footnote text
footnote definitions. With referenced footnotes added to the bottom of the generated HTML. -
Add a few HtmlWriter methods and enhancements to allow:
-
indenting HTML
-
methods return
this
so methods could be chained -
invocation with lambda to eliminate the need to close a tag
-
renderChildren()
to eliminate the need for each renderer to roll its own -
attr(String, String)
method to accumulate attributes to be used on the nexttag()
invocation. Eliminating the need to roll your own attribute methods. Accumulated attributes are merged, or overwritten by ones passed in as an argument totag()
-
Fix
SegmentedSequence::getEndOffset()
for sub-sequences would return end offset of the full sequence and not itssubSequence()
. -
Fix footnotes embedded in other footnotes would not be assigned the right source offsets nor character sequences.
-
Add characters to be dumped as part of the AST for opening and closing sequences for easy validation. For some nodes also dump the text.
-
Fix table extension to include open/close pipe as part of TableCell node markers
- AST is built based on Nodes in the source not nodes needed for HTML generation. New nodes:
- Reference
- Image
- LinkRef
- ImageRef
- AutoLink
- MailLink
- Emphasis
- StrongEmphasis
- HtmlEntity
Each node has getChars()
property which returns a BasedSequence character sequence which spans
the contents of the node, with start/end offsets into the original source.
Additionally, each node can provide other BasedSequence
properties that parcel out pieces of
the node's characters, independent of child node breakdown.
-
Add
PropertyHolder
interface to store document global properties for things like references, abbreviations, footnotes, or anything else that is parsed from the source. -
Add
NodeRepository<T>
abstract class to make it easier to create collections of nodes indexed by a string like one used for references. -
ParserState a few new methods:
-
getPropertyHolder()
returns the property holder for the parse session. This is the current document parser. After parsing the property holder is the Document node which can be obtained from viaNode::getDocument()
method. Implementation is to traverse node parents until a Document node is reached. -
getInlineParser()
returns the current parse session's inline processor -
getLine()
andgetLineWithEOL()
return the current line being parsed. With or without the EOL. -
getLineEolLength()
returns the current line's EOL length, usually 1 but can be 2 if"\r\n"
is the current line's sequence. -
Implements
BlockPreProcessor
interface to handle pre-processing of blocks as was done in paragraph blocks to remove reference definitions from the beginning of a paragraph block. -
AbstractBlockParser::closeBlock()
now takes aParserState
argument so that any block can do processing similar to Paragraph processing of leading References by using theBlockPreProcessor::preProcessBlock()
method. -
Add
Builder::customInlineParserFactory()
method to allow switching of inline parser. -
Add
Builder::blockPreProcessor()
method to allow adding custom processing similar toReference
processing done previously inParagraphParser
. -
InlineParserImpl
has all previouslyprivate
fields and methods set toprotected
so that it can be sub-classed for customizations. -
Special processing in document parser for ParagraphParser removed, now can be done by each Parser since ParserState is passed to
closeBlock()
method. -
Special processing of references was removed from
ParagraphParser::closeBlock()
now it is done by a call toParserState::preProcessBlock()
-
Special processing in document parser for ListParser removed, now it is done in the ListParser so that it can be customized.
-
spec.txt
nowast_spec_txt
with an added section to each example that contains the expected AST so that the generated AST can be validated.
[[*foo* bar]]
[*foo* bar]: /url "title"
.
<p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
.
Document[0, 41]
Paragraph[0, 14]
Text[0, 1]
LinkRef[1, 12] textOpen:[0, 0] text:[0, 0] textClose:[0, 0] referenceOpen:[1, 2, "["] reference:[2, 11, "*foo* bar"] referenceClose:[11, 12, "]"]
Emphasis[2, 7] textOpen:[2, 3, "*"] text:[3, 6] textClose:[6, 7, "*"]
Text[3, 6]
Text[7, 11]
Text[12, 13]
Reference[15, 40] refOpen:[15, 16, "["] ref:[16, 25, "*foo* bar"] refClose:[25, 27, "]:"] urlOpen:[0, 0] url:[28, 32, "/url"] urlClose:[0, 0] titleOpen:[33, 34, """] title:[34, 39, "title"] titleClose:[39, 40, """]
```
* Convert all extension tests to spec.txt style driven testing to make generating tests easier
and to also test for the generated AST
* Add `FullSpecTestCase` to be sub-classed, this one takes the original spec text file and
replaces the expected html and ast sections with actual ones, then compares the full text of
the original file to the generated one. Makes it easy to copy actual and do a diff or paste it
into the expected document for updating the expected values.
* Add `FullSpecTestCase` derived tests added to all extensions
* Add `flexmark-ext-abbreviation` to implement abbreviations processing. The extension `create`
function can take an optional `boolean`, which if true will generate HTML abbreviations as `<a
href="#" title"Abbreviation Expansion Text">Abbreviation</a>`, otherwise will generate `<abbr
title"Abbreviation Expansion Text">Abbreviation</abbr>`.