flexmark-java is a Java implementation of CommonMark (spec 0.28) parser using theblocks first, inlines after Markdown parsing architecture.
Its strengths are speed, flexibility, Markdown source element based AST with details of thesource position down to individual characters of lexemes that make up the element andextensibility.
The API allows granular control of the parsing process and is optimized for parsing with a largenumber of installed extensions. The parser and extensions come with plenty of options for parserbehavior and HTML rendering variations. The end goal is to have the parser and renderer be ableto mimic other parsers with great degree of accuracy. This is now partially complete with theimplementation of Markdown Processor Emulation
Motivation for this project was the need to replace pegdown parser in my Markdown Navigatorplugin for JetBrains IDEs. pegdown has a great feature set but its speed in general is lessthan ideal and for pathological input either hangs or practically hangs during parsing.
Java 8 or above, Java 9+ compatible
Android compatibility to be added
The project is on Maven: com.vladsch.flexmark
The core has no dependencies other than org.jetbrains:annotations:15.0
. For extensions, seeextension description below.
The API is still evolving to accommodate new extensions and functionality.
For Maven, add flexmark-all
as a dependency which includes core and all modules to thefollowing sample:
<dependency>
<groupId>com.vladsch.flexmark</groupId>
<artifactId>flexmark-all</artifactId>
<version>0.62.2</version>
</dependency>
Source:BasicSample.java
package com.vladsch.flexmark.samples;
import com.vladsch.flexmark.util.ast.Node;
import com.vladsch.flexmark.html.HtmlRenderer;
import com.vladsch.flexmark.parser.Parser;
import com.vladsch.flexmark.util.data.MutableDataSet;
public class BasicSample {
public static void main(String[] args) {
MutableDataSet options = new MutableDataSet();
// uncomment to set optional extensions
//options.set(Parser.EXTENSIONS, Arrays.asList(TablesExtension.create(), StrikethroughExtension.create()));
// uncomment to convert soft-breaks to hard breaks
//options.set(HtmlRenderer.SOFT_BREAK, "<br />\n");
Parser parser = Parser.builder(options).build();
HtmlRenderer renderer = HtmlRenderer.builder(options).build();
// You can re-use parser and renderer instances
Node document = parser.parse("This is *Sparta*");
String html = renderer.render(document); // "<p>This is <em>Sparta</em></p>\n"
System.out.println(html);
}
}
compile 'com.vladsch.flexmark:flexmark-all:0.62.2'
Additional settings due to duplicate files:
packagingOptions {
exclude 'META-INF/LICENSE-LGPL-2.1.txt'
exclude 'META-INF/LICENSE-LGPL-3.txt'
exclude 'META-INF/LICENSE-W3C-TEST'
exclude 'META-INF/DEPENDENCIES'
}
More information can be found in the documentation:
Wiki Home Usage Examples Extension Details Writing Extensions
PegdownOptionsAdapter
class converts pegdown Extensions.*
flags to flexmark options andextensions list. Pegdown Extensions.java
is included for convenience and new options not foundin pegdown 1.6.0. These are located in flexmark-profile-pegdown
module but you can grab thesource from this repo: PegdownOptionsAdapter.java, Extensions.java and make your ownversion, modified to your project's needs.
You can pass your extension flags to static PegdownOptionsAdapter.flexmarkOptions(int)
or youcan instantiate PegdownOptionsAdapter
and use convenience methods to set, add and removeextension flags. PegdownOptionsAdapter.getFlexmarkOptions()
will return a fresh copy ofDataHolder
every time with the options reflecting pegdown extension flags.
import com.vladsch.flexmark.html.HtmlRenderer;
import com.vladsch.flexmark.parser.Parser;
import com.vladsch.flexmark.profile.pegdown.Extensions;
import com.vladsch.flexmark.profile.pegdown.PegdownOptionsAdapter;
import com.vladsch.flexmark.util.data.DataHolder;
public class PegdownOptions {
final private static DataHolder OPTIONS = PegdownOptionsAdapter.flexmarkOptions(
Extensions.ALL
);
static final Parser PARSER = Parser.builder(OPTIONS).build();
static final HtmlRenderer RENDERER = HtmlRenderer.builder(OPTIONS).build();
// use the PARSER to parse and RENDERER to render with pegdown compatibility
}
Default flexmark-java pegdown emulation uses less strict HTML block parsing which interrupts anHTML block on a blank line. Pegdown only interrupts an HTML block on a blank line if all tags inthe HTML block are closed.
To get closer to original pegdown HTML block parsing behavior use the method which takes aboolean strictHtml
argument:
import com.vladsch.flexmark.html.HtmlRenderer;
import com.vladsch.flexmark.parser.Parser;
import com.vladsch.flexmark.profile.pegdown.Extensions;
import com.vladsch.flexmark.profile.pegdown.PegdownOptionsAdapter;
import com.vladsch.flexmark.util.data.DataHolder;
public class PegdownOptions {
final private static DataHolder OPTIONS = PegdownOptionsAdapter.flexmarkOptions(true,
Extensions.ALL
);
static final Parser PARSER = Parser.builder(OPTIONS).build();
static final HtmlRenderer RENDERER = HtmlRenderer.builder(OPTIONS).build();
// use the PARSER to parse and RENDERER to render with pegdown compatibility
}
A sample with acustom link resolveris also available, which includes link resolver for changing URLs or attributes of links and acustom node renderer if you need to override the generated link HTML.
Major reorganization and code cleanup of implementation in version 0.60.0, seeVersion-0.60.0-Changes
Flexmark Architecture and Dependencies Diagramsthanks to great work by Alex Karezin you can get an overviewof module dependencies with ability to drill down to packages and classes.
Merge API to merge multiple markdown documents into a singledocument.
Extensible HTML to Markdown Converter module:flexmark-html2md-converter.Sample: HtmlToMarkdownCustomizedSample.java
Java9+ module compatibility
Compound Enumerated ReferencesEnumerated References Extension for creatinglegal numbering for elements and headings.
Macros Extension to allow arbitrary markdown content to beinserted as block or inline elements, allowing block elements to be used where only inlineelements are allowed by syntax.
Extensions: Gitlab Flavoured Markdown forparsing and rendering GitLab markdown extensions.
OSGi module courtesy Dan Klco (GitHub @klcodanr)
Extensions: Media Tags Media link transformer extensioncourtesy Cornelia Schultz (GitHub @CorneliaXaos) transformslinks using custom prefixes to Audio, Embed, Picture, and Video HTML5 tags.
Translation Helper API to make translating markdowndocuments easier.
Admonition Extension Tocreate block-styled side content. For complete documentation please see theAdmonition Extension, Material for MkDocs documentation.
Enumerated Referenceto create enumerated references for figures, tables and other markdown elements.
Attributes Extension toparse attributes of the form {name name=value name='value' name="value" #id .class-name}
attributes.
YouTube Embedded Link Transformerthanks to Vyacheslav N. Boyko (GitHub @bvn13) transforms simple links to youtube videos toembedded video iframe HTML.
Docx Converter Moduleusing the docx4j library. How to use: DocxConverter Sample, how to customize:Customizing Docx Rendering
Development of this module was sponsored byJohner Institut GmbH.
Update library to be CommonMark (spec 0.28) compliant and addParserEmulationProfile.COMMONMARK_0_27
and ParserEmulationProfile.COMMONMARK_0_28
to allowselecting a specific spec version options.
Custom node rendering API with ability to invoke standard rendering for an overridden node,allowing custom node renders that only handle special cases and let the rest be rendered asusual.PegdownCustomLinkResolverOptions
Gfm Issues andGfm Users extensions forparsing and rendering #123
and @user-name
respectively.
Deep HTML block parsing option for better handling of raw text tags that come after other tagsand for pegdown HTML block parsing compatibility.
flexmark-all
module that includes: core, all extensions, formatter, JIRA and YouTrackconverters, pegdown profile module and HTML to Markdown conversion.
Typographic Extension Moduleimplemented
Markdown Formatter module tooutput AST as markdown with formatting options.
Table Extension forMarkdown Formatter withcolumn width and alignment of markdown tables:
Input | Output |
---|---|
|
|
I use flexmark-java as the parser for Markdown Navigator plugin for JetBrains IDEs. I tend touse the latest, unreleased version to fix bugs or get improvements. So if you find a bug that isa show stopper for your project or see a bug in github issues page marked fixed for next release
that is affecting your project then please let me know and I may be able to promptlymake a new release to address your issue. Otherwise, I will let bug fixes and enhancementsaccumulate thinking no one is affected by what is already fixed.
There are many extension options in the API with their intended use. A good soft-start is theflexmark-java-samples
module which has simple samples for asked for extensions. The next best place is the source ofan existing extension that has similar syntax to what you want to add.
If your extension lines up with the right API, the task is usually very short and sweet. If yourextension uses the API in an unintended fashion or does not follow expected housekeepingprotocols, you may find it an uphill battle with a rat's nest of if/else condition handling andfixing one bug only leading to creating another one.
Generally, if it takes more than a few dozen lines to add a simple extension, then either youare going about it wrong or the API is missing an extension point. If you look at all theimplemented extensions you will see that most are a few lines of code other than boiler platedictated by the API. That is the goal for this library: provide an extensible core that makeswriting extensions a breeze.
The larger extensions are flexmark-ext-tables
and flexmark-ext-spec-example
, the meat ofboth is around 200 lines of code. You can use them as a guide post for size estimating yourextension.
My own experience adding extensions shows that sometimes a new type of extension is bestaddressed with an API enhancement to make its implementation seamless, or by fixing a bug thatwas not visible before the extension stressed the API in just the right way. Your intendedextension may just be the one requiring such an approach.
The takeaway is: if you want to implement an extension or a feature please don't hesitate toopen an issue and I will give you pointers on the best way to go about it. It may save you a lotof time by letting me improve the API to address your extension's needs before you put a lot offruitless effort into it.
I do ask that you realize that I am chief cook and bottle washer on this project, without aniota of Vulcan Mind Melding skills. I do ask that you describe what you want to implementbecause I can't read your mind. Please do some reconnaissance background work around the sourcecode and documentation because I cannot transfer what I know to you, without your willingeffort.
If you have a commercial application and don't want to write the extension(s) yourself or wantto reduce the time and effort of implementing extensions and integrating flexmark-java, feelfree to contact me. I am available on a consulting/contracting basis, All about me.
Despite its name, commonmark is neither a superset nor a subset of other markdown flavors.Rather, it proposes a standard, unambiguous syntax specification for the original, "core"Markdown, thus effectively introducing yet another flavor. While flexmark is by defaultcommonmark compliant, its parser can be tweaked in various ways. The sets of tweaks required toemulate the most commonly used markdown parsers around are available in flexmark asParserEmulationProfiles
.
As the name ParserEmulationProfile
implies, it's only the parser that is adjusted to thespecific markdown flavor. Applying the profile does not add features beyond those available incommonmark. If you want to use flexmark to fully emulate another markdown processor's behavior,you have to adjust the parser and configure the flexmark extensions that provide the additionalfeatures available in the parser that you want to emulate.
A rewrite of the list parser to better control emulation of other markdown processors as perMarkdown Processors Emulation is complete. Addition ofprocessor presets to emulate specific markdown processing behaviour of these parsers is on ashort to do list.
Some emulation families do a better better job of emulating their target than others. Most ofthe effort was directed at emulating how these processors parse standard Markdown and listrelated parsing specifically. For processors that extend original Markdown, you will need to addthose extensions that are already implemented in flexmark-java to the Parser/Renderer builderoptions.
Extensions will be modified to include their own presets for specific processor emulation, ifthat processor has an equivalent extension implemented.
If you find a discrepancy please open an issue so it can be addressed.
Major processor families are implemented and some family members also:
ParserEmulationProfile.COMMONMARK
ParserEmulationProfile.FIXED_INDENT
ParserEmulationProfile.COMMONMARK
ParserEmulationProfile.GITHUB_DOC
ParserEmulationProfile.KRAMDOWN
ParserEmulationProfile.MARKDOWN
ParserEmulationProfile.MULTI_MARKDOWN
PegdownOptionsAdapter
in flexmark-profile-pegdown
ParserEmulationProfile.PEGDOWN
ParserEmulationProfile.PEGDOWN_STRICT
flexmark-java is a fork of commonmark-java project, modified to generate an AST whichreflects all the elements in the original source, full source position tracking for all elementsin the AST and easier JetBrains Open API PsiTree generation.
The API was changed to allow more granular control of the parsing process and optimized forparsing with a large number of installed extensions. The parser and extensions come with manytweaking options for parser behavior and HTML rendering variations. The end goal is to have theparser and renderer be able to mimic other parsers with great degree of accuracy.
Motivation for this was the need to replace pegdown parser in Markdown Navigator plugin.pegdown has a great feature set but its speed in general is less than ideal and forpathological input either hangs or practically hangs during parsing.
commonmark-java has an excellent parsing architecture that is easy to understand and extend.The goal was to ensure that adding source position tracking in the AST would not change the easeof parsing and generating the AST more than absolutely necessary.
Reasons for choosing commonmark-java as the parser are: speed, ease of understanding, ease ofextending and speed. More detailed description inPegdown - Achilles heel of the Markdown Navigator plugin. Now that I have reworked the coreand added a few extensions I am extremely satisfied with my choice.
Another goal was to improve the ability of extensions to modify parser behavior so that anydialect of markdown could be implemented through the extension mechanism. An extensible optionsAPI was added to allow setting of all options in one place. Parser, renderer and extensions usethese options for configuration, including disabling some core block parsers.
This is a work in progress with many API changes. No attempt is made to keep backward APIcompatibility to the original project and until the feature set is mostly complete, not even toearlier versions of this project.
Feature | flexmark-java | commonmark-java | pegdown |
---|---|---|---|
Relative parse time (less is better) |
|
|
|
All source elements in the AST |
|
|
|
AST elements with source position |
|
|
|
AST can be easily manipulated |
|
|
|
AST elements have detailed source position for all parts |
|
|
|
Can disable core parsing features |
|
|
|
Core parser implemented via the extension API |
|
instanceOf tests for specific block parser and node classes |
|
Easy to understand and modify parser implementation |
|
|
|
Parsing of block elements is independent from each other |
|
|
|
Uniform configuration across: parser, renderer and all extensions |
|
|
int bit flags for core, none for extensions |
Parsing performance optimized for use with extensions |
|
|
|
Feature rich with many configuration options and extensions out of the box |
|
|
|
Dependency definitions for processors to guarantee the right order of processing |
|
|
|
flexmark-java pathological input of 100,000 [
parses in 68ms, 100,000 ]
in 57ms, 100,000nested [
]
parse in 55ms
commonmark-java pathological input of 100,000 [
parses in 30ms, 100,000 ]
in 30ms, 100,000nested [
]
parse in 43ms
pegdown pathological input of 17 [
parses in 650ms, 18 [
in 1300ms
"<br />"
){% include file %}
,Include Markdown and HTML File ContentI am very pleased with the decision to switch to commonmark-java based parser for my ownprojects. Even though I had to do major surgery on its innards to get full source positiontracking and AST that matches source elements, it is a pleasure to work with and is now apleasure to extend. If you don't need source level element AST or the rest of what flexmark-javaadded and CommonMark is your target markdown parser then I encourage you to usecommonmark-java as it is an excellent choice for your needs and its performance does notsuffer for the overhead of features that you will not use.
Latest, Jan 28, 2017 flexmark-java 0.13.1, intellij-markdown from CE EAP 2017, commonmark-java0.8.0:
File | commonmark-java | flexmark-java | intellij-markdown | pegdown |
---|---|---|---|---|
README-SLOW | 0.420ms | 0.812ms | 2.027ms | 15.483ms |
VERSION | 0.743ms | 1.425ms | 4.057ms | 42.936ms |
commonMarkSpec | 31.025ms | 44.465ms | 600.654ms | 575.131ms |
markdown_example | 8.490ms | 10.502ms | 223.593ms | 983.640ms |
spec | 4.719ms | 6.249ms | 35.883ms | 307.176ms |
table | 0.229ms | 0.623ms | 0.800ms | 3.642ms |
table-format | 1.385ms | 2.881ms | 4.150ms | 23.592ms |
wrap | 3.804ms | 4.589ms | 16.609ms | 86.383ms |
Ratios of above:
File | commonmark-java | flexmark-java | intellij-markdown | pegdown |
---|---|---|---|---|
README-SLOW | 1.00 | 1.93 | 4.83 | 36.88 |
VERSION | 1.00 | 1.92 | 5.46 | 57.78 |
commonMarkSpec | 1.00 | 1.43 | 19.36 | 18.54 |
markdown_example | 1.00 | 1.24 | 26.34 | 115.86 |
spec | 1.00 | 1.32 | 7.60 | 65.09 |
table | 1.00 | 2.72 | 3.49 | 15.90 |
table-format | 1.00 | 2.08 | 3.00 | 17.03 |
wrap | 1.00 | 1.21 | 4.37 | 22.71 |
overall | 1.00 | 1.41 | 17.47 | 40.11 |
File | commonmark-java | flexmark-java | intellij-markdown | pegdown |
---|---|---|---|---|
README-SLOW | 0.52 | 1.00 | 2.50 | 19.07 |
VERSION | 0.52 | 1.00 | 2.85 | 30.12 |
commonMarkSpec | 0.70 | 1.00 | 13.51 | 12.93 |
markdown_example | 0.81 | 1.00 | 21.29 | 93.66 |
spec | 0.76 | 1.00 | 5.74 | 49.15 |
table | 0.37 | 1.00 | 1.28 | 5.85 |
table-format | 0.48 | 1.00 | 1.44 | 8.19 |
wrap | 0.83 | 1.00 | 3.62 | 18.83 |
overall | 0.71 | 1.00 | 12.41 | 28.48 |
Because these two files represent the pathological input for pegdown, I no longer run them aspart of the benchmark to prevent skewing of the results. The results are here for posterity.
File | commonmark-java | flexmark-java | intellij-markdown | pegdown |
---|---|---|---|---|
hang-pegdown | 0.082ms | 0.326ms | 0.342ms | 659.138ms |
hang-pegdown2 | 0.048ms | 0.235ms | 0.198ms | 1312.944ms |
Ratios of above:
File | commonmark-java | flexmark-java | intellij-markdown | pegdown |
---|---|---|---|---|
hang-pegdown | 1.00 | 3.98 | 4.17 | 8048.38 |
hang-pegdown2 | 1.00 | 4.86 | 4.10 | 27207.32 |
overall | 1.00 | 4.30 | 4.15 | 15151.91 |
File | commonmark-java | flexmark-java | intellij-markdown | pegdown |
---|---|---|---|---|
hang-pegdown | 0.25 | 1.00 | 1.05 | 2024.27 |
hang-pegdown2 | 0.21 | 1.00 | 0.84 | 5594.73 |
overall | 0.23 | 1.00 | 0.96 | 3519.73 |
[[[[[[[[[[[[[[[[[
which causes pegdown to go into a hyper-exponential parse time.[[[[[[[[[[[[[[[[[[
whichcauses pegdown to go into a hyper-exponential parse time.Pull requests, issues and comments welcome
Copyright (c) 2015-2016 Atlassian and others.
Copyright (c) 2016-2020, Vladimir Schneider,
BSD (2-clause) licensed, see LICENSE.txt file.
在Java应用程序中,我需要将标记向下的文本转换为简单的纯文本,而不是html(例如,删除所有链接地址、粗体和斜体标记)。 哪种方法是最好的?我在想用Fleaxmark这样的减价库。但我第一眼就找不到这个特点。在那吗?还有其他更好的选择吗?
问题内容: 用最简单的方法,在java中如何创建一个文件并写入内容? 问题答案: 创建一个文本文件: 创建一个二进制文件: Java 7+用户可以使用Files该类来写入文件: 创建一个文本文件: 创建一个二进制文件:
问题内容: 我正在尝试使用AJAX创建一个页面,但是当我获得该页面并且它包含Javascript代码时,它不会执行。 为什么? 我的ajax页面中的简单代码: …并且它不执行它。我正在尝试使用Google Maps API并通过AJAX添加标记,因此,每添加一个标记,我都会执行一个AJAX页面,该页面将获取新标记,并将其存储在数据库中,并应将标记“动态”添加到地图中。 但是,由于我无法以这种方式执
问题内容: 我有一个基于jquery的单页webapp。它通过AJAX调用与RESTful Web服务进行通信。 我正在尝试完成以下任务: 将包含JSON数据的POST提交到REST URL。 如果请求指定JSON响应,则返回JSON。 如果请求指定PDF / XLS / etc响应,则返回可下载的二进制文件。 我现在有1&2,并且客户端jquery应用通过基于JSON数据创建DOM元素在网页上显
问题内容: 我希望能够从Java操作方法中的JSON字符串访问属性。只需说一下即可使用该字符串。下面是该字符串的示例: 在此字符串中,每个JSON对象都包含其他JSON对象的数组。目的是提取ID列表,其中任何给定对象都具有包含其他JSON对象的group属性。我将Google的Gson视为潜在的JSON插件。谁能提供某种形式的指导,说明如何从此JSON字符串生成Java? 问题答案: 我将Goog
问题内容: 在java中如何追加文本到存在的文件中? 问题答案: Java 7+ 如果你只需要执行一次,则使用Files类很容易: 注意:NoSuchFileException如果文件不存在,上述方法将抛出。它还不会自动追加换行符(追加到文本文件时通常会需要此换行符)。 但是,如果你要多次写入同一文件,则上述操作必须多次打开和关闭磁盘上的文件,这是一个缓慢的操作。在这种情况下,使用缓冲写入器更好: