python-markdown语法

向安福
2023-12-01

python的markdown扩展,功能较为丰富,里面甚至集成了一些 rST-style 的命令。极大的扩展了文章的表现力。

也有第三方扩展

[^@# %”.

Officially Supported Extensions

Extension               “Name”
Extra                   markdown.extensions.extra
    Abbreviations       markdown.extensions.abbr
    Attribute Lists     markdown.extensions.attr_list
    Definition Lists    markdown.extensions.def_list
    Fenced Code Blocks  markdown.extensions.fenced_code
    Footnotes           markdown.extensions.footnotes
    Tables              markdown.extensions.tables
    Smart Strong        markdown.extensions.smart_strong
Admonition              markdown.extensions.admonition
CodeHilite              markdown.extensions.codehilite
HeaderId                markdown.extensions.headerid
Meta-Data               markdown.extensions.meta
New Line to Break       markdown.extensions.nl2br
Sane Lists              markdown.extensions.sane_lists
SmartyPants             markdown.extensions.smarty
Table of Contents       markdown.extensions.toc
WikiLinks               markdown.extensions.wikilinks

Python-Markdown Extra

这是对于模仿 PHP Markdown Extrapython-markdown 扩展的一个合辑

支持下面这些扩展:

Abbreviations           缩写
Attribute Lists         属性列表
Definition Lists        定义列表
Fenced Code Blocks      代码块
Footnotes               脚注
Tables                  表格
Smart Strong            加粗

缩写

The HTML specification
is maintained by the W3C.

*[HTML]: Hyper Text Markup Language
*[W3C]: World Wide Web Consortium

扩展了定义缩写的能力。被定义的缩写会被包含在 标签里。

The HTML specification 
is maintained by the W3C.

*[HTML]: Hyper Text Markup Language
*[W3C]:  World Wide Web Consortium

转换成

<p>The <abbr title="Hyper Text Markup Language">HTML</abbr> specification 
is maintained by the <abbr title="World Wide Web Consortium">W3C</abbr>.</p>

想法:

不止可以用作缩写,还可以直接作为一些提示信息,免去了链接,脚注,或者加括号注释的麻烦。

属性列表

属性列表增加了定义多中HTML元素的属性的方式。

格式例子:

{: #someid .someclass somekey='some value' }

用#开头的是元素id,用.开始的是分配给元素的class的列表,后面的键值等式,将被赋到元素上。

要知道,当.语法被加到类里,使用键值对会重写之前定义的属性:

{: #id1 .class1 id=id2 class="class2 class3" .class4 }

被转换成:

id="id2" class="class2 class3 class4"

对于块级元素

属性列表要放到对应的块的最后一行。

This is a paragraph.
{: #an_id .a_class }

会有:

<p id="an_id" class="a_class">This is a paragraph.</p>

对于headers元素

要求是在同一行:

A setext style header {: #setext}
=================================

### A hash style header ### {: #hash }

转换为:

<h1 id="setext">A setext style header</h1>
<h3 id="hash">A hash style header</h3>

对于內链

属性列表应该被在內链属性后,无间隔,定义:

[link](http://example.com){: class="foo bar" title="Some title!" }

转换为:

<p><a href="http://example.com" class="foo bar" title="Some title!">link</a></p>

个人想法:

感觉使用价值不大。更加偏向于html本身的元素的设置。

定义列表

Apple
Pomaceous fruit of plants of the genus Malus in
the family Rosaceae.
Orange
The fruit of an evergreen tree of the genus Citrus.

可以创建定义列表。

Apple
:   Pomaceous fruit of plants of the genus Malus in 
    the family Rosaceae.

Orange
:   The fruit of an evergreen tree of the genus Citrus.

转换为:

<dl>
<dt>Apple</dt>
<dd>Pomaceous fruit of plants of the genus Malus in 
the family Rosaceae.</dd>

<dt>Orange</dt>
<dd>The fruit of an evergreen tree of the genus Citrus.</dd>
</dl>

个人看法:

感觉很适合用来表示一些词汇的解释与定义。

代码块

用符号表示的代码块,克服了一些缩进代码块的局限。

不支持内嵌。即,只支持用在文本最左侧的级别。

This is a paragraph introducing:

~~~~~~~~~~~~~~~~~~~~(至少三个,上下最好要一样多)
a one-line code block
~~~~~~~~~~~~~~~~~~~~(至少三个,不然隔不断)

显示:

This is a paragraph introducing:

~
a one-line code block
~

指定语言:

~~~~{.python}
# python code
~~~~

~~~~.html
<p>HTML Document</p>
~~~~

转换为:

<pre><code class="python"># python code
</code></pre>

<pre><code class="html">&lt;p&gt;HTML Document&lt;/p&gt;
</code></pre>

效果:

~~~~{.python}

python code

~~~~

~~~~.html

HTML Document

~~~~

其中,更建议使用同样被支持的GITHUB代码块格式 ```

```python
# more python code
```

个人看法:

还是用 ```好,方便。

如果想要让代码块使用 CodeHilite 扩展来高亮,得有 Pygments ,之后就会被适当的高亮转换。

和其中的冒号语法相似的是,可以强调某些行。

要使用:

~~~~{.python hl_lines="1 3"}
# This line is emphasized
# This line isn't
# This line is emphasized
~~~~

或者:

```python hl_lines="1 3"
# This line is emphasized
# This line isn't
# This line is emphasized
```

脚注

Footnotes1 have a label[^@#$%] and the footnote’s content.

Paragraph two of the definition.

> A blockquote with
> multiple lines.

    a code block

A final paragraph.

在方括号里只有第一个脱字符有特殊含义,而且,括号内可以使用任意字符,包含空格。前后对应就好。

常规:

Footnotes2 have a label[^@#$%] and the footnote's content.

3: This is a footnote content.
[^@#$%]: A footnote on the label: "@#$%".

特殊的:从第二行的开头就相当于文本中的最左侧,markdown标记是有效的。就像下面的,引用,代码块,都是有显示的。

4: 
    The first paragraph of the definition.

    Paragraph two of the definition.

    > A blockquote with
    > multiple lines.

        a code block

    A final paragraph.

特殊的一些配置:

The following options are provided to configure the output(输出):

  • PLACE_MARKER: A text string used to mark the position where the footnotes(脚注) are rendered(致使).默认的是 ///Footnotes (去掉括号) Go Here///

    If the place marker text is not found in the document, the footnote definitions(定义) are placed at the end of the resulting HTML document.

    默认的是 ///Footnotes (去掉括号) Go Here/// 这句话后面就有一句,直接把所在的一个单元全部替换了。

Defaults to ///Footnotes Go Here///.

  • UNIQUE_IDS: Whether to avoid collisions(碰撞) across multiple calls to reset(). Defaults to False.

  • BACKLINK_TEXT: The text string that links from the footnote definition back to the position in the document. Defaults to ↩.这个设置的是跳回原文的那个小图标文本。

个人想法:

上面的这些设置感觉很有用。

表格

例子:

First Header  | Second Header
------------- | -------------
Content Cell  | Content Cell
Content Cell  | Content Cell

转换为:

<table>
  <thead>
    <tr>
      <th>First Header</th>
      <th>Second Header</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Content Cell</td>
      <td>Content Cell</td>
    </tr>
    <tr>
      <td>Content Cell</td>
      <td>Content Cell</td>
    </tr>
  </tbody>
</table>

特别的:

| Item      | Value |
| --------- | -----:|
| Computer  | $1600 |
| Phone     |   $12 |
| Pipe      |    $1 |


| Function name | Description                    |
| ------------- | ------------------------------ |
| `help()`      | Display the help window.       |
| `destroy()`   | **Destroy your computer!**     |
ItemValue
Computer$1600
Phone$12
Pipe$1
Function nameDescription
help()Display the help window.
destroy()Destroy your computer!

加粗

Text with double__underscore__words.
Strong still works.
this__works__too.

处理双下划线。

Text with double__underscore__words.
__Strong__ still works.
__this__works__too__.

->

<p>Text with double__underscore__words.</p>
<p><strong>Strong</strong> still works.</p>
<p><strong>this__works__too</strong>.</p>

由上可见,紧凑文本内的成对双下划线是不起作用的。多个双下划线,起到加粗作用的只有最外面的一对。

个人看法: 类似的还有成对单下划线。感觉和成对的星号用途类似。

Admonition 警告

!!! note
You should note that the title will be automatically capitalized.

!!! danger “Don’t try this at home”

!!! important “”
This is a admonition box without a title.

添加了 rST-style 的警告标记

!!! type "optional explicit title within double quotes"
    Any number of other indented markdown elements.

    This is the second paragraph.

其中,type 会被用作CSS的类名,作为默认的标题,必须是个单词。你可以自己随便写。

!!! note
    You should note that the title will be automatically capitalized.

!!! danger "Don't try this at home"
    ...

!!! important ""
    This is a admonition box without a title.

转换成:

<div class="admonition note">
<p class="admonition-title">Note</p>
<p>You should note that the title will be automatically capitalized.</p>
</div>

<div class="admonition danger">
<p class="admonition-title">Don't try this at home</p>
<p>...</p>
</div>

<div class="admonition important">
<p>This is a admonition box without a title.</p>
</div>

个人看法:

很实用。

CodeHilite

使用Pygments高亮python-markdown文本中的代码块。

输入:

    #!/usr/bin/python
    # Code goes here ...

会显示:

#!/usr/bin/python
# Code goes here ...

输入:

    #!python
    # Code goes here ...

会显示:

# Code goes here ...

特别的——输入:

    :::python
    # Code goes here ...

显示:

# Code goes here ...

输入:

:::python hl_lines="1 3"
# This line is emphasized
# This line isn't
# This line is emphasized

hl_lines is named for Pygments’ option meaning “highlighted lines”.
When no language is defined, the Pygments highlighting engine will try to guess the language (unless guess_lang is set to False).

The following options are provided to configure the output:

  • linenums: 行号 Use line numbers. Possible values are True for yes, False for no and None for auto. Defaults to None.

    Using True will force every code block to have line numbers, even when using colons (:::) for language identification.

    Using False will turn off all line numbers, even when using shebangs (#!) for language identification.

  • guess_lang: 语言自动检测 Automatic language detection. Defaults to True.

    Using False will prevent Pygments from guessing the language, and thus highlighting blocks only when you explicitly set the language.

  • css_class: Set CSS class name for the wrapper

    tag. Defaults to codehilite.

  • pygments_style: Pygments HTML Formatter Style (ColorScheme). Defaults to default.

    This is useful only when noclasses is set to True, otherwise the CSS styles must be provided by the end user.

  • noclasses: Use inline styles instead of CSS classes. Defaults to False.

  • use_pygments: 使用 Defaults to True. Set to False to disable the use of Pygments. If a language is defined for a code block, it will be assigned to the tag as a class in the manner suggested by the HTML5 spec (alternate output will not be entertained) and might be used by a JavaScript library in the browser to highlight the code block.

个人看法:

主要还是关于代码高亮的。

HeaderId

#的功能。

The following options are provided to configure the output(输出):

  • level: Base level for headers.

    Default: 1

    The level setting allows you to automatically(自动地) adjust(调整) the header levels to fit within the hierarchy(层级) of your HTML templates(模板). For example, suppose the markdown text for a page should not contain any headers higher than level 3 (<h3>). The following will accomplish(完成) that:

>>>  text = '''
... #Some Header
... ## Next Level'''
>>> from markdown.extensions.headerid import HeaderIdExtension
>>> html = markdown.markdown(text, extensions=[HeaderIdExtension(level=3)])
>>> print html
<h3 id="some_header">Some Header</h3>
<h4 id="next_level">Next Level</h4>
  • forceid: Force all headers to have an id.

    Default: True

    The forceid setting turns on or off the automatically generated(形成) ids for headers that do not have one explicitly(明确地) defined(定义) (using the Attribute List extension).

>>> text = '''
... # Some Header
... # Header with ID # { #foo }'''
>>> html = markdown.markdown(text,
                             extensions=['markdown.extensions.attr_list',
                                         HeaderIdExtension(forceid=False)])
>>> print html
<h1>Some Header</h1>
<h1 id="foo">Header with ID</h1>
  • separator: Word separator(分离器). Character which replaces white space in id.

    Default: -

  • slugify: Callable to generate anchors(锚).

    Default: markdown.extensions.headerid.slugify

    If you would like to use a different algorithm(算法) to define the ids, you can pass in a callable which takes two arguments:

    • value: The string to slugify.
    • separator: The Word Separator.

The HeaderId extension(延长) also supports the Meta-Data extension. Please see the documentation(文件材料) for that extension for specifics(特性). The supported meta-data keywords are:

header_level
header_forceid

例如:

header_level: 2
header_forceid: Off

# A Header

转换为:

<h2>A Header</h2>

Meta-Data

可以定义文档的元数据。

元数据包含一系列的关键字和值,定义在文档开头。像这样:

Title:   My Document
Summary: A brief description of my document.
Authors: Waylan Limberg
         John Doe
Date:    October 2, 2007
blank-value: 
base_url: http://example.com

This is the first paragraph of the document.

关键字不分大小写,可包含字母,数字,下划线,连接符,必须接有冒号,值可以是任何,包括空白,但是要与冒号在一行,接在后面。

可以多行排列值,但是新起的一行要有至少四个空格。

文档的第一个空行会结束所有的元数据部分,所以文档第一行不可是空。

当然也可以使用YAML格式来标记元数据的开始与结束。
开头第一行必须是---,结尾是空行,或者---,或者...

所有的元数据在处理时,会被优先从文档中剥离。

元数据可以被访问:

>>> md = markdown.Markdown(extensions = ['markdown.extensions.meta'])
>>> html = md.convert(text)
>>> # Meta-data has been stripped from output
>>> print html
<p>This is the first paragraph of the document.</p>

>>> # View meta-data
>>> print md.Meta
{
'title' : ['My Document'],
'summary' : ['A brief description of my document.'],
'authors' : ['Waylan Limberg', 'John Doe'],
'date' : ['October 2, 2007'],
'blank-value' : [''],
'base_url' : ['http://example.com']
}

The following extensions are currently known to work with the Meta-Data extension. The keywords they are known to support are also listed.

  • HeaderId
    • header_level
    • header_forceid
  • WikiLinks
    • wiki_base_url
    • wiki_end_url
    • wiki_html_class

个人看法:

用Pelican写博文时,开头就用的是元数据。记得在用jekyll的框架时,也是有。

New-Line-to-Break Extension

换行。

Line 1
Line 2

->

<p>Line 1<br />
Line 2</p>

Line 1
Line 2

个人看法:

这个扩展应该就是把平时多按的回车给省了。

Sane Lists

  1. Ordered item 1
  2. Ordered item 2

    • Unordered item 1
    • Unordered item 2

关于有序无序列表的一个扩展。

不允许列表类型混用。

默认的markdown行为中

1. Ordered item 1
2. Ordered item 2

- Unordered item 1
- Unordered item 2

输出结果是:

1. Ordered item 1

2. Ordered item 2

3. Unordered item 1

4.  Unordered item 2

但是这个扩展就改变了:

<ol>
  <li>Ordered item 1</li>
  <li>Ordered item 2</li>
</ol>

<ul>
  <li>Unordered item 1</li>
  <li>Unordered item 2</li>
</ul>

有序无需全都有。互不影响。

使用一种列表时,前面一定要有空行。

SmartyPants

该扩展,将ascii中的引号,连接号,省略号转换为对应的HTML等价表示。

|  ASCII symbol |   Replacements  |  HTML Entities  |  Substitution Keys  |
|---|---|---|---|
|'  | ‘ ’  |   &lsquo; &rsquo;   |  'left-single-quote', 'right-single-quote'|
|"  | “ ”  |   &ldquo; &rdquo;   |  'left-double-quote', 'right-double-quote'|
|<< >>  | ? ?   |  &laquo; &raquo;  |   'left-angle-quote', 'right-angle-quote'|
|...  |   … |  &hellip;  |  'ellipsis'|
|-- | – |  &ndash;  |   'ndash'|
|---  |   —  | &mdash;   |  'mdash'|
ASCII symbolReplacementsHTML EntitiesSubstitution Keys
‘ ’‘ ’‘left-single-quote’, ‘right-single-quote’
“ ”“ ”‘left-double-quote’, ‘right-double-quote’
<< >>? ?« »‘left-angle-quote’, ‘right-angle-quote’
‘ellipsis’
‘ndash’
‘mdash’

Note
This extension(延长) re-implements the Python SmartyPants library by integrating(完整) it into the markdown(标低价) parser(解析). While this does not provide any additional(附加的) features(特色), it does offer a few advantages. Notably(值得注意的), it will not try to work on highlighted(突出的) code blocks (using the CodeHilite Extension(延长)) like the third party library has been known to do.


The following options are provided to configure the output(输出):

OptionDefault valueDescription
smart_dashesTruewhether to convert(转变) dashes
smart_quotesTruewhether to convert(转变) straight quotes
smart_angled_quotesFalsewhether to convert angled(成角的) quotes
smart_ellipsesTruewhether to convert ellipses(椭圆形)
substitutions{}overwrite(重写) default substitutions(代替)

个人看法:

注意在pelican的nest模板中,不知名的原因,第一行数据,不受第二行格式的限制。
但是正常的预览时,是会有限制的。

Table of Contents

[TOC]

会把各级标题集合到一起,构建成一个目录的形式,可以快速访问对应的标题。不论标签的位置在文档的何处,只要独立成行,前后空白,就可以生成整个文章的标题列表。

[TOC]

# Header 1

## Header 2

->

<div class="toc">
  <ul>
    <li><a href="#header-1">Header 1</a></li>
      <ul>
        <li><a href="#header-2">Header 2</a></li>
      </ul>
  </ul>
</div>
<h1 id="header-1">Header 1</h1>
<h1 id="header-2">Header 2</h1>

The following options are provided to configure the output(输出):

  • marker: Text to find and replace with the Table of Contents. Defaults to [TOC].

    Set to an empty string to disable searching for a marker, which may save some time, especially on long documents.

  • title: Title to insert in the Table of Contents’ <div>. Defaults to None.

  • anchorlink: Set to True to cause all headers to link to themselves. Default is False.

  • permalink: Set to True or a string to generate(形成) permanent(永久的) links at the end of each header. Useful with Sphinx style sheets.

    When set to True the paragraph symbol(象征) (? or “¶”) is used as the link text. When set to a string, the provided string is used as the link text.

  • baselevel: Base level for headers. Defaults to 1.

    The baselevel setting allows the header levels to be automatically(自动地) adjusted(调整) to fit within the hierarchy(层级) of your HTML templates(模板). For example, suppose the Markdown text for a page should not contain any headers higher than level 3 (<h3>). The following will accomplish(完成) that:

>>> text = '''
... #Some Header
... ## Next Level'''
>>> from markdown.extensions.toc import TocExtension
>>> html = markdown.markdown(text, extensions=[TocExtension(baselevel=3)])
>>> print html
<h3 id="some_header">Some Header</h3>
<h4 id="next_level">Next Level</h4>'
  • slugify: Callable to generate(形成) anchors(锚).

    Default: markdown.extensions.headerid.slugify

    In order to use a different algorithm(算法) to define(定义) the id attributes(属性), define and pass in a callable which takes the following two arguments:

    * value: The string to slugify.
    * separator: The Word Separator.
    

    The callable must return a string appropriate(适当的) for use in HTML id attributes.

  • separator: Word separator. Character which replaces white space in id. Defaults to “-”.

个人看法:

permalink 这个设置有点意思,就像是一点击,显示的最上面一行就是这个标题。

添加对于 WikiLinks 的支持。 [[内容可以被转换为链接]]

[[wikilink]]

内容可以是任何大小写字母,数字,连接号,下划线和空格。

[[Bracketed]]
[[Wiki Link]]

->

<a href="/Bracketed/" class="wikilink">Bracketed</a>
<a href="/Wiki_Link/" class="wikilink">Wiki Link</a>

The default behavior is to point each link to the document root of the current domain and close with a trailing(后面的) slash. Additionally, each link is assigned to the HTML class wikilink.

The following options are provided to change the default behavior:

  • base_url: String to append(附加) to beginning of URL.

    Default: ‘/’

  • end_url: String to append to end of URL.

    Default: ‘/’

  • html_class: CSS class. Leave blank for none.

    Default: ‘wikilink’

  • build_url: Callable which formats the URL from its parts.

使用这些配置的例子:

假如指子目录 /wiki/ 并以 with .html 结尾。

>>> from markdown.extensions.wikilinks import WikiLinkExtension
>>> html = markdown.markdown(text,
...     extensions=[WikiLinkExtension(base_url='/wiki/', end_url='.html')]
... )

[[WikiLink]]转换为

<a href="/wiki/WikiLink.html" class="wikilink">WikiLink</a>

支持改变或者移除类属型

>>> html = markdown.markdown(text,
...     extensions=[WikiLinkExtension(html_class='myclass')]
... )

->

<a href="/WikiLink/" class="myclass">WikiLink</a>

也可以使用元数据:

The supported meta-data keywords are:

wiki_base_url
wiki_end_url
wiki_html_class

When used, the meta-data will override the settings provided through the extension_configs interface.

This document:

wiki_base_url: http://example.com/
wiki_end_url:  .html
wiki_html_class:

A [[WikiLink]] in the first paragraph.

would result in the following output (notice the blank wiki_html_class):个人看法:

<p>A <a href="http://example.com/WikiLink.html">WikiLink</a> in the first paragraph.</p>

个人看法:

感觉这个不好用。还是最基本的语法链接格式好一些。[]()

[^@# %”.


  1. This is a footnote content.
    The first paragraph of the definition.
  2. This is a footnote content.
    The first paragraph of the definition.
  3. This is a footnote content.
    The first paragraph of the definition.
  4. This is a footnote content.
    The first paragraph of the definition.
 类似资料: