Details
-
Change Request
-
Resolution: Persuasive with Modification
-
Very High
-
FHIR Core (FHIR)
-
R4
-
FHIR Infrastructure
-
Datatypes
-
-
Eric Haas/Grahame Grieve: 10-0-1
-
Clarification
-
Compatible, substantive
-
R5
Description
Rationale:
The markdown datatype was intended to use a very conservative version of markdown consisting of commonmark with no inline HTML + gfm tables with no inline HTML to ensure interoperability and safety. The definition currently refers to GHM commonmark page which shows the table extension but also allows inline html as does commonmark itself. This definition is incorrect need to be updated to reflect the stricter set of “FHIR markdown” rules.
Some Background
See this chat: https://chat.fhir.org/#narrow/stream/179252-IG-creation/topic/HTML.20tags.20information.20message
from MarkDown spec (https://spec.commonmark.org/0.30/#raw-html):
Text between < and > that looks like an HTML tag is parsed as a raw HTML tag and will be rendered in HTML without escaping. Tag and attribute names are not limited to current HTML tags, so custom tags (and even, say, DocBook tags) may be used.
from pandoc’s doco:
Extension: raw_html
Markdown allows you to insert raw HTML (or DocBook) anywhere in a document (except verbatim contexts, where <, >, and & are interpreted literally). (Technically this is not an extension, since standard Markdown allows it, but it has been made an extension so that it can be disabled if desired.)
The raw HTML is passed through unchanged in HTML, S5, Slidy, Slideous, DZSlides, EPUB, Markdown, CommonMark, Emacs Org mode, and Textile output, and suppressed in other formats.
In the CommonMark format, if raw_html is enabled, superscripts, subscripts, strikeouts and small capitals will be represented as HTML. Otherwise, plain-text fallbacks will be used. Note that even if raw_html is disabled, tables will be rendered with HTML syntax if they cannot use pipe syntax.
The Markdown linter rule MD033 - inline HTML (source?: see https://openbase.com/js/markdownlint)
MD033 - Inline HTML Tags: html
Aliases: no-inline-html
Parameters: allowed_elements (array of string; default empty)
This rule is triggered whenever raw HTML is used in a Markdown document:
Inline HTML heading
To fix this, use 'pure' Markdown instead of including raw HTML:
Proposal
- change Markdown data datatype
markdown: A FHIR string (see above) that may contain markdown syntax for optional processing by a markdown presentation engine, in the CommonMark format plus the GFM commonmark table extension with no inline raw html. xs:string JSON string Regex: \s*(\S|\s)* (can’t put size limit in the regex - too large)
- change the markdown discussion
-
- About the markdown datatype:
- This specification requires and uses
the GFM (Github Flavored Markdown) extensions onCommonMark format - This specification requires and uses the GFM (Github Flavored Markdown) table extension on the CommonMark format
- Markdown content SHALL NOT contain inline raw html (see rule MD033) TODO: get source
- Markdown content SHALL NOT contain Unicode character points below 32, except for u0009 (horizontal tab), u0010 (carriage return) and u0013 (line feed)
- Systems are not required to have markdown support, so the content of a string should be readable without markdown processing, per markdown philosophy
- Markdown is a string, and subject to the same rules (e.g. length limit)
- Converting an element that has the type string to markdown in a later version of this FHIR specification is not considered a breaking change (neither is adding markdown as a choice to an optional element that already has a choice of data types)
- This specification requires and uses
- About the markdown datatype: