Uploaded image for project: 'FHIR Specification Feedback'
  1. FHIR Specification Feedback
  2. FHIR-38714

Correct markdown datatype definition

    XMLWordPrintableJSON

Details

    • Icon: Change Request Change Request
    • Resolution: Persuasive with Modification
    • Icon: Very High Very High
    • FHIR Core (FHIR)
    • R4
    • FHIR Infrastructure
    • Datatypes
    • Hide

      About the markdown datatype:

      • Markdown is a string, and subject to the same rules (e.g. length limit)
      • This specification requires and uses the GFM (Github Flavored Markdown) extensions on CommonMark format, with the exception of support for inline HTML.
      • This specification requires and uses the GFM (Github Flavored Markdown) table extension on the CommonMark format
      • Processers SHALL treat embedded XML tags as string content, not as tags. This may be done by pre-processing and escaping any "<" characters preceding character content ("<" becomes "\<") before processing the content or by using a markdown processor-specific flag that accomplishes the same effect. This means that HTML content cannot be embedded in markdown to influence rendering.
      • Systems are not required to have markdown support, so the content of a string should be readable without markdown processing, per markdown philosophy
      • Converting an element that has the type string to markdown in a later version of this FHIR specification is not considered a breaking change (neither is adding markdown as a choice to an optional element that already has a choice of data types)

      In addition, the publisher will be updated to spit out a warning if there are non-escaped XML tags indicating that they will be escaped.

      Will create a Task to perform a review of all current 'string' elements to verify that none of our existing string elements are used in places where multiple paragraphs worth of content are potentially appropriate.

      Show
      About the markdown datatype: Markdown is a string, and subject to the same rules (e.g. length limit) This specification requires and uses the GFM (Github Flavored Markdown) extensions on CommonMark format, with the exception of support for inline HTML. This specification requires and uses the GFM (Github Flavored Markdown) table extension on the CommonMark format Processers SHALL treat embedded XML tags as string content, not as tags. This may be done by pre-processing and escaping any "<" characters preceding character content ("<" becomes "\<") before processing the content or by using a markdown processor-specific flag that accomplishes the same effect. This means that HTML content cannot be embedded in markdown to influence rendering. Systems are not required to have markdown support, so the content of a string should be readable without markdown processing, per markdown philosophy Converting an element that has the type string to markdown in a later version of this FHIR specification is not considered a breaking change (neither is adding markdown as a choice to an optional element that already has a choice of data types) In addition, the publisher will be updated to spit out a warning if there are non-escaped XML tags indicating that they will be escaped. Will create a Task to perform a review of all current 'string' elements to verify that none of our existing string elements are used in places where multiple paragraphs worth of content are potentially appropriate.
    • Eric Haas/Grahame Grieve: 10-0-1
    • Clarification
    • Compatible, substantive
    • R5

    Description

      Rationale:

      The markdown datatype was intended to use a very conservative version of markdown consisting of   commonmark with no inline HTML + gfm tables with no inline HTML to ensure interoperability and safety.  The definition currently refers to GHM commonmark page which shows the table extension but also allows inline html as does commonmark itself. This definition is incorrect need to be updated to reflect the stricter set of “FHIR markdown” rules.

      Some Background

      See this chat: https://chat.fhir.org/#narrow/stream/179252-IG-creation/topic/HTML.20tags.20information.20message

      from MarkDown spec (https://spec.commonmark.org/0.30/#raw-html):

      Text between < and > that looks like an HTML tag is parsed as a raw HTML tag and will be rendered in HTML without escaping. Tag and attribute names are not limited to current HTML tags, so custom tags (and even, say, DocBook tags) may be used.

      from pandoc’s doco:

      Extension: raw_html

      Markdown allows you to insert raw HTML (or DocBook) anywhere in a document (except verbatim contexts, where <, >, and & are interpreted literally). (Technically this is not an extension, since standard Markdown allows it, but it has been made an extension so that it can be disabled if desired.)

      The raw HTML is passed through unchanged in HTML, S5, Slidy, Slideous, DZSlides, EPUB, Markdown, CommonMark, Emacs Org mode, and Textile output, and suppressed in other formats.

      In the CommonMark format, if raw_html is enabled, superscripts, subscripts, strikeouts and small capitals will be represented as HTML. Otherwise, plain-text fallbacks will be used. Note that even if raw_html is disabled, tables will be rendered with HTML syntax if they cannot use pipe syntax.

      The Markdown linter rule MD033 - inline HTML (source?: see https://openbase.com/js/markdownlint)

      MD033 - Inline HTML Tags: html

      Aliases: no-inline-html

      Parameters: allowed_elements (array of string; default empty)

      This rule is triggered whenever raw HTML is used in a Markdown document:

      Inline HTML heading

      To fix this, use 'pure' Markdown instead of including raw HTML:

      Proposal

      1. change Markdown data datatype

      markdown: A FHIR string (see above) that may contain markdown syntax for optional processing by a markdown presentation engine, in the CommonMark format plus the GFM commonmark table extension with no inline raw html. xs:string JSON string Regex: \s*(\S|\s)* (can’t put size limit in the regex - too large)

      1. change the markdown discussion
        • About the markdown datatype:
          • This specification requires and uses the GFM (Github Flavored Markdown) extensions on CommonMark format
          • This specification requires and uses the GFM (Github Flavored Markdown) table extension on the CommonMark format
          • Markdown content SHALL NOT contain  inline raw html (see rule MD033) TODO: get source
          • Markdown content SHALL NOT contain Unicode character points below 32, except for u0009 (horizontal tab), u0010 (carriage return) and u0013 (line feed)
          • Systems are not required to have markdown support, so the content of a string should be readable without markdown processing, per markdown philosophy
          • Markdown is a string, and subject to the same rules (e.g. length limit)
          • Converting an element that has the type string to markdown in a later version of this FHIR specification is not considered a breaking change (neither is adding markdown as a choice to an optional element that already has a choice of data types)

       
       

      •  
         

      Attachments

        Activity

          People

            Unassigned Unassigned
            ehaas Eric Haas
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: