Uploaded image for project: 'FHIR Specification Feedback'
  1. FHIR Specification Feedback
  2. FHIR-4026

XML resource parsing with contained resource(s) fails when XML formatting is compressed (no white space characters)

XMLWordPrintableJSON

    • Icon: Change Request Change Request
    • Resolution: Persuasive
    • Icon: Medium Medium
    • FHIR Core (FHIR)
    • DSTU1 [deprecated]
    • FHIR Infrastructure
    • Bundle
    • Correction
    • Non-substantive
    • DSTU1 [deprecated]

      Problem Statement

      Tool: FHIR Java Reference Implementation
      Attachment: bundle-example-message.xml - compressed XML format (no white space characters)

      While attempting to parse a Bundle resource with multiple entries each with a contained resource, the Bundle parsed correctly if formatted using a pretty XML output. However, when the Bundle XML format was compressed; i.e. all white space characters removed, the XMLParser throws the following java.lang.Exception:

      java.lang.Exception: Unknown Content entry @ START_TAG seen ...</resource></entry><entry>... @1:1300
      at org.hl7.fhir.instance.formats.XmlParserBase.unknownContent(XmlParserBase.java:268)
      at org.hl7.fhir.instance.formats.XmlParser.parseBundleBundleEntryComponent(XmlParser.java:1386)
      at org.hl7.fhir.instance.formats.XmlParser.parseBundle(XmlParser.java:1338)
      at org.hl7.fhir.instance.formats.XmlParser.parseResource(XmlParser.java:9664)
      at org.hl7.fhir.instance.formats.XmlParserBase.parse(XmlParserBase.java:94)
      at org.hl7.fhir.instance.formats.XmlParserBase.parse(XmlParserBase.java:82)
      at org.hl7.fhir.instance.test.ToolsHelper.executeCanonicalXml(ToolsHelper.java:303)
      at org.hl7.fhir.instance.test.ToolsHelper.main(ToolsHelper.java:74)

      Additional testing revealed that this same exception is thrown when parsing any resource with contained resource(s) when the XML is formatted in this compressed manner.

      Corrective Action

      The org.hl7.fhir.instance.formats.XMLParseBase class contains two methods for parsing contained resources:

      parseDomainResourceContained(XmlPullParser xpp)
      parseResourceContained(XmlPullParser xpp)

      Each of these methods were found to exhibit the same issue where after the return of the parsed DomainResource or Resource when the XML format did not contain any white space caracters, the XmlPullParser current tag was not set properly. The issue is that returned tag position from the parsed DomainResource or Resource was one more tag ahead. This meant that the subsequent second call to next(xpp) was then one more tag position ahead of where it should have been and, therefore, caused the entire Bundle parse to fail.

      I implemented the following code change in my local environment to both of the above methods (shown here in just the parseResourceContained method). The doSecondNext boolean is set based on the XmlPullParser current text immediately following the return of the parsed DomainResource or Resource where the text is expected to not contain the </resource> end tag in order for the second next(xpp) to be called.

      Code example
      protected Resource parseResourceContained(XmlPullParser xpp) throws Exception {
      next(xpp);
      int eventType = nextNoWhitespace(xpp);
      if (eventType == XmlPullParser.START_TAG) {
      Resource r = (Resource) parseResource(xpp);
      boolean doSecondNext = (xpp.getText() != null && xpp.getText().indexOf("resource") < 0) ? true : false;
      next(xpp);
      if (doSecondNext)

      {\\ xpp.next();\\ }


      return r;
      } else

      {\\ unknownContent(xpp);\\ return null;\\ }


      }

            Unassigned Unassigned
            richard.ettema Richard Ettema
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: