EuroMISE centrum – Cardio, co-holder organisation University of Economics, Prague

XML Knowledge Block Transformation (XKBT)

Version 2.0

Author: Marek Růžička (ruza.m@volny.cz)


1.12.2002

Abstract

This document defines syntax and semantics of language XKBT (XML Knowledge Block Transformation), which is used for directed transformation of XML document to another XML document.

Language XKBT was created as a part of methodology of step-by-step approach to text document formalization and it was designed in way to fulfill all requirements of Stepper system, based on that methodology.

XKBT explicitly expressed all possible transformations of knowledge block from one level (one XML document) to another.

Changes from previous version (1.0)

New features:


1.  Introduction

During development of methodology of step-by-step approach and support tool Stepper we gradually find out, that we cannot continue until we explicitly specify all possible transformations of individual knowledge blocks/types between source and destination document. We try to work out necessary system of general transformation rules (XKBT).

Theoretical overview and possibilities of practical use of XKBT are given here (Czech only).

This document is devoted primary to syntax and semantics of XKBT. XKBT defines four main rule types - 1:1 relation, decomposition, aggregation and death-end rule. Each rule identifies source and destination set of knowledge blocks and describes way of transformation.

Someone could ask why XKBT when we already have XSLT. First of all XKBT supports directed transformations. It means that user can influence processing of rule at some points and therefore even the result of transformation. XKBT also enables situations, where more suitable rules are defined for same set of source knowledge blocks (elements). User then simply picks up one of offered rules. And finally we hope that XKBT is much easier to understand than XSLT.

2.  General record of XKBT document

Next example shows, how root element of XKBT document is defined.

<!ELEMENT xkbt (comment?, (one-to-one | decomposition aggregation | death-end)*)>
<!ATTLIST xkbt author CDATA #IMPLIED >
<!ATTLIST xkbt date CDATA #IMPLIED >

Elements:

xkbt – root element; it contains optional element comment (comment to whole file) and arbitrary count of rule elements.

Attributes:

author – optional attribute for authors name

date – optional attribute for date

3.  Identification of knowledge blocks/elements and their sets

In general each rule type should identify all source blocks which will be transformed into destination blocks. That's the reason why we need way how to identify those blocks in relevant DTDs.

Definition of elementary knowledge block:

<!ELEMENT source (cond*, exception*, applyXSLT?) >
<!ATTLIST source element CDATA #REQUIRED >

<!ELEMENT dest (applyXSLT?) >
<!ATTLIST dest element CDATA #REQUIRED >

Elements:

source – this element identifies source element in DTD for source document

dest – this element identifies destination element in DTD for destination document

Attributes:

element – name of element in DTD file

Note – All other defined sub-elements will be described later

In some situations we need to identify not only single knowledge block, but compounded set of knowledge blocks. For specification of set we use element compound-source (or compound-dest).

Definition of compounding elements:

<!ELEMENT compound-source (source | compound-source)*>
<!ATTLIST compound-source type (iteration | selection | sequence)
"selection">
<!ATTLIST compound-source minOccures CDATA #IMPLIED>
<!ATTLIST compound-source maxOccures CDATA #IMPLIED>
<!ATTLIST compound-source occurs CDATA #IMPLIED>
<!ATTLIST compound-source order (fixed | free) "fixed">
<!ATTLIST compound-source conditions (all | any | none) "all">

<!ELEMENT compound-dest (dest | compound-dest)*>
<!ATTLIST compound-dest type (iteration | selection | sequence)
"selection">
<!ATTLIST compound-dest minOccures CDATA #IMPLIED>
<!ATTLIST compound-dest maxOccures CDATA #IMPLIED>
<!ATTLIST compound-dest occurs CDATA #IMPLIED>
<!ATTLIST compound-dest order (fixed | free) "fixed">

Elements:

compound-sourcesource set of knowledge blocks compounded from elementary blocks and other sets

compound-dest – destination set of knowledge blocks, similar to compound-source

Attributes:

type – three allowed types – iteration, selection and sequence

minOccures – used only for iteration, minimal count of iterated block

maxOccures – used only for iteration, maximal count of iterated block

occures – used only for iteration, exact count of iterated block

order – used only for sequence. Value fixed means that the order of blocks within sequence is strictly given. Otherwise value free allows free order of blocks.

conditions – this attribute will be explained later in paragraph about conditions

4.  Rule types

4.1.  1:1 relation

This rule type transforms one source block to exactly one destination block. It's used especially in situations, when we are giving precision to inner structure of knowledge block.

Definition of 1:1 relation rule:

<!ELEMENT one-to-one (comment?, source, dest*) >
<!ATTLIST one-to-one name CDATA #REQUIRED >
<!ATTLIST one-to-one type (direct | selection) "direct" >

Elements:

one-to-one – this rule type transforms source block identified in source element into destination block identified in dest.

Attributes:

name – name (identification) of rule

type – sub-type of rule – value direct should be used in situations when there is only one defined dest element. Otherwise value selection should appear (value selection means, that during application of rule one of offered dest elements has to be selected and the transformation will be directed to this dest element). Examples 1 and 2 shows rule 1:1 in both variants.

Example 1 - 1:1 relation - direct relation

<one-to-one name="example_1" type="direct">
      <source element="hypothesis" />
      <dest element="hypothesis" />
</one-to-one>
Example 2 - 1:1 relation - selection of destination element

<one-to-one name="example_2" type="selection">
      <source element="diagnose" />
      <dest element="treatment" />
      <dest element="disease" />
      <dest element="other" />
</one-to-one>.

4.2.  Decomposition rules

Decomposition rules have elementary knowledge block on source side and set of compounded blocks on destination side. This rule type is written in element decomposition.

Definition of decomposition rule:

<!ELEMENT decomposition (comment?, source, compoud-dest) >
<!ATTLIST decomposition name CDATA #REQUIRED >

Elements:

decomposition – Similar to one-to-one element. Only difference is that compounded set of blocks has to specified on destination side of rule

Attributes:

name – name (identification) of rule

Essential part of decomposition rule is compounded set of destination elements. Next example shows all three possibilities of compounding (selection/sequence/iteration). Thanks to these three operations we are able to design almost every set of blocks.

Example 3 – Decomposition rule – destination set

<decomposition name="example_3 - simple decomposition">
      <source element="drug treatment" />
      <compound-dest type="iteration" minOccures="1" maxOccures="5">
            <dest element="drug class" />
            <compound-dest type="sequence" order="fixed">
                  <dest element="treatment_start" />
                  <dest element="treatment_duration" />
            </compound-dest >
            <compound-dest type="selection"> 
                  <dest element="comment" />
            </compound-dest >
      </compound-dest >
</decomposition>

4.3.  Aggregation rules

Next group of rules is called aggregation rules. As an opposite to previous decomposition rules, this type has single elementary block on destination side and set of blocks on source side.

Aggregation rules are divided in three categories – clustering, integration and plain text transformation.

Definition of aggregation rule:

<!ELEMENT aggregation (comment?, compound-source?, dest) >
<!ATTLIST aggregation name CDATA #REQUIRED >
<!ATTLIST aggregation type (cluster | integration | text) "cluster" >

Elements:

aggregation – this rule transforms source set of blocks defined in compound-source element into simple destination element identified in dest sub-element.

Attributes:

name – name (identification) of rule

type – sub-type of aggregation rule (cluster – clustering of blocks, integration – integration of blocks, text – transformation of plain text to primary knowledge block).

First variant of aggregation (cluster) takes all source blocks and copy them as sub-elements into destination block. Substructure and attribute values of all source blocks remain unchanged.

Example 4 - Aggregation rule - clustering of element

<aggregation name="example_4" type="cluster">
      <compound-source type="sequence" order="free">
      <source element="diagnose" />
      <source element="treatment" />
      <compound-source type="iteration" minOccures="0">
                  <source element="other" />
            </compound-source>
      </compound-source>
      <dest element="conclusion" />
</aggregation>

Second variant is used for actual integration of source blocks. Aggregating element could have completely different substructure independent from source blocks.

Alternative use of aggregation rules is to markup fragment of plain text (in source document – XHTML format) and make it knowledge block (XML element at first level). This is way of creation of new elementary knowledge blocks and could be used only during first step. In this case sub-element compound-source is left behind.

Example 5 - Aggregation rule - Markup of element in source XHTML document
<agregation name="example 5 - knowledge block goal" type="text">
      <dest element="goal" />
</agregation>

4.4.  Death-end rules

Death-end rules are used for those knowledge blocks, which seems to be useless for further formalization. After application of death-end rule on knowledge block, this block will be excluded from destination document.

Definition of death end rule:

<!ELEMENT death-end (comment?, (source | compound-source)) >
<!ATTLIST death-end name CDATA #REQUIRED >

Record of this rule is similar to previously mentioned rule types.

Example 6 - Death end rule 

<death-end name="example 6 - death-end ">
      <source element="other" />
</death-end>

5.  Copying of knowledge block contend

Copying of knowledge block contend is defined in element exception. Exception is always sub-element of source element and all information given in it is related only to that source element.

<!ELEMENT exception (attrException*) >
<!ATTLIST exception contendCopy (true | false) "true" >
<!ATTLIST exception allAttrCopy (true | false) "true" >

<!ELEMENT attrException EMPTY >
<!ATTLIST attrException attribute CDATA #REQUIRED >
<!ATTLIST attrException copy (true | false) "false" >
<!ATTLIST attrException copyTo CDATA #IMPLIED >

Elements:

exception – declares copying of all attributes and substructure

attrException – sub-element of exception element; declares copying of one individual attribute

Attributes:

contendCopy – copying of substructure; if it is set to true, all compatible contend of source element is copied to destination element (compatible contend means those sub-elements that are defined in both source and destination DTD in same way)

allAttrCopy – copying of all attributes; if it is set to true, all compatible attributes of source element are copied to destination element

attribute – identification of one individual attribute

copy – if it is set to true, value of attribute is copied to destination element

copyTo – this attribute is used in cases when some attribute should be copied into attribute with different name (copyTo is equal to name of destination attribute)

Example 7 shows copying of attributes. In this example all compatible attributes of element diagnose are copied to destination element and beside that attribute name is copied to attribute ID. Only attribute reference is left behind.

Example 7  Copying of knowledge block contend

<source element="diagnose">
      <exception allAttrCopy="true">
            <attrException attribute="name" copyTo="ID" />
      	    <attrException attribute="reference" copy="false" />
      </exception>
</source>

6.  Conditioned transformation rules

Each rule could be enriched with conditions. These conditions apply to source block or set of blocks and they are testing its attribute values and sub-structure. They are written within sub-element cond and each cond contains exactly one condition.

<!ELEMENT cond (comment?)>
<!ATTLIST cond attribute CDATA #IMPLIED>
<!ATTLIST cond child_element CDATA #IMPLIED>
<!ATTLIST cond parent_element CDATA #IMPLIED>
<!ATTLIST cond previous_element CDATA #IMPLIED>
<!ATTLIST cond next_element CDATA #IMPLIED>
<!ATTLIST cond exist (yes | no) "yes">
<!ATTLIST cond empty (yes | no) "yes">
<!ATTLIST cond operator (equal | not_equal) "equal">
<!ATTLIST cond value CDATA #IMPLIED>

Attributes of cond are used in several combinations depending on condition type. There are 8 main types of condition. Their detailed description is in table I.

Table I
Condition type Used  attributes Description
Attribute exist attribute="attribute name"
exist="yes / no"
Condition is accepted if attribute of given name exists
Attribute value attribute="attribute name"
operator="equal / not_equal"
value="expected value"
Tests whether attribute of given name has (or has not) expected value.
Text element empty="yes / no" Tests whether text element is empty or not
Sub-element exist child_element="name of  sub-element "
exist="yes / no"
Similar to Attribute exist.
Sub-element count child_element="name of sub-element "
operator="equal / not_equal / higher_than / lower_than"
value="number of sub-elements"
Compare real number of sub-elements with given name (child_element) with expected value.
It is possible to use keyword "&ALL" for attribute child_elemens to count all sub-elements
Parent element parent_element="name of parent element " Compares name of real parent element with expected name parent element
Previous element previous_element="name of previous element " Similar to Parent element. It is possible to use keywords "&ANY" and "&NONE" for attribute previous_element to test whether previous element exist (&ANY) or not (&NONE)
Next element next_element="name of next element " Similar to Parent element. It is possible to use keywords "&ANY" and "&NONE" for attribute next_element to test whether next element exist (&ANY) or not (&NONE)

If you want to use more conditions at the same time you have to fill in attribute conditions at element source, because evaluation of conditions is dependent on its value. There are two standard values for attribute conditions - all (mean all conditions have to be accepted at the same time) and any (at least one condition has to be accepted). Example 8 shows rule with conditions.

Example 8 - Evaluation of conditions

<one-to-one name="example_8 - conditions" type="direct">
      <source element="diagnose" conditions="all">
            <cond attribute="disease_type" exist="yes" />
            <cond attribute="is_treatable" operator="equal" value="yes" />
      </source>
      <dest element="disease" />
</one-to-one>

7.  Integration of XSLT

XSLT (XSL Transformations) is very powerful XML based transformation language and in some situation it is easier to integrate part of XSLT rather than develop similarly complicated transformation system. At the moment integration of XSLT is provided by link to external XSL file. Next version of XKBT will probably include possibility of writing XSLT code directly in XKBT files. The link to external file is placed in sub-element applyXSLT of element dest in attribute externalLocation (see example 9).

Example 9 - Integration of XSLT

<one-to-one name="example_9" type="direct">
<source element="hypothesis" />
<dest element="hypothesis">
      <applyXSLT externalLocation = "\step2\xslt_rules\hypothesis.xsl" />
</dest>
</one-to-one>

Date: November 23, 2017

URL: