5.1 Normalizing mark-up

Transformation of XML source ...

<document-like>
   <p>That means that someone (actually it was
   <name type="person">Sigfrid Lundberg</name>)
   cheated when he late autumn <date>2001</date>
   pushed a lot of XML data into a full-text Z39.50 server</p>
</document-like>

... into something more searchable

<record-like>
   <text>That means that someone (actually it was</text>
   <name-person>Sigfrid Lundberg</name-person>
   <text>) cheated when he late autumn</text>
   <date>2001</date>
   <text>pushed a lot of XML data into a
        full-text Z39.50 server</text>
</record-like>

... but there are problematic combinations of syntax and semantics

<p>The search engine in
<title><name type="person">Sigfrid Lundberg</name>'s
database</title> is using Z39.50 for access</p>

<< prev | next >>


$Id$