Last Release: 8 February 2003
The Cardinal XFML Parser is an XFML Core compatible XFML processor implemented in Visual Basic 6.0 and built upon the MSXML 4 DOM implementation. Cardinal provides an XFML abstraction to simplify the development of tools to create and consume XFML documents. This article gives an overview of the parser, a detailed specification of its API, and notes on its use.
This document and the Cardinal XFML Parser software are copyright © 2003 by Jeremy Shantz. Some rights reserved.
eXchangeable Faceted Metadata Language (XFML) is a lightweight language for faceted metadata. Cardinal XFML Parser is a XFML Core compatible, COM-based API for creating and processing XFML files.
Although it is possible to create and process XFML documents using standard XML parsers, Cardinal XFML Parser provides a convenient abstraction that simplifies working with XFML. It models the XFML specification and its concepts and provides built-in ways to traverse a document's network cloud, import occurrences, and work with occurrence strength. For example, the XFMLDocument's ImportOccurrences method downloads the remote XFML files indicated by each <topic> element's <connect> children, selects and processes all occurrences from connected topics, and adds them to the output. It can traverse the XFML document's entire network cloud by doing this for each relevant <connect> element in all of the downloaded XFML files. All of this in one line of code. That kind of ease of use makes creating XFML Tools faster by enabling you to work directly with a model of the XFML language, without the need to write helper code. We hope that this leads to increased development of XFML.
Cardinal XFML Parser was implemented in Visual Basic 6.0 and is built upon Microsoft's MSXML 4.0 DOM implementation.
Cardinal XFML Parser implements the XFML 1.0 specification in its entirety. It's design was also guided by the implementation checklist and the processing instructions offered by XFML's creator.
Where the specification is ambiguous or silent, Cardinal XFML Parser tends towards the permissive. For example, while the <managingEditor> node is optional, it is not clear whether including the <name> child requires one to include all of the other children of <managingEditor>. Cardinal permits setting one or none of the child elements for all of the sub sections of <mapInfo>.
Since one of main value propositions of XFML is to allow faceted metadata to be exchanged and reused, we think it makes sense to provide a means to separate the taxonomy from any particular use of it; that is, to separate <facet> and <topic> elements from <page> elements. Cardinal XFML Parser allows this separation by implementing the <pages> tag, which is a child element of the root <xfml> element.
As mentioned on the XFML mailing list the specification does not mandate how URLs should be resolved.
The Cardinal XFML Parser API is a simple collection of objects that allows you to create valid XFML documents without the complexity of working with the XML object model. With the Cardinal XFML Parser, you generally do not have to create elements, set their value, and append them to parent nodes, although with TopicNodes and PageNodes you can do this. However, you still have all the processing power you need to find exactly the topic or page you need to update.
To use Cardinal XFML Parser, reference the library in the same way you would reference any COM object.
The PROGIDs are:
After TopicNode and PageNode objects have been added to a XFMLDocument, their properties can be updated directly without further appending or replacing the node in the XFMLDocument. You can add Connections and Occurrences to existing TopicNode and PageNode objects. These changes will be saved to the document without you having to reattach the Node to the XFMLDocument. For example, there is no need to remove and readd a TopicNode to the XFMLDocument to update its description property.
Cardinal does not ensure that you have set a URL, but the specification mandates that one be present.
The XFMLDocument object holds a representation of the XFML Document.
Using the properties and methods of the XFMLDocument, you can create, load and save XFML files ; transform XFML documents with XSLT style sheets; set the document's URL and language; create, update, and remove all of the attributes and children of <mapInfo>; add and remove facets; add, update, and remove topics and pages; import occurrences of connected topics from remote XFML documents.
All of the properties of the <mapInfo> node can be set with the following properties and methods: nextUpdate, lastUpdate, setManagingEditor(), setEditor(), setPublisher(), setWebMaster(), setLicense(), setGenerator().
If none of these values are set, the <mapInfo> node will not be created. Child elements of the <mapInfo> node can be deleted by passing null values to the appropriate method.
Facets represent one <facet> element in a XFML document.
For the Cardinal XFML Parser, <facet> elements are not stand alone objects, but properties of the XFMLDocument.
Facet elements are added to and removed from XFML documents by using the AddFacet(), RemoveFacet() and RemoveAllFacets() methods of the XFMLDocument object. You can check the properties of any <facet> element by passing the appropriate index value to the FacetId and FacetValue properties. FacetCount returns the number of facets contained within the XFMLDocument.
FacetIds cannot begin with a number. Cardinal will raise an error if an attempt is made to add a facet with an id that begins with a number.
Each facet must have an id that is unique within topics and facets. Cardinal enforces this requirement by raising an error if you attempt to add a facet with an id that is already taken.
A TopicNode represents one <topic> element in a XFML document.
Topics can be created by using the CreateTopic() method of the XFMLDocument or by creating a TopicNode object variable and using the XFMLDocument's AppendTopic() method. Both methods return a reference to the TopicNode just added. TopicNodes must have their id, facetid, and name properties set before they are created or appended.
TopicNodes are removed by using the RemoveTopic() and RemoveAllTopics() methods of the XFMLDocument.
The number of topics in an XFMLDocument can be retrieved using the XFMLDocument's TopicCount property.
Existing topics can be accessed in two ways: by passing an index to the Topics property of the XFMLDocument; by passing an XPath query to the XFMLDocument's SelectTopic() method. Both methods return a TopicNode object.
When using the CreateTopic() method of the XFMLDocument, only one connect url can be passed in. Use the returned TopicNode's AddConnect() method to add more <connect> children. A TopicNode can have zero or more connections.
Each topic must have an id that is unique within topics and facets. Cardinal enforces this requirement by raising an error if you attempt to create or append a topic with an id that is already taken.
A PageNode represents a <page> node in a XFMLDocument.
Pages can be created by using the CreatePage() method of the XFMLDocument or by creating a PageNode object variable and using the XFMLDocument's AppendPage() method. Both methods return a reference to the PageNode just added. PageNodes must have their url property set before they are created or appended.
PageNodes are removed by using the RemovePage() and RemoveAllPages() methods of the XFMLDocument.
The number of pages in an XFMLDocument can be retrieved using the XFMLDocument's PageCount property.
Existing pages can be accessed in two ways: by passing an index to the Pages property of the XFMLDocument; by passing an XPath query to the XFMLDocument's SelectPage() method. Both methods return a PageNode object.
When using the CreatePage() method of the XFMLDocument, only one occurrence object can be passed in. Use the returned PagesNode's CreateOccurrence() or AppendOccurrence methods to add more <occurrence> children. A PageNode can have zero or more connections.
The ImportOccurrences() method of the XFMLDocument imports <page> elements from remote XFML documents into the local document. The method processes the <connect> children of every <topic> element, importing page occurrences of that topic from the remote document into the local document. Imported <page> elements are assigned <occurrence> children with topicids equal to local topics, not topics in the document from which they were imported.
The XFMLDocuments's ImportedOccurrencesStrength property sets the value assigned the strength attribute of imported occurrences. All remote topics connected directly to local topics will have their occurrences imported with a strength equal to ImportedOccurrencesStrength (default is 2). If the XFMLDocuments's IncrementOccurrenceStrength property is set to True, the strength value assigned to imported occurrences is incremented by one for each level removed the remote document is from the local document. For example, if our local DocA has a topic that links to a topic in DocB, all pages imported from DocB will have an occurrence strength equal to ImportedOccurrencesStrength. If that topic in DocB is connected to a topic in DocC, pages imported from DocC will have an occurrence strength equal to ImportedOccurrencesStrength + 1 (default would be 3), and so on. This implements the specification's occurrencestrength concept.
All sample code uses Visual Basic.
Dim XFML As New XFMLDocument
XFML.Load "C:\xfml.xml"
Dim thePage As New PageNode
Do
Set thePage = XFML.SelectPage("//page[contains(@url,'xfml.org')]")
If Len(thePage.Url) Then
thePage.Url = Replace(thePage.Url, "http://xfml.org/", "http://purl.oclc.org/NET/xfml/")
End If
Loop While Len(thePage.Url)
XFML.Save "C:\xfml.xml"
set thePage = Nothing
set XFML = Nothing
This work is licensed under a Creative Commons License. You may copy and use and distribute the component and use it as part a larger work. In return, somewhere in your work, include a line like "This program makes use of the Cardinal XFML Parser, available at http://www.jeremyshantz.com/software/xfml/cardinal/."
The standard library; uses MSXML 4.
This version uses MSXML 3, but due to problems with MSXML 3's XPath handling, Cardinal lacks the following methods and properties: SelectTopic(), SelectPage(), ImportOccurrences(), ImportedOccurrencesStrength, IncrementOccurrenceStrength .
Cardinal XFML Parser depends on the following components
Send me an email. I'd be happy to respond.
This is what remains to be done:
Method. Loads an XFML document from the specified location. Any valid XML file with a root node name of 'xfml' will be loaded.
Parameters: xmlSource. A string containing a URL that specifies the location of the XFML file.
Return Value: Boolean. Returns True if the load succeeded; False if the load failed.
Method. Saves an XFML document to the specified location.
Parameters: destination. A string containing a path that specifies the location to save the XFML file. If the file already exists, it will be overwritten.
Return Value: None.
Method. Processes the XFML document using the supplied XSL Transformations (XSLT) style sheet and returns the resulting transformation.
Parameters: Required. stylesheetpath. A string containing the location of an XSL Transformations (XSLT) style sheet.
Parameters: Optional. destination. A string containing a path that specifies the location to save the transformation output. If the file already exists, it will be overwritten.
Return Value: String. Returns a string that contains the product of the transformation of this XFML document based on the XSLT style sheet. This value will still be returned if a destination parameter is specified.
Method. Processes each <connect> url of <topic> elements and imports the <page> elements matching that topic into the current XFMLDocument.
Parameters: Required. deep. Boolean. If True, the XFMLDocument's entire map network will be traversed. If False, only topics linked directly from the current XFMLDocument will have occurrences imported. Default value is False.
Return Value: Boolean. Returns True if the operation completed successfully, False otherwise.
This implements the Map Network Concept.
Property. Integer. Read. Write.
Sets and retrieves the strength value of <occurrence> elements attached to imported <page> elements.
Default value is 2.
Allows incrementing of the strength value of indexing work other than your own, in accordance with the specification's occurrencestrength concept.
Property. Boolean. Read. Write.
If True, each level of imported occurrences will have its occurrence strength incremented by one, starting from the value of ImportedOccurrencesStrength. If False, all imported occurrences will have an occurrence strength equal to ImportedOccurrencesStrength.
Default is False.
Allows occurrences with greater proximity to the original document to have a lower occurrence strength than occurrences more removed, in accordance with the specification's occurrencestrength concept.
Property. String. Read.
Retrieves the <xfml> element's version property.
Future development of the XFML specification will require this property to be read/write.
Property. String. Read. Write.
Sets and retrieves the <xfml> element's url property.
Default value is null.
Url is the XFMLDocument's default property.
Property. String. Read. Write.
Sets and retrieves the <xfml> element's language property.
Default value is 'en'.
Property. Date. Read. Write.
Sets and retrieves the <mapInfo> element's nextUpdate property.
The value passed in must be a valid date in any format. It will be converted to RCF822 format, as mandated by the specification.
Property. Date. Read. Write.
Sets and retrieves the <mapInfo> element's lastUpdate property.
The value passed in must be a valid date in any format. It will be converted to RCF822 format, as mandated by the specification.
Method. Sets the <managingEditor> element's name, email, and url values.
Parameters: name,email,url. All optional.
Return Value: None.
Pass in null values for all three parameters to delete <managingEditor>.
Method. Sets the <editor> element's name, email, and url values.
Parameters: name,email,url. All optional.
Return Value: None.
Pass in null values for all three parameters to delete <editor>.
Method. Sets the <publisher> element's name, email, and url values.
Parameters: name,email,url. All optional.
Return Value: None.
Pass in null values for all three parameters to delete <publisher>.
Method. Sets the <webMaster> element's name, email, and url values.
Parameters: name,email,url. All optional.
Return Value: None.
Pass in null values for all three parameters to delete <webMaster>.
Method. Sets the <license> element's text, name, email, and url values.
Parameters: text,name,email,url. All optional.
Return Value: None.
Pass in null values for all three parameters to delete <license>.
Method. Sets the <generator> element's name, email, and url values.
Parameters: name,email,url. All optional.
Return Value: None.
Pass in null values for all three parameters to delete <generator>.
Property. Integer. Read.
Retrieves the id attribute of the <facet> element indicated by the index.
Parameters: index. An integer which indicates the position of the <facet>.
Property. Integer. Read.
Retrieves the value of the <facet> element indicated by the index.
Parameters: index. An integer which indicates the position of the <facet>.
Property. Integer. Read.
Retrieves the number of <facet> elements in the XFML document.
Method. Adds a <facet> element to the XFML document.
Parameters: id. A string containing the id for the facet. value. A string containing the value of the facet.
Return Value: None.
The first character of the id parameter cannot be a number, as mandated by the specification.
Method. Removes a <facet> element from the XFML document.
Parameters: index. An integer which indicates the position of the <facet> to be removed.
Return Value: None.
Method. Removes all <facet> elements from the XFML document.
Parameters: None.
Return Value: None.
Property. Retrieves a reference to the <topic> element specified by the index.
Parameters: index. An integer which indicates the position of the desired TopicNode.
Return Value: TopicNode. Returns a reference to a TopicNode object.
Property. Integer. Read.
Retrieves the number of <topic> elements in the XFML document.
Method. Creates a new <topic> element and adds it to the XFML document.
Parameters: Required: id, facetid, name.
Parameters: Optional: parentTopicid , connect, psi, description.
Return Value: TopicNode. Returns a reference to the TopicNode.
Only one connect url can be passed in. Use the AddConnect method of the Topic object to add more.
Method. Appends a <topic> element to the XFML document.
Parameters: TopicNode. A TopicNode object. The TopicNode's id, facetid, and name properties must be set.
Return Value: TopicNode. Returns a reference to the TopicNode.
Method. Removes a <topic> element from the XFML document.
Parameters: index. Variant. The value can be either an integer which indicates the position of the <topic> to be removed, or a valid TopicNode.
Return Value: None.
Method. Removes all <topic> elements from the XFML document.
Parameters: None.
Return Value: None.
Method. Retrieves the <topic> element specified by the XPath query from the XFML document.
Parameters: XPathQuery. A string containing an XPath query that selects a <topic> element.
Return Value: TopicNode.
This method will only return a TopicNode. If the XPath query does not match any <topic> elements, the method will be set to a new TopicNode.
Property. Retrieves a reference to the <page> element specified by the index.
Return Value: PageNode. Returns a reference to a PageNode object.
Parameters: index. An integer which indicates the position of the desired PageNode.
Property. Integer. Read.
Retrieves the number of <page> elements in the XFML document.
Method. Creates a new <page> element and adds it to the XFML document.
Parameters: Required: url. A string.
Parameters: Optional: title, description, occurrence.Title and Description are strings. Occurrence is an occurrence object.
Return Value: PageNode. Returns a reference to the PageNode.
If the Occurrence object is present, but missing either a topicid or strength property it will be ignored.
Only one Occurrence can be passed in. Use the CreateOccurrence method of the Page object to add more.
Method. Appends an <page> element to the XFML document.
Parameters: PageNode. A PageNode object. The PageNode's url property must be set.
Return Value: PageNode. Returns a reference to the PageNode.
Method. Removes a <Page> element from the XFML document.
Parameters: index. Variant. The value can be either an integer which indicates the position of the <page> to be removed, or a valid PageNode.
Return Value: None.
Method. Removes all <page> elements from the XFML document.
Parameters: None.
Return Value: None.
Method. Retrieves the <page> element specified by the XPath query from the XFML document.
Parameters: XPathQuery. A string containing an XPath query that selects a <page> element.
Return Value: PageNode.
This method will only return a PageNode. If the XPath query does not match any <page> elements, the method will be set to a new PageNode.
Property. String. Read. Write.
Sets and retrieves the <topic> element's id attribute.
Id is the TopicNode's default property.
Property. String. Read. Write.
Sets and retrieves the <topic> element's facetid attribute.
Property. String. Read. Write.
Sets and retrieves the <topic> element's parentTopicid attribute.
Property. String. Read. Write.
Sets and retrieves the <topic> element's <name> child element.
Property. String. Read. Write.
Sets and retrieves the <topic> element's <psi> child element.
Property. String. Read. Write.
Sets and retrieves the <topic> element's <description> child element.
Property. String. Read.
Retrieves the url from the <connect> element specified by the index.
Parameters: index. A integer containing the index of the <connect> element to be retrieved.
Indexed from 0.
Property. Integer. Read.
Retrieves the <topic> element's number of <connect> child elements.
Method. Adds a <connect> child element to the <topic>.
Parameters: url. A string contain a valid url.
Return Value: None.
Method. Removes a <connect> child element from the <topic>.
Parameters: index. A integer containing the index of the <connect> element to be removed.
Return Value: None.
Method. Removes all <connect> child elements from the <topic>.
Parameters: None.
Return Value: None.
Property. String. Read. Write.
Sets and retrieves the <page> element's url attribute.
Url is the PageNode's default property.
Property. String. Read. Write.
Sets and retrieves the <page> element's title attribute.
Property. String. Read. Write.
Sets and retrieves the <page> element's description attribute.
Property. Retrieves a reference to the <occurrence> element specified by the index.
Return Value: Occurrence. Returns a reference to an Occurrence object.
Parameters: index. An integer which indicates the position of the desired Occurrence.
Property. Integer. Read.
Retrieves the <page> element's number of <occurrence> child elements.
Method. Creates a new <occurrence> element and adds it to the current PageNode.
Parameters: Required: topicid, strength.
Return Value: Occurrence. Returns a reference to the Occurrence.
Method. Appends an <occurrence> element to the current PageNode.
Parameters: Occurrence. An Occurrence object. The Occurrence object's topicid and strength properties must be set.
Return Value: Occurrence. Returns a reference to the Occurrence.
Method. Removes an <occurrence> child element from the <page>.
Parameters: index. A integer containing the index of the <occurrence> element to be removed.
Return Value: None.
Method. Removes all <occurrence> child elements from the <page>.
Parameters: None.
Return Value: None.
Property. String. Read. Write.
Sets and retrieves the <occurrence> element's topicid attribute.
TopicId is the Occurrence object's default property.
Property. Integer. Read. Write.
Sets and retrieves the <occurrence> element's strength attribute.
