Wednesday, 7 October 2015
Change - because things don't stand still
Monday, 14 September 2009
XProc and SMIL: Orchestrating Pipelines
XProc and SMIL: Orchestrating Pipelines
Friday, 4 September 2009
Serving XML
Serving XML
Abstract
The stage has been set for a new kind of application development that does away with the impedance miss-match that occurs between programming languages like Java and the XML data they seek to process. The ability to query and updated XML content in a consistent and logical manner is being provided by technologies like Native XML Databases, XQuery and the XRX architecture.
Contents
Introduction
This article takes a broad and high-level view of the current state of Native XML Databases, XML Querying, Processing and their future. Starting with a brief review of the way XML used to be stored, we then look at what Native XML databases are today and what is on offer. With XQuery now firmly established as a W3C Recommendation, we see how it is being employed for search, aggregation and the delivery of XML content. As a part of this we make a quick study of the XRX architecture, ask how XSLT fits in, look at XML Pipelines and finally cast our gaze to the future.
The Way Things Were
XML documents have, traditionally, been stored in Relational Database Management Systems (RDBMS) as 'blobs' of character data; usually broken into chunks that best fit the publisher's requirements for reuse, and then indexed accordingly. If the granularity of the 'chunks' is a good fit for your purposes then the system will be well balanced. The balancing act has to consider the size of the chunks e.g. article, chapter, section or paragraph; against how you might want to search, aggregate and reuse those chunks. Too small and it will be like trying to knit sawdust, too large and you'll be juggling bowling balls.
Native XML Databases
You no longer have to put-up with disgruntled application developers bashing XML's square pegs into their roundish object-orientated holes. What you can have is a confident group of XML developers using a range of tools that are built upon the XML stack of technologies (UTF-8, URI, Namespaces, XML, XPath, XQuery and XSLT) working in a far more consistent environment.
XQuery: A New Beginning
A new beginning indeed, but one that's been a long time coming. XQuery only became a W3C Recommendation in 2007 but has been in development for many years. Its promise is to provide a means of querying large stores of XML content, transforming and then delivering finished XML documents. It is this ability to not only query but create and transform document structure at the same time that sets it above other languages or combinations of languages. The sheer convenience of being able to use the same language to do both and in a way that's transparent to the underlying data model is undeniable.
XQuery can be regarded as a super-set of XPath 2.0, where additional instructions allow the creation of new nodes (the transformation) and the ability to sort the sequences of nodes returned by a query. With a query language that is built upon the logical model with which the content has been stored, NXDs are optimised to the task of storing, querying and retrieving and updating XML content. This enables them to operate with efficiency and speed. It hasn't taken long for developers to realise that an entire web application can be built on top of an NXD by using XQuery as the programming language.
XRX: An End-to-End XML Solution
XForms
XForms Implementations
| Client-side | Server-side | Browser Plug-in |
| ubiquity xforms | Orbeon | Mozilla XForms |
| XSLTForms | Chiba | formsPlayer |
| XForms Engine |
REST
XQuery
Where Does XSLT Fit-in
XProc: An XML Pipeline Language
Serving XML
- Elsevier
- McGraw-Hill Education
- O'Reilly Media
- Oxford University Press
- Springer Science+Business Media
- Wiley
- Wolters Kluwer
What Does the Future Hold?
The scene has quite definitely been set for an exciting future!
| Serving XML | Philip Fennell |
Saturday, 13 January 2007
A few clever tricks with XSLT 2
1) Resolving local fragment identifiers within a document. SVG has a mechanism for declaring pieces of reusable mark-up that can be referenced with a
<use xlink:href="#someID"/> fragment where #someID is a fragment identifier URL that is resolved within the parent document. When using XSLT 2 to process a document and you wished to resolve that reference you could try the following.<xsl:sequence select="doc(resolve-uri(@xlink:href, base-uri(root()))"/>This example treats the fragment identifier as a URI (which it is), it resolves it with respect to the URI of the source document's root and then opens that document and extracts the fragment the identifier points to. Its short and sweet and doesn't require any tedious messing around with
substring-after(@xlink:href, '#'). For that matter you could define your own wrapper function called my:resolve-fagment-identifier() that takes a single argument that is the fragment identifier.2) Now, in saying all that, you could regard the previous reference as something akin to an ID/IDREF. but in examples where you have used the
id attribute but don't have a schema or DTD you wont be able to use the XPath id() function.Or can you?
In a previous post entitled 'Who knows what a node ID is?' I talked about the
xml:id attribute. Saxon 8 understands this attribute's intent to uniquely identify its owner element within the scope of the parent document. So put simply, if you don't or wont have a schema/DTD but you do want ID/IDREFs then use xml:id and the id() function.3) Here's a nice little tip - you have a path expression that must match, for example, one of two attributes.
<xforms:input bind="date" ref="shipping/@date">....
</xforms:input bind>In this case the bind attribute has priority over the ref attribute so:
<xsl:value-of select="(@bind, @ref)[1]"/>The brackets are a sequence constructor and the
[1] predicate states that the first item in the sequence will be selected. Where both are present then @bind is selected but in the absence of the bind attribute, @ref will be selected.
Of Schematron, Unit Testing and oXygen...
- Schematron - a rules based XML validation language
- Unit testing - with respect to the above
- <oxygen/> - an most excellent XML IDE
Unit Testing, which I'm sure we've all had some involvement with one way or another but for some people its necessary and for others its a necessary evil. Now, considering what I've just been saying about Schematron, a rules based language for validating XML documents, or fragments there of, the light should be coming on about now. Yes, you can use Schematron to validate the results of your XSLT transformations. I could bang on for ages about this but I wont. Have a think and a play.
<oxygen/>, as I've already stated is a most excellent XML IDE, which comes chock full of some wonderful tools including the very good source editor with superb code completion, abbreviations and support for all the main validation languages (including Schematron). It has a sensational debugger that you've got to experience to believe, a profiler that I haven't touched yet but others I know have found it useful and it support XSLT 2 and XQuery via Saxon 8. I'm not kidding when I say - It rocks!!!
Tuesday, 3 October 2006
XSLT is an XML application so why not transform it
It must be about three years ago that I had what can only be described as an epiphany with respect to seeing XSLT for what it is, an XML application and as such can be generated by XSLT and for that matter transformed by XSLT into another XSLT.
If you are using a framework like Apache's Cocoon, that allows XSLT transforms to be referenced as the product of another pipeline then that's one way to employ meta-stylesheets. Another and potentially more interesting approach, which I'm sure Mr. Kay will bring-up, is the use of the Saxon 8 / XSLT 2 extension functions saxon:compile-transform() and saxon:transform(). These two together allow you to load a stylesheet into your running stylesheet and apply it to a node-set or sequence that you are working on.
But why stop there when you could build a transform at run-time based on some aspect of your source document then apply that transform to either the source or some node-set derived from the source to produce the desired result.
All very wonderful stuff and I'd love to be there when he presents his paper but alas I will not. So I hope it will be available post conference in some way shape or form.
I initially used XSLT transforms on XSLT stylesheets to map some XHTML generating XSLT into XSL-FO generating XSLT. The end result of that was to simplify the maintenance of a website that published to both XHTML and PDF. Structure and style changes to the XHTML where propagated to the PDF output automatically... Sweet :)
More recently I have been looking at Schematron with a mind to using for unit testing. More on that will follow in due course.
Wednesday, 2 August 2006
Who knows what a node ID is?
Up until 9th September 2005 you could rely upon two things:
1) If there was a DTD available, then your parser would identify id attributes as being of type ID if they were declared so in the DTD.
2) Your application assumed that when you use 'id' as an attribute name then it most probably was.
Either way you could use
document.getElementById(id) to locate the uniquely identified node in the document tree.Now with respect to SVG viewers, both Batik and ASV3 make an assumption about nodes in the source tree as being SVGElement nodes. As a result you can use
document.getElementById(id) to locate metadata nodes that are not in the SVG namespace.But, if they are not in the SVG namespace and not declared in the SVG DTD then they should not be accessible by this method. The two new implementations of SVG, Firefox 1.5 and Opera 9 have made this distinction. Neither of these browsers will allow you to use the DOM to retrieve nodes by their ID if they are not SVG elements.
However, there is light at the end of the tunnel. From 9th September 2005 onwards there was a new W3C recommendation published that covered this exact problem. The xml:id recommendation identifies an new XML attribute, like xml:lang, that has a special meaning. If you use an
xml:id attribute, the application processing your XML should interpret this as a unique identifier for the owner element regardless of the presence of a DTD or schema and I'm happy to say that Opera 9 supports xml:id but unfortunately, at this moment in time, Firefox 1.5 does not.So, once again I find myself being bruised and bumped by web standards support.
