bestkungfu weblog

XML 2003 wrap-up

Filed in: XML2003, Sat, Dec 13 2003 02:11 PT

I’m always exhausted after conferences. This one is no exception. It’s a charge to the creative part of my brain, a change of scenery, the shock of rapid-fire human interaction. The chance to dream of cool new stuff. XML gave me all of that, and yet, I feel like something’s hiding.

It feels to me not like the air has been let out, as I’ve seen in other conference circles, but that the business side has taken over. And I can see the toll that can take on the real visionaries in the XML world, the folks who got us where we are today. Clearly, the idealists behind this standard are paying the bills (most of them, at least). But, as Adam Bosworth said in his keynote, there were the visionaries who had the vision, and sold the companies on it so they could get their way. It feels like they’re all looking for a way to sell them on the good stuff again.

There are a lot of topics with some buzz around them. Topic maps, for example. RDF and Semantic Web applications had a few followers. XQuery and XForms are catching on. And there are some things, like Web Services, that have crossed the threshold into the domain of business. There are still border skirmishes around SOAP versus REST, and so on, but the upshot of it all is that servers and clients are talking XML to each other, which gives the idealists one more thing to be happy about.

I have to say, I want to figure out what I’m going to be demanding in ten years, and my impression is that XML is only going to be the glue for that. (It’s also my hardcore geek media fantasy, so I won’t bother with any great detail.) There will be something for the idealists to cling to, coming soon, and it’s not that quick pop you’ll get out of Atom, for example. It’s got to be something big, something that captivates the user. Microsoft, at least, knows that. It’s going to be up to the rest of the world to realize that and act in the user’s interest. I could go another several pages railing against current user experiences, and so on, but I’ll leave that for another day. For now, it’s another six hours on a plane, another hour in the car, another brass ring, another dream.

Keynote: Ludo van Vooren

Filed in: XML2003, Fri, Dec 12 2003 19:00 PT

Ludo van Vooren is, from what I understand, the wonder boy of the SGML world. At around 16, he was well-known in markup circles. He presented a short closing keynote on the things he’s learned in his years in the industry, titled “Gullible’s Travels”. He broke the lesson into three parts:

The Technology Sweet Spot

Technology moves much faster than the users can. It’s best to be in the “sweet spot”: where the users are just starting to occupy, and the developers already are.

Economic Drivers

It takes a leap of faith to invest in technology, “and that’s what makes our job so difficult.” It’s very hard to prove return on investment up front. Two examples:

Inter-company Connections

Companies aren’t going to argue on implementing compatibility layers, “they’re going to argue over who will pay for it.” Even though things may not be difficult technically, there are still economics that are holding things back.

Web Services Fees

Web Services work well for free services and public services. But transactional models don’t work for business, because if you do a transaction model, users can’t budget for it, or control costs. (Which is actually why providers are in love with the transaction model.) He says not to underestimate security and authentication issues. “You have to kiss the frog sometimes”: don’t just do the cool stuff, cover the stuff that pays the bills.

Extending the Vision

The complexity of SGML came out of reducing the number of keystrokes for the developer. This was an artifact of the era, where bandwidth was costly. They were developing for their constraints. The goal is to think of what you’re going to need tomorrow.

The upshot is, remember where the money comes from, and where you’re going, then be conscious of where you are.

Ruby: Atom

Filed in: XML2003, 17:30 PT

I joined the session 10 minutes late after answering a question as I walked out of my session.

Core model

Dave Winer has said recently that links should point to solid URIs based off of the top-level domain. Ruby says, why not just use the existing link element in HTML? Start, previous and next, etc., are standard, and link is extensible.

RSS 2.0’s guid and RSS 1.0’s item seem to work everywhere. In Atom, it must be a URI, but not a URL (i.e., not an http:)

Descriptions: Sam, like Tim Bray, is frustrated with the informality in RSS 2.0. Markup often contains relative URLs. Fixes have been sketchy. So Atom will adopt xml:base. A number of blogging tools put the full content of the blog entry in the element. Atom’s solution is to split things into summary and content elements. Content is not supposed to be markup (i.e., only plain text) unless denoted as “encoded”.

Syndication

author: RSS 2.0 says the author’s email address must be there. Some people don’t like that. Atom uses dc.creator, which has name, email, address.

Dates: Issued date, last modified, and initial creation date (optional)

Extensibility: Atom is extensible by namespace, or by linking to data.

Ruby had a scenario involving events. Somebody you know mentions they’re going to a conference. Wouldn’t it be nice to add that info to your calendar? A solution is to point to vCal with: <atom:link type=”text/calendar” href=”…”>

Required elements: id and modified. That’s it.

Bandwidth is a problem. It’s possible that one could use link to point to the alternate content, to help alleviate repetitive downloading of the same content.

APIs

Approached in “a RESTful fashion.” POST, GET, PUT, and DELETE. SOAP is optional, and has the same functionality. There is a way to use POST and GET to emulate the PUT and DELETE features missing in non-HTTP/1.1 servers. Resources are identified like: <atom:link rel=”service.post” href=”…”>, etc.

Summary: Content is a rich source of metadata. Clients should be able to rely on required elements, and escaping and markup should be okay, if (emphasis his) it’s annotated. And links are cool.

The moderator asked how to find the feed from an arbitrary Web site. Ruby said Mark Pilgrim suggested the link element. Dare Obasanjo talked about his RSS Bandit which he uses for discovery of RSS. It starts with the link, then scans the links for something with “rss” in it. Failing that, he looks to Syndic8.

Ruby says (in response to a question from W3C’s Martin Dürst) that his biggest problem with using RDF in this space is that it “ignores the fact that the most interesting data is (not in metadata but) in your data.” He says he and Pilgrim started with Atom as a list of best practices for RSS, but RSS has been “resistant” to change, so Atom-the-format was born.

May: Accessibility and CMS

Filed in: XML2003, 16:45 PT

(This is kind of artificial, me blogging my own XML session, but at least everybody knows I’m not going to disagree with the speaker. I thought he was very well-dressed.)

I gave a basic overview of the Web Accessibility Initiative, what we produce in terms of guidelines, education and specification review. (I left out the Protocols and Formats paper on CAPTCHA, because I’ll be presenting on that at CSUN, and I’d hate to have nothing to talk about between now and then.)

I did a quick introduction on the Web Content Accessibility Guidelines, and how it’s been adopted by governments and corporations around the world. I then focused on the range of authoring tools, including conversion tools, WYSIWYG tools, multimedia tools, and, of course, CMS. I said that the one I’ve been highlighting everywhere I’ve been speaking this year has been CMS, given the comparatively large quantity of content locked up in them, and the relatively few templates that could be made accessible. (Yes, blogging software developers, I’m talking about you in the set of content management system vendors.)

The Authoring Tool Accessibility Guidelines 1.0 document has been a W3C Recommendation since the end of 2000. Conforming to ATAG indicates that the tool has done just about everything a tool can do to ensure the corresponding level of WCAG conformance. It breaks down to four categories for CMS vendors and developers to focus on:

Accessible content

Valid content is the starter. Valid’s good. You should be valid. And semantically-rich, too: screen readers and other assistive technologies understand and represent lists to users meaningfully. But if you’re just doing line breaks between links, you’re probably already broken. CMS tools should know how to fix common accessibility problems as they arise. But they should also be smart enough not to strip out markup it doesn’t understand. That’s been a problem with lots of tools to date, and has resulted in lots of wasted effort.

Accessible collateral

Most CMS products have databases for binary content (images, video, etc.), and that kind of data store needs to have room to attach, retrieve and edit metadata. Capture alt text on images, but don’t tie them one-to-one to the image. Sometimes, images mean different things, and users should be able to represent that to the user at will.

Accessible templates and documentation

Guide the user with your educational and bundled content. Anybody who’s designed a system that comes with templates knows that lots of users leave most of what’s there in place when they develop on top of it. (Submitted for your approval: most Movable Type blogs.) If the templates are inaccessible, not only did you break potentially thousands of sites, you don’t get to pull it back. And worse, your template is now considered to be best practice, so even as people evolve, they’re still apt to screw it up based on your example. Fix your broken markup before you ship. Then, in your documentation, point out the accessibility-related features of your publishing process, and make sure all of your code samples implement good accessible design.

Accessible interface

Did you know that people with disabilities even write Web content? It’s true! The interface to your CMS’s business end needs to be accessible to users with disabilities, too. When dealing with ActiveX, Java, or script-based HTML editing tools, make sure that users with script turned off can add or manipulate content. Make sure that your JavaScript calls don’t infect the href attribute: put that stuff in the events, and link to usable content with the href. If the CMS has a Web interface, it should conform to WCAG. If it’s standalone, it should conform to the accessibility guidelines of its host operating system. Every major windowing system has a list of guidelines for accessibility. Use ‘em. They’re good for you, and they just may make you rich.

It was a fun session. I also do weddings and bar mitzvahs, and if you have a roomful of interested developers, who knows, maybe I’ll come to your house or place of business. Especially if it’s in Hawaii. Please, let there be a cache of authoring tool developers in Hawaii.

My slides are public, as is the paper on accessible content management systems I wrote for the conference.

Special thanks to Tim Bray for lending me the stupid DVI-to-VGA adapter I always lose when I go out giving presentations.

Ogbuji: Python and XML

Filed in: XML2003, 15:30 PT

Uche Ogbuji presented on processing XML with Python. He says Python has been called “readable pseudocode.” He said that he was “hooked” on Python once he discovered it expressiveness and readability relative to, for example, Perl.

Python has well-designed Unicode support, built-in support for text processing, Internet, and XML. Recently added have been generators and iterators, which help for working with lists and program control. He went over a large number of the implementations, many with code examples.

xmllib

Not highly recommended: out of date.

xml.parsers.expat

Low-level interface to James Clark’s expat. Very fast, SAX-like interface.

xml.sax

SAX implementation. The dominant push model for XML. The parser streams events to a custom handler module. Methods are invoked as callbacks. To get SAX working, you set up a custom class built on sax.ContentHandler. You handle your own depth with your own implementations of startDocument, startElement, and endElement methods. You instantiate the parser with sax.make_parser(), and set the handler to the custom class you created. SAX is low-memory, is portable and reusable, but can require sophisticated state management code for big tasks, and has some syntactical hooks that are noticeably non-Pythonesque. (I didn’t write down the code sample. Sorry. It’s easy to find.)

xml.dom

DOM implementation. There are actually many implementations of DOM in Python (Ogbuji suggests trying pxdom, which he says is a rigorous implementation of DOM level 3).

document = minidom.parseString(doc) document = document.documentElement() child = document.childNodes[1] print child.attributes children = [ node for node in document.childNodes if node.codeName = u'line' ] # note that ‘u’ before “line”: It denotes Unicode. third_child = lines[2] third_child.normalize() print third_child.firstChild.data print document.toxml(encoding=”utf-8″)

There’s no complex state management needed, like there is with SAX, decent interoperability, and Python generators can make things fun. It is, on the other hand, memory-heavy, and also has a somewhat non-Pythonesque implementation.

xml.dom.pulldom

Says Ogbuji: “It’s good for what it’s good for”: pulling bits out of large files. He says it’s easier than SAX, but it’s more of a theoretical ease than a practical one.

from xml.dom import pulldom events = pulldom.parseString(doc) line_counter = 0 for (event, node) in events: if event == pulldom.START_ELEMENT: if node.tagName == "line": line_counter += 1 if line_counter == 3: # start processing events.expandNode(node) # do the other stuff you want at that level print node.firstChild.data

xmlTextReader

Pull API similar to .NET’s TextReader interface built in libxml2 from the GNOME project. The core is implemented in C.

import cStringIO import libxml2 XMLREADER_START_ELEMENT_NODE_TYPE = 1 # hacked in from C library stream = cStringIO.StringIO(doc) input_source = libxml2.inputBuffer(stream) reader = input_source.newtextReader("urn:bogus") line_counter = 0 while reader.Read(): if reader.NodeType() == XML_READER_START_ELEMENT_NODE_TYPE: if reader.Name() == "line": line_counter += 1 if line_counter == 1: node = reader.Expand() print node.children.content if reader.Next() != -1: # skip what you just expanded so you don't see it twice break

ElementTree

“Designed largely out of frustration with DOM’s lack of Python idiom.”

import cStringIO stream = cStringIO.StringIO(doc) from elementtree.ElementTree import ElementTree root = ElementTree(filestream) third_child = root.findall('line')[2]

gnosis.xml.objectify

maps from XML to Python objects. Part of the gnosis tool set.

import gnosis.xml.objectify import cStringIO stream = cStringIO.StringIO(doc) dom.obj = XML_Objectify(stream) verse.line[2].PCDATA

Anobind

Requires Ogbuji’s 4Suite. Uses declarative rules to bind to Python better. Gives extra tools like XPath, RELAX NG, XML Catalogs and XInclude.

import anobind from Ft.Xml import InputSource isrc_factory = InputSource.DefaultFactory isrc = isrc_factory.fromString(doc, "urn:bogus") binder = anobind.binder() binding = binder.read_xml(isrc) print binding verse.line[2].text_content()

Ogbuji has an article on xml.com listing all of the 15 million XML tools for Python.

Town Hall: W3C and OASIS

Filed in: XML2003, 02:00 PT

The W3C’s Philippe Le Hégaret went first. (Full disclosure: I work for W3C, and have been chatting all week with Philippe and others, but he yelled at us for leaving food around the booth, so I should be pretty close to impartial. I’ll note when I’m not.) He emphasized the position of W3C that the standards created in Web Services need to be IP-unencumbered. He enumerated the current overview of Web Services, and new work going beyond Web Services, including XML Key Management, XML Signature, XML Encryption, and a P3P implementation for Web Services.

OASIS’s Jamie Clark followed. Most of the OASIS slide on standards was the same, with one notable exception: he believes standards should be subject to “explicit, disclosed” IP terms. “Anything else,” says the slide, “is proprietary.”

“Harmonization is hard.” Instability of specs (e.g., some owner rescinds a standard, or changes it each year) causes formats to be of less value than open standards.

He noted that OASIS has submitted a UDDI namespace document to IETF for consideration as an RFC, and does so with other groups as well. They have thirteen documents that are final, and several dozen that have provisional approval.

He says everyone has pretty much agreed on SOAP for messaging, and that lots of people are using the UDDI and ebXML registries, so service discovery is good. On service description, W3C is winning with WSDL, though there are hints that something else is coming. “We haven’t found anybody who can’t use WSDL yet.”

Audience question to Philippe: “What does WS-Inspection have to do with standards?” Questioner specifically noted ebXML registry. Philippe answered that he was thinking of a system where “you don’t need a central registry.” Questioner said that there are already two competing standards in the space, and convergence has to happen before good things happen. Jamie said there are people working in those groups who disagree.

Jamie says that orchestration and management is in a “rap singer” phase: everybody who has the mic says they’re the greatest. “We’re all developing these systems trying to solve this problem.” OASIS has several groups working on this, including the BPEL technical committee. W3C has one as well, the Web Services Choreography Interface working group.

Proposed questions:

  • Should users implement web services with proprietary products or wait for the standards?
  • Is conformance and interoperability part of the lifecycle of a standard?
  • How can we move from enterprise services to web services?

Mark Palmer: “There’s a level of frustration” with the moving around of data payloads in the implementer community.

Joe Chiusano, OASIS: “In some cases, it can actally help a standard” to start out proprietary, get shaken out, then get brought to a standards body. Philippe mentioned that W3C is learning and getting feedback from WSDL and SOAP being in everyday use, and is using that feedback to work on it.

Question on the state of content management standards: Philippe says W3C is using that kind of feedback in WSDL 2.0. Questioner (Farouk? from Sun) said ebXML is on trajectory to becoming a content management standard. Sun has several CMS products, and none of them talk to each other. In the absence of standards in this space, he proposes using ebXML as the basis for a CMS standard. Jamie mentioned that there are products one can buy that allow content migration. (My impression is that the question centers on mergers, where existing databases collide. Personally, I think that CMS vendors are still aiming more for lock-in than open content exchange, and they’re going to have to grow up — er, mature — and stop holding corporate content hostage when someone buys into their system. But I won’t say that, because I want people to come to my session on content management accessibility tomorrow.)

A Sun consultant says that in his experience in the field, customers are now seeing the point in developing a single architecture based on standards. He says that customers are getting the religion of waiting until “the vendors get done fighting it out.” And they want it fast. Philippe says that it’s still important to do it well, even if that takes time with, for example, WSDL 2.0. Eve Maler from Sun says there’s a lot of moving to the middle to be done, particularly with things like security.

Joe Chiusano: It’s more critical the lower down the stack to agree on a singular standard, and perhaps less important at higher levels, especially with things like XSLT. Michael Sperberg-McQueen of W3C says that, expanding on this, the lowest layer is designed to allow more than one mechanism, which would contradict the assertion. Our colleague Martin Dürst mentioned that in IETF they look like this more like an hourglass. (This makes sense if you understand the OSI model. Honest.)

Farouk addresses the second question by submitting the Java Community Process, which requires conformance testing in order for specs to move forward. Philippe agrees, citing the W3C Candidate Recommendation process, where two implementations of each part of the specification must be found. Additionally, CR requires a test suite to ensure interop and conformance. Jamie says that OASIS requires groups to certify that they conform to a standard, but they don’t require evidence of that conformance. “Frankly, it extends the process quite a bit.” Michael interjects: “Conformance tests cannot prove conformance, because tests cannot prove correctness.” You can only scientifically disprove something. (This is why I don’t get into arguments with logicians.)

Eve Maler of Sun: “Once a specification starts to show some traction,” it’s important to have test suites, etc., but before then, it’s a lot of work to impose on a group. Conformance clauses and testable assertions are sufficient before then. Mark Palmer adds that two interoperable implementations is the minimum, and OASIS’s approach of implementability is insufficient. Jamie says that at the final stage, they actually require three implementations, but still don’t require interop. WS-Security had “several big” interoperability tests, but it isn’t documented in the process. Janet Daly, head of communications for W3C, notes that a publicly-viewable implementation report is required to exit Candidate Recommendation. Also, comments on the spec must have some disposition in public.

Moore: Semantic Web servers

Filed in: XML2003, Thu, Dec 11 2003 22:30 PT

Thus far, the Semantic Web is “unplugged” — it’s missing connectivity of SemWeb objects to each other. The tools and standards in existence are great for standalone servers pushing out content, but only HTML is going out over the wire. We’re missing the part where servers talk to each other. The famous SemWeb layer cake is missing a protocol layer.

A SemWeb protocol would be useful for clients (i.e, IE) that want to explore metadata or other relationships with the content it’s received; client apps that want to aggregate content to share it with other clients (ooh. Want that.); or to query into multiple SemWeb applications and unify it for the user.

An app using this protocol needs to be able to update, query, be easy to implement and deploy, support transactions, allow server introspection, resolve identity, be secure, support auditing, and have a small footprint. (I’d like to add that it should clean my toilet and make me cookies. Or at least contact my toilet-cleaning and cookie-making bots, discover what they do, and make them do it.)

HTTP and URIQA are two existing protocols that don’t fit all the requirements. HTTP doesn’t have query capability, a way to perform updates, etc., and that is keeping the SemWeb from gaining traction. URIQA, on the other hand, does these things, or does them better.

Moore and Andy Seaborne of HP wrote the RDF Net API and submitted it to the W3C. The goal was a way to remotely update and query RDF models. It was designed as an abstract protocol becaues they didn’t want to design an XML-based language, or do the syntax first. An abstract protocol allows different implementations. The API consists of:

query

Allows queries to be written in several languages, with a choice of result formats (RDF, XML, etc.)

getStatements

Evaluate contents.

insertStatements

Add new knowledge into knowledge base.

removeStatements

Delete stuff.

putStatements

Replace the knowledge base.

updateStatements

Batch job to insert and remove in a transaction.

options

What query languages the server supports, etc.

SOAP and HTTP bindings are available. RDF Net API meets the aforementioned design guidelines, with some limitations. There’s no security model, some issues with queries. Also missing is a Topic Map protocol. Moore describes topic map servers are “SemWeb servers in disguise.” Some of the approach with RDF Net API is reusable with topic maps. Among the issues is how to get small portions of a given topic map. (After he went from RDF Net to topic maps, I went from treading water to drowning. I’m not what you’d call a subject matter expert on topic maps. I tried to hang in there, folks. I really did.)

Robie: XQuery

Filed in: XML2003, 17:30 PT

XML is not a relational database, so using existing relational approaches doesn’t always apply. In moving relational data to XML via SQL and the DOM, you write a lot of uninteresting code. Robie goes over three approaches:

Code it yourself

It’s portable, but sucks to write.

XML/SQL

SQL 2003 includes a set called SQL/XML that is easy to use for existing SQL programmers. It also includes a set of SQL functions to bridge relational data to XML. Support is available for DB2 and Oracle, and an independent implementation is available from DataDirect.

Native XML programming

XML is fundamentally hierarchical; object-oriented systems are fundamentally graph-oriented. It’s also more than just text, but a description of relationships of data. Native XML programming abstracts away the issues with parsing, etc. XSLT, XQuery, etc., are among those things he considers to be native.

Robie offers a fourth way:

XQuery for Java

The “JDBC for XQuery” is XQuery for Java. It doesn’t exist yet. Robie’s company has submitted a proposal under the Sun Java Community Process to meet this need. It needs to deal with relational and XML data equally, but ultimately must be XML-centric.

Question (well, hypothesis) from the audience: “Logical progression that says that XSLT should go away.” Robie says, “Not necessarily,” and notes that there are some applications where XSLT is just easier to do. XSLT will be used for design, and XQuery will be used for data management, as designed.

XQuery seems to be the subject of a lot of buzz this week. Paul Cotton, who presented before Robie, offered two pages in small print of XQuery implementations in his slides. They’re not all complete, and most probably never will be, but by all appearances it should be mentioned as commonly as XSLT about two XML conferences from now.

Keynote: Dave Thomas

Filed in: XML2003, 14:45 PT

Dave Thomas of the Open Augment Consortium keynoted today, and — auigh! — his slides are in Comic Sans. (Sidebar for people who don’t regularly read me: I hate Comic Sans.)

We’ve heard the opening before: In the beginning, there was Vannevar Bush’s As We May Think. Then Doug Engelbart’s “Augmenting Human Intellect” in 1962. Then Ted Nelson’s “Hypertext” in 1965.

The Augment system is 35 years old. This is the markup of an Augment document:

<ADDRESS : VIEWSPECS ; CAPATTERN;>

He says he doubts that XPath, XQuery and XPointer combined could allow the kind of introspection that Augment does. He went over the innovations in Augment, such as the mouse and chording keypad, multiple windows and fonts, object-oriented editing, video conferencing, the log-based filesystem, among other things that have been picked up on since. (I wonder what Jef Raskin would be saying here if he were invited to talk about the Canon Cat.)

Augment takes a “non-embedded approach”, as opposed to the current edit-and-annotate process with, say, email. Each document, then, stands alone, linked by reference to each other. Quotable: “Most of our technology sucks compared to this system.”

“You have to be willing to make an investment” in a five-finger chording mouse to really get into Augment, he says. “You’re not going to get there with a point-and-click interface.” They’re working on three-button mice and chording keyboards with Logitech as collector’s items and for use with Open Augment.

On to Open Augment: a non-profit corporation formed in 2003, with a mission to preserve the Augment legacy on the Web. It’s looking to reverse-engineer the “essence” of Augment, and redesign it using open standards and Web technologies. They’re using modern interfaces (i.e., the browser) using the MVC and HTTP interfaces, and it should integrate with things like instant messaging.

Thomas complains that the core standards (XLink, SVG, XUL, etc.) are not commonly implemented. He says the vendors say (and I paraphrase) yeah, welcome to XML utopia, linking not supported. (Good point.)

On the first release of Open Augment: “We’ve all agreed on the wrong way to do it.”

Thomas took a shot at SOAP as a technology that was designed to be so complex that only Microsoft and IBM could implement it. (Another satisfied XML-RPC customer?)

He asks, “Why is it so complicated? Why is there so much stuff?”, when the Augment system took seven people three years to implement. (I think this is an easy one to answer. Don’t you?)

Open Augment is selling a DVD of the Engelbart demo of Augment from lo these many hears ago. They’re looking for a number of people to review and help author for Open Augment, integrate subsystems and file formats, alternative clients, etc., and to contribute. He appealed to new developers: “We’re losing the people who made our industry,” he says, enumerating the people who have died recently. Engelbart is still around (and has an office at Logitech), but is in less-than-perfect health. He wants to get more people involved to continue the work.

I’ve seen a few of this style of presentation now. Someone comes up and says, “I don’t know much about your technology, but it can’t do what this stuff does.” While there’s a lot of value to evangelizing the work of a pioneer one idolizes, I’m really tired of going to conferences to hear “all the stuff you’re doing sucks.” It’s a very divisive approach, often causing others who are otherwise sympathetic to tune out. Yes, we know that Ted Nelson-esque hypertext advocates aren’t happy with the state of the world. Now go out there and offer Open Augment as a bridge to better hypertext, without going over the top and proposing a world-changer.

But that’s just my impression as an infrequent follower of the hypertext world who doesn’t really have any vested interest to speak of in the outcome. Go check out Open Augment and figure it out for yourselves.

Paoli: XML on the desktop

Filed in: XML2003, Wed, Dec 10 2003 21:45 PT

Jean Paoli started by echoing Jon Udell’s opening keynote. Udell said we are at the point where the language used by people to communicate is the same language used by computers. He says InfoPath heralds “a new era for XML on the desktop.”

Everyday XML documents are broken up into four parts: the document, the end user, the back end, and the process (workflow).

North Carolina state patrol troopers are using XML forms on mobile units and submit information directly to a mainframe in a pilot program. Troopers and administrators had been using more than 500 forms, lots of manual work, etc. So they move to InfoPath, with units for the end users, on their existing IBM mainframe with Web Services, write an app in “a couple of months”, and got more efficient.

GOL Linhas Aeras, an airline in Brazil, uses an XML spreadsheet representing the Flight Timeline Board. Everyone from the front office to the catering folks use the board for analysis and logistics. Analysts used data that was 3 days old after it was all gathered and scrubbed. They use an Excel 2003 template that now generates current data, over an XML-based reservation system. (From my hazy memory of the systems at Expedia, having an XML-based CRS would be a godsend. The major ones, like Sabre, are (or were until recently) still on ancient mainframe systems.)

Continuing Legal Education Society of British Columbia marks up its documents in XML and has a repository to link to documents and statutes. They used Word 2003 with their own schema, and a CMS, and converted manuals, etc., from print to online.

Merck, a small pharmaceutical concern, is doing a pilot on clinical trials. They have reporting requirements of 24- to 48-hour turnaround in late-stage trials when incidents arise. They use an InfoPath form with their own schema, and their existing databases.

Office 2003 XML Reference Schema “Open, royalty-free license program”. WordprocessingML, SpreadsheetML, etc., published, including documentation, in the open (though if I remember right, InfoPath isn’t available in all versions of Office, no?)

Various arguments in one slide for end-user XML, such as interop, ubiquity, semantics, tool availability, and standardized schemata.

Paoli advocates “everyday XML documents” which are lightweight structure, and which he says “are running the world.”

Someone asked if there’s an Office OS X version with similar functionality. He said he couldn’t answer.

He compared InfoPath again to Notepad, saying it’s low-level enough to do simple reads and writes, and layered enough to include APIs and so forth, including XML Digital Signatures as a security layer.

When asked, “Are DTDs dead?” he responds: “I never use the word ‘dead’.” It just happens that XML Schema makes their lives easier.

Powered by WordPress (RSS 2.0, Atom)medical terms database