This White Paper is intended for web developers: it describes a design and development style well-suited to creating customizable Web applications. While this paper discusses development in the context of the PIA technology, the fundamental philosophy, which can be roughly characterized as "Web application maintenance is mostly a task of document management and should not require specialized programming skills or tools," applies equally well to other platforms. Several other technologies, such as Meta-HTML, PHP, and the proposed Apache XML initiative, support some of the "active XML" features of the PIA. We encourage the evolution and convergence of these technologies to a standard, platform-independent design for Web applications that facilitates shared development and ongoing maintenance.
The World Wide Web has given rise to a new category of applications: programs whose primary user interface is a web browser, and which consist of a large amount of information or ``documents'' mixed in with a small amount of behavior expressed as ``code.'' In effect, applications have become specialized web servers.
Given the ever-changing nature of information and the Web environment, easy customization and long-term maintenance should be considered key factors for Web application development. This paper describes an XML based design approach and development methodology that supports ongoing maintenance while minimizing the need for special technical or programming skills. Most appropriate for applications deployed by smaller organizations without significant IT support, these design principles are embodied by the Platform for Information Applications (PIA).
PIA based Web applications consist primarily of XML documents written using a set of domain specific tags. (<h1> and <a> are examples of HTML tags. XML, which stands for eXtensible Markup Language, is similar to HTML but allows tags to be defined and used as needed.) Maintenance means editing these documents using any standard XML editor. The flexible PIA engine serves pages in response to client requests by dynamically processing these documents in accordance with developer defined semantics. This processing may include simple tag substitution, page transformation, database lookup and insertions, or any other functions appropriate to the application domain. In essence, the tags provide a specialized vocabulary available to use in customizing the application.
This approach promises platform independent, easily customizable Web applications. XML support in the form of editors and other tools already exists on essentially all platforms and continues to grow. Specifying the processing in XML not only makes the logic more accessible but restricts the dependence on a particular computing environment to the implementation of a few tags. In the PIA, semantics for the small set of primitive or "basic" tags are defined in Java, while the majority of application-specific (defined by developers) tags are specified in terms of these primitives.
Freely available as open source, the PIA software, written in pure Java, is available from
RiSource.org
and includes interfaces that conform to all relevant standards, including the W3C's Document Object Model (DOM), and the Simple API for XML (SAX). RiSource.org also distributes and helps coordinate the development of open source Web applications including as a workflow system, shared calendar, Web site management tool, and a personal ``browsing assistant.'' As the technology for developing and deploying XML based Web applications becomes standardized, RiSource.org will help facilitate the collaboration of the expanding pool of developers who create and maintain Web applications.
The World Wide Web as we know it today represents not one but two revolutions in the world of computing. The first was to transform the entire internet into one huge collection of documents, each potentially no more than a mouse-click away from any other. The second, less obvious revolution was to transform services into collections of active documents.
A web application is a computing service accessed through the web. It consists of a web server -- the ``engine'' that makes documents available to browsers -- and a collection of documents, some of which are ``active,'' i.e. they make the engine ``do things'' like order a book or buy stocks for the client.
This highly efficient method of delivering services represents a potentially huge productivity gain for both providers and customers. Amazon's book recommendations and 1-Click ordering can literally pay for the cost of a book in the amount of time and effort it saves.
Unfortunately the overhead costs of developing and maintaining Web applications limits them to large organizations well supported by an IT infrastructure. Even though most Web applications consist primarily of (unstructured) information with small amounts of embedded software (what Tim O'Reilly has called Infoware), they tend to require the same development tools and skills as ``traditional'' software applications. Development and modification of an application requires a programmer, someone whose primary job is translating the desired behavior into a machine language. This is not unlike the state of numeric computing before the advent of spreadsheets when financial analyst required assistance to translate their formulas into Fortran programs. This paper describes an approach to building Web applications that reduces the overhead costs making it possible for smaller organizations and individuals to realize the efficiency gains of providing their services via the Web.
Consider five different types of web applications:
Each type of application has different requirements, support, and performance characteristics. Public and enterprise applications must handle large numbers of simultaneous users with 24/7 accessibility. These applications generally have full-time staff devoted to their maintenance. Group applications generally have a relatively small number of users but must constantly evolve as the information and group needs change. Embedded applications must be very robust and may be deployed in harsh environments with no local support.
Existing Web application designs and development tools are most appropriate for public and enterprise applications. These applications generally have well-defined software components that are developed and maintained by a support group. In contrast, our design methodology seems more appropriate for group applications intended to be customized and maintained by the people who use them. Furthermore, this XML based approach provides a safe mechanism for extending and customizing applications in the field, a useful feature for embedded applications.
Before describing our approach in detail, we first review the standard technologies and life-cycle for Web applications.
There are three main ways of making documents on a web site ``active:''
In this section we will examine the development ``life cycle'' of a web application, and touch briefly on how it differs from that of a software application.
Web applications require a different kind of development than traditional software applications. Software applications have always (of necessity) been developed by programmers, sometimes with the assistance of a few user-interface designers (all too often called in at the last minute to figure out why nobody wants to use the new software). Most interactive applications end up being about 70% user interface.
Web applications, on the other hand, typically exceed 95% ``content'' -- as seen by the user, a web application consists entirely of documents: HTML ``pages'' viewed in a browser. Scattered through these documents are the buttons and text boxes of a more traditional user interface.
The developer's perspective depends in part on which technology they choose. In the early days of the Web, applications consisted primarily of standard HTML documents with a few CGI scripts to handle <form> submissions. Initial development consisted of creating HTML pages and then creating the CGI scripts to handle specific forms.
It soon became obvious that this approach was too restrictive since the HTML documents were static and could not reflect updated information. Morevover, developers grew tired of maintaining the correspondence between the HTML forms and CGI processing. Servlets and similar technologies solved these problems by generating every page programmatically. In essence, a method or set of methods is written in your favorite programming language to generate all of the pages that constitute the application. Applications could be arbitrarily powerful and all the information was contained in a single set of (source code) files making it easier to maintain the correspondence between the (generated) HTML and processing. Development fit very well with traditional software development -- generate some specifications and then write software to meet the specifications.
Eventually developers realized that modifying source code to fix a missing </table> tag or modify the navigation bar was a tedious job. Besides that, programming tools do not match very well the linking structure of Web applications. Server pages provide a compromise solution with regular HTML pages containing bits and pieces of embedded code that the server interprets dynamic. Developers who are comfortable with the syntax for both HTML and the embedded language can freely intermix the two. Page layouts and prototype applications can be first developed with static HTML and then augmented with embedded code to provide the desired functions.
Server pages work reasonably well assuming that all of the developers have similar skill sets, which includes the ability to write in the pages' embedded programming language, and a standard development environment. Oftentimes though, the embedded programming language makes the documents incompatible with standard structured editors (or the editors produce HTML/XML which breaks the embedded language) and inaccessible to non-programmers.
As described in detail below, the PIA approach uses a type of extensible server page. In contrast with current systems, the embedded ``language'' is pure XML which has several key advantages, such as widespread support from designers and development tools.
Once a web application has been developed, it has to be deployed: uploaded to a public server and integrated with the company's existing web site. Workgroup applications are deployed on an intranet server, but the principle is essentially the same.
Deployment of a web application is a much more traumatic event than the deployment of a piece of software. Software's availability can be controlled by controlling its distribution. It goes first to a select group of beta testers, and only then (after a few rounds of bug fixing and refinement) to wide distribution. Even after the software ``hits the shelves'' it will take a long time to ``ramp up'' to full deployment.
A web application is different. After it is installed on a public server it may be only a matter of minutes -- days at the most -- before it is found by the search engines, and shortly thereafter by a horde of eager users. If a public announcement is made, the ``Slashdot effect'' may innundate the server with a ``flash crowd'' -- Britannica's web site was all but inaccessible for a week after its introduction.
This means that the maintenance cycle is much shorter for a web application than for a software application. The application's designers may have to respond to problems within minutes, rather than months. Support for rapid customization is essential. In effect, what software developers call ``rapid prototyping'' goes on even after a web application is released.
Once a web application has been deployed, the maintenance begins. It has been said that ``software is the only field in which adding a new wing to a building is considered maintenance.'' This is even more true of web applications: a web site may be completely redesigned and rebuilt several times over its lifetime.
There are three aspects to maintaining a web application: monitoring the server to ensure that it's operating properly, editing and updating the documents, and modifying the software to keep up with the changes in the documents.
One of the unpleasant facts of maintenance is that at some point the original developers usually move on to other projects. This is not a significant problem for the ``content'' portion of a web site: professional writers and designers are good at maintaining a consistent style at an organization. It is a problem for the application's software -- this is often obscure and poorly documented, and the programmers are usually the first people to move on.
The software portion of a web application, because it is usually written in ``scripting languages'' like Perl, or is broken up into tiny fragments embedded in the site's documents, is almost always harder to maintain than a traditional all-software application. Sometimes it's easier to scrap it and have a new programming team rewrite large portions, than to figure out some earlier programmer's tricks.
The PIA's document-processing framework, which allows the designer to define special-purpose tags that are shared among many documents, simplifies maintenance in several ways:
<header>
and <footer>
tags
can replace many lines of complex HTML coding.
The line between ``application maintenance'' and ``customization'' is extremely fuzzy for web applications, especially for group or personal applications. The PIA helps make many maintenance tasks more like the kind of customization that users and web developers are familiar with: modifying documents rather than software.
One of the major differences between software and infoware shows up when a design and development team moves on to its next project. A software team's next project is almost certain to be a variation on its previous one -- another compiler, say, or another printer controller. The team accumulates knowledge, a suite of tools, and a library of re-useable code that grows with each new project.
A web design team, on the other hand, is more likely to move on to something very different. A few basic tools and scripts may be carried over from one project to the next, but most of the content -- the information -- will be new. Many server-side scripting languages, in fact, were designed by web design consulting companies in order to provide a framework for code re-use.
The PIA's tagsets and configuration files give web designers the equivalent of the software team's code library. Entire sub-applications (for example, a calendar) are also easily portable to new projects, and very easily customized.
Software development is situated somewhere between a craft and an engineering discipline, and many design and development tools and methodologies exist to assist the process in all of its stages. Software designers may use CASE (Computer Aided Software Engineering) tools; programmers can count on syntax-checking compilers, interactive debuggers, and even automatic documentation extractors (Javadoc being a recent example). For maintenance there are tools like profilers for improving the software's performance, version control systems such as CVS for archiving changes, and bug-tracking systems for managing requests.
Even though the linear ``waterfall model'' of the software development lifecycle has been partially abandoned, software development is still fairly straightforward. There may be a flurry of ``rapid prototyping'' while designing the user interface, and major additions may be made during maintenance, but on the whole the picture is one of steady progress and gradual evolution.
Even in the design phase software is relatively straightforward. Whatever specific methodology is being followed, the application at the end is usually fairly close to what the original requirements specified.
Web application design and development are significantly more chaotic. Whereas it's almost inconceivable for a software application to start out as a compiler and end up as web browser (Emacs may be one of the few exceptions), it's not unusual to find a simple search engine that has transformed itself overnight into a ``portal site.'' Unlike software development, few (if any) methodologies exist to guide this process.
Then again, the tools available for building web applications are still very primitive. WYSIWYG editors are good for documents whose ultimate destination is ink on paper, but they fail miserably when applied to a web page that may be viewed on anything from a Palm Pilot to a 21-inch monitor. There is little if any support for complex operations like changing the header and footer of every document on a web site. There's no support at all for editing documents that may contain bits of scripting mixed in (especially if some of it is meant for the browser and some for the server, in two different languages).
Some proprietary Web development platforms, such as Cold Fusion, do support some aspects of application development and maintenance. Unfortunately these tend to be very expensive, have their own nonstandard programming component, and of course lock the designers in to a single vendor.
The largest problem facing a workgroup or small business is customizing their web applications. An enormous amount of effort can go into designing and maintaining a public or enterprise-level web site, but the project has the full support of the company's IT department, a large budget for designers and consultants, and so on. Since the revenues from a public web application are likely to be large (they may even be the company's only revenue stream, as in the case of an Internet-based business), it's easy to justify a large expenditure. Similarly, an enterprise-wide intranet site is going to be run by the IT and HR departments, which can easily justify its cost.
The situation is far different in a workgroup or small office. The software complexity of the application is likely to be almost as great as that of a large public web site, but the amount of information associated with it is far less, and there may not be even a single full-time person dedicated to maintaining the entire network, let alone the web applications that make it useful.
As a result, small-scale web applications are usually ``home-grown'' in somebody's spare time. (They might be bought ``off the shelf,'' but shrink-wrapped web applications are exceedingly rare at this point.) They will tend to grow haphazardly, as the result of a series of customizations to meet the group's changing requirements. Of course, it's almost trivial to customize the information part of a small web site. The software is another matter, and will often consist of small CGI scripts downloaded from the Net and changed as little as possible.
Web applications that are built using extensible server pages (for example, ASP, JSP, PHP3, and Meta-HTML) tend to be easier to customize than those based on standard programming techniques such as CGI scripts or servlets. The PIA's approach to extensible server pages is particularly simple because, being XML-based, it is well adapted to existing authoring tools and techniques.
The PIA is a highly versatile platform for web applications: it is able to function as either a ``traditional'' web server, a client, or a proxy.
Furthermore, the PIA can combine these aspects, enabling totally new kinds of web applications. The PIA approach to web applications is based on three main ideas, which we will examine in more detail below:
The PIA's server-side document processing system (DPS) is essentially a form of extensible server pages, with three main differences from other systems.
Let's examine some of the consequences of these features in more detail:
The PIA is XML ``all the way down'' -- there are no other programming constructs that have to be learned. This means that existing XML and HTML editing tools can be used, and will be able to help an author deal with the extensions rather than getting in the way or, worse, rejecting the extensions because they don't follow correct XML syntax.
Moreover, the PIA's document processing uses the nested tag structure of documents, rather than ignoring it. It is impossible for the PIA to create a syntacally incorrect document. Also, XML can be used for its intended purpose as structured data: the PIA's tags include powerful data extraction and tree transformation operations.
The end result is that all of a web application's data can be kept in the form of XML files, if desired. And because XML can be embedded directly in ordinary HTML pages, there is no artificial barrier between information and processing: a single document can include both, with the same syntax in the same familiar markup-language style.
A tagset is an XML document that assigns meanings to tags. In most cases, these meanings are similar to ``macros:'' they simply specify a piece of text (and tags, of course) that replaces the defined tag. Tagsets allow the designer to collect all of an application's special-purpose tags in a single document; imposing a uniform style by ``cut-and-paste'' becomes a thing of the past.
A small number of tags are pre-defined as ``primitives.'' They define a Turing-complete scripting language on which a user can build complete applications.
Several pre-defined extensions of the basic tagset are provided, for applications that include document formatting, calendaring, office form processing, and web site content management. Tagsets provide applications with a domain-specific vocabulary for document structuring and processing.
Because the tagset is separate from the document being processed, it is possible to process the same document in different ways. This can be as simple as changing the appearance of a document's page headers and footers in different parts of a web site, or as complex as a site indexing package.
It is also possible to obtain documents from other sites and process them to obtain information. For example, a workgroup site might display the current price of the parent company's stock. A personal productivity application might download and merge e-mail from multiple web-based mail sites.
Many web sites and applications use specialized XML documents for transferring information; it is easy to process these using a special-purpose tagset.
Tags have also been designed that allow HTML documents to be ``updated in place'' in response to user input. This technique is especially useful for discussion forums, as well as in certain kinds of form-based applications.
XML defines a construct called an ``entity'' (the familiar
&
construct for coding an ampersand in HTML is one
example) that allows arbitrary text to be effectively ``included''
in a document. The text of an entity can be defined as part of a
document type declaration, or it can be specified as coming from a
file or even an arbitrary URL.
The PIA permits entities to be treated the way other programming languages treat ``variables'' -- they can be written into as well as inserted, and so can change dynamically while a document is being processed. Entities that reside in files provide a simple mechanism for making data ``persistant'' and for sharing data among the documents that make up an application.
The PIA also extends XML's notion of ``namespaces'' to impose a
directory-like hierarchical structure on entities. This gives rise
to entity names like ``&FORM:inputVar;
'' and
``/Samples/TagsetDemo/test2.xh
'' -- the value of the
``inputVar'' variable
submitted as part of a form, and the path from the
server's URL to the document itself.
The tags available for use while processing a document are limited to those provided by the tagset. Although the PIA's tags include operations for accessing other web sites, for reading and writing files, and for controlling the application, it is easy to remove any subset of these tags before processing a document that might be ``suspect'' in any way (for example, when processing a document from foreign site or an incoming e-mail message).
Of course, since a PIA-based application is a web server, it is also possible to use standard web authentication techniques such as passwords. The PIA's tags also include operations for computing and verifying digital signatures.
The ability of the application designer to specify exactly what set of operations an active document can perform makes it possible to be much more secure than a system in which arbitrary code can be embedded in a document, or (horrors!) even downloaded on request (as applets and ActiveX controls are on the browser side).
Because the exensions in a document are XML, and hence visible as part of the document's structure, it is also possible to ``audit'' a document, or even an entire new application, to ensure that it does nothing unsafe or unexpected. Naturally, this can easily be done by means of a tagset.
The operation of the PIA's document-processing system is described further in a companion White Paper, Document Processing in the PIA. The PIA's documentation can be found online at www.RiSource.org/PIA/Doc. There are several features of the PIA, including its flow-through architecture and its use of open API's, that are of interest for other areas besides customizable Web applications.
Like most web servers, the PIA has a configuration file in which all of its many options and parameters can be specified. Also like most servers, additional configuration files can be supplied in any directory to supply local options.
Unlike other servers, however, the PIA's configuration files are pure XML
(which should come as no surprise), and any of the standard
document-processing tags can be used in them. In particular, the
<include>
tag can be used to include other files (for
example, the standard mappings for filename extensions), and the
<if>
tag can be used to make parts of the configuration
optional (for example, setting up authorization only if a password file
can be found).
Three aspects of the site configuration mechanism are particularly interesting:
The PIA has a mechanism that allows two directories to be ``overlaid'' on top of one another. The one ``on top'' is the real directory, and any files the PIA writes go into it. The one ``underneath'' is called the virtual directory, and the PIA looks there for any file it can't find in the real one.
Although this seems confusing at first, it makes local customization enormously easier. For one thing, it gives greatly increased protection to the PIA's own files, and to an application's documents. Usually an application is shipped with a configuration file that puts all of its documents in a virtual directory. Any local customizations then go into the real directory. The virtual directory might even be on a CD-ROM, or in some location shared by many users, each with their own real directory full of personal data and customizations.
The same configuration mechanism that allows for shadow directories can also be used to make ``virtual documents'' appear in a directory. The main use of this is to create ``aliases'' or ``symbolic links'' in an OS-independent way: the virtual document or directory can be brought in from anyplace on the system.
The configuration file also defines the mapping between filename extensions and both MIME types and tagsets. It is also possible to hide files from the client; hidden files can still be accessed from inside the applicaton.
The extension mapping also defines a ``search order'' -- a URL without an extension causes the PIA to try each of the listed extensions in order until a document is found. Among other advantages, this means that document names in URL's don't need extensions, making them shorter and easier to type and remember.
A further advantage of omitting extensions is that a document's extension, and hence the tagset that processes it, can be changed at any time by the application designer without invalidating a user's bookmarks.
The PIA also serves as a platform for running software agents. These are small XML documents that specify actions to be performed in response to events rather than specific client requests. In this role the PIA is sometimes referred to in its documentation as an ``agency.''
There are several uses for agents, corresponding to different kinds of events that activate them.
These are activated at a particular time, or repeated at a
particular interval. For example, one can specify that a
particular agent is supposed to run daily at 1am, or every hour on
the half-hour. This kind of agent is similar to a
``cron
job'' or ``at
job'' in Unix; they
can be used for reminders, periodic updates, cache management, and
so on.
Transaction agents are activated by specified features of web requests or responses. They can then operate on the transaction and, for responses, on the document itself before it gets to the client. One of the earliest uses of a transaction agent was one that maintains a permanent browsing history. A closely related one puts a small ``toolbar'' at the front of every HTML page requested using the PIA as a proxy. A more recent application was an ``ad-buster'' agent that recognizes requests to banner-ad sites and redirects them.
Some of the things that can be done with transaction agents resemble other innovative proxy-based web applications, for example IBM's WBI, or with specialized web caching software.
Marker agents don't really respond to any events, they
just ``bookmark'' their ``home URL,'' which is typically an
application. This gives the user a ``short cut'' URL for that
application, for example ``/~Calendar
'' instead of the
more verbose ``/Agents/SimpleCalendar
''. The use of a
tilde on the agent's name is meant to resemble the common
convention of using a tilde in the URL for a user's home directory
on Apache and similar web servers. In some cases, marker agents
might be set up for real users on a group site.
In this section we will compare the PIA's approach to constructing web applications to that of other systems currently in use.
The conventional platforms for web applications break down into two broad categories: separate code, and embedded code:
It is difficult, using embedded languages, to perform processing on arbitrary documents. It is almost impossible to use them to define new tags, or new meanings for old tags.
Server-side extension languages are complete programming languages that are designed to generate web pages; in general ordinary text and HTML tags are treated as ``constants'' and passed through to the client. The PIA falls squarely in this category, but with some significant differences which we will examine in more detail below. For the moment, let's compare the PIA to two well-known server-side languages, Meta-HTML and PHP.
<?php ... ?>
brackets that make them look
like XML ``processing instructions.'' (Other bracketing methods exist,
but this is the most ``XML-friendly.'') PHP, then, has all the usual
disadvantages of embedded code with respect to editing tools, but at
least the embedded constructs are designed so that they can be properly
ignored or passed through by XML tools. It would not be difficult for
PHP and the PIA to co-exist; for example, PHP could be used inside of
tagsets, or an application could consist of a mixture of XML and PHP
documents.
Feature | Meta-HTML | PHP3 | PIA |
---|---|---|---|
Syntax | HTML-like | C-like | pure XML |
Embedded processing | yes | yes | yes |
Tagsets | yes | no | yes |
redefine document tags? | yes | no | yes |
treats documents as: | strings | strings | parse trees |
Neither Meta-HTML nor PHP, nor any other server-side scripting system that we know of, shows any awareness of the structure of the underlying document. The document is simply treated as a ``character string;'' it is possible to generate syntactically incorrect documents, and it is not particularly easy to manipulate the document's parse tree. This makes it difficult to take advantage of the document's structure. (Of course, XSL does explicitly manipulate the documents structure but it does not provide the traditional scripting capabilities.)
Note the latest version of PHP does include an XML parser, which can be used for processing documents, but it can't handle ordinary HTML. Unlike Meta-HTML, PHP lacks the notion of tagsets.
The available XML-based approaches to server-side document processing fall into two broad categories: embedded code inside of XML constructs (like PHP), and ``style-sheet'' languages. As we will see, the PIA, which is also XML-based, has aspects of both.
The main disadvantage of the style-sheet languages is that they cannot be embedded in a document. In addition, they tend to be difficult to learn, and cannot easily be manipulated as data. XSL, and the closely related Cascading Style Sheets of HTML, are not complete programming languages. We feel that both expressive power and embeddability are important. It's worth noting, though, that style-sheet processing could easily be added to the PIA's document processor.
Feature | XSL | PIA |
---|---|---|
local transformations | no: assumes a complete parse tree | yes: documents can be streamed |
arithmetic operations | counters only | yes |
text manipulation | sorting and concatenation | sort, split, join, trim, subst |
iteration | only over node sets | general |
tests | tree matching | numeric, string |
embedded expressions | {xpointer} |
&entity; |
definitions | only in stylesheet | tagset or document |
processing | only in stylesheet | tagset or document |
native code extensions | no | yes, through tag handlers |
interface to files | read-only | read/write |
interface to web docs | read-only | read/write/query |
interface to server | no: documents only | yes: can operate on transactions |
interface to database | no | yes |
Learnability | complex | simple |
Security | N/A | flexible |
A full technical analysis of the PIA is beyond the scope of this paper. Here we mention a few of the technical aspects of the PIA's implementation that have direct consequences for the designer of web applications.
Currently the PIA is highly portable, but fairly large and not particularly fast; this makes it most suitable for personal and group applications. We are actively working on the size and speed problems, with the ultimate goal of developing a much faster and smaller version that can easily be integrated with Apache and other high-end web servers.
The PIA is currently written in Java, which makes it highly portable; it is known to run under Sun's JDK 1.1 and 1.2 on Linux, Solaris, and Windows 98 and NT, as well as under Kaffe on Linux. Adding to the PIA's portability is the fact that the user interface is completely web-based, avoiding Java's user interface classes.
Unfortunately, Java is an interpreted language, which imposes a performance penalty. There are several factors in the PIA's design which partially compensate for this:
makefile
to process documents ``offline.'' Only the
truly dynamic pages need to be processed by the server.
In its present Java implementation, the PIA is not well suited for large public websites or other applications where extreme high performance is needed. At least, not yet. There are three possible ways of improving the PIA's performance dramatically:
In order to integrate PIA-based applications into an existing web site, it is usually necessary to integrate the PIA with the server that is already present. There are three ways of doing this:
It is just as easy to integrate the PIA with existing applications, web-based and otherwise, as it is to integrate it with a web server. As usual, there are several ways of doing this:
We expect one of the major uses of the PIA to be ``embedded'' applications, with the PIA providing the primary user interface for some piece of equipment that is not a general-purpose computer. There are several features of the PIA that make it suitable for embedded applications.
The PIA approach offers a way to build Web applications that can be easily customized and maintained. It leverages existing and future XML tools and allows developers to create application-specific vocabularies so that the documents which comprise Web application may be created and modified without customized programming skills or tools.
Server-side Document Processing | |
---|---|
XML-compliant | Standard tools (editors, parsers, etc.) can be used for development. Processing can be separate or embedded in documents. |
Embeddable | Documents can contain their own processing, embedded in the portion of the document that has to be processed. |
Separable | Separate tagsets can be shared among documents. |
User-extensible | User-Defined tags have the same syntax as built-in operations. |
Operates on parse trees | Impossible to generate a syntactically-incorrect output document. |
Small set of primitives | Easily ported to other programming languages. Easy to learn. |
Turing complete | It is possible to write arbitrary programs in this XML language (not that you would want to, but it it is nice to not have to worry about running into brick walls) |
Efficient | Documents can be streamed through, meaning that the browser can get the first part of a page while the rest is still being generated. |
Web-based application platform | |
Flexible configuration | Active documents and data easily separated, but can be shown in the same URL tree. |
XML configuration files | Full power of document-processing language available at configuration time. |
Java-based | Platform-agnostic. Runs on Linux, Unix, Windows. |
Agent-based event handling | Processing can occur based on transaction features (request and response headers) or time. Agents can modify transactions. |
Client / Server / Proxy | Web engine operates in multiple modes for maximum flexibility. |
Copyright © 1999 Ricoh Innovations, Inc.
$Id: wp-webapp.html,v 1.12 2001/01/12 01:45:43 steve Exp $