PIA: An Open-Source Web-Based Document Processing System

CRC-TR-9815

This technical report contains the slides that were presented in the session on the PIA at the LinuxWorld Conference and Expo, March 4th, 1999. The PIA has been released as open source software; the source code and documentation, as well as a copy of these slides, can be found on Ricoh's open source website, www.RiSource.org.

Abstract

Information Appliances replace traditional software applications with web sites -- special-purpose ``thin'' servers. Traditionally these are built on top of a free Unix such as Linux, which provides a stable, feature-rich platform and an excellent development environment. Implementing an appliance, however, is still a matter of custom programming.

Our Platform for Information Appliances is a framework which promises to make it significantly easier to build, maintain, and customize Web based applications. The main component of this platform is a core engine for processing SGML (including XML and HTML) documents which makes it very easy to transform, compose, or otherwise modify documents. It can also retrieve and operate on information from other web sites as well as local files.

This engine makes it possible to create Web based applications that consist mainly (or solely) of a set of XML pages. Upon request, the processor converts these pages into a desired format (e.g. HTML). Unlike other scripting languages or active server pages, our source documents strictly adhere to the SGML standards and the semantics of the elements (tags) are fully under the control of the developer. Actions associated with elements can be specified in XML or a traditional programming language. This makes for a seamless interface between content, specification, and implementation. The resulting application is very easy to maintain and build upon, with most customizations requiring little or no programming.

In addition to the core processing engine, the platform includes templates for creating Web applications, the infrastructure needed for acting as a server, client, and processing proxy, and some other features (including sample applications and configuration files) that make it easy to bundle as a stand-alone information appliance. The entire development environment, including its documentation, is organized as a web site; this makes it easy for a new user or developer to get up to speed.

As an Open Source project, this platform should appeal to developers working on a range of Web based applications, from stand-alone devices to personal proxy servers. The document processing capabilities make it especially attractive for applications which must work in conjunction with other Web sites. Example applications which have been built on this platform include a "digital photo album" for managing a collection of photographs taken with a digital camera, and a personal proxy server which maintains a browser-independent, permanent history database, and can be customized to remove certain types of (unsolicited) images from retrieved pages.

 

 

 

 


Contents:

High-level Overview:

Slide List:

The list of slides which follows on the next page was automatically generated from the slide titles; unfortunately it lacks the section information given above. Note that the numbers in it refer to slide numbers, not page numbers.

 

 

 

 


Copyright © 1999 Ricoh Silicon Valley
$Id: tech-report.html,v 1.3 1999/03/22 19:51:52 steve Exp $
Stephen R. Savitzky <steve@rsv.ricoh.com>