About the PIA Core Engine

This document contains notes about the PIA's core server engine, as implemented in the Java module org.risource.pia.

Recent Changes

As of 9/9/99 (End Of File Day -- seems appropriate) I started to make major changes in the implementation of Agents and the PIA core server. As of 9/22 they are checked in on the internal CVS server and open for general hacking. Nevertheless some of the former Agents are very much ``under construction,'' and much of the documentation is out of date.

The main change is that agents no longer handle Transactions directly under most circumstances (the old handle method will still be there so that a specialized agent can handle transactions if it has to. A cache agent might do this, for example.) Agents do not handle transactions directed at their ``home URL'' -- these are handled by a new object, the ``Site Machine'' (class org.risource.pia.site.SiteMachine), which uses the Site Resource Package.

If a Transaction is not proxied, its default action is to go to the PIA's ``Site'' (Pia.getSite()), locate the Resource specified by the URL, and return the associated document. This is done by means of a one-line change in HTTPRequest.toMachine that defaults to to Pia.getSiteMachine() when no toMachine has been specified.

As a result of this change, a ``Root'' agent is no longer necessary; its function was simply to map URL paths onto agents. It is also no longer necessary to have an Agent for each top-level directory in URL space; the net effect is to speed up transaction processing in the Resolver. Mapping URL's to files is also sped up considerably because the classes in the org.risource.site package do the lookup more efficiently than GenericAgent used to.

It is possible to register a ``home directory'' in URL space that is web-accessible as ``/~name'' following the usual web/Unix convention for home directories. (Note that in the Reader's Helper, for example, it would make a great deal of sense to give each user a home directory.) These directories will usually be associated with agents, though this need not be the case. The ability to have unregistered agents and non-agent home directories gives the site designer full control over the top-level namespace.

Even if an Agent has not registered its home directory at the top level it will still have one: the directory in which the forms that control the agent reside. An Agent used to be a namespace in its own right; this function is now taken over by the ``properties'' of the agent's home directory. In fact, an Agent is now simply an <AGENT> element in the configuration of its home Resource.

Views

Because the top-level configuration file of a Site does not have to be part of the Site, it's easy to construct multiple views of a directory (for example, the PIA). For example, simply making the PIA home directory the root should give a purely passive view of the tree. For a specialized appliance server with a limited set of agents, the root might be PIA/Agents/ROOT, with the actual agent directories virtualized, and the documentation virtualized read-only.

Environment Variables and the Command Line

In order to initialize the PIA, we need a site configuration file and two directories:

PIA_HOME -- the PIA's install directory. This is the directory formerly called "PIA_DIR".
PIA_ROOT -- the real root of the PIA's site, into which things can be written. Formerly called USR_DIR.

In general we also need a virtual root, but this can be specified in the configuration file either relative to PIA_HOME or as an absolute file. In almost all standard cases it is PIA_HOME, so there's little point in creating an environment variable for it.

The directories can be specified in the command line, in environment variables, or as attributes of the top-level Container element in the site-configuration file. Other configuration attributes, including host name and port, can only be specified in the configuration file or on the command line (with the name=value syntax usual for attributes).

The last item on the command line is either a filename (the configuration file) or a directory (the real root). In the latter case, the configuration is read from _subsite.xcf in the root.

A small number of other things have to be dragged in from the environment, including the user's name and home directory, PATH, the X DISPLAY, proxy variables, and a few other items that Java doesn't handle properly.

Design Decisions

Agent Alternatives

There are four plausible implementations for Agents:

Each Agent has a ``state document'' (e.g. AGENT.xml) in its home Resource. Saving an Agent's state is fairly fast and simple. Loading an Agent is also simple. There are complications involved in making the agent's namespace available to all documents in its home, especially if multiple agents share the same home.
In this scheme it becomes the SiteMachine's responsibility to track down the agent associated with a directory URL and make it available as a Namespace when executing the documents in that directory. Multiple agents may share a home directory; their namespaces will all be available under the agents' names (e.g., History: as well as the default AGENT:).
An Agent is a Namespace in its home Resource's configuration. Loading the agent becomes trivial. Saving it requires nothing more than synchronize, but will take longer if there's lots of other metadata to save. The <AGENT> tag would be defined in the configuration tagset pia-config; AgentBuilder's existing tagset-switching method will be sufficient to switch to the container's local tagset for the content. There may, again, be naming issues if there are multiple agents; it also means that the agent name and the namespace name will usually be different, with the namespace name being AGENT and the agent name being the name of the parent Container.
An Agent is its home Resource's configuration. This totally eliminates naming issues. We can preserve the <AGENT> tag, but it no longer needs to have a name attribute -- in fact it no longer even needs to be a namespace! The act-on and criteria could be separate properties, or we could make act-on the content of <AGENT> and leave criteria as an attribute. This leads to:
An Agent's namespace is its home Resource's properties and documents. A Container (say, ``.../Toolbar'') can easily have multiple agents, because all they are is <AGENT> elements in the configuration. An Agent becomes nothing but a single XML element with some contents that get expanded, and attributes that say when it gets expanded. It can have optional sub-elements <initialize> and <action>.