to-do
file
to-do
file
in ../pia/
../dps/
The Site Resource Package implements the mapping from URL's passed to the PIA server, to documents returned by the browser. This breaks down into the following steps:
Additionally, the Site Resource Package is able to manage an arbitrary amount of additional XML metadata associated with documents, including WebDAV properties, DPS entities, PIA agents, and so on.
The information that defines the structure (configuration) of a site is also defined using XML, but configuration information is kept (at least conceptually) separate from other metadata in order to simplify the implementation and increase versatility. The XML used for configuration files is similar to the W3C's Resource Description Format and the IETF's WebDAV properties, and is perhaps a little closer to the former.
There are four main interfaces in the org.risource.site
package:
The interfaces follow the the Composite pattern as described in the Gang-of-Four book; in this pattern the main interface (Resource) includes all methods needed for accessing sub-Resources. This imposes no burden on implementations of Document, which are free to return null results.
However, since every container (directory) Resource has an associated
Document (for example, home.xh
or index.html
)
that may be accessible on its own, it makes sense for Resource and
Document to be different. A method, getDocument
, gets the
Document associated with a Resource (even if it is the same object).
There are two parallel sets of implementation classes:
AbstractResource | ||
ConfiguredResource | ||
Resource | Subsite (container) | FileResource |
Document | SiteDocument | FileDocument |
Root | Site | (FileRoot) |
The ``File'' classes are (comparatively) lightweight objects that contain
no configuration information or associated XML metadata -- everything is
derived from the underlying file or directory. (FileRoot
is
shown in parentheses because it is not presently implemented, but we will
need it eventually.)
One may well ask why the interface is called Resource and the
implementation is called Subsite rather than the other way around, or
perhaps something like BasicResource. The main reason is that
``site.Resource
'' simply sounds better than
``site.Subsite
''. Also, ``resource'' (and to a lesser
extent, ``document'') match the terminology used in, for example, WebDAV
and most other web-related specifications. (URL, after all, stands for
``Universal Resource Locator''.)
Subsite
caches a large amount of information: virtual search
path for defaults, which virtual directory each child is in, timestamps,
tagsets, configuration information for child documents, and so on. In
fact, it is possible to build an entire virtual Site out of nothing but a
configuration file.
For this reason, Subsite objects are normally kept in memory as a tree. FileResource objects are not, since they are easy to reconstruct from the available filesystem information. Similarly, FileDocument objects are easily reconstructed from a combination of filesystem information and the configuration information cached by their parent Subsite.
A Resource normally has a ``real'' location in the filesystem, which is a direct descendent of the directory that corresponds to the Root resource. A container resource may also have a ``virtual search path'' of directories in which to look for default children. All writing is done in the real location.
In most cases the real location of a resource will not exist at first; in that case the resource has to be ``realized'' in order to write the resource.
Typically the virtual search path of a resource has only one or two
elements: a ``prototype'' directory under the source-controlled PIA
directory, and possibly a ``defaults'' directory that provides a fallback
for documents like home.xh
which most directories are
expected to have. The prototype directory corresponds roughly to
PIA/Agents
, and most or all actual agents will have their
prototype directories in PIA/Agents
. The prototype
for the standard, out-of-the-box configuration is PIA
itself.
The real location of the PIA's root corresponds rather closely to the
current .pia
directory. It is created in the first place by
specifying a ``configuration document'' for the Site (see below) and then
``realizing'' it. A command-line utility will be provided for this
purpose.
There will be multiple sample configuration files in the standard
distribution, corresponding to, e.g., an appliance server, a personal
proxy, and so on. A distribution of the PIA could ship with a real,
non-CVS-controlled Site
directory created by realizing a
default configuration as part of the release process. It might be best if
this were a sibling of the PIA
directory rather than
a child; another possibility is to create it on installation (which would
allow the user to select their preferred configuration). Most Unix users
will, of course, want to use ~/.pia
as the real location of
their personal PIA.
A Resource's ``configuration'' is specified using an XML element with node
type ``Resource
''. Attributes specify all of the String,
boolean, and integer fields of the underlying object (of class Subsite or
Document. XML metadata is contained in namespace
elements in
the content.
The configuration of a Container resource may also contain
Resource
sub-elements in its content, corresponding to
documents and virtual containers that have no corresponding configuration
file. A Subsite will normally have its configuration loaded from a file
called, by default, ``_subsite.xcf
''.
The configuration file of the Root resource may be specified separately;
if such a configuration file is provided the _subsite.xcf
file in the Root directory is ignored. Alternative configuration
files for the PIA are provided in the PIA/Config/Site
directory.
_subsite.xcf
file itself, but a copy of
the ``property Namespace'' that is derived from it. On initialization
we first load the _subsite.xcf
file, then override
properties from the property file to restore any changes.
One objective of the site
package is to provide the machinery
necessary to support ``agents,'' but without placing any constraints on
their implementation. All that the Root needs to do is to map names
that start with a ``~
'' (tilde) character into the ``home
Resource'' (typically a container) for the named agent. It is then up to
the documents in that Resource to provide the agent's user interface.
Note that not all agents need to be registered in this way, only the ones that need web-accessible user interfaces. Similarly, nothing prevents a Resource from being the home of several agents, as long as some mechanism exists for sorting them out. One way of doing this might be to make an Agent's ``home Resource'' a document rather than a container, but this may complicate things unnecessarily. For the moment we can ignore the problem, and simply make sure that every registered agent has its own home.
In the new PIA, then, agents will be considerably simpler than in the old
scheme, because they will no longer have anything to do with interpreting
URL's or processing documents. Essentially, an agent will be nothing but
an XML Element with an action
sub-element in its content that
provides the hook. In most cases an agent's definition is simply a
sub-element of its home Subsite's configuration.
Note that in this scheme, an agent no longer needs a state document! All
of its state is contained in ordinary documents in its home
Subsite, which can be accessed in the usual way via entities or
<include>
tags.
Normally the document associated with a container resource is its
home
file (with any of several extensions taken from a
standard list). If no home
file exists, an
index
is searched for. Finally, if none of those exist, a
standardized listing is created (using the Listing
class,
which implements the Document
interface).
The standardized listing is always available (unless hidden) under the
name ``.
'', which is the Unix shorthand for the current
directory. The period is only recognized as a listing file when
it is the last filename in a path; otherwise it simply refers to the
current Container, so that ``/./
'' is equivalent to
``/
'', as Unix users expect. (This feature can be very
useful when constructing paths automatically, for example in a
Makefile
.
This convention replaces the PIA's previous mechanism, which involved
distinguishing between paths ending and not ending with
``/
''. This idea came from the original CERN
httpd
, but since no other servers picked it up it proved
quite confusing for users.