Overview (WD)
"Coordination language" and "generative communication" are terms usually associated with the Linda paradigm for distributed and parallel computation. In the course of our discussions of REST, we've referred to HTTP as a CoordinationLanguage. We've also talked about the heretical concept of GenerativeNaming using URI, which sounds kind of like but is mostly unrelated to generative communication. This leads one to ask the questions: "are the Web and Linda related? Are they in some subtle way equivalent? Can I do Linda on the Web in a Web-friendly way?" (WF)
This document attempts to address some of those questions. We start out with a brief review of Linda. We discuss how Linda is used to structure a simple distributed computation. We discuss how to map Linda tuplespaces into HTTPspace and the Linda operations into HTTP operations. A few random thoughts are offered, and then we conclude with a few pointers to related work. (WG)
A Brief Review of Linda (WH)
The central abstraction in Linda is a "tuplespace." A tuplespace is a form of distributed shared memory; it may be literally distributed or may be served from a single server to which remote processes may connect. A tuplespace is a multiset containing tuples. By multiset, we mean that a tuplespace may contain an arbitrary number of identical tuples which are treated as discrete objects. Some implementations of Linda provide the ability to nest tuplespaces, but this is not essential to the model or to this discussion. (WI)
A tuple is an ordinal collection (sequence) of typed values. The stored values in a tuplespace are all actual, that is, they are reified values. A Linda program has a concept of formal values in tuples, but these only show up in the processes interacting with a tuplespace rather than in the tuplespace itself. Tuples are distinguished in two ways: by the types of their fields and by the values of those fields. (WJ)
Linda has three essential operations which may be "embedded" (e.g., implemented by a library) in any language to form a Linda dialect of that language. The operations are traditionally (WK)
tuple = rd(template) tuple = in(template) out(tuple) (WL)
These operations respectively read a tuple that matches a template pattern w/o removing it from the tuplespace, atomically read and remove a tuple from the tuplespace, and insert a tuple containing actual values into the tuplespace. The rd() and in() operations are defined to block if no actual tuple exists in the tuplespace that matches. Various implementations and derivatives provide non-blocking analogs of these operations; the JavaSpaces variant provides asynchronous callbacks, as well as "leases" on tuples that constrain and control the longevity of tuples inserted into the space, preventing persistent garbage from accumulating. (WM)
(NB: Early versions and descriptions of Linda had a fourth operation, eval(), which has been removed from most implementations; it proved semantically difficult to define and practically difficult to implement. It's non-essential to the model; it offers a convenient way to do things that can be done otherwise using just the first three operations.) (WN)
A Brief Linda Example (WO)
Process 1: (WP)
ping() { forever { out("pingpong", "ping") in("pingpong", "pong") } } (WQ)
Process 2: (WR)
pong() { forever { in("pingpong", "ping") out("pingpong", "pong") } } (WS)
These two processes communicate through the tuplespace, passing a tuple back and forth while changing a field of the tuple back and forth between two values. In doing so, they demonstrate how they coordinate their activity through the tuplespace in a loosely coupled way; they know how each effects the data in the shared space, but they know nothing directly about each other. (nb: this is not precisely right because it does not accurately demonstrate the matching of formals to actuals, but it's close enough for the purpose of this illustration --- which is about the coordination behavior of Linda --- and avoids the complexities and syntactic details of formal matching, which is generally API and often language specific. (WT)
Mapping to HTTP (WU)
NB: the following section describes an early version of a toy Web-Linda system called 'Cantrip' (totally contrived acronym for "Coherent Architecture for a Network Transparent Internet Platform --- ick, and not very accurate either. ;-) I implemented Cantrip as an experiment at Activerse, though it never had any product impact. Unfortunately I evolved away from this to a more sophisticated but Java-only sort of system, which was itself never written. In hindsight, I'm surprised and pleased by the original design; I wish I'd had my "gestalt" about using rather than abusing the Web about 5 years ago. ;) It seems clear that this system could actually be used to implement some interesting things --- I wish I still had the code, but AFAIK it has disappeared into the bowels of CMGI. Perhaps someone, somewhere in a CMGI-based company has found it useful. (WV)
The most straightforward way to map Linda into HTTPspace is to assume that a tuplespace is a resource. The alternative would be to represent a tuplespace as a host / authority and map individual tuples as resources w/o using query strings, but this leads to various complications and, generally, breaks the axiom of opacity. (Oh no!) ;) In addition, tuples are ephemeral things so it's perhaps not really a good idea to give them names that imply persistence, i.e., names w/o query strings that crawlers or proxies might be tempted to hang on to. So let's assume that a tuplespace is represented by a resource (WW)
http://lindaweb.com/joespace (WX)
This is a tuplespace named "joespace" and hosted "on" a hypothetical service called lindaweb.com. This tuplespace holds 0-N tuples, that is, ordinal sequences of typed values. These tuples are accessed by way of a query interface offered by the resource; the query string is of the form (WY)
http://lindaweb.com/joespace?<p1Type>=<p1Val>&<p2Type>=<p2Val>...<pnType>=<pnVal> (WZ)
This may look a little weird; what does ?int=1&int=2&... mean? Many different cgi frameworks, application servers, and so forth might misinterpret this, assuming that we're providing multiple values for a single parameter variable. That's true --- this was in fact chosen to illustrate a point, that query strings conceptually may have both ordinal and keyword semantics, that the keyword namespace isn't necessarily interpreted such that multiply supplying a given keyword is an error. (X0)
Linda's rd() operation has a trivial mapping to GET. One conceptual problem exists with caches; since the state of the tuplespace can meaningfully change between reads, we must make representations returned from tuplespaces non-cacheable. The major difference is that GET doesn't block if the resource doesn't exist, it returns a 404. This is a problem for many Linda-based algorithms --- the alternative is polling. An asynchronous communication mechanism that could notify you when a previously non-existant resource came into existance would be helpful; perhaps HttpEvents or similar could be extended to accomodate this usage. At any rate, the query string constructed for this purpose is an ordered, typed collection of actual values or formals, i.e. wildcards, such as (X1)
http://lindaweb.com/joespace?str=pingpong&str=ping http://lindaweb.com/joespace?str=pingpong&str=pong http://lindaweb.com/joespace?str=pingpong&str=* (X2)
There are several questions still to be answered. In order for clients and servers to interoperate effectively, they would have to agree on a set of types and representations of those types. A standard encoding of the tuple content to be passed back as the response body for in() and rd() would be necessary, in order to allow binding. Details, details... ;) (X3)
The in() operation is somewhat more problematic. In is a kind of atomic GET+DELETE, in an HTTP sense. Unfortunately, HTTP doesn't atomic groupings of methods, and without atomicity in this Linda doesn't work reliably. A single DELETE could be used, *if* you return the content body in the result of the DELETE. It's not clear how "correct" this usage of delete is. As in the above case, the query string for this operation is a mixture of actual and formal (wildcarded) typed values. (X4)
The out() operation is just a POST operation. It's altering the state of the tuplespace, not replacing it, so POST is appropriate and PUT is not. (Note there is a subtle implication here: as is intuitively the case, this assumes that in fact the query string does not distinguish separate resources, but parameterizes a single resource. This is controversial. If otherwise-identical URIs with slightly different query strings are actually considered to identify different resources, then it's not clear that POST is preferred to PUT. Indeed, that belief creates a rather confusing situation.) The URI used with this operation must contain only actual values, no wildcards. (X5)
One important thing to note is that the tuplespace may contain *many* identical tuples. If I POST to the same URI 5 times, I've inserted 5 identical tuples. If I then DELETE the same URI 5 times, I'll get back content 5 times. The 6th time I DELETE that URI, I'll get a 404. It's also important to note that there is no particular guarantee on which among a set of matching tuples for a given "template" (wildcarded query string) may be returned on a GET or DELETE. Some actual Linda implementations provide certain semantics like "first tuple in, first one out among a set that matches some input template." In general, it's best to assume that there's no particular semantics like this, and build your tuples and templates accordingly. (E.g., you can achieve ordering simply by embedding counters, etc.) (X6)
The above provides a minimal but useful sketch of a "Webby" Linda implementation that could actually be used for real applications. It's limited; tuples can only contain whatever can be usefully and practically encoded in the query string. A more robust implementation might use an XML format for actuals and pass this in the content body of a POST; but unless HTTP were extended to include passing content bodies in i.e. GET and DELETE, POST would have to be overloaded to accomodate in() and rd(). This begins to smack of tunneling, which is a good sequeway to... (X7)
Is this RESTful? I dunno, you tell me. On the one hand, it's a much closer mapping to HTTP methods than tunneling everything through POST. The tuplespace-to-resource mapping is probably correct: a tuplespace is an encapsulation of a collection of tuples which exposes a simple but powerful interface, easily mapped to HTTP operations and semantics. OTOH, query strings worry me in general. ;) But on yet another hand, it's clear that the "shape" of a tuplespace more resembles the "shape" of the parameter space of a query string, as opposed to the hierarchially mapped "shape" of the space described by a URI's path segment. (X8)
Related Work (X9)
There are many Linda-related Web-oriented technologies, projects, products, and so forth. While not exactly Web-based, Sun's JavaSpaces is without a doubt the most common and popular Linda version or derivative ever. There are numerous efforts which attempt to provide generative / associative access to Web-based XML spaces: Ruple [1], XMLSpaces, [2] and XML-Tuples and XML-Spaces [3] are examples of this. (There's another really good one that operates as a generative XML-based business document exchange, think XML/Linda?-driven EDI. I seem to have lost my pointer to it, it's called "Forum" or "Nova" or something combining those concepts.) There's lots of old and current work around these concepts, and any pointers are always appreciated. (XA)
References (XB)
- JavaSpaces (XC)
- Ruple (XD)
- Coordinating Web-based Systems with Documents in XMLSpaces (XE)
- XML-Tuples and XML-Spaces, V0 (XF)
See also Triple-Based Computing, about nested tuple spaces for decoupling `web services'. -- LaurianGridinoc (5DU)
See also TSC State of the Art and Requirements Analysis, for a fairly comphrensive survey of Tuple Space implementations. -- NickGall (5T3)