Process Interaction Models
This page is dedicated to my doctoral work
begun at Yale University while doing research work with
Professor David Gelernter and completed at
Universite Pierre et Marie Curie, PARIS (VI)
- traditionally the science department of the Sorbonne -
and the Ecole NS des Mines de Paris.
I successfully defended Process Interaction Models in July 1992.
The work provides a critical review of existing process interaction models and
makes a new proposal, called Ease, designed to provide a general purpose
parallel programming model that addresses the issues raised.
This work is strongly influenced by my industry experience with the
UK semiconductor company INMOS (now
STMicroelectronics)
and the development of the transputer microprocessor and Occam.
At INMOS I worked mainly in David May's computer architecture group and on occasion in Bob Krysiak's microprocessor design group (T810 and T9000).
A version of the book I wrote for INMOS about Occam in 1987, published originally
by Prentice Hall and revised by INMOS in 1995 -
the Occam
Reference Manual - is now also available in the public domain.
In the history relevant to this subject I was involved in the original development
of MPI though my
contribution was minor. In addition to my contributions to the early discussions I was the Chairman of the unappreciated Formal Methods Subcommittee.
I am also apparently a previous author of
the Linda Manual, though
to be completely honest, while I have written about Linda over the years and I was briefly a
member of the Linda Group at Yale, I have no clear recollection of writing a document
for the Linda company SCIENTIFIC.
You can find the PDF version of my thesis Process Interaction Models in the following downloads section.
With the renewed interest in parallel processing I have also extracted
an overview chapter and an appendix that illustrates how to add the proposed model to existing languages.
Although the book begins with a French introduction the body
of the work is in English.
Briefly, the proposal moves beyond message passing and Linda shared data, addressing difficiencies that effect and affect
the behavior of engineers programming parallel machines, while optimizing performance and aiding portability
independent of the underlying memory architecture. Simplistically, the proposal allows for the encapsulation of pointers and
access synchronization. It extends the typing system of parallel machines to provide strictly typed shared
data structures called contexts and introduces the primitive copy and binding operators put, get, read and write.
The proposal has the advantage of a formal basis (CSP).
If you have interest in my proposal then I hope that you will take the time to look beyond the model itself, learn about the issues and read the
rational that led there. Consideration was given to several aspects of programming language design that are novel, including the effect that the
language has upon the problem solving behavior of programmers using the language. This is an important consideration especially when
dealing with the complexities of designing solutions for parallel processing computers.
This result came from my years of experience with Occam, Linda and the Transputer and my working with a large international engineering
community all trying to apply parallel programming techniques to a wide range of real world problems - including graphics pipelines, image processing, recognition of
various kinds, artificial intelligence and (unfortunately) missile control. For a long time I was the Chairman of the Occam User Group AI SIG.
Several projects to implement Ease were started before the commercial and academic interest in parallel processing imploded in 1992/1993
with the collapse of companies like Thinking Machines and Kendall Square Research
- with whom I was associated through Kuck and Associates at the time.
The only independent project that continued, as far as I am aware, is
Tim Kleingeld-MacKenzie's C-with-Ease and C++-with-Ease compiler
(based on the language and primitives specified in my thesis)
developed at Monash in the later 1990s, and available for a variety of Unix
multi-processors. It can be found
here and an IEEE paper published in 1998 on
the subject
here.
I wrote papers on Ease in 1990 that are published by Yale University and may still be available.
You can find detailed references to them here.
In particular, Yale University technical report TR809 Programming with Ease: A Semiotic definition of the language was widely distributed
and, in essence, is the foundation of my Ph.D. thesis.
You can see the original announcement of Ease on July 12th, 1990 in the record of comp.arch.
An intermediate paper I wrote while in France A Rationale for Programming with Ease can be found in Springer's
Lecture Notes In Computer Science; Vol. 574,
Research Directions in High-Level Parallel Programming Languages. This is a good volume that captures a snapshot of
most of the work going on at the time. You can Google Book this paper too.
Post thesis, Ease; the model and its implementation
was published in the
January 1993 edition of
ACM SIGPLAN Notices, Volume 28 Issue 1.
If I can find someone to read some CT600 tapes I have - and they are still readable - I will make the Ease parser that I wrote available.
The Bison/Yacc grammar for Ease is available in the thesis appendix.
Finally, my interest in the resurgence of massively parallel computing directly relates to my current research where we are dealing with
massively parallel models of protein conformance in sensory and motile architectures,
in addition to the well-known massive synchronized parallelism of neural architectures.
The recognition devices we predict that will be a hybrid of silicon (or a substrate put to similar ends) and biological physics will inevitablely be massively parallel
and a new understanding of natural interaction models will be essential.
Downloads
The full thesis: Process
Interaction Models, THESE de DOCTORAT de l.UNIVERSITE PARIS VI,
Steven Ericsson Zenith,
Universite Pierre et Marie Curie - PARIS VI, soutenue le 1er Julliet 1992
Chapter 7: Writing Programs with Ease (an overview of the proposal)
Appendix: C-with-Ease (the proposal added to a conventional language)
Ease Grammar
|
Berkeley Presentation
I am giving a lecture to Dave Paterson's group at Berekeley on this work and its relevance to my current research at IASE - on Tuesday February 27th, 2007.
Read the Berkeley report on why the work that has languished here is suddenly relevant:
The Landscape of Parallel Computing Research: A View from Berkeley
Abstract
The Ease parallel processing model was developed at Yale University in 1990. Designed to ease the programming of parallel machines in the wake of experience with the transputer, Occam and Linda. In this talk I will outline the work and describe its current relevance.
Relevant Biography
From 1985 to 1990 Steven was a principal member of David May's computer architecture team at INMOS in Bristol, England. He made individual contributions to two transputer designs and the language Occam. He is the principal author of the Occam Reference Manual, published in 1987 by Prentice Hall.
Steven has extensive experience working with the international community then attempting to apply transputers using the Occam and Linda models to a variety of scientific, commercial and space/military applications - including graphics pipelines, image processing, recognition problems of all kinds, artificial intelligence, neural networks, scientific and commercial modeling, workstations, embedded and control systems.
In 1989, Steven was invited to Yale University by David Gelernter to do research work on the development of Linda and subsequently went to the Ecole NS des Mines de Paris to complete this research.
His PhD thesis "Process Interaction Models" (Paris VI), published in 1992, presents Ease as a complete model and language, it discusses the variety of process interaction models and presents a review of the problems found in practice. A web site dedicated to this work can be found at
http://process-interaction-models.info
Several aspects of the work are novel. In particular, the thesis notes the effect that engineering languages have upon the problem solving behavior of engineers and identifies "performance semantics" that force engineers to break the concurrency model. To address these problems the thesis introduces parallel data types that make it easier to reason about large scale distributed data models constructed and refined by parallel processes. The proposal introduces simple operators that encapsulate pass-by-reference and access synchronization. The model's pragmatics promise to enable applications that are readily portable and maintain efficiency across a range of parallel architectures with diverse characteristics; including the number and performance of processors, cores, communication and memory architectures.
Steven is currently principal investigator at the Institute for Advanced Science & Engineering ( http://iase.info ). His basic research mission there is the explanation of experience in nature - the foundations of logic and apprehension. This is a long term research project that leverages new data from the life sciences and ultimately addresses the recognition problems, sensory modality and motility. An outline of this work can be found at http://senses.info
Steven is also a founding contributor to the MPI standard.
Process Interaction Models - Abstract
The construction of programs as a collection of behavior patterns called processes and some
mechanism by which they interact is evolving as a positive engineering solution to programming
parallel machines. While much work has been done on understanding process models
(e.g., Hoare, Milner, Pratt and others), the models by which processes interact remain today
a subject of debate.
Process interaction models are an integral component of parallel processing theory and
practice, defining the means by which concurrent processes interact; where "interaction"
means not only the exchange of data but also synchronization between processes. In searching
for a general purpose model to program parallel machines, it is desirable to provide portability,
an expressiveness which does not distract the programmer from the task in hand and efficiency
independent of the memory architecture of the machine.
Message passing has, for sometime, been a favored interaction model on the basis that
it is simple and readily understood. However, Generalized Message Passing, as typified by
the language Occam[INM88], is an unsuitable model for general purpose programming of
parallel computers. It causes programmers to be preoccupied with complex issues of data
distribution and implementations are compelled to copy data that might usefully be exchanged
by reference.
The Linda[Gel85] model promised a solution to the problems of data distribution complexity
introduced by generalized message passing. Linda programmers need not be concerned
by issues of data distribution. However, the Linda model is flawed; performance semantics in
the model are unpredictable, leading programmers to write code based on an understanding of
how a particular optimizer or underlying matching protocol behaves - subverting any meaningful
portability, and data structures must be contrived to develop an uncertain construction
of distributed data structures.
This thesis discusses these issues and presents a new process interaction model. Ease is
described as a general purpose, high level, imperative programming language. A program is
a collection of processes which execute concurrently, constructing and interacting via strictly
typed shared data structures called "Contexts".
Ease is novel in the following regard: a Context provides a priority oriented and strictly
typed intermediary in which distributed data structures are constructed and by which processes
may interact. Ease provides simple and symmetric binding operators which allow complex data
structures to be constructed and exchanged efficiently, constructions for both cooperative and
subordinate concurrency, and a mechanism for constructing statically reusable
and virtual resources.
The language definition is used as a reference vehicle for the model but the model is
sufficiently distinct to be added to conventional practices in the way message passing and
Linda have been in the past; a definition of C-with-Ease is presented. This illustrates how
hybrid languages can be constructed. A prototype compiler is described.
Sample Chapter: Chapter 5
Semiotics
Conventional programming language definitions have been primarily
concerned with the development of Syntax and Semantics in the context
of some infinite abstract machine model. For sequential languages
this abstraction is the Turing Machine (an infinite resource
uniprocessor). For newly evolving languages dealing with concurrency
based on interacting sequential processes, the abstract machine has
been an extension of this model - an infinite number of
interconnected Turing Machines.
Abstract, infinite resource, machines enable the application of
many mathematical techniques to computer programs. Unfortunately, the
programmer finds herself in the classic engineering predicament of
applying theory to practice. The programmers universe is a finite
one, yet when seeking guidance in this universe none can be found in
the formal language definition. The programmer is alone, or at best
clutching a poorly specified and informally described system manual
which has poor relation to the language used for programming, and is
often in conflict with it. No mention is made of the effects that
resource limitations may have on program behavior, nor is mention
made of the pragmatics that exist to caused the invention of some
language feature.
In the past pragmatic issues have been left to an uncertain
tutorial style which usually takes the form of an informal
introduction to the language syntax and semantics, and some sketchy
detail of implementation restrictions.
5.1 Performance semantics
In many conventional programming languages experienced programmers
adopt a hidden semantics based on their understanding of the behavior
of a particular or general implementation issue. These semantics I
name performance semantics, since they derive from a particular use
of the language in anticipation that such use will result in some
performance benefit.
Examples of such usage in conventional sequential languages1 are
use of shift operators for positive power of two multiplication and
division, use of in line code in preference to procedure or function
calls, and use of a certain loop construct on the understanding that
an optimizer will vectorize the code and use an available vector
processor.
Examples of performance semantics in parallel programming are
prolific. Indeed, the very nature of some explicit parallel
constructions is pragmatic. Replication, for example, is utilized
when the programmer has an understanding that to use the construction
will provide some performance benefit; i.e., the use of the
construction does not contribute to the solution (is not an intrinsic
of the algorithm).
None-the-less, general parallel construction need not be seen this
way. Algorithmic decomposition using parallel constructions is a
modular programming methodology not dissimilar from such
methodologies as structured programming and object oriented
programming.
Thus general parallel construction need not be regarded as
introducing performance semantics but rather be seen as a modular method of program construction;
i.e., that such a program may be run on a machine with multiple
processors is incidental.
The increasing introduction of performance optimizing compilers
can be a source of great semantic abuse. Where the style and actual
form of a program is directly affected by the programmer’s
understanding of the optimizing behavior of a particular
implementation.
5.2 Semiotics
The term semiotic in linguistics and philosophy is used in
reference to the complete study of signs. The study in these
disciplines consists primarily of the three forms syntax, semantics
and pragmatics. In this thesis I adopt the term in a particular
sense. More usually found in linguistics or philosophy the term
refers to “a general philosophical theory of signs and symbols
that deals esp. with their function in both artificially constructed
and natural languages and comprises the three branches of syntactics,
semantics, and pragmatics”(Webster’s Third New
International Dictionary). In the manner I use the term here
semiotics, in addition to considering the semantics and syntax of a
programming language considers the effect the language has upon the
behavior of the programmer, and, in particular, the pragmatic
statements required for the programmer to make consistent and
efficient use of the language.
The syntax and semantics of existing programming languages have
inherited much from the development of mathematical formalism. In
pure mathematics there are simple pragmatics. These pragmatics
consist of
- knowledge that use of the formalism is correctly applied
(i.e. Degree of confidence in the ability of the mathematician), and
- interpretation of the expressions by the human observer.
In mathematics both these pragmatics are dealt with by cross
referencing with the competence of other mathematicians. However, it is important to observe that
without the application of these pragmatics the validity of any
mathematical result is in question.
The role of pragmatics in mathematics is an important one. The
incorrect application of a formalism produces incorrect or
meaningless results. A misconception means to interpret the behavior
of an expression incorrectly. Incorrect application of the formalism
implies the results are incorrect or unexpected. Calculus, geometry,
whatever mathematical form, is just so much scribble without these
pragmatics.
The pragmatics in mathematics rests on the integrity of the
mathematician doing the work, the ability to cross reference with
other mathematicians who can verify the correctness of the
application and interpretation. This is an entirely reasonable
assumption among mathematicians.
Plato formalized these mathematical pragmatics in a single
statement written over the door of the Academy
“Let only geometers enter.”
Indeed, these pragmatics are a pragmatic of the scientific
community.
5.3 Semiotics in Computer Science
For the programmer dealing with large, non-trivial, problems a
central and unavoidable issue is performance. It is this pragmatic of
performance that most distinguishes the engineering sciences
(including Computer Science) from Mathematics.
Syntax and semantics are becoming well understood in the Computer
Science community, they are issues of great complexity and form. I am
concerned here not with those issues but rather the effect that the
language and its implementation have upon the behavior of the
programmer. As an example imagine a language where the use of
procedural abstraction incurred a significant performance cost. This
will effect the behavior of the programmer in time. The programmer
learns through experience to avoid using procedural abstraction for
trivial pieces of code.
We have seen mentioned in the earlier discussion several instances
of this semiotic effect.
In Occam, programmers soon learn that to over use communication in
programs incurs a significant penalty due to an increase in the copy
penalty, programmers begin programming with the free expression of
parallelism but pragmatics soon bring constraint and the programmer’s
behavior is modified; modified by the performance semantics of the
language.
In generalized message passing a preoccupation with routing and
multiplexing is a semiotic effect that directly changes the way the
programmer behaves. In this case the effect serves to distract the
programmer from the real task: algorithmic development.
In Linda the performance semantics of each primitive operation is
not uniformly unpredictable because of the requirement for run time
matching and optimization, and again the programmer’s behavior
is modified (in the ways discussed in the introduction).
Performance semantics are less of an issue in sequential
programming since a sequential machine generally has a balanced
nature with consistent performance semantics. However, for parallel
and distributed programming no such balance exists in current and
foreseeable technology. The more computer technology becomes parallel
the greater this issue will become. Performance is an architectural
complexity dependent on the balanced nature of the machine.
So what are we to do? We must come to understand these issues.
This understanding will itself have an effect on the behavior of
programming language designers.
In the following proposal I have made a first attempt at providing
a truly semiotic definition by providing a section of explicit
pragmatics that will enable the programmer to make efficient and
consistent use of the language described.
continued in the thesis
|