Process Interaction Models

 

Institute for Advanced Science & Engineering

semeiosis research

semeiosis = the process in nature that constructs metaphysical marks. [Zenith]

 

 

 

 

{ } Updated March 27th, 2007

Process Interaction Models

This page is dedicated to my doctoral work begun at Yale University while doing research work with Professor David Gelernter and completed at Universite Pierre et Marie Curie, PARIS (VI) - traditionally the science department of the Sorbonne - and the Ecole NS des Mines de Paris. I successfully defended Process Interaction Models in July 1992.

The work provides a critical review of existing process interaction models and makes a new proposal, called Ease, designed to provide a general purpose parallel programming model that addresses the issues raised.

This work is strongly influenced by my industry experience with the UK semiconductor company INMOS (now STMicroelectronics) and the development of the transputer microprocessor and Occam. At INMOS I worked mainly in David May's computer architecture group and on occasion in Bob Krysiak's microprocessor design group (T810 and T9000). A version of the book I wrote for INMOS about Occam in 1987, published originally by Prentice Hall and revised by INMOS in 1995 - the Occam Reference Manual - is now also available in the public domain.

In the history relevant to this subject I was involved in the original development of MPI though my contribution was minor. In addition to my contributions to the early discussions I was the Chairman of the unappreciated Formal Methods Subcommittee. I am also apparently a previous author of the Linda Manual, though to be completely honest, while I have written about Linda over the years and I was briefly a member of the Linda Group at Yale, I have no clear recollection of writing a document for the Linda company SCIENTIFIC.

You can find the PDF version of my thesis Process Interaction Models in the following downloads section. With the renewed interest in parallel processing I have also extracted an overview chapter and an appendix that illustrates how to add the proposed model to existing languages. Although the book begins with a French introduction the body of the work is in English.

Briefly, the proposal moves beyond message passing and Linda shared data, addressing difficiencies that effect and affect the behavior of engineers programming parallel machines, while optimizing performance and aiding portability independent of the underlying memory architecture. Simplistically, the proposal allows for the encapsulation of pointers and access synchronization. It extends the typing system of parallel machines to provide strictly typed shared data structures called contexts and introduces the primitive copy and binding operators put, get, read and write. The proposal has the advantage of a formal basis (CSP).

If you have interest in my proposal then I hope that you will take the time to look beyond the model itself, learn about the issues and read the rational that led there. Consideration was given to several aspects of programming language design that are novel, including the effect that the language has upon the problem solving behavior of programmers using the language. This is an important consideration especially when dealing with the complexities of designing solutions for parallel processing computers. This result came from my years of experience with Occam, Linda and the Transputer and my working with a large international engineering community all trying to apply parallel programming techniques to a wide range of real world problems - including graphics pipelines, image processing, recognition of various kinds, artificial intelligence and (unfortunately) missile control. For a long time I was the Chairman of the Occam User Group AI SIG.

Several projects to implement Ease were started before the commercial and academic interest in parallel processing imploded in 1992/1993 with the collapse of companies like Thinking Machines and Kendall Square Research - with whom I was associated through Kuck and Associates at the time. The only independent project that continued, as far as I am aware, is Tim Kleingeld-MacKenzie's C-with-Ease and C++-with-Ease compiler (based on the language and primitives specified in my thesis) developed at Monash in the later 1990s, and available for a variety of Unix multi-processors. It can be found here and an IEEE paper published in 1998 on the subject here.

I wrote papers on Ease in 1990 that are published by Yale University and may still be available. You can find detailed references to them here. In particular, Yale University technical report TR809 Programming with Ease: A Semiotic definition of the language was widely distributed and, in essence, is the foundation of my Ph.D. thesis. You can see the original announcement of Ease on July 12th, 1990 in the record of comp.arch. An intermediate paper I wrote while in France A Rationale for Programming with Ease can be found in Springer's Lecture Notes In Computer Science; Vol. 574, Research Directions in High-Level Parallel Programming Languages. This is a good volume that captures a snapshot of most of the work going on at the time. You can Google Book this paper too. Post thesis, Ease; the model and its implementation was published in the January 1993 edition of ACM SIGPLAN Notices, Volume 28 Issue 1.

If I can find someone to read some CT600 tapes I have - and they are still readable - I will make the Ease parser that I wrote available. The Bison/Yacc grammar for Ease is available in the thesis appendix.

Finally, my interest in the resurgence of massively parallel computing directly relates to my current research where we are dealing with massively parallel models of protein conformance in sensory and motile architectures, in addition to the well-known massive synchronized parallelism of neural architectures. The recognition devices we predict that will be a hybrid of silicon (or a substrate put to similar ends) and biological physics will inevitablely be massively parallel and a new understanding of natural interaction models will be essential.

Downloads

  • The full thesis: Process Interaction Models, THESE de DOCTORAT de l.UNIVERSITE PARIS VI,
    Steven Ericsson Zenith, Universite Pierre et Marie Curie - PARIS VI, soutenue le 1er Julliet 1992
  • Chapter 7: Writing Programs with Ease (an overview of the proposal)
  • Appendix: C-with-Ease (the proposal added to a conventional language)
  • Ease Grammar
  •  

     

     

    Berkeley Presentation

    I am giving a lecture to Dave Paterson's group at Berekeley on this work and its relevance to my current research at IASE - on Tuesday February 27th, 2007.

    Read the Berkeley report on why the work that has languished here is suddenly relevant: The Landscape of Parallel Computing Research: A View from Berkeley

    Abstract

    The Ease parallel processing model was developed at Yale University in 1990. Designed to ease the programming of parallel machines in the wake of experience with the transputer, Occam and Linda. In this talk I will outline the work and describe its current relevance.

    Relevant Biography

    From 1985 to 1990 Steven was a principal member of David May's computer architecture team at INMOS in Bristol, England. He made individual contributions to two transputer designs and the language Occam. He is the principal author of the Occam Reference Manual, published in 1987 by Prentice Hall.

    Steven has extensive experience working with the international community then attempting to apply transputers using the Occam and Linda models to a variety of scientific, commercial and space/military applications - including graphics pipelines, image processing, recognition problems of all kinds, artificial intelligence, neural networks, scientific and commercial modeling, workstations, embedded and control systems.

    In 1989, Steven was invited to Yale University by David Gelernter to do research work on the development of Linda and subsequently went to the Ecole NS des Mines de Paris to complete this research.

    His PhD thesis "Process Interaction Models" (Paris VI), published in 1992, presents Ease as a complete model and language, it discusses the variety of process interaction models and presents a review of the problems found in practice. A web site dedicated to this work can be found at

    http://process-interaction-models.info

    Several aspects of the work are novel. In particular, the thesis notes the effect that engineering languages have upon the problem solving behavior of engineers and identifies "performance semantics" that force engineers to break the concurrency model. To address these problems the thesis introduces parallel data types that make it easier to reason about large scale distributed data models constructed and refined by parallel processes. The proposal introduces simple operators that encapsulate pass-by-reference and access synchronization. The model's pragmatics promise to enable applications that are readily portable and maintain efficiency across a range of parallel architectures with diverse characteristics; including the number and performance of processors, cores, communication and memory architectures.

    Steven is currently principal investigator at the Institute for Advanced Science & Engineering ( http://iase.info ). His basic research mission there is the explanation of experience in nature - the foundations of logic and apprehension. This is a long term research project that leverages new data from the life sciences and ultimately addresses the recognition problems, sensory modality and motility. An outline of this work can be found at http://senses.info

    Steven is also a founding contributor to the MPI standard.

    Process Interaction Models - Abstract

    The construction of programs as a collection of behavior patterns called processes and some mechanism by which they interact is evolving as a positive engineering solution to programming parallel machines. While much work has been done on understanding process models (e.g., Hoare, Milner, Pratt and others), the models by which processes interact remain today a subject of debate.

    Process interaction models are an integral component of parallel processing theory and practice, defining the means by which concurrent processes interact; where "interaction" means not only the exchange of data but also synchronization between processes. In searching for a general purpose model to program parallel machines, it is desirable to provide portability, an expressiveness which does not distract the programmer from the task in hand and efficiency independent of the memory architecture of the machine.

    Message passing has, for sometime, been a favored interaction model on the basis that it is simple and readily understood. However, Generalized Message Passing, as typified by the language Occam[INM88], is an unsuitable model for general purpose programming of parallel computers. It causes programmers to be preoccupied with complex issues of data distribution and implementations are compelled to copy data that might usefully be exchanged by reference.

    The Linda[Gel85] model promised a solution to the problems of data distribution complexity introduced by generalized message passing. Linda programmers need not be concerned by issues of data distribution. However, the Linda model is flawed; performance semantics in the model are unpredictable, leading programmers to write code based on an understanding of how a particular optimizer or underlying matching protocol behaves - subverting any meaningful portability, and data structures must be contrived to develop an uncertain construction of distributed data structures.

    This thesis discusses these issues and presents a new process interaction model. Ease is described as a general purpose, high level, imperative programming language. A program is a collection of processes which execute concurrently, constructing and interacting via strictly typed shared data structures called "Contexts".

    Ease is novel in the following regard: a Context provides a priority oriented and strictly typed intermediary in which distributed data structures are constructed and by which processes may interact. Ease provides simple and symmetric binding operators which allow complex data structures to be constructed and exchanged efficiently, constructions for both cooperative and subordinate concurrency, and a mechanism for constructing statically reusable and virtual resources.

    The language definition is used as a reference vehicle for the model but the model is sufficiently distinct to be added to conventional practices in the way message passing and Linda have been in the past; a definition of C-with-Ease is presented. This illustrates how hybrid languages can be constructed. A prototype compiler is described.

    Sample Chapter: Chapter 5

    Semiotics

    Conventional programming language definitions have been primarily concerned with the development of Syntax and Semantics in the context of some infinite abstract machine model. For sequential languages this abstraction is the Turing Machine (an infinite resource uniprocessor). For newly evolving languages dealing with concurrency based on interacting sequential processes, the abstract machine has been an extension of this model - an infinite number of interconnected Turing Machines.

    Abstract, infinite resource, machines enable the application of many mathematical techniques to computer programs. Unfortunately, the programmer finds herself in the classic engineering predicament of applying theory to practice. The programmers universe is a finite one, yet when seeking guidance in this universe none can be found in the formal language definition. The programmer is alone, or at best clutching a poorly specified and informally described system manual which has poor relation to the language used for programming, and is often in conflict with it. No mention is made of the effects that resource limitations may have on program behavior, nor is mention made of the pragmatics that exist to caused the invention of some language feature.

    In the past pragmatic issues have been left to an uncertain tutorial style which usually takes the form of an informal introduction to the language syntax and semantics, and some sketchy detail of implementation restrictions.

    5.1 Performance semantics

    In many conventional programming languages experienced programmers adopt a hidden semantics based on their understanding of the behavior of a particular or general implementation issue. These semantics I name performance semantics, since they derive from a particular use of the language in anticipation that such use will result in some performance benefit.

    Examples of such usage in conventional sequential languages1 are use of shift operators for positive power of two multiplication and division, use of in line code in preference to procedure or function calls, and use of a certain loop construct on the understanding that an optimizer will vectorize the code and use an available vector processor.

    Examples of performance semantics in parallel programming are prolific. Indeed, the very nature of some explicit parallel constructions is pragmatic. Replication, for example, is utilized when the programmer has an understanding that to use the construction will provide some performance benefit; i.e., the use of the construction does not contribute to the solution (is not an intrinsic of the algorithm).

    None-the-less, general parallel construction need not be seen this way. Algorithmic decomposition using parallel constructions is a modular programming methodology not dissimilar from such methodologies as structured programming and object oriented programming.

    Thus general parallel construction need not be regarded as introducing performance semantics but rather be seen as a modular method of program construction; i.e., that such a program may be run on a machine with multiple processors is incidental.

    The increasing introduction of performance optimizing compilers can be a source of great semantic abuse. Where the style and actual form of a program is directly affected by the programmer’s understanding of the optimizing behavior of a particular implementation.

    5.2 Semiotics

    The term semiotic in linguistics and philosophy is used in reference to the complete study of signs. The study in these disciplines consists primarily of the three forms syntax, semantics and pragmatics. In this thesis I adopt the term in a particular sense. More usually found in linguistics or philosophy the term refers to “a general philosophical theory of signs and symbols that deals esp. with their function in both artificially constructed and natural languages and comprises the three branches of syntactics, semantics, and pragmatics”(Webster’s Third New International Dictionary). In the manner I use the term here semiotics, in addition to considering the semantics and syntax of a programming language considers the effect the language has upon the behavior of the programmer, and, in particular, the pragmatic statements required for the programmer to make consistent and efficient use of the language.

    The syntax and semantics of existing programming languages have inherited much from the development of mathematical formalism. In pure mathematics there are simple pragmatics. These pragmatics consist of

    • knowledge that use of the formalism is correctly applied (i.e. Degree of confidence in the ability of the mathematician), and
    • interpretation of the expressions by the human observer.

    In mathematics both these pragmatics are dealt with by cross referencing with the competence of other mathematicians. However, it is important to observe that without the application of these pragmatics the validity of any mathematical result is in question.

    The role of pragmatics in mathematics is an important one. The incorrect application of a formalism produces incorrect or meaningless results. A misconception means to interpret the behavior of an expression incorrectly. Incorrect application of the formalism implies the results are incorrect or unexpected. Calculus, geometry, whatever mathematical form, is just so much scribble without these pragmatics.

    The pragmatics in mathematics rests on the integrity of the mathematician doing the work, the ability to cross reference with other mathematicians who can verify the correctness of the application and interpretation. This is an entirely reasonable assumption among mathematicians.

    Plato formalized these mathematical pragmatics in a single statement written over the door of the Academy

    “Let only geometers enter.”

    Indeed, these pragmatics are a pragmatic of the scientific community.

    5.3 Semiotics in Computer Science

    For the programmer dealing with large, non-trivial, problems a central and unavoidable issue is performance. It is this pragmatic of performance that most distinguishes the engineering sciences (including Computer Science) from Mathematics.

    Syntax and semantics are becoming well understood in the Computer Science community, they are issues of great complexity and form. I am concerned here not with those issues but rather the effect that the language and its implementation have upon the behavior of the programmer. As an example imagine a language where the use of procedural abstraction incurred a significant performance cost. This will effect the behavior of the programmer in time. The programmer learns through experience to avoid using procedural abstraction for trivial pieces of code.

    We have seen mentioned in the earlier discussion several instances of this semiotic effect.

    In Occam, programmers soon learn that to over use communication in programs incurs a significant penalty due to an increase in the copy penalty, programmers begin programming with the free expression of parallelism but pragmatics soon bring constraint and the programmer’s behavior is modified; modified by the performance semantics of the language.

    In generalized message passing a preoccupation with routing and multiplexing is a semiotic effect that directly changes the way the programmer behaves. In this case the effect serves to distract the programmer from the real task: algorithmic development.

    In Linda the performance semantics of each primitive operation is not uniformly unpredictable because of the requirement for run time matching and optimization, and again the programmer’s behavior is modified (in the ways discussed in the introduction).

    Performance semantics are less of an issue in sequential programming since a sequential machine generally has a balanced nature with consistent performance semantics. However, for parallel and distributed programming no such balance exists in current and foreseeable technology. The more computer technology becomes parallel the greater this issue will become. Performance is an architectural complexity dependent on the balanced nature of the machine.

    So what are we to do? We must come to understand these issues. This understanding will itself have an effect on the behavior of programming language designers.

    In the following proposal I have made a first attempt at providing a truly semiotic definition by providing a section of explicit pragmatics that will enable the programmer to make efficient and consistent use of the language described.

    continued in the thesis