CS 358. Concurrent Object-Oriented Programming
Spring 1996

Lecture 1. Introduction

Mechanics of the course

This course will examine issues in the design of concurrent object-oriented programming languages. The only assignment is a term project, which may be an oral presentation in class or a written term paper. The project may either be a survery presentation, with some analysis of a specific problem, or a small piece of original research.

Students may register for CS 358 and do a "theory" project or CS 342 and do a "systems" project.

Objects and processes

A basic issue in the design of concurrent object-oriented languages is the relation between objects and processes. More specifically, the object metaphor is easily adapted to distributed computing, since the notion of "sending a message" does not involve any assumptions about locality. Therefore, a natural approach to distributed object systems is simply to allow objects to be distributed across a system.

Processes and objects are both relatively complex entities that are used to structure the control and data flow of computation. It seems reasonable to expect some form of connection between the two constructs. In addition, it seems difficult to avoid design considerations that involve interactions between obejcts and processes.

There have been some designs, such as COOL at Stanford, that do not seem to combine objects and processes in any interesting way. However, it might be an interesting project to analyze this system, or try writing some code, in light of the ideas that arise in this class.

Some design possibilities:

Example issue: method concurrency

One simple design decision is to serialize method execution. More specifically, to avoid interference between the commands that appear in method bodies, each object may be associated with a single thread, with one method completed before another message is received. We can see why this might be useful by considering the possible concurrent execution of push and pop in this stack implementation.
class stack 
      contents : array[1..n] of int
      top : int
      stack () =   top := 1
      push (x:int) : unit = 
          if top < n then 
              top :=  top +1;
              contents(top) := x
          else raise stack_full;
      pop ( )  : int  =
          if top > 0 then 
              top :=  top -1:
              return contents(top+1)
          else raise stack_empty;
end Stack
Using interleaving to explain the result, we could have this sequence of operations:

Initial state
             contents(1) = 3
             contents(2) = #

Call                                 push(4)                  pop()
                                       top := 2                 top := 1
                                       contents(1) := 4         return #
Final state
             contents(1) = 4
             contents(2) = #
A general correctness condition for parallel execution of parts of a program is that this should be indistinguishable, in overall effect, from some serial ordering of the program parts. (This idea comes from distributed databases, say, where requests to the database are not inherently serialized.) However, the result of the interleaved sequence of commands above is not consistent with any sequential ordering of push and pop.

Process calculus models of objects such as \cite{Walker,PierceTurner94:COPC} use records, each represented using a single channel, to represent objects. Since messages along a channel arrive serially, messages to objects are also serialized. We will investigate whether this approach is likely to yield useful parallelism for reaslistic programs.

In Cliff Jones's $\pi o\beta\lambda$ language, for example, only one method can be invoked at a time, resulting in a rendez-vous that causes the caller (client) to block until object (server) exits the rendez-vous. An object may continue execution of the method body after the rendez-vous, allowing the caller to proceed in parallel, but the object cannot accept another message until computation is complete.

An alternative to parallel execution of methods is to serialize method invocation, then allow each method body to decide when it is "safe" for the next method to begin. In other words, each method body would be structured into a critical section and a non-critical section. the critical section would be executed first, excluding the execution of other methods. When the critical section is completed, the non-critical section could continue in parallel with the critical section (or other parts) of another method.

Exercise: Structure the implementations of push and pop into critical and non-critical sections so that concurrent execution of these methods would be correct. Does this approach seem useful? Try rewriting the following implementation of queues using critical regions. Explain why your implementation is correct. You may assume that after insert passes the end of its critical region, a call to remove may be processed in parallel, but another call to insert may not. Mark the end of the critical region of each method with a statement like end_critical_region. Would you be able to better if you could explicitly lock a part of the array without locking all of it?

class queue 
      contents : array[1..n] of int
      front, back : int
      queue() =  front := back := 1
      insert (x:int) : unit = 
          if back+1 mod n !=  front then 
             back :=  back+1 mod n;
             contents(back) := x
          else raise queue_full;
      remove ( )  : int  =
          if front != back then 
             front :=  front+1 mod n;
             return contents(front-1 mod n)
          else raise queue_empty;
end Stack

Atomicity of methods

Another general issue is whether method invocation is atomic. In simple terms, an operation is atomic if any execution is guaranteed to either complete successfully or roll back a state where it is consistent to assert that the operation was never initiated. From a theoretical point of view, atomicity is useful for reasoning about program correctness. However, it may be difficult to achieve in practice.

Communication and locality

Some other important issues are those surrounding locality and communication between objects. In the Spring object-oriented operating system project at Sun, the difference between local (synchronous and reliable) and remote (possibly asynchronous and/or possibly unreliable) communication became an important issue. With large and small entities uniformly treated as objects, many messages are involved in even the simplest computation. In order to achieve reasonable efficiency, it therefore seems important to distinguish straightforward local communication from remote method invocation, since the latter would involve more complicated communication protocols.

It is also not clear how much programmer attention is required. When objects are known to be local, it is tiresome to specify the action to take when communication fails or times out. However, some programmer effort may be necessary in the case of remote communication.

Processes and functions

A foundational question is the connection between processes and functions. Intuitively, a function is that (i) runs in caller's thread or, equivalently, causes caller to block until it returns, and (ii) has specific input and output channel(s) so that caller passes an argument (once) and function may return a result (once). The can be realized fairly simply in $\pi o\beta\lambda$, say, since a method that returns one value, terminating the rendez-vous only when computation is complete, has essentially the same behavior as a function, except for the problem with recursive calls. However, with other conventions for method invocation, the connection seems less clear. The relationship between functions and processes in $\pi$-calculus has been studied by Milner.

Language frameworks

P. America "Issues" paper

This is a description of POOL2, with some useful general discussion in the early sections. Read sections 1-3 for next lecture.