CS 358. Concurrent Object-Oriented Programming
Spring 1996

Lectures 6-8. The Actor approach to concurrent objects

References: G. Agha and C. Hewitt, Concurrent programming using Actors, in Yonezawa and Tokoro (eds), Object-Oriented Concurrent Programming.
G. Agha, ACTORS: A Model of Concurrent Computation in Distributed Systems, MIT Press, 1986.
G. Agha, Concurrent object-oriented programming, CACM 33, 9, Sept 1990, pages 125--141.
G.A. Agha, I.A. Mason, S.F. Smith, C.L. Talcott, A foundation for Actor computation, J. Functional Programming, 1(1)??, 1993.

History

Actors originated through Carl Hewitt work on the artificial intelligence system Planner in the early 1970's (e.g., Hewitt 1971, cited in [G.A. Agha, I.A. Mason, S.F. Smith, C.L. Talcott]). The ideas of Hewitt and his collaborators evolved over subsequent years. At present, there is no one Actor system that embodies all of the concepts that have been developed in the literature. As a result, our discussion of Actor systems is more a discussion of a point of view, with possible alternatives and embellishments, than an evaluation of a specific, concrete system. Nonetheless, the Actor ideas presented in twenty years of research papers represent an interesting and influential point of view on concurrent, object-oriented computing. The Actor approach is was formulated around three main design objectives:
  1. Shared, mutable data. The actor model is designed to deal with shared resources that may change state. An example is a bank account whose balance may change.
  2. Reconfigurability. New objects may be created and it is possible to communicate with new objects after they are created.
  3. Inherent concurrency. It should be possible to understand the "inherent concurrency" of a program by examining it.
The phrase "inherent concurrency" refers to the number of activities that could be carried out in parallel if there were an unbounded number of processes available. In practice, a program may be executed with less than its inherent concurrency due to resource limitations. However, such deviation from maximally parallel execution need not be apparent from the text of the program since it depends on the resources available at the time the program is executed, not the tasks described by the program. The importance of this design goal seems to be that Actors are an explicitly concurrent notion, not an implicitly concurrent one as arises in concurrent implementations of sequential languages.

Basic ideas in the actor model

An actor is an object that carries out its actions in response to communications it receives. There are three basic actions that an actor may perform: Actor computation is reactive, which means that computation is performed as a "reaction" to communication. An actor is "dormant" until it receives communication, then the script of the actor may specify subsequent communication and a set of new actors to create. After this, the actor returns to its "dormant" or "sleep" state. The replacement behavior determines the response to subsequent communication. For example, if an actor representing the number 3 receives a communication "increment by 1," its replacement behavior will be to act like the number 4.

In any computation, each actor receives a linearly ordered sequence of communications. However, messages are not guaranteed to arrive in the order in which they are sent. If one actor, A, sends two communications to another, B, then B may not receive them in the same order that A sent them. This is consistent with the idea that actors may be physically located in different places, connected by a communication network that might route different messages along different paths, sometimes send short communications faster than long ones, and so on.

There is no assignment to local variables in the basic actor model; the only side effects of an actor are to send communication and to specify a replacement behavior, which is a form of atomic change in the state of the actor performed only after the other activities are complete. This establishes a specific granularity to actor computation and allows us to consider one execution of an actor script as an atomic computation.

An important part of the model is the mail system, which routes and buffers communication between actors. Every communication must be sent to a mail address and an actor may communicate with any actor if it knows its mail address. When an actor A specifies a "replacement behavior," the replacement behavior will be the script for a new actor that receives subsequent communication addressed to the mail address of A.

Components of an Actor system

A communication event is called a task. A task has three parts:

The configuration of an actor system consists of a finite set of actors and a finite set of "pending" or undelivered tasks. Since each task has a single, specified destination, [Agha 1986] say that we may visualize the system as a set of actors and task queues, with one queue per actor and each task on exactly one queue. (As mentioned above, tasks arrive in indeterminate order -- if one actor sends two communications to another, they may be placed on the queue in any order. However, each actor receives a linearly ordered sequence of communications.)

This does not actually seem right -- task queues imply that if a stack object receives an "enqueue" message, one that it is not prepared to respond to in any way, this might sit in the queue in front of a push or pop message and keep the actor from processing a message that it is prepared to respond to. For this reason, it seems more reasonable to think of each actor has having an unordered "pool" of messages that have arrived. When dormant, an actor may react to any of the messages in its pool. (* Do actor experts think this is correct??? *)

General Properties of Actors

Synchronous vs. asynchronous computation

Actors are asynchronous, with no global clock.

Rationale: Distributed agents may have local clocks that proceed at different rates. This may be do to differences in processor speed, or the way that different logical processes are scheduled on physical processors. In addition, there may be different time delays in communication between different processors. Therefore, processes may observe different orders of events. For example, processes A and B may both receive communication from processes C and D indicating the completion of certain tasks. However, if communication from C to A proceeds more quickly than from C to B, and conversely communication from D to B is quicker than from D to A, then process A may "see" that C completes before D while B "sees" that D completes before C.

Synchronous computation may be achieved in an asynchronous environment, using a "clock" process that notifies all other processes when to proceed. However, the costs of communicating with the clock process and waiting for acknowledgment may be high. Therefore, the costs of maintaining synchronization should not be hidden by the basic programming model. It is more realistic to assume that the basic computing environment is asynchronous.

The rationale is based on assumptions that may not hold in all systems. For example, multiprogramming on a uniprocessor.

Communication mechanisms

Shared variables

Shared variables are not allowed in the actor model.

Rationale: Shared variables are a low-level primitive, meaning that they provide data transfer without any coordination mechanism or concurrency control. Users of shared variables must implement and use some agreed-upon access protocol, in order to transfer information properly, but this is not enforced in any way. It seems better (to the proponents of the actor model, anyway) to adopt a communication primitive that includes a communication protocol.

Latency and timing issues make it awkward to use shared variables in an asynchronous environment. Therefore, a more abstract communication model is preferred.

Synchronous vs. asynchronous communication

Communication is asynchronous, without any guaranteed arrival order.

Rationale: Advantages that might be cited for synchronous communication are that:

On the other hand, asynchronous communication has several advantages. The main advantages are that it is a more realistic assumption in practice.

Message order: if actor A sends a sequence of communications to actor B, then B my not receive them in the same order that A sent them. (Intuition: think of email from SF to London, with different messages routed in different ways; delays may also depend on message size.) However, it is possible for A to tag each message with a sequence number, so that B may rearrange messages into the correct order.

Exercise: Write actors A and B that implement the send and receive portions of an order-preserving communication channel. (Agha Sec 6.3)

Buffered communication

Rationale: It's hard to imagine a reasonable system built on asynchronous, unbuffered communication. Too much would get lost.

            synchronous     asynchronous
           --------------------------------
           |               |               |
           |               |               |
buffered   |    OK??       |    Actors     |
commun.    |               |               |
           |               |               |
           --------------------------------
           |               |               |
           |   CCS,        |               |
unbuffered |    CSP,       |   Useless??   |
commun.    |  Ada,         |               |
           |  Occam ?      |               |
           |               |               |
           --------------------------------
           
Synchronous communication does not require synchronous computation. Ada, for example, has asynchronous computation of independent processes, with a synchronous "rendezvous" communication. (Process synchronize for the purpose of communication.)

These models are all abstractions built on more complicated underlying protocols that generally involve transmission, acknowledgments, retransmission, and so on.

Fairness

There are two fairness assumptions, one for the mail system and one for computation steps of active actors:

Rationale: Eventual mail delivery is a convenient programming assumption. Since the mail system is implicit, rather than an explicit part of actor programs, it seems necessary to make some assumptions about its behavior. It is not entirely feasible for an implementation to satisfy this assumption, however. For example, suppose stack actor (C in the picture below) receives push and enqueue messages, where push can be handled by the stack, but enqueue is not. Then it seems like the enqueue messages must just pile up in case the actor eventually becomes a queue. This leads to arbitrarily large message pools.


       --------                     
       |       |                        
       |   A   |                        -----------    --------
       |       |------| mail sys |----> |         |    |      |
       --------                         | mail    |--->|  C   |
                                        |  pool   |    |      |
       --------                         |         |    --------
       |       |------| mail sys |---->|----------
       |  B    |                        
       |       |                        
       --------|                        
                      

The general point of view here is like assuming an infinite run-time stack in the semantics of a procedural language. This cannot be provided in practice, but the "first pass" in establishing program correctness would be to show that a program computes the correct result, assuming an infinite stack. If the run time stack overflows, then a "correct" program may not terminate correctly. However, since stack size is dependent on the implementation, this is a useful idealization.

(Questions in class:

)

Some reasons for assuming scheduler fairness have to do with actor system composition and equivalence, discussed below.

We can compare several forms of fairness. For message delivery and progress of actor computation, only weak fairness is assumed.

The last two involve some notion of global time.

Actor Programs

A program consists of Behaviors may be defined as a function of "instantiation parameters", which are called acquaintances.

Example 1

We can get some intuition for actor programs by looking at a stack node actor.
a stack node with acquaintances content and link
    if operation requested is a pop and content != nil then
         become forwarder to link
         send content to customer
    if operation requested is push then
         let P=new stack-node with current acquaintances
         {become stack-node with acquaintances new-content and P }

Discussion in class
We discussed the behavior of stacks implemented in this way and the issue of "delivered" messages. A pop operation changes the stack as follows:
      ---------      ---------      ---------
 --->| 3  |  --|--->| 4  |  --|--->|    |    |
      ---------      ---------      ---------

BECOMES

      ----------     -------      ---------
 --->|forwarder|--->| 4 | --|--->|    |    |
      ----------     -------      ---------
One implementation issue is how the "forwarder-to" node is implemented. Perhaps it sits there until garbage collection, then is eliminated (?).

A general issue is the behavior of concurrent stacks in general. If concurrent processes push and pop from the same stack, then since the order of messages is indeterminite, there are no exact guarantees about the order of data received back from pop operations. The situation with Actors is more complicated than in other languages, since even with one actor doing all of the push and pop operations, the order of transmission is indeterminate. Therefore, it is possible to get the following sequence of communications:

   
        Actor A       Stack input pool                     Stack actor
      
        push(3)
        push(4)
        push(5)
        pop
                     push(3), push(4), push(5), pop

                                                           sees pop message
                                                           sends "empty stack"
                                                           to actor A


If actor A hasn't been counting how many things it has placed on the stack (and why should it; that's what a stack is supposed to do), then actor A cannot draw any conclusions from the "empty stack" message it receives. If all of the numbers on the stack must be processed in some way, then the actor must keep issuing "pop" messages to make sure that push messages still sitting somewhere in the mail system are eventually taken care of. This example shows some of the reasons why synchronous communication simplifies some kinds of programming, or at least why it might be desirable to have some guarantees of message order. (This can be handled in actor systems using transmission numbers, as discussed below.) ----------- comments from Amit Patel -------------- The picture of 'pushing' onto a stack by moving the current top of the stack to the new stack actor doesn't seem too strange. In C/C++ or ML, you'd have to do the same thing if you wanted all pointers to the stack to remain valid. The way it's usually done in C/C++ or ML is to have an extra Stack object at the top that doesn't contain any stack values. Then there's a list of StackValues. When a StackValue is added or removed, all pointers to the Stack remain valid.

The same thing should work in the Actor model:

	[Stack] ---> [5] ---> [3]
(pop)  ----------------------------
	[Stack] ------------> [3]
(push) ----------------------------
        [Stack] ---> [1] ---> [3]
(pop)  ----------------------------
        [Stack] ------------> [3]
(pop)  ----------------------------
        [Stack]->nil
The extra element would eliminate the moving around of values during a push and also the strange empty stack with contents and next set to nil. So far, I'm just convinced that their example is bad. The thing about pushing then popping could be solved by having every operation produce a reply (for synchronization purposes). If you wanted to make sure the element was pushed before you try to pop, you'd wait for the push reply. If not, then you push, pop, then wait for the two replies. If two different actors are doing the push and pop, then without any synchronization, they get what they deserve. :*) -------------------------

Example 2

In his book on Actors, Agha mentions the problem of adding a list of numbers in parallel. If we have 2^n numbers
    a_0, a_1, ...,  a_{2^n -1}
Then the sum may be computed in time n (the logarithm of the the length of the list) by evaluating all nodes at the same level of the following tree in parallel:
    a_0  a_1  a_2   a_3           ....            a_{2^n -1}
     \   /      \   /                                /
      \ /        \ /                                /
       +          +                                +
        \        /                                /
          \    /
           \  /                               
             +                               .
              \                             .
                                           .
                    .
                       .                                          
                          .
                                \     /
                                   +
The problem is to write a program that does this efficiently and cleanly. Agha cites an article by van Emden and Filho (in a book, I think, called Logic Programming Academic Press, 1982) saying that this cannot be done by any fixed network of processes.

Occam, a language based on CSP, apparently allows only a fixed number of processes. (For each program, there is a fixed upper bound on the number of processes defined by the program, independent of the input.) Therefore, it seems (?) impossible to compute the sum of 2^n numbers in time n, for arbitrarily large n, by a single Occam program. (Anyone have more detailed evidence or counter-evidence??)

PRAM algorithm: The Parallel Random Access Machine (PRAM) model is a simple and partly unrealistic framework for describing synchronous parallel computation with shared variables. We assume that a PRAM has as many processors as we need, then count the number of processors used . There is no specific programming language associated with the PRAM model, but there are various conventions for expressing the behavior of a machine.

We can write an addition algorithm by saying what each processor does at step k. We assume that the input is stored in registers 0, ... ,2^n -1.

Step 1: Processor i 
           adds a_{2i} + a_{2i +1}
           stores result in register a_{2i}
           
Step k: Processor i 
           adds a_{i 2^k} + a_{i 2^k + 2^{k-1}}
           stores result in register a_{i 2^k}
Halt after n steps
Exercise: Can you write this algorithm using a sequential Algol-like language, extended with only cobegin ... coend? I think maybe so, using recursive procedure that calls itself inside the cobegin block. Can you do this with only iteration and cobegin ... coend?

Note on parallel algorithms: it is traditional not to count the cost of reading the input as part of the computation time. There are two reasons. The first is that we could never consider running times that are less than linear in the length of the input. This obscures the fact that parallel algorithms like the one above are substantially faster than sequential algorithms (logarithmic time instead of linear time). The second reason is that often a program consists of a series of algorithms. In this case, it is perfectly reasonable for one parallel algorithm to produce, as input to a second algorithm, a list of length 2^n in time proportional to n rather than 2^n. (This could be a parallel input procedure, or some other algorithm.)

Actor algorithm: My first guess was to follow the PRAM algorithm, using actors that each have data and neighbor fields. This algorithm would have two phases

Initialization:
     Iterate through input array, assigning data to 2^n actors.
Iteration:
     Each actor repeats the following steps  
         data :=  data + neighbor.data
         neighbor :=  neighbor.neighbor
However, this has two problems: the initialization takes linear time and the iteration phase isn't demand driven the way actor computation is supposed to be. There are no obvious communication events to trigger each addition.

A reasonable actor approach seems to involve recursively creating a binary tree of actors, one for each node in the evaluation tree drawn above, then adding data and passing values back up.

create fork-node with acquaintance external receiver
    send fork(0,2^n) to fork-node

a fork node with acquaintance parent
    if operation requested is fork(i,j) then
        if (i+1)<j then
            create fork-node with acquaintance this actor
                send fork(i, (i+j)/2) to fork-node
            create fork-node with acquaintance this actor
                send fork(i, (i+j)/2) to fork-node               
            become receive-first with acquaintance parent
        else 
            if i=j then send sum(a_i) to parent
                   else send sum(a_i+a_j) to parent


a receive-first with acquaintance parent
    if operation requested is sum(k)  then
         become  receive-second  with acquaintance k and parent

a receive second with acquaintances k and parent
    if operation requested is sum(j)  then
          send sum(j+k) to parent
This example illustrates a general approach to simulating recursive calls with actors. More specifically, consider the recursive function
   fun sum(i,j) = 
       if (i+1)<j 
          then sum(i, (i+j)/2) + sum(i, (i+j)/2) 
          else 
            if i=j then a_i
                   else a_i + a_j
If we execute the actor program above sequentially, with the computation initiated by "create fork-node with ..." completing before the other is started, we get a pattern of actor creation that exactly follows the pattern of creation of activation records on a stack associated with execution of the recursive procedure "sum". Conversely, if we execute the recursive function in parallel, the computation is essentially the same as that defined by the actor program.

Syntax of kernel language Act


<act program>           ::=  <behavior definition>* (<command>*)

<behavior definition>   ::=  (define (id {(with identifier <pattern>)}*)
                                  <communication handler>*)
                                
<communication handler> ::=  (Is-communication <pattern> do <command>*)                               

<command>        ::=    <let command>   |  <conditional command>  
                   |    <send command>  |  <become command>
             
<let command>    ::=    (let (<let binding>*) do <command>*)

<conditional command>  
           ::=    (if <expression>  (then do <command>*) (else do <command>*))

<send command>   ::=    (send <expression>  <expression>)

<become command> ::=    (become <expression>)
(* Need to explain these *)

Fairness, Equivalence, Composition

Observational equivalence: Two actor systems are observationally equivalent if the same observable events (messages to external receivers) are produced when the two are inserted in any larger system.

The Brock-Ackerman anomaly means that observational equivalence is not the same as equivalent mappings from sequences of input communication events to sequences of output communication events.

Consider an actor system composed of an actor A and and actor B which receives on communication and proceeds indefinitely without halting. This is observationally equivalent to A alone, if we assume fairness, but not otherwise.

We can contrast this to process algebras like CCS, where no fairness is assumed.

Composition of actor systems may be defined using configurations that consist of set of actors, set of messages (tasks) en route, set of receptionists and set of external receivers. Note: the composition of two actors cannot be an actor because an individual actor is sequential.

Can we define an "actor algebra"? Can we interpret process algebra connectives as operations on actors? How would we define

For composition, remember that the interface to an actor system may change dynamically. (** These questions might be good starting point for a term project **)

Garbage Collection?: How is "become forwarder to ..." implemented? If each actor has a unique mail address, then the actor executing this statement should immediately become garbage.

More generally, how do you detect garbage or termination, for that matter, in an asynchronous system like this?

Conclusions ?

Replacement behavior: A principle feature of Actors is the granularity of state change: an actor only changes state by becoming another behavior after it completes processing input. This is a very clean idea. On the other hand, it leads to a large number of different behaviors in a single system. In some cases, it might be simpler to write "larger" actor definitions with explicit internal state.

It seems advantageous, in an Actor-based programming language, to have some module construct for organizing behavior definitions into independent sets.

Communication: Asynchronous buffered communication is useful in many cases, but the indeterminate order of message arrival might be inconvenient in a number of programs. For example, consider an actor traversing a graph, using another stack actor to keep track of graph nodes to revisit. If there is no guarantee on order of messages, the algorithm will not necessarily traverse the graph in depth-first order. More importantly, it seems, the graph search algorithm cannot be sure it has explored the graph since a message from the stack actor indicating that it is empty might only indicate that some "push" message is still in the mail system somewhere and had not yet been delivered to the stack. We might like

In a theoretical sense, ordered channels and synchronous communication are definable using actors, but there are also pragmatic advantages to adopting these as system-provided primitives. Specifically, it the system can determine that an actor is waiting on delivery of a communication, it can route the message with higher priority. This might cause the entire actor system to run more efficiently.

There is no way to cancel messages. If a rectangle is moved on the screen, say, then it would be appropriate to cancel messages that ask the previous rectangle to be redrawn.

There is a very weak notion of class, in the form of behavior definition. Objects can change their interfaces, however, by becoming arbitrarily different behaviors. This makes static analysis, optimization of the mail system, etc., potentially difficult.

No inheritance, other OO features.