Chapter 1 Concurrent programming

This part of the manual is a tutorial introduction to the join-calculus. This chapter presents small, local, examples. Chapter 2 deals with the distributed features.

1.1 Conventions

Examples are given as join-calculus source, followed by the output of the compiler when enabled to print types. For instance, here is the definition of a simple top-level value binding:

# let x = 1
# ;;
val x : int

An example may span over several programs chunks. Here is another binding:

# let y = x+1
# ;;
val y : int

In the end, the output generated by the execution of the example is shown:

# spawn echo(y)
# ;;
-> 2

Therefore, to execute the example above, it suffices to concatenate the three source chunks above in some file a.j (erasing the ``#'' characters), then to compile a.j by the command jcc -i a.j, and finally to run the produced code by the command ./a.out. (Option -i enables the output of inferred types).

1.2 Basics

Join-calculus programs are made of processes and expressions. Roughly, processes are executed asynchronously and produce no result, whereas expressions are evaluated synchronously and their evaluation produces values. Processes communicate by sending messages on channels or port names. Messages carried by channels are made of zero or more values and channels are values themselves. By contrast with other process calculi (such as the pi-calculus and its derived programming language Pict), channels and the processes that listen on them are defined by a single language construct. This feature allows considering (and implementing) channels as functions, when channels are used as functions.

Join-calculus programs are first organized as a list of top-level statements. A Top-level statements is a binding (such as let x = 1 ;;), a process spawnings or an expression execution (see below).

1.2.1 Processes

Processes are the core syntactic class in the join-calculus. The most simple process sends a message on a channel. For instance, the standard library provides the channel echo, sending an integer i on echo prints i on the console. At top-level, processes are introduced by the keyword ``spawn''. Top-level phrases are terminated by an optional ;;.

# spawn echo(1)
# ;;
# 
# spawn echo(2)
# ;;
-> 12

Processes introduced by ``spawn'' are executed concurrently. The program above may either echo 1 then 2, or echo 2 then 1. Thus, the output above may be 12 or 21, depending on the implementation.

Concurrent execution also occurs inside processes, using the parallel composition operator ``|''. This provides a more concise, semantically equivalent, alternative to the previous example:

# spawn echo(1) | echo(2)
# ;;
-> 21

Composite processes also include conditionals (if's) and local binding (let...in's). Process grouping is done by using curly braces ``{'' and ``}''.

# spawn
#   let x = 1 in
#   {let y = x+1 in echo(y) | echo(y+1)} | echo(x)
# ;;
-> 231

Once again, the program above may echo the integers 1, 2 and 3 in any order.

1.2.2 Expressions

The other important syntactic class of the join-calculus is the class of expressions. By contrast with processes, expressions evaluate to some results.

At top-level, expressions are introduced by the keyword ``do''. Expressions also occur in the right-hand side of value bindings or as arguments to message sending. Apart from variables and constants, such as integers, the most simple expression sends some values on a synchronous channel. Synchronous channels return an answer, which is made of zero or more results. For instance, the standard library provides the synchronous channel print. When given a message consisting of one integer i, print outputs i on the console and returns an empty result, consisting of zero values, to its sender.

# let x = 1
# ;;
# do print(x)
# ;;
# do print(x+1)
# ;;
val x : int
-> 12

In the program above 1, x and x+1 are expressions whose evaluation returns a single value (here an integer). The expressions print(x) and print(x+1) return empty results. This makes sense with respect to synchronization: as top-level phrases are evaluated by following textual ordering, the program above first outputs 1 and then outputs 2.

Composite expressions are made of conditionals, value bindings and sequence. Thus one may write:

# do
#   let x = 1 in
#   print(x) ; print(x+1)
# ;;
-> 12

Sequences may also occur inside processes. The general form of a sequence inside a process is expression ; process , where expression must return an empty result. As expression can itself be a sequence, composite sequences appear inside processes naturally. Thus, one may write:

# spawn
#   print(1) ; print(2) ; echo(3)
# ;;
-> 123

A sequence may terminate with the empty process that does nothing and is denoted by ``'', the empty sequence of characters. Thus, an alternative to the previous example is as follows:

# spawn
#   print(1) ; print(2) ; print(3) ;
# ;;
-> 123

Concrete syntax for processes and expressions are purposely similar, when not the same. A noticeable exception is grouping, which is expressed by curly braces ``{'' and ``}'' in the case of processes and by ordinary parenthesis ``('' and ``)'' in the case of expressions. Due to parser limitations, grouping is mandatory in the branches of an if process, i.e., one must write if expression then { process ₁ } else { process ₂ }.

1.2.3 Channels

Channels or port names are the main values of the join-calculus. The standard library already provides many port names, such as echo and print.

There are two important categories of port names: asynchronous and synchronous port names. Synchronous names (such as print) return values, whereas asynchronous channels (such as echo) do not. Therefore, message sending on synchronous channels occur inside expressions, whereas the message sending on asynchronous channels occur inside processes. The type-checker flags an error whenever a channel is used in the wrong context.

# do echo(1)
# ;;
File "ex11.j", line 1, characters 3-10:
Type error: This is not an expression with an empty result

jcc: Compilation failed for ex11.j

Users can create new channels with a new kind of let binding, which is not to be confused with ordinary value binding. The right hand-side of the definition of a channel a is a process that will be spawned whenever some message is sent on a. Additionally the message contents are bound to formal parameters. For instance, here is the definition of a stuttering echo channel:

# let echo_twice(x) = echo(x) | echo(x)
# ;;
val echo_twice : <int>

The new channel echo_twice has type <int>, this means that it accepts messages made of one integer. Sending an integer i on echo_twice fires an instance of the guarded process echo(x) | echo(x), where x is replaced by i. Such message sending is asynchronous in the sense that it is not possible to know when console output actually takes place.

The definition of a stuttering print is as follows:

# let print_twice(x) = print(x) ; print(x) ; reply
# ;;
val print_twice : <int> -> <>

As exposed by its type, print_twice is a synchronous port name that takes one integer as argument and returns an empty result. The type checker infers that print_twice is synchronous, thanks to the new reply construct whose semantics is to send back some (here zero) values as result. Message sending on print_twice is synchronous, in the sense that one knows that console output has occurred at the time when print_twice answers.

# spawn echo_twice(1)
# ;;
# 
# do print_twice(2)
# ;;
# 
# spawn echo_twice(3)
# ;;
-> 221133

Observe that, since processes execute concurrently, 221331 would also be a perfectly valid output. Printing both 2's before any 3 is the only constraint here, by the synchronous character of print_twice.

Channels are polyadic with respect to both arguments and results (when they exist). For instance, print accepts one argument and returns an empty result. Another example is the standard library channel +, that accepts two arguments and returns one result. The following channel f has arity two, both for argument and result, as shown by its type.

# let f(x,y) = reply x+y,y-x
# ;;
val f : <int * int> -> <int * int>

The type for f clearly shows its arities. Polyadic results are exploited by using polyadic value bindings. For instance the following program should print 3 on the console:

# let x,y = f(1,2)
# ;;
# 
# do print(x)
# ;;
val x : int
val y : int
-> 3

Synchronous names can be used to support a functional programming style. A traditional example is the Fibonacci function.

# let fib(n) =
#   if n <= 1 then {reply 1}
#   else {reply fib(n-1) + fib(n-2)}
# ;;
# 
# do print(fib(10))
# ;;
val fib : <int> -> <int>
-> 89

Note that, by contrast with value bindings, channel definitions are always potentially recursive.

Port names are first-class values in the join-calculus. They can be sent as messages on other port names and returned as results. As a result, higher order ``functions'' can be written.

# let twice(f) =
#   let r(x) = reply f(f(x)) in
#   reply r
# ;;
val twice : <<'a> -> <'a>> -> <<'a> -> <'a>>

The type for twice is polymorphic: it includes a type variable 'a that can be replaced by any type. Thus twice is a synchronous channel that takes a synchronous channel of type <'a> -> <'a> as argument and returns one result of the same type.

For instance, 'a can be the type of integers or the type of strings (^ is string concatenation):

# let succ(x) = reply x+1
# ;;
# 
# let double(s) = reply s^s
# ;;
# 
# do
#   let f = twice(succ)
#   and g = twice(double) in
#   print(f(0)) ; print_string(g("X"))
# ;;
val succ : <int> -> <int>

val double : <string> -> <string>
-> 2XXXX

1.2.4 Threads

Threads are not part of the join-calculus language, but they are part of the implementation. Threads are some kind of execution units at the implementation level. Threads are created, may suspend then resume, and finally die. Threads match the intuition of a ``direct'' causality. For instance, all the printing actions in a sequence are to be executed one after another and clearly belong to an unique thread. By contrast, the parallel composition operator ``|'' separates two threads that execute independently.

More precisely, the following program creates three threads:

# spawn
#   {print(1) ; print(2) ; print(3) ;} |
#   {{print(4) ;} |
#     print(5) ; print(6) ;}
# ;;
-> 561234

The output of the program is a mix of the output of its three threads: 123, 4 and 56. An output such as 12, then 4, then 356 would reveal that the thread printing 1, 2 and 3 has been preempted before printing 3.

Sending messages on channels ultimately fires a new process and should thus create a new thread. However, a new thread is not always created. For instance, compare the following definitions that create infinitely many threads and one, never dying, thread:

# let rabbit() = print_string("+") ; rabbit() | rabbit ()
# ;;
# 
# let forever() = print_string("-") ; forever() ; forever() ; reply
# ;;
val rabbit : <>

val forever : <> -> <>

Sending a message on asynchronous rabbit, fires a small thread that terminates spawning two new instances of rabbit, thus creating two other thread. By contrast, sending a message on synchronous forever blocks the sender thread until the thread fired by the message completes. The implementation takes advantage of this: it does not fire a new thread and executes the process guarded by forever() on the sender thread. Here, message sending results in an ordinary function call.

This implementation behavior is exposed by the following program:

# spawn
#   {rabbit()} |
#   {print_newline () ; failwith("over !!!")} |
#   {forever() ;}
# ;;
-> -------------------
-> jcrun a.out: failwith over !!! (from runtime.ml)

The program concurrently starts three threads: first, a failwith(...) thread that flushes pending output and kills the system; then, two crazy threads, rabbit() and forever().

Considering processes only, the program may terminate after an unspecified number of -'s and +'s has been printed. Considering threads, more -'s than +'s should appear on the console, since all -'s are printed by the same thread, whereas +'s are printed by different threads. All these printing threads compete for scheduling with the lethal thread and the forever() thread has to be preempted to stop outputting.

As a conclusion, informal reasoning about threads helps predicting program output, taking the implementation into account. What is predicted is likely output and not correct output.

1.3 Modules

The current implementation of the join-calculus supports a simple module system. This provides a clean access to library routines. Additionally, users may structure their programs in several modules, which can be written and compiled separately.

1.3.1 Names from the library

The standard library provides a lot of pre-defined names. The library is made of modules. Inside programs the name x from the module mod is accessed as a qualified name mod.x. For instance, the library module ml gives access to basic operations from the implementation, including operations on floating point numbers (such as conversion, output functions and the exponentiation exp).

# let one = ml.float_of_string("1.0")
# ;;
# 
# do ml.print_float(ml.exp(one))
# ;;
val one : ml.float
-> 2.71828182846

An alternative to qualifying names from modules is ``opening'' modules. Opened modules are searched by the compiler when it discovers a name that is not bound locally.

# open ml
# ;;
# 
# let one = float_of_string("1.0")
# ;;
# 
# do print_float(exp(one))
# ;;
val one : ml.float
-> 2.71828182846

The compiler starts with module sys opened, so that the primitives print, print_string, +, ...are in fact sys.print, sys.print_string, sys.+, ...

Primitives can be seen as (synchronous) port names in one important aspect: users send messages to them and get results from them. However, there is one important difference: primitives are not first class values. As such, primitives cannot be used on their own, they must be applied to their arguments.

# let duplicate_arg(f,x) = reply f(x,x)
# ;;
# 
# do print(duplicate_arg(+,2))
# ;;
File "ex25.j", line 4, characters 23-24:
this is an external primitive: sys.+
jcc: Compilation failed for ex25.j

Primitive are not first class values, mainly for implementation reasons : the execution of primitive is compiled as an instruction and there is no port name here. A simple workaround is to create a port name that simply calls the primitive.

# let duplicate_arg(f,x) = reply f(x,x)
# ;;
# 
# let plus(x,y) = reply x+y
# ;;
# 
# do print(duplicate_arg(plus,2))
# ;;
val duplicate_arg : <<'a * 'a> -> <'b> * 'a> -> <'b>

val plus : <int * int> -> <int>
-> 4

1.3.2 User defined modules

Modules are not restricted to the standard library. Users can create their own modules, compile them separately and link them together into an executable program. For instance, users may write a module stutter, that exports two channels echo and print. Ideally, users first specify the names exported by module stutter and their types, by writing an interface file stutter.ji.

# val echo : <int>
# val print : <int> -> <>

The interface stutter.ji is compiled by issuing the command jcc stutter.ji. This produces an object interface file stutter.jio.

Then, the implementation file stutter.j contains the actual definitions for stutter.echo and stutter.print.

# let echo(x) = async.echo(x) | async.echo(x)
# 
# let print(x) = sys.print(x) ; sys.print(x) ; reply

The implementation file stutter.j is compiled by issuing the command jcc -c stutter.j (-c is the compile-only, do-not-link option). This produces an object implementation file stutter.jo.

Now that the module stutter is properly compiled, some other implementation file user.j can use it.

# do stutter.print(1)
# ;;
# 
# spawn stutter.echo(2) | stutter.echo(3)
# ;;

The implementation file user.j can be compiled into user.jo by issuing the command jcc -c user.j. An executable a.out is produced by the command jcc stutter.jo user.jo that links the modules stutter and user together. Alternatively, a.out can be produced in one step by the command jcc stutter.jo user.j.

Running a.out may produce the following output:

-> 112233

See sections 3.9 and 7.3 for a more thorough description of modules.

1.4 Join-patterns

Join patterns significantly extend port name definitions. A join-pattern defines several ports simultaneously and specifies a synchronization pattern between these co-defined ports. For instance, the following source fragment defines two synchronizing port names fruit and cake:

# let fruit(f) | cake(c) =
#   {print_string(f^" "^c) ; print_newline() ; }
# ;;
val fruit : <string>
val cake : <string>

To trigger the guarded process {print_string(f^" "^c) ; print_newline() ;}, messages must be sent on both fruit and cake.

# spawn fruit("apple") | cake("pie")
# ;;
-> apple pie

The parallel composition operator ``|'' appears both in join-patterns and in processes. This highlights the kind of synchronization that the pattern matches.

Join-definitions, such as the one for fruit and cake, provide a simple mean to express non-determinism.

# spawn fruit("apple") | fruit("raspberry") | cake("pie") | cake("crumble")
# ;;
-> raspberry pie
-> apple crumble

Two cake names must appear on the console and both possible combinations of different fruits and cake types are correct.

Composite join-definitions specify several synchronization patterns.

# let apple() | pie() = print_string("apple pie") ;
# and raspberry() | pie() = print_string("raspberry pie") ;
val apple : <>
val raspberry : <>
val pie : <>

Observe that only three names apple, raspberry and pie are defined. Thus, pie potentially takes part in two synchronizations.

Again, internal choice is performed when only one invocation of pie is present:

# spawn apple() | raspberry() | pie()
# ;;
-> apple pie

Join-patterns are the key programming paradigm in the join-calculus. They allow the encoding of many data structures, which can be used concurrently. For instance, the following source fragment defines a counter:

# let count(n) | inc() = count(n+1) | reply to inc
# and count(n) | get() = count(n) | reply n to get
# ;;
# 
# spawn count(0)
# ;;
val inc : <> -> <>
val count : <int>
val get : <> -> <int>

This definition calls for two remarks. First, when a join-pattern defines several names, some of which are synchronous, the reply construct must now specify the name to which it replies. Hence the new reply ... to name construct.

Second, the usage of the name count above is a typical way of ensuring mutual exclusion. For the moment, assume that there is at most one active invocation on count . When one invocation is active, count holds the counter value as a message and the counter is ready to be incremented or examined. Otherwise, some operation is being performed on the counter and pending operations are postponed until the operation being performed has left the counter in a consistent state. As a consequence, the counter may be used consistently by several threads.

# spawn {inc () ; inc () ;} | {inc() ;}
# ;;
# 
# let wait() =
#   let x = get() in
#   if x < 3 then {wait()} else {
#     print_string("three is enough !!!") ; print_newline() ;
#   }
# ;;
# 
# spawn wait()
# ;;
val wait : <>
-> three is enough !!!

Ensuring the correct counter behavior in the example above requires some programming discipline: only one initial invocation on count has to be made. If there are more than one simultaneous invocations on count, then mutual exclusion is lost. If there is no initial invocation on count, then the counter will not work at all. This can be avoided by making the count, inc and get names local to a create_counter definition and then by exporting inc and get while hiding count, taking advantage of lexical scoping rules.

# let create_counter() =
#   let count(n) | inc0() = count(n+1) | reply to inc0
#   and count(n) | get0() = count(n) | reply n to get0 in
#   count(0) | reply inc0,get0
# ;;
# 
# let inc,get = create_counter()
# ;;
val create_counter : <> -> <<> -> <> * <> -> <int>>

val inc : <> -> <>
val get : <> -> <int>

This programming style is reminiscent of ``object-oriented'' programming: a counter is a thing called an object, it has some internal state (count and its argument) and exports some methods to the external world (here, inc and get). The constructor create_counter creates a new object, initializes its internal state and returns the exported methods. As a consequence, several counters may now coexist.

1.5 Control structures

Join-pattern synchronization expresses many programming paradigms, either concurrent or sequential.

1.5.1 Control structures for concurrency

Locks

Join-pattern synchronization can be used to emulate simple locks.

# let new_lock() =
#   let free() | lock() = reply to lock
#   and unlock() = free() | reply to unlock in
#   free() | reply lock,unlock
# ;;
val new_lock : <> -> <<> -> <> * <> -> <>>

Threads try to acquire the lock by performing a synchronous call on channel lock. Due to the definition of lock(), this consumes the name free and only one thread can get a response at a time. Another thread that attempts to acquire the lock is blocked, until the thread that got the lock releases it by the synchronous call unlock that fires another invocation of free.

To give an example of lock usage, we introduce channels that output their string arguments several times:

# 
# let double(p) =
#   let r(s) = p(s) ; p(s) ; reply in
#   reply r
# ;;
# 
# let print_port(s) = print_string(s) ; reply
# ;;
# 
# let print16 = double(double(double(double(print_port))))
# ;;
val double : <<'a> -> <>> -> <<'a> -> <>>

val print_port : <string> -> <>

val print16 : <string> -> <>

Now consider two threads, one printing -'s, the other printing +'s.

# spawn {print16("-") ;} | {print16("+") ;}
-> ++++++----------------++++++++++

As threads execute concurrently, their outputs may mix, depending upon scheduling.

Using a lock to delimit a critical section, -'s and +'s do not mix anymore.

# let lock,unlock = new_lock()
# ;;
# 
# spawn
#   {lock() ; print16("-") ; unlock() ;} |
#   {lock() ; print16("+") ; unlock() ;} 
# ;;
# 
val lock : <> -> <>
val unlock : <> -> <>
-> ++++++++++++++++----------------

Barriers

A barrier is another synchronization mechanism. Basically, barriers define synchronization points in the execution of concurrent threads. Here is a simple barrier that synchronizes two threads only:

# let join1() | join2() = reply to join1 | reply to join2
# ;;
val join1 : <> -> <>
val join2 : <> -> <>

Note that the definition above includes two reply constructs.

The following two threads print ba or ab between matching parenthesis:

# spawn
#   {print_string("(") ; join1() ; print_string("a") ; join1() ;
#    print_string(")") ;} |
# 
#   {join2() ; print_string("b") ; join2() ;}
# ;;
-> (ba)

Bi-directional channels

Bi-directional channels are found in many process calculi, such as the pi-calculus for instance. In the pi-calculus, given a channel c, a value v can be sent asynchronously on c (written c![v]) or received from c and bound to some variable x in some guarded process P (written c?x.P). All processes can both send and receive on the channels they know. By contrast, join-calculus processes that know a channel can only send messages on it. The receiver process is defined at the same time as the channel and is therefore unique. Finally, the scope of a pi-calculus channel name c is defined by the new c in P operator. Such an operator does not exist in the join-calculus, since join-definitions are binding constructs.

Bi-directional channels can be defined in the join calculus as follows:

# let new_pi_channel() =
#   let send(x) | receive() = reply x to receive in
#   reply send,receive
# ;;
val new_pi_channel : <> -> <<'a> * <> -> <'a>>

A pi-calculus channel is implemented by a join definition with two port names. The port name send is asynchronous and is used to send messages on the channel. One of these messages is received by making a synchronous call to the other port name receive. Now let us ``translate'' the following pi-calculus process:

new c,d in c![1] | c![2] | c?x.d![x+x] | d?y. print(y))

We get:

# spawn
#   let sc,rc = new_pi_channel()
#   and sd,rd = new_pi_channel() in
#   sc(1) |  sc(2) |  {let x = rc() in sd(x+x)} |
#   {let y = rd() in print(y) ;}
# ;;
-> 4

Observe that, by contrast with the join-calculus semantics of receptors, the process that performs x+x runs only once.

Synchronous pi-calculus channels are encoded just as easily as asynchronous ones: it suffices to make send synchronous:

# let new_pi_synch_channel() =
#   let send(x) | receive() = reply x to receive | reply to send in
#   reply send,receive
# ;;
val new_pi_synch_channel : <> -> <<'a> -> <> * <> -> <'a>>

1.5.2 Loops

Join-patterns are also useful for expressing various programming control structures. A simple control structure is iteration on some integer interval.

Simple loops

Asynchronous loops can be used when the execution order of iterated actions is irrelevant.

# let loop(a,x) = if x > 0 then {a() | loop(a,x-1)}
# ;;
# 
# let echo_star() = print_string("*") ;
# ;;
# 
# spawn loop(echo_star,5)
# ;;
val loop : <<> * int>

val echo_star : <>
-> *****

When execution order matters, asynchronous loops do not work. Since iterated actions are performed concurrently. Here, a synchronous loop is preferred.

# let loop(a,x) = if x > 0 then {a(x) ; loop(a,x-1) ; reply} else {reply}
# ;;
# 
# let print_int(x) = print(x) ; print_string(" ") ; reply
# ;;
# 
# do loop(print_int,5)
# ;;
val loop : <<int> -> <> * int> -> <>

val print_int : <int> -> <>
-> 5 4 3 2 1

When a synchronous loop produces a result, this result can be computed inside reply constructs and returned. The following example computes the sum of the squares of the integer between 1 and 32:

# let sum(i0,f) =
#   let iter(i) =
#     if i > 0 then {reply f(i)+iter(i-1)} else {reply 0} in
#   reply iter(i0)
# ;;
# 
# let square(x) = reply x*x
# ;;
val sum : <int * <int> -> <int>> -> <int>

val square : <int> -> <int>

# do print(sum(32,square))
# ;;
-> 11440

Port name definitions such as the one for iter above belong to a functional programming style. In particular, the various iterations of the loop body (computing f(i) here) never execute concurrently, since iterations are performed one after the other.

However, since integer addition is associative and commutative, the summing order of the various f(i) does not matters. Thus, asynchronous iteration can be used here. Then, we get a program with much more opportunities for concurrent execution:

# let sum(i0,f) = 
# 
#   let add(dr) | total(r,i) =
#     let r' = r+dr in
#     if i > 1 then {reply total(r',i-1) to total}
#     else  {reply r' to total} in
# 
#   let loop(i) = if i > 0 then {add(f(i)) | loop(i-1)} in
#   loop(i0) | reply total(0,i0)
# ;;
# 
# do print(sum(32,square))
# ;;
val sum : <int * <int> -> <int>> -> <int>
-> 11440

Observe how loop result is accumulated using the synchronous name total. The argument to the reply to sum consists in one call to total. This trick enables the synchronous sum to return its result when the asynchronous loop is over. In fact, replying directly to sum from inside the sub-definition for add(dr) | total(r,i) is rejected by the compiler:

# let sum(i0,f) = 
# 
#   let add(dr) | total(r,i) =
#     let r' = r+dr in
#     if i > 1 then {total(r',i-1)}
#     else  {reply r' to sum} in
# 
#   let loop(i) = if i > 0 then {add(f(i)) | loop(i-1)} in
#   loop(i0)
# ;;
# 
File "ex49.j", line 6, characters 23-26:
continuation outside of its scope: sum
jcc: Compilation failed for ex49.j

The current compiler enforces strong restriction on where to place reply constructs, see section 7.4.

Distributed loops

Sharing a loop between several ``agents'' requires more work. Let us informally define an agent as some computing unit. In this section, an agent is represented by a synchronous channel. In a more realistic setting, different agents would reside on different computers. The agent paradigm then serves to allocate computing resources (see section 2.3.3 in the next chapter).

For instance, here are two agents square1 and square2. The agent square1 modelizes a fast machine, whereas square2 modelizes a slow machine by computing squares in a uneficient way. Additionally, square1 and square2 differ marginally by their console output: square1 outputs a +, while square2 outputs a * when it starts and a - just before answering.

# let square1(i) = print_string("+") ; reply i*i
# ;;
# 
# let square2(i) =
#    print_string("*") ;
#    let total(r) | wait() = reply r to wait in
#    let mult(r,j) =
#      if j <= 0 then {total(r)} else {mult(r+i,j-1)} in
#    mult(0,i) |
#    let r =  wait() in
#    print_string("-") ; reply r
# ;;
val square1 : <int> -> <int>

val square2 : <int> -> <int>

Sharing a loop between several agents is allocating the iterations to be performed to them. The following channel make_sum, returns a register channel and a wait channel. An agent registers by sending its computing channel on register. The final loop result is returned on wait.

# let make_sum(i0) = 
# 
#   let add(dr) | total(i) =
#     if i > 1 then {reply dr+total(i-1) to total}
#     else  {reply dr to total}
#   and  wait() = reply total(i0) to wait in
# 
#   let loop(i) | register(f) =
#     if i > 0 then {add(f(i)) | loop(i-1) | register(f)} in
# 
#   loop(i0) | reply register,wait
# ;;
val make_sum : <int> -> <<<int> -> <int>> * <> -> <int>>

The only difference with the asynchronous sum loop from the previous section resides in the replacement of the definition let loop(i) = ... by the join-pattern definition let loop(i) | register(f) = .... As a consequence, the agents square1 and square2 may now compete for loop iterations, provided two invocations register(square1) and register(square2) are active.

# let register,wait = make_sum(32)
# ;;
# 
# spawn register(square1) | register(square2)
# ;;
# 
# do print(wait())
# ;;
val register : <<int> -> <int>>
val wait : <> -> <int>
-> +*+*+*+*+*+*+*+*+*+*+*+*++**+*+*----------------11440

The distributed loop above is not satisfactory, since it does not take the relative computing speed of square1 and square2 into account while allocating iterations. The add(f(i)) jobs are spawned asynchronously, so that the different iterations performed by a given agent are executed concurrently. As a result, the iteration space is partitioned evenly between square1 and square2, as illustrated by the output above. This leads to a poor load balance, since the fast square1 stands idle at the end of the loop, while the slow square2 is overwhelmed.

A better solution is for an an agent to execute its share of work in sequence rather than concurrently. This is achieved by the slightly modified definition for loop(i) | register(f) below:

# let make_sum(i0) =
# 
#   let add(dr) | total(i) =
#     if i > 1 then {reply dr+total(i-1) to total}
#     else  {reply dr to total} in
# 
#   let  wait() = reply total(i0) in
# 
#   let loop(i) | register(f) =
#     if i > 0 then
#       {loop(i-1) |
#       let r = f(i) in
#       add(r) | register(f)} in
# 
#   loop(i0) | reply register,wait
# ;;
val make_sum : <int> -> <<<int> -> <int>> * <> -> <int>>

In the new definitions, register(f) is launched again only once f(i) is computed. By contrast, loop(i-1) is launched immediately, for another agent to grab the next iteration as soon as possible.

# let register,wait = make_sum(32)
# ;;
# 
# spawn register(square1) | register(square2)
# ;;
# 
# do print(wait())
# ;;
val register : <<int> -> <int>>
val wait : <> -> <int>
-> +*++++++++++++++++++++++++++++++-11440

1.6 Data structures in the join-calculus

1.6.1 Simple data structures

Polymorphic pairs can be encoded quite easily in the join-calculus by taking advantage of port name arity.

# let create(x,y) =
#   let get() = reply x,y in
#   reply get
# ;;
val create : <'a * 'b> -> <<> -> <'a * 'b>>

As exposed by the type above, a pair is a synchronous port name that takes zero arguments and returns two values. The synchronous name create returns such a name when given the values to store in the pair.

The content of a pair can be retrieved by sending a message of arity zero on it:

# let fst(p) = let x,y = p() in reply x
# and snd(p) = let x,y = p() in reply y
# ;;
val fst : <<> -> <'a * 'b>> -> <'a>
val snd : <<> -> <'a * 'b>> -> <'b>

Pairs can now be used in an ``abstract'' fashion, by considering only the constructor create and the two destructors, fst and snd:

# let p = create(1,"coucou")
# ;;
# let x = fst(p) and s = snd(p)
# ;;
# do print(x+string.length(s))
# ;;
val p : <> -> <int * string>

val x : int
val s : string
-> 7

An alternative to encoding data structures in the join-calculus is using data structures from Objective Caml. For instance, the standard library provides a module pair:

# open pair
# ;;
# 
# let p =  create(1,"coucou")
# ;;
# let x = fst(p) and s = snd(p)
# ;;
# do print(x+string.length(s))
# ;;
val p : (int, string) ml.pair

val x : int
val s : string
-> 7

The type of pairs is now ('a,'b) ml.pair, emphasizing the abstract character of pairs. Nothing is known about the concrete structure of pairs, implementation details for pairs being local to the implementation of the standard library module pair. This is in sharp contrast with the type of encoded pairs: <> -> <'a * b'>, which unveils much of the encoding.

1.6.2 Data structure encoding by object-oriented style

A convenient programming style for expressing data structures in the join-calculus is the so-called object-oriented style. An object has an internal, hidden, state and exports some methods, (here some synchronous names) that operate on its internal state. Consider for instance the following encoding for lists:

# let cons (h,t) =
#   let head () = reply h
#   and tail () = reply t
#   and self () = reply false,head,tail in
#   reply self
# and nil () =
#   let  head () = reply failwith ("head of nil")
#   and  tail () = reply failwith ("tail of nil")
#   and self () = reply true,head,tail in
#   reply self
# ;;
val cons : <'a * 'b> -> <<> -> <bool * <> -> <'a> * <> -> <'b>>>
val nil : <> -> <<> -> <bool * <> -> <'a> * <> -> <'b>>>

The internal state of a list cell is an emptiness status (true or false), plus, when appropriate, two subcomponents (h and t). The emptiness status is exposed directly to the external world. The two methods head and tail give access to the subcomponents of a non-empty list cell or fail. Observe how the name self is introduced to pack methods together in the reply to cons and nil. Finally, the types for the results of nil and cons are the same. Hence, both kinds of list cells are values of the same type.

List can now be used by directly retrieving methods:

# let list_concat(l) =
#   let n,h,t = l() in
#   if n then {reply ""}
#   else {
#     reply h()^list_concat(t())
#   }
# ;;
val list_concat :
  <'a> -> <string> where 'a = <> -> <bool * <> -> <string> * <> -> <'a>>

The above type is recursive. This reflects the fact that lists are recursive data structures. More precisely, the type of the cons-cell creator cons was not recursive. However, the recursive type for lists of strings appears naturally when writing a function that traverses lists.

Finally, the following source fragment shows how to create and use string lists:

# let replicate(elem,n) =
#   if n <= 0 then {reply nil()}
#   else {
#    reply cons(elem,replicate(elem,n-1))
#   }
# ;;
# 
# do print_string(list_concat(replicate("X",16)))
# ;;
val replicate :
  <'a * int> -> <'b> where 'b = <> -> <bool * <> -> <'a> * <> -> <'b>>
-> XXXXXXXXXXXXXXXX

1.6.3 Mutable data structures

Object states, represented as join-patterns, can be altered by invoking the appropriate methods. Here is a definition for a reference cell. One method (get) examines the content of the cell, while another (put) alters it.

# let create_ref(y0) =
#   let state(y) | get() = state(y) | reply y to get
#   and state(y) | put(new_y) = state(new_y) | reply to put in
#   state(y0) | reply get,put
# ;;
val create_ref : <'a> -> <<> -> <'a> * <'a> -> <>>

Here, the internal state of a cell is its content, its is stored as a message y on the channel state. Lexical scoping is used to keep the state internal to a given cell.

# let gi,pi = create_ref(0)
# and gs,ps = create_ref("")
# 
val gi : <> -> <int>
val pi : <int> -> <>
val gs : <> -> <string>
val ps : <string> -> <>

1.6.4 Elaborate data structure: a concurrent fifo

We are now ready for a more sophisticated example of data structure encoding in the join-calculus. First we define a new kind of list cells. Such a cell always holds an element (x below). When created, it stands in the first position of a list, and a name (see first below) reflecting that status is activated. Then, when another element is cons-ed is front of the list, a cell additionally holds a pointer to this previous element (see the pattern first() | set_prev(prev) below). In the end, a cell can be destroyed (see kill below), it then returns both its content, a pointer to the previous cell (when it exists), and a boolean, which is set to true, when the destroyed cell is the only one in the list.

# let new_cell (x) =
#   let first() | set_prev(prev) = inside(prev) | reply to set_prev
#   and inside(prev) | kill () = reply x,prev,false to kill
#   and first()      | kill () = reply x,self,true to kill
#   and self() = reply set_prev,kill in
#   first() | reply self to new_cell
# ;;      
val new_cell :
  <'a> -> <'b> where 'b = <> -> <<'b> -> <> * <> -> <'a * 'b * bool>>

A fifo is a data structure that provides two methods put and get. The method put stores its argument in the fifo, while the method get retrieves one element from the fifo. The internal state of the fifo is either empty (and then the name Empty is activated), or it contains some elements (internally, non-empty fifos have the name state activated). The name state holds two arguments: fst is a pointer to the first cell of the element list, while lst is a pointer to the last cell of the element list. Thus, elements stored in the fifo are cons-ed in front of fst (see put below), while retrieved elements are taken from the end of the element list (see get below). There is no Empty() | get() pattern below. As a consequence, an attempt to retrieve an element from an empty fifo is not an error: answering to get is simply postponed until the fifo fills in.

# let fifo() =
#   let Empty() | put(x) =
#     let fst = new_cell(x) in
#     state(fst,fst) | reply to put
#   and state(fst,lst) | put(x) =
#      let new_fst = new_cell(x) in
#      let set,rem = fst() in
#      set(new_fst) ;
#      state(new_fst,lst) |
#      reply to put
#   and state(fst,lst) | get () =
#      let set,rem = lst() in
#      let x,prev,last_cell = rem () in
#      if last_cell then {Empty()} else {state(fst,prev)} |
#      reply x to get in
#   Empty() | reply put,get to fifo
# ;;   
val fifo : <> -> <<'a> -> <> * <> -> <'a>>

From the fifo point of view, elements are retrieved in the order they are stored. In a concurrent setting this means that when a thread performs several get in a row, the retrieved elements come in an order that is compatible with the order in which another thread feeds the fifo.

# spawn
#   let put,get = fifo() in
#  {print(get()) ; print(get()) ; print_newline() ;} |
#  {let x = get () and y = get () in } |
#  {put(1) ; put(2) ; put(3) ; put(4);}
# ;;
-> 34

Therefore, the program above prints two integers from the set {1, 2, 3, 4} in increasing order.

1.7 A word on typing

The join-calculus type system is derived from the ML type system and it should be no surprise to ML programmers. The key point in typing à la ML is parametric polymorphism. For instance, here is a polymorphic identity function:

# let id(x) = reply x
# ;;
val id : <'a> -> <'a>

The type for id contains a type variable ``'a'' that can be instantiated to any type each time id is actually used. Such a type variable is a generalized type variable.

For instance, in the following program, variable ``'a'' is instantiated successively to int and string:

# let i = id(1) and s = id("coucou")
# ;;
# 
# do print(i) ; print_string(s)
# ;;
val i : int
val s : string
-> 1coucou

In other words, the first occurrence of id above has type <int> -> <int>, while the second has the type <string> -> <string>.

Experienced ML programmers may wonder how the join-calculus type system achieves mixing parametric polymorphism and mutable data structures. There is no miracle here. Consider, again, the join-calculus encoding of a reference cell:

# let state(x) | get() = state(x) | reply x to get
# and state(x) | set(new_x) = state(new_x) | reply to set
val get : <> -> <'_a>
val state : <'_a>
val set : <'_a> -> <>

The type variable ``'_a'' that appears inside the types for state, get and set is prefixed by an underscore ``_''. Such type variables are non-generalized type variables that are instantiated only once. That is, all the occurrences of state must have the same type. Moreover, once ``'_a'' is instantiated with some type, this type replaces ``'_a'' in all the types where ``'_a'' appears (here, the types for get and set). This wide-scope instantiation guarantees that the various port names whose type contains ``'_a'' (state, get and set here) are used consistently.

More specifically, if ``'_a'' is instantiated to some type int, by sending the message 0 on state. Then, the type for get is <> -> <int> in the rest of the program, as shown by the type for x below. As a consequence, the following program does not type-check and a runtime type-error (printing an integer, while believing it is a string) is avoided:

# let state(x) | get() = state(x) | reply x to get
# and state(x) | set(new_x) = state(new_x) | reply to set
# ;;
# 
# spawn state(0)
# ;;
# 
# let x = get()
# ;;
# 
# do print_string(x)
# ;;
val get : <> -> <'_a>
val state : <'_a>
val set : <'_a> -> <>

val x : int

File "ex70.j", line 12, characters 3-15:
Type error (Call): this expression has type
<string> -> <>
but is used with type
<int> -> '_b

jcc: Compilation failed for ex70.j

Non generalized type variables appear when the type of several co-defined port names share a type variable. Such a type variable is not generalized.

# let port(p) | arg(x) = p(x)
val port : <<'_a>>
val arg : <'_a>

A workaround is to encapsulate the faulty names into another port name definition that defines only one name. This restores polymorphism.

# let make_it() =
#   let port(p) | arg(x) = p(x) in reply port,arg
# ;;
val make_it : <> -> <<<'a>> * <'a>>

Non-generalized type variables also appear in the types of the identifiers defined by a value binding.

# let p1,a1 = make_it()
# ;;
# 
# let p2,a2 = make_it()
# ;;
# 
# spawn p1(echo) | p2(echo_string) | a1(1) | a2("coucou")
# ;;
val p1 : <<'_a>>
val a1 : <'_a>

val p2 : <<'_b>>
val a2 : <'_b>
-> 1coucou

It is interesting to notice that invoking make_it() twice produces two different sets of port and arg port names, whose types contain different type variables. Thereby, programmers make explicit the different type instantiations that are performed silently by the compiler in the case of generalized type variables.

Non-generalized type variable sometimes are a source of nuisance, when the compiler unexpectedly rejects programs, see section 7.4 for details.