MPRI 2004/2005 Internship in the MOSCOVA group, INRIA Rocquencourt:
``High level languages for distributed programming --- middleware and subtyping''

Jean-Jacques LÚvy and James J. Leifer

Introduction

Acute is a prototype of a new language designed in INRIA Rocquencourt (by James J. Leifer and Francesco Zappa Nardelli) and in the University of Cambridge Computer Lab (by Peter Sewell, Keith Wansbrough, and others). It combines the simplicity and safety of ML-like languages (such as Ocaml) with key features needed for programming distributed systems.

Communication channels are the basis of much distributed programming. They come with varying properties of synchrony, fault tolerance, security, lifespan, and robustness in the presence migration. Most distributed languages fix on a particular channel implementation. By contrast, our aim in Acute is to express this entire range of potential behavior. It is for this reason that Acute does not bake in a particular channel abstraction but rather provides easy access to low-level primitives, such as the Unix TCP sockets library, and to high-level language constructs for controlling how language values are converted to byte strings (via marshalling) when building channel infrastructure.

Beyond distributed execution, Acute also supports distributed development and deployment of programs. As it is often impossible to synchronize distributed updates, several versions of an application and its libraries may well need to interoperate. Acute provides precise control over code shared between intercommunicating program, including ones using different versions of shared libraries.

The design motivation and semantics of Acute are given in the following reports:

Tasks

We propose two alternative tasks which may be adapted and combined according to the taste and capacities of the student. The first task has a practical orientation while the second is type theoretic.

  1. Existing distributed systems tend to be coded using large industrial middleware, such as Enterprise Java Beans and Corba which provide communication (via remote procedure call (RPC)), object replication, object leasing, concurrency control, etc. Microsoft's BizTalk supports reliable queueing for glueing together the long running transactions involving multiple databases. On the language level, Erlang has been successfully used for writing telecommunications switch software by exploiting its communication, failure monitoring, and hot code swapping capabilities.

    Can we do ``better'' in Acute? More precisely, can we take advantage of the Acute's features to write simple, lightweight code to provide many of the same features as the industrial systems? We are optimistic because Acute benefits from many of the qualities that makes ML-like languages so concise and expressive (such as higher order functions) and easy to debug (due static type inference) --- and has additional features concerned with marshalling that explicitly support distributed communication protocols.

    This task would thus involve the student learning about existing middleware, identifying interesting services and algorithms, and coding these services in Acute. Depending on the difficulties encountered, it may be possible to hack up an interface with industrial protocols so that the Acute code can be directly integrated in a network running existing middleware.

  2. The second line of work centers around adding subtyping to Acute. Subtyping is well suited to rich XML-like type systems for semi-structured data and has been integrated into ML languages such as Cduce and Xduce. Subtyping is the main motivation for using XML for inter-machine communication: the receiver may still use data sent to it even when it has only an approximation of the type information known by the sender.

    Adding subtyping to Acute will be a significant challenge. The subtyping required may not marry easily with ML-style abstract types, which are an important way of describing data with user-level invariant properties, and a key feature Acute inherits from Ocaml. Our experience with abstract types for distributed computing (see reports cited above) suggests that an effective way of maintaining abstraction safety is by comparing global type names generated from hashing the code implementing the underlying abstract type. Subtyping and hashing interact in complex ways that need to be overcome.

    This task involves understanding the subtleties of Acute's hashing mechanism that ensures type and abstraction safety of marshalled data; then proposing new typing rules and operational semantics; and, finally implementing these changes in the type inference and checking engine and the run-time system.

Prerequisites

This internship requires a knowledge of the following areas: Ocaml, Unix programming, operational semantics, concurrent programming, type theory.

Laboratory and supervisors

The internship takes place at INRIA Rocquencourt, in the MOSCOVA group. The work will be supervised by James J. Leifer and Jean-Jacques LÚvy.

Contact

Interested students may contact James.Leifer@inria.fr for any further information. For administrative details, please see the main page for all proposed MPRI internships.


This page: <http://pauillac.inria.fr/~leifer/jobs/mpri-2005.html>.
Last updated: "2005/01/25 18:00:37 UTC leifer".
Validez cette page.