An occam2 To Modula-3 Translator






Martin Stuart Mamo
Magdalene College

An occam2 To Modula-3 Translator

Diploma In Computer Science 1995

Approximately 9800 words (sample code not included)

Project Originator: Michael Hinchey

Project Supervisor: Stephen Hand

Original Aims:

The aims were to write a translator in the programming language C, to translate from occam2 to Modula-3. occam2 is a comparatively small language and it was hoped that a fair portion of the language could be translated. The occam input would be assumed to be correct.

Until relatively recently, the only occam compilers have been available for transputers. This translator would therefore allow portability of software from transputers to any platform supporting a Modula-3 compiler, as well as providing concise constructs for the support of parallelism and synchronous communication not directly available in Modula-3.

Work Completed:

I have written a lexer, parser (using the tools flex and yacc) and code generator to translate a subset of occam2 to Modula-3. Integer, real and Boolean data types are supported as well as channels for synchronous communication. The translator can successfully translate occam's PAR, SEQ, IF and WHILE constructs and/or replicators.

No support is included for the ALT construct/replicator, procedures and functions, nor for the record and timer data types. PLACED PAR is inappropriate for a single processor machines and PRI PAR cannot be handled by Modula-3.

Special Difficulties:

none


CONTENTS

1
Introduction

1.1
occam the programming language and CSP
1.2
occam and the transputer
1.3
Previous work
1.4
Motivation and outline of work

2
Preparation

2.1
Structure of the translator
2.2
Lexical analysis and the use of indentation in occam
2.3
Parsing and syntax tree representation
2.4
Code generation
2.4.1
Assignments and channels
2.4.2
Mapping of data types, scopes and the declaration vector
2.4.3
The unravelling of nested PARs and variable passing

3
Implementation

3.1
Distribution of source files
3.2
Error Handling
3.3
Lexical analysis
3.4
Parsing and parse tree generation
3.5
Code generation
3.5.1
SKIP and STOP
3.5.2
Channels
3.5.3
PAR constructs and replicators
3.5.4
PLACED PAR and PRI PAR
3.5.5
SEQ and loop constructs
3.5.6
IF constructs and replicators
3.5.7
Simple and multiple assignments
3.5.8
Declarations and types
3.5.9
Operators, numbers and variable names
3.5.10
Indentation

4
Evaluation

4.1
Production of correct Modula-3 code
4.2
Error handling
4.3
occam constructs and types yet to be implemented
4.4
Natural support for concurrency and communication

5
Conclusions

Bibliography

Appendix

Project Proposal


1

INTRODUCTION

occam was developed as a programming language for the transputer and was based on concepts found in the process algebra CSP (Communicating Sequential Processes). This work will describe the construction of an occam2 to Modula-3 translator, allowing the portability of software from transputer to non-transputer system, as well as providing concise constructs for the specification of concurrency.

1.1 occam THE PROGRAMMING LANGUAGE AND CSP

occam[4,5,6] arose from concepts founded by Tony Hoare in CSP[7] and was intended as a low-level assembly language for the transputer. occam was developed with a minimalist approach that avoids unnecessary duplication of language constructs and as such was named after the 14th century philosopher William of Occam, who proposed entities should not be duplicated beyond necessity.

occam2 later extended the occam language with the introduction of data types. occam can be used to program a single transputer or a network of transputers. For the former, inter-process communication will be via shared memory and the processes will time-share on the single processor. At the moment INMOS is working on occam3[11], which will provide support for large programs not found in occam2.

occam's basis in CSP, and often one-to-one equivalence, gives it a strong formal basis. CSP specifications may be viewed as a system of processes executing independently, communicating over unbuffered, unidirectional channels. These specifications may be manipulated by algebraic laws. Various semantic models allow proposed properties to be proven. Finally the CSP system may be transformed almost directly to an occam program for execution on a transputer.

The occam2 programming language enables an application to be described as a collection of concurrent processes, where each one can communicate with other processes through channels. occam's distinguishing feature is that concurrent processes can be composed as easily as sequential ones. In conventional languages statements which lexically follow one another are executed in sequence. In occam statements are replaced by processes, which can operate in sequence or concurrently, depending on whether they are preceded by a SEQ or PAR construct or replicator. SEQ and PAR constructs are themselves processes and this captures the hierarchical structure of a system by allowing an interconnected system of processes to be regarded as a single process.

A process performs a sequence of actions and then terminates. The simplest process is an action, which can be any one of the following:

A channel is a unidirectional, unbuffered point-to-point link from one process to another. Communication can only take place when both input and output processes are ready to engage in the communication event. If either process is not ready, the other process will remain suspended.

1.2 occam AND THE TRANSPUTER

In 1977, a collaborative project between Oxford University and INMOS Limited began work to design a microprocessor with input and output channels, making them easier to connect together to produce networks of processors of arbitrary size. In 1982, the transputer (TRANSistor comPUTER) emerged. For the T414 model the architecture consists of:

The T414 operates at 10 MIPs. Each of the INMOS links provides two occam channels - one in each direction. Communication may occur concurrently with communication on all other links. The synchronisation of processes at each end of the link is automatic and needs no explicit programming.

The transputer reflects the occam model and can be considered to be an occam machine and it provides a direct implementation of the occam process in its hardware. Although occam2 is not an assembly language, it provides full control of the transputer with regards system configuration, input/output and real-time interrupts and as such makes programming in native assembler unnecessary.

Until recently, the only complete occam compilers were for transputers. This has restricted the use of the occam language as transputers are not readily accessible. Compilers to allow occam to be run to Unix work-stations would facilitate portability between transputer and non-transputer systems.

1.3 PREVIOUS WORK

In 1994, the Department of Electronic and Computer Science, University of Southampton released Southampton's Portable Occam Compiler[2] (SPOC), which allows occam2 programs to be supported on Unix work-stations. Its compilation is achieved by translating occam source into target-independent C source and then using native C compilers to generate the object code. The execution of an occam program on a single processor is achieved by multi-tasking. This multi-tasking can be considered as non-local flow control - when a process deschedules on a synchronisation point the control flows out of that process, through the scheduler and into another process ready to execute. This flow of control cannot be determined at compile time and so is handled by a run-time system which maintains a queue of processes that are ready to execute and switches between them.

They decided not to use available thread libraries to implement concurrency from C as they are not a part of the ANSI C standard to so would compromise portability. Instead they decided to implement the non-local control flow using ordinary C constructs.

Parallel components of a PAR construct or replicator are moved out into separate procedures. The scheduling control flow was achieved by implementing the body of each procedure as a case statement. The case statement is switched by an instruction label to denote which atomic code section is to execute next. The labels are represented by a set of contiguous integers. Every atomic section starts with a case label and terminates with a descheduling point.

More recently, a beta version of the Kent Retargetable Occam Compiler[1] (KROC) has been released by the University of Kent at Canterbury (May 1995). Like SPOC, KROC enables occam to run on non-transputer platforms. The beta version runs on SunOS/SPARC systems.

KROC works by translating code produced by an INMOS occam Toolset compiler into native SPARC assembler and linking it with a small kernel that provides the process scheduling and inter-process communication. Therefore an occam Toolset compiler is also needed. INMOS have released the sources for their Toolset compiler, which compiles on most Unix systems.

1.4 MOTIVATION AND OUTLINE OF WORK

In comparison to Modula-3[9], occam2 is a small language. It does not allow recursion, dynamic memory allocation and has no support for object-orientated programming.

Modula-3 is a modern, general purpose programming language with dynamic memory, objects, garbage collection, exception handling and also basic support for concurrency. Modula-3's ancestors include Modula-2 and Pascal. However Modula-3 lacks occam's natural and concise way of specifying concurrency and communication.

Modula-3 is a more natural choice for translating occam into rather than C. Although Modula-3 has no direct equivalents to occam's PAR constructs and channels, it does allow the creation of any number of threads. The implementation of a scheduler for concurrency would have greatly added to my work. As for SPOC, the use of thread libraries would have decreased portability. Modula-3 can also control thread access to shared variables, suggesting a means by which channels can be implemented. C does not have a static chain and as such disallows nested procedure and function definitions. This is not the case for Modula-3.

An occam2 to Modula-3 translator would allow the portability of software from transputer to non-transputer systems as well as providing a simpler means to implement concurrency than is provided by a Modula-3 compiler alone.

I have written an occam2 to Modula-3 translator on Unix, which supports:

I chose to write the translator in a third language, namely C, because of the ready availability of the Unix tools flex and yacc, which generate C source code for lexers and parsers respectively. Flex and yacc require any embedded code in their input files to be in C. My flex and yacc input files and code generator were all written de novo.

The following chapters present the design and implementation of my occam2 to Modula-3 translator.


2

PREPARATION

2.1 STRUCTURE OF THE TRANSLATOR

One.gif - 2.4 K

Operation of the occam2 to Modula-3 translator

A compiler translates from a program written in one language, the source language, to an equivalent program in a second language, the object or target language. Typically the source language will be C or Pascal and the object language will be machine code for the computer being used.

For our translator the source language is occam2 and the target language is Modula-3, rather than machine code.

The translation of one language to another breaks down into a number of distinct phases. At the simplest level, we have the front end responsible for the analysis of the structure and meaning of the source test and the back end responsible for generating the target language.

The front end can be further subdivided into the lexical analyser (scanner), the syntax analyser (parser) and the semantic analyser. The scanner groups individual characters found in the source language into logical entities e.g. 'P', 'A' and 'R' would be grouped together as the word 'PAR'. The parser analyses the overall structure of the program and groups token output from the lexer into larger constructs, such as statements, loops and assignments. Once the structure has been determined, its semantics can be analysed i.e. which variables are real and which ones are integers. (For my translator semantic analysis will be limited and will not form a distinct phase.) At the this stage the program is in the form of some intermediate representation - typically a tree or a three-address code.

The back end takes the intermediate representation and generates an equivalent program in the target language. This often involves more than one phase if efficient translation is desired. An intermediate code optimiser may transform the intermediate representation into a more efficient equivalent. For SPOC, occam constructs that are difficult to map into C, were subject to abstract syntax tree transformation - all component PAR processes were pushed out into procedures. (For my translator, no intermediate phase is intended - any transformations will be done by the code generator.)

2.2 LEXICAL ANALYSIS AND THE USE OF INDENTATION IN occam

Diagram (Lexical Analysis)

The lexer returns tokens as small integers/lexemes are passed via a C union
The lexer also converts indentation into BEGIN-END blocks

The tool flex was chosen to generate the lexical analyser. In addition to returning a stream of tokens, along with their associated attributes/lexemes if they have one, the lexer will have to have a number of other roles. Most obviously the lexer will have to strip away white space and comments. Comments in occam are preceded by the character pair '--' and extend to the end of the line.

In addition the lexer should create a linked list of all variable names in the occam2 source code. This is for two reasons:

a) So the lexer can pass the actual variable name to the parser as a pointer into the list.
b) A complete list of variable names will be helpful to the code generator when it has to print out a list of all variables in scope along with type information contained within its own data structure.

Unlike some languages where the level of indentation is optional, in occam the indentation of the program lines forms an intrinsic part of the language's syntax. For example the syntax for the constructor SEQ, takes the form:

	sequence = SEQ
	             {process}
This syntax denotes a SEQ construct followed by zero or more component processes, each on a separate line and all indented by the same amount - two spaces beyond the SEQ. This is to indicate each process is a component of the SEQ. Deindentation by two space will signify the end of the SEQ construct.

As such the lexer will have to convert changes in indentation into BEGIN and END tokens where appropriate.

2.3 PARSING AND PARSE TREE GENERATION

The tool yacc was chosen to generate the parser. A Backus-Naur Form-like description of the occam2 language suitable for yacc was found to be available[6].

Hand-coded semantic actions will generate a tree representation of the occam2 input suitable for later code generation.

2.4 CODE GENERATION

2.4.1 Assignments and channels

As mentioned in the introduction, in occam all statements are processes and the simplest ones are of assignment, input and output.

Assignments change the value of variables as in any conventional language. However in occam, multiple assignments are also possible:

	a, b := b,a
The above will swap the values of the variables a and b. As such our translator will have to use explicit temporary variables to accomplish this.

Communication can only take place over a channel if both input and output processes are ready to engage in the communication event. If either process is not ready, the other process will remain in a suspended state. Channels are unbuffered, unidirectional and synchronised. The below will assign the 5 to the integer variable x.

	CHAN OF INT channel:
	INT32 x:
	PAR
	  channel ! 5
	  channel ? x
In our translation, communication can be accomplished via a shared variable. The shared variable should be an object field in a subtype of a Modula-3 Thread.Mutex so as access can be synchronised. In addition there should be a Boolean value to specify whether the channel is full (TRUE) or empty (FALSE), as well as condition variables to allow a thread to be suspended whilst it is waiting for a channel to either be emptied or filled.

2.4.2 Mapping of data types, scopes and the declaration vector

The mapping of occam data-types into the equivalent or longer Modula-3 data-type is shown below. On the Unix system CUS, Modula-3's INTEGER is 32 bits wide and LONGREAL is 64 bits wide. For simplicity, REAL32 has been mapped to LONGREAL, even though REAL is more appropriate in terms of bit size on our Modula-3 implementation. Different types of floating-point numbers in Modula-3 use different letters for their exponent part.

occam type Modula-3 type
INT INTEGER
INT16 INTEGER
INT32 INTEGER
INT64 not supported
REAL32 LONGREAL
REAL64 LONGREAL
BOOL BOOLEAN

All occam variables and channels are local to the process that immediately follows their declaration. When that process ends, the variables are out of scope. That process can either be a multi-line construct (PAR, SEQ etc.) or a single-line action.

If inside the scope of a variable, another variable is declared with the same name, then within this second variable's scope this namesake replaces the original. The original variable is said to have had a 'hole' knocked in its scope. For example:

INT apple, peach: -- declare two integer variables 
SEQ
  apple := 500(INT)
  REAL64 apple: -- the real variable apple has knocked a hole
  SEQ           -- in the scope of the integer variable apple
    apple := 3.14(REAL64)
During the code generation phase, at various time the translator will have to know what variable names are in scope and what their type is. A linked list of variable name pointers (pointers into the lexer generated list) and type information is kept (DVEC). Any variables declared before a process are added to the top of the list. A subsequent declaration with a previously used variable name will get added to the top of the list. As the list is searched from the top downwards to the first occurrence of a variable name, holes are thus implemented. When a process ends its variables are removed from the linked list.

2.4.3 The unravelling of nested PARs and passing variables

The translator will be using Modula-3's Thread.Fork to help implement occam's PAR constructs and replicators.

The component processes of an occam PAR construct see all the variables currently in scope. However Modula-3's threads have their own stack and so are unable to access the local variables in the scope of their call. For example in the occam program below, processes 1 and 2 can access the variables apple and orange.

INT apple, orange:
SEQ
  apple, orange := 100, 200
  PAR
    process1
    process2
Threads can access global and dynamic variables though. Therefore for Modula-3 threads to have access to the variables in the scope of their call, the translator will have to implement all variables used as references (pointers) and then explicitly pass the references to each thread via a Thread.Closure subtype. If values were merely copied, then any alterations made to the variables by the thread would be lost, unless explicit measures are taken. Explicitly passing references to threads will require all the variables in scope to be printed out in several different forms:

For convenience this will require a single function that will go through DVEC (declaration vector) and print out all the variables in scope in different forms according to what argument it is passed.

There is also the further problem that PAR constructs can be nested indefinitely. Thread definitions (syntactically the same as procedures) all have to be top-level and cannot be nested. To unravel nested PAR constructs we either need to make use of an arbitrary number of temporary files during the code generation phase or to perform tree transformations just before code generation begins. I chose the former and it will be discussed further in Chapter 3.


3

IMPLEMENTATION

3.1 DISTRIBUTION OF SOURCE FILES

The source code for the translator is split up into five separate files as follows:

3.2 ERROR HANDLING

As specified in the proposal the occam source language is assumed to be correct and so comprehensive error messages and error recovery were not implemented. However the translator will report the following errors.

Upon encountering the first error the compiler will issue a message and quit.

3.3. LEXICAL ANALYSIS

The C source for the lexer was generated using the compiler construction tool flex distributed by the GNU project. From a description of the regular expressions (type 3 grammars) for the tokens in its input file, flex generates transition and output tables for a Finite State Machine and an ANSI C driving program.

The lexer creates a linked list of all variable names found in the occam source file. When a variable token is to be returned to the parser, a pointer to the actual variable name in the linked list is assigned to the global variable yylval, which has been defined as a C union in a yacc generated header file. The parser then has access to the token's attribute via the global variable.

In addition to the tokenisation of the input occam file, the lexer converts integer strings to a binary format. An integer will cause an integer token to be returned to the parser and assignment of the binary value to the C union.

Real numbers are left as a string to facilitate conversion between occam and Modula-3 formats later on by the code generator e.g. 33.3E6 (REAL64) to 33.3d6 and 22.0 (REAL64) to 22.0d0.

One grammar ambiguity caused by the addition of occam replicators to the parser, has been corrected by returning the token NAME instead of VARIABLE, if a variable immediately follows a PAR, SEQ or IF token on the same line. Tokens like PAR either have a replicator to their right on the same line or nothing.

	replicator = NAME '=' expression FOR expression
The lexer has to convert the level of indentation preceding the first token on a line found in the occam source file into the appropriate number of BEGIN or END token/s for the parser. In occam two space characters are the unit of indentation. A start of a new block is indicated by increasing the indentation by two spaces. The end of a block is marked by decreasing it by two spaces. A new line will either entail the generation of no BEGIN or END tokens, at most the addition of one BEGIN token or several END tokens i.e. several blocks can be ended at once.

For example,

IF
  x > 5
    process1
  x < 5
    process2
would be converted into the following token stream.
IF
BEGIN
  VARIABLE > INTEGER
  BEGIN
    process1
  END						{ end of process1 block
  VARIABLE < INTEGER
  BEGIN
    process2
  END						{ end of process2 block
END						{ end of IF block
The lexer counts the number of white spaces before the first token on each line and then works out the change in indentation level from the previous line. If it has increased by one, the lexer will fill a one-unit buffer with the current token and then return a BEGIN token to the parser. When the parser calls the lexer again, this buffer will have its contents passed to the parser.

If the indentation level has decreased by a even number of spaces, the lexer will make a note of the exact change, fill the one-unit buffer with the current token and then return a single END token to the parser. When the lexer is next called, it will sequentially return the right number of END tokens and then return the contents of the one-unit token buffer.

It is quite possible the lexer will reach the end of the occam source code file, with an unpaired number of BEGIN tokens having been returned to the parser. This is because several indentation levels can be stepped down in one go. As such the lexer keeps a record of how many unpaired BEGIN tokens there are at any one time. When an end of file character is detected, the appropriate number of END tokens will be returned to the parser, before the lexer signals the end of input. GNU's flex, unlike AT&T's lex, has a special pattern <<EOF>> that matches the end of file character. All comments and white space are removed by the lexer. Comments can either follow an occam statement on the same line. These are recognised with the regular expression "--".*. Alternatively they can take up a whole line of their own and are recognised with ^(" "*)"--".*\n.

3.4 PARSING AND PARSE TREE GENERATION

The C source for the parser was generated using the compiler construction tool yacc.

In the first section of the input file to yacc, I have defined which symbols (terminals and non-terminals) are associated with which type of value using %union. As I invoked yacc with the -d option, this was copied to the file y.tab.h as the C union defining type YYSTYPE. The file y.tab.h is included in the lexer and code generator files.

Symbol ANSI C Attribute
Integer int
Real number char *
Boolean short
Variable/Name VARLIST * (pointer into linked list of names)
Non-Terminal NODE * (pointer into parse tree structure)

Also I have declared what tokens are used in the grammar using %token. Each token has an associated integer value, which again are included in the file y.tab.h as a series of C #defines.

I have specified the precedence and associativity of the arithmetic operators to be parsed explicitly using yacc's %left, %right and %nonassoc. %left specifies that the operators on its line have the same precedence level and group from the left to the right i.e. 2 + 3 + 4 is parsed as (2 + 3) + 4. To specify operators with a higher precedence level another %left, %right or %nonassoc line is included below i.e.

Arrow pointing downwards

%left AND OR NOT
%left '-' '+' 
%left '*' '/' 
%nonassoc UMINUS
%nonassoc specifies that the pseudotoken UMINUS has a higher precedence than '*'. The symbol '-' as defined above has a lower precedence that '*'. Later on in the rules section, unary minus is parsed with '-' expression %prec UMINUS. %prec tells yacc to use the precedence of UMINUS for this rule. Otherwise unary minus would be parsed with a lower precedence than '*' and '/', which is incorrect.

%left, %right and %nonassoc also declare their operators to be tokens, if they are more than single characters.

The second section of the yacc input file contains the rules for the occam grammar using a Backus-Naur like form. Each rule in this section is followed by user C code, which is executed when the rule is reduced. This code will typically call a C function, defined in the third section of the yacc input file, to add a new node to the parse tree. These functions are passed the attributes of the terminals and the non-terminals contained in the reduced rules. The tree is therefore built up in a manner which reflects the syntax of the occam language. Each node is represented as a C structure of type NODE, which is given below.

typedef struct node {      
  NODETYPE nodetype;
  CONTTYPE conttype;
  union attrib_tag {
    int ival;
    char *fval;
    short bool;
    VARLIST var;            /* for variables  */
    int op_type;            /* for infix operators */
  } attribute;
  struct node *left;
  struct node *right;
  struct node *middle;
  struct node *void_pointer;
  } *NODE;
The field nodetype tells us what rule has been reduced. If the node has an attribute associated with it, the nodetype field will be set accordingly and the conttype field will tell us what the type of the attribute is, so it can be later selected from the C union attribute contained in the structure. This structure has room to point to up to three subtrees if necessary. As an example the parse tree representation of the following occam2 code is given on the next page.
INT x, y:
SEQ
  x := y + 1
With reference to the parse tree, an occam program is defined as a process list. As this program only contains one process, the right branch of the node process list has had a null pointer assigned to it. If there were two processes, the right branch would point to another process list.

In occam, variable declarations can optionally proceed a process. In this case two integer variables x and y have been declared. If a there was a further declaration of say, real variables, on the next line, the node declaration list would have its right branch pointing to yet another declaration list node instead of to a null pointer. In this way arbitrary length lists can be represented.

The one process in this program is a SEQ construct represented as the node SEQ. SEQ in turn has one component process. The process x:=y+1 does not have any variable declarations preceding it and so the node process, beneath the nodes SEQ and process list, in the tree, has its left branch assigned a null pointer.

As previously stated, assignments in occam can assign multiple variables at once. That is why the parse tree has capacity for an assignment node to point to a list of variables and also to a list of expressions.

Diagram (occam parse tree)

Representation of an occam parse tree

3.5 CODE GENERATION

The code generator walks the parse tree and outputs equivalent Modula-3 code. I will now individually describe the translation of each occam construct supported, along with some other details about how the translator works.

3.5.1 SKIP and STOP

In occam, SKIP starts a process, performs no action and then terminates. It is used to stand in for parts of a program that have not yet been written.

STOP starts a process, but does no work and never terminates. Like SKIP, STOP is used in program development.

These two processes are translated into Modula-3 procedure calls. The procedures are defined at the top of every translation.

3.5.2 Channels

Channels allow synchronous communication between two occam processes. One process executes output to a channel and the other executes an input from the same channel. An input will not complete execution until an output on the same channel is executed and vice-versa.

The type of data carried on channels is specified when the channel is declared. Channels have the same scope rules as any other occam variable. The format and data type of a channel is specified by its protocol.

	CHAN OF protocol mail:
There are three different types of protocol allowed. The simplest protocols consist of a primitive data type, such as an integer protocol. For example,
	CHAN OF [10]INT32 mail:
will define a channel that passes arrays of integers.

occam also allows a name to be given to a particular protocol e.g.

	PROTOCOL myname IS REAL64:
Then a channel with protocol myname can be declared as:

	CHAN OF myname mail:
The next type is the sequential protocol. This allows values to be passed along a channel in groups. For example,
	CHAN OF INT32; REAL64 mail:
will define a channel that passes values in groups of two.

Variant type protocols allow several different types of message to be passed along the same channel. A tag specifies what type of message will be passed.

Due to time constraints, the translator will only support simple protocols. Also occam's method of naming protocols is not supported.

As mentioned previously, channels are implemented as a shared variable between two Modula-3 threads. To avoid race conditions the shared variable will be created as a field in an object descended from Thread.Mutex. Modula-3's LOCK mutex DO stmts END ; will prevent the two communicating threads from trying to access the shared variables at the same time, as the code for reading/writing to the variable will be placed in the critical section defined by the LOCK statement.

In addition a thread should suspend itself if it is waiting to read or write to the shared variable. To do this, two condition variables are added to the Thread.Mutex subtype. A thread can be suspended on a condition variable until another thread signals that condition variable.

Thus the occam declaration

	CHAN OF INT chan1:
will correspond to the Modula-3 code below shown below.
TYPE

  chan1_type1 = Thread.Mutex OBJECT
    notFull, notEmpty : Thread.Condition ;
    channel_status : REF BOOLEAN ;
    channel_value : REF INTEGER ;
  END ;

...

VAR
  chan1 := NEW(chan1_type1) ;
The field channel_value is the shared variable used to pass data and channel_status indicates whether the channel has had a value placed in it or removed.

A Modula-3 translation of occam may result in several different "channel" variables all named chan1 existing with different scopes. As the type definitions for Modula-3 "channel" variables are all placed at the beginning of the code and so are global, an integer number is added on to the end of the type name. For example if there are two chan1 variables with different scopes, the first one declared with be of type chan1_type1 and the second will be of type chan1_type2.

The occam for reading value from a channel, namely

	chan1 ? y
is translated into:
   LOCK chan1 DO
      WHILE chan1.channel_status^ = FALSE DO
        Thread.Wait(chan1, chan1.notEmpty) ;
      END ;
      y^ := chan1.channel_value^ ;
      chan1.channel_status^ := FALSE ;
      Thread.Signal(chan1.notFull) ;
    END ;
The LOCK construct will cause the reader to hold the mutex and thus the any writers will not be able to access the shared variable. The loop
      WHILE chan1.channel_status^ = FALSE DO
        Thread.Wait(chan1, chan1.notEmpty) ;
      END ;
will suspend the thread until the channel has been filled with a value and release the mutex.

When the writer executes

      Thread.Signal(chan1.notEmpty) ;
after placing a value in the channel, the reader thread will resume and relock the mutex. Then the value will be read and the channel marked as being empty.

See example 10 in the appendix for a full translation of a simple occam program using channels.

3.5.3 PAR constructs and replicators

The PAR construct is followed by zero or more processes which are indented by two spaces. For example,

	PAR
	  p1
	  p2
	  p3
specifies the concurrent execution of three processes. The construct only terminates after all the component processes have terminated.

As discussed previously, Modula-3 does provides support for concurrency. Calls to Thread.Fork allow multiple points of execution in a program.

However there are two problems in the translation of occam's PAR constructs or replicators into Modula-3 code.

a) In occam all variables in scope are seen by the component processes of a PAR. This is not the case for Modula-3, although it does allow variables to be explicitly passed to threads.

b) In occam PAR constructs and replicators can be nested. In Modula-3 thread definitions all have to be top-line procedures and nesting is not allowed for thread definitions (it is allowed however for normal procedures and functions).

From point b), we can see that each component of a PAR construct should be translated into a top-line Modula-3 procedure. This procedure will be written to a separate file that at the end of code generation is concatenated to the start of the main file. However it is not quite as simple as that, nested PAR constructs have to be "unravelled" first if code is to the translated in the correct order (see example 8 in the appendix). To "unravel" nested PAR constructs we can do one of two things.

For each level of PAR nesting the translator needs one extra temporary file to write to. Thus the translator needs to have the ability to open up an arbitrary number of temporary files to write the translated PAR component processes to. Each time the translator encounters a PAR construct, it either move up from the main file to a temporary file or from a temporary file to another temporary file next up in a linked list. If there is not a temporary file higher up in the list, it creates one. If no PAR processes are to be translated it creates no temporary files. If there are 9 levels of PAR nesting it will create 9 temporary files called proc.tmp1 to proc.tmp9. After concatenation these files are removed by the translator.

Let's consider an example as to why the translator needs to open up an arbitrary number of temporary files instead of just one to store the translated PAR component processes. See example 8 in the appendix for the Modula-3 translation of the occam program below.

occam
INT apple:
PAR
  SEQ
    apple := 56
    PAR
      apple := apple * 2
      PUTINT(apple)
    PUTINT(apple*7)
When the first PAR is encountered, the translator switches writing from the main Modula-3 output file to a temporary file, say proc.tmp1. The translator successfully writes apple^ := 56. Then it encounters another PAR. But say it does not open up another file, it will still be writing to proc.tmp1. So then it correctly writes the Modula-3 code to fork off the two processes apple:=apple*2 and PUTINT(apple*7). Next it encounters PUTINT(apple*7), after writing the thread definitions for apple:=apple*2 and PUTINT(apple). PUTINT(apple*7) belongs to the first PAR in the occam program, but the Modula-3 code for it will be added incorrectly to the thread definition of apple:= apple*2.

What it needed to do was to switch from writing to proc.tmp1 to proc.tmp2. In proc.tmp1, we would initially write the Modula-3 code for apple:=56 and then for forking off two processes. Then it moves up to writing to proc.tmp2 and there writes the thread definitions for the two occam processes apple:=apple*2 and PUTINT(apple). Then the code generator would encounter the end of the nested PAR process and then move down from proc.tmp2 to proc.tmp1. In proc.tmp1 the translator was writing the code for the single SEQ process of the first PAR. It will then write the code for PUTINT(apple*7) in the correct place. Then it encounters the end of the PAR block and closes the thread definition with RETURN NIL; END pthread1_1;.

The number of files that an application can have open at once is no doubt large on Unix. However whenever a temporary file is not being using it is closed. When it is needed again it is opened for appending.

The second problem concerns the fact that Modula-3 threads do not see all the variables that are in the scope of their call. Each Modula-3 thread has its own stack. During code generation, the translator has to print out all the variables in scope in several different forms in different places. To simplify this matter, we have written one C function (varsinscope) that will print out all the variables in scope in four different Modula-3 formats according to the arguments it is called with i.e. type definitions, assignments etc. Basically this function runs through the linked list of all variable names created by the lexer and will use the most recently added variable of the same name found in the linked list DVEC.

Having discussed the two problems involving in translation, we will now go through the steps in the translation of a PAR construct to Modula-3 code.

For each PAR construct, the translator generates code into two separate files. It first outputs code for forking off the Modula-3 threads to the current file, be that the main file or a temporary file. It then generates code for the thread definitions to a temporary file, be that the first temporary file (proc.tmp1) or a temporary file one up from the current file in the linked list of proc.tmpX files.

Each PAR construct encountered is assigned a unique id number, say x. The processes of a PAR construct are numbered from one onwards. Thus each process will have an unique number e.g. x_1 or x_2 or x_3 etc.

It then counts up the number of processes that make up the PAR construct. It requires this information a little later when it needs to know how many threads to fork off.

It calls varsinscope to generate a record type that has fields corresponding to each variable in scope. It then makes a second call, to generate a Thread.Closure subtype that has this record as a data field. These types are written to a separate file that is concatenated to the front of the Modula-3 code at the end of code generation.

It then writes code to create to new variable of the aforementioned record type and a Modula-3 handle for each thread to be forked off. A third call to varsinscope generates code to set the fields of this record to the values of the variables in scope.

All that is left to do now is the fork off the appropriate number of threads and wait for their completion.

The replicated PAR construct produces an array of structurally identical parallel processes. It is used in occam to implement pipelines, buffers and queues (see example 7 in the appendix for its use in a pipeline). Unlike PAR, replicated PAR can only be followed by one process - not zero or more processes. However that one process can be a constructor like PAR or SEQ. The same applies to other replicators.

For example,

	PAR i = 0 FOR n
	  p1
will construct an array of n similar processes. The value of i for the first replication here will be 0 and for the last replication will be n - 1. The replications, will of course, proceed in parallel.

The generation of code for PAR replicators is similar to PAR constructs described above. Instead of forking off several different processes, here the translator has to fork off the same process a certain number of times. The processes have successive integer values substituted for the replicator index. In occam the replicator index may not be changed by the component process. However in our translation, this is no such restriction as each replicator index points to a unique variable.

As for PAR constructs, the translation starts with generating a record type containing fields for all variables in scope. However in addition the replicator index is included, which is an integer variable.

Instead of creating just one record variable to hold the values of all the variables in scope, the translator creates an array of them. Each record will differ by having one field pointing to a different integer variable. These integer variables will hold successive integers corresponding to the replicator index. See example 6 in the appendix for an example translation of a PAR replicator.

3.5.4 PLACED PAR and PRI PAR

PLACED PAR assigns a process for execution on a specified processor. This construct is inappropriate for single processor work-stations and has no equivalent in Modula-3.

PRI PAR is used to specify the ordering of priority in a series of processes. On a transputer only two priority levels are supported. This construct cannot be implemented using Modula-3's threads.

Unsurprisingly these two constructs are not supported by the translator.

3.5.5 SEQ and loop constructs

The SEQ construct is just used to group occam processes together into one sequentially executed block and so does not need any explicit translation. As a SEQ is defined as a process by the code generator any variables declared immediately before it will exist only within the SEQ construct. Its translated component processes will be in a VAR BEGIN...END block if any variables were declared.

The replicated SEQ construct is equivalent to Modula-3's FOR statement. For example,

	SEQ i = 0 FOR n
	  p1
will cause the process p1 to be repeated n times. The value of i for the first replication is 0 and for the last replication is n - 1 for the above case.

The translation of a SEQ replicator is given in example 3 the appendix.

All translated variables have to be Modula-3 references (pointers) so they can be accessed by threads. The effect of this is that in example 3, the integer index of the Modula-3 FOR statement is given a temporary name (zzz_temp) that differs from the index of the occam SEQ replicator. This is because the index of a Modula-3 FOR statement cannot be a reference. In the next line, the value of zzz_temp is passed to the appropriate variable.

occam's WHILE loop semantically corresponds to Modula-3's WHILE loop and its translation is trivial. See example 2 in the appendix.

3.5.6 IF constructs and replicators

The syntax for the IF conditional in occam2 is:

conditional = IF
                {choice}
         |    IF replicator
                choice

choice = guarded.choice | conditional

guarded.choice = expression process

replicator = name = index FOR count

Thus in occam2, the IF construct can followed by further IFs, IF replicators as well as Boolean expressions and their associated process, which will be executed if they yield a true value. If an IF construct or replicator contains no conditions that are found to be true, the construct will behave like STOP.

For example,

IF x = 0 FOR 4
  array[x] = 0
    array[x] := 99
will check the first four elements of the array for a zero. The first element which it finds to be zero, will have its value changed to 99 and then the construct will terminate

In Modula-3 the IF statement is very simple. The IF statement is followed by Boolean expressions each one with an associated statement. The first expression that is found to be true will have its associated statement executed and then the construct will terminate. If no expressions are true, the IF construct will terminate. The optional "catch all" of ELSE is provided.

The translation of occam's nested IFs is simple. They simply flatten out into one IF. See the translation below.

Nested IFs

occam

INT apple:
SEQ
  apple := 100
  IF
    apple = 50
      PUTINT(apple)
    IF
      apple = 60
        PUTINT(apple)
      IF
        apple = 100
          PUTINT(apple)
Modula-3
  VAR
    apple := NEW(REF INTEGER) ;
  BEGIN
    apple^ := 100 ;
    BEGIN
      IF apple^ = 50 THEN
        Wr.PutText(Stdio.stdout, Fmt.Int(apple^) & "\n") ;
        Wr.Flush(Stdio.stdout) ;
      ELSIF apple^ = 60 THEN
        Wr.PutText(Stdio.stdout, Fmt.Int(apple^) & "\n") ;
        Wr.Flush(Stdio.stdout) ;
      ELSIF apple^ = 100 THEN
        Wr.PutText(Stdio.stdout, Fmt.Int(apple^) & "\n") ;
        Wr.Flush(Stdio.stdout) ;
      ELSE Stop() ;
      END ; (* END IF *)
    END ;
  END ; (* END PROCESS DECLARATION *)
Now we will consider the translation of an IF replicator nested inside an IF construct. As mentioned previously an IF evaluates its conditions in sequence (conditions being IF constructs, IF replicators or simple Boolean/associated statements), looking for the first one that is true. If none are true it will act like STOP.

In the translation of the occam code, nested IF replicators will have to be replaced by functions that return TRUE if one of their conditions is true. Otherwise they return FALSE. See the translation below.

An IF Replicator Nested Inside An IF

occam

INT apple:
SEQ
  apple := 50
  IF
    apple = 40
      PUTINT(40)
    IF index = 5 FOR 10
      apple = 40 + index
        PUTINT(apple) 
Modula-3
  VAR
    apple := NEW(REF INTEGER) ;
  BEGIN
    apple^ := 50 ;
    PROCEDURE if_replicator1() : BOOLEAN =
      VAR index := NEW(REF INTEGER) ;
      VAR zzz_temp : BOOLEAN := FALSE ;
      BEGIN
        FOR zzz_temp2 := 5 TO (5 + 10) - 1 DO
          index^ := zzz_temp2 ;
          IF apple^ = 40 + index^ THEN
            zzz_temp := TRUE ;
            Wr.PutText(Stdio.stdout, Fmt.Int(apple^) & "\n") ;
            Wr.Flush(Stdio.stdout) ;
            EXIT ;
          END ; (* END IF *)
        END ; (* END FOR *)
        RETURN zzz_temp ;
      END if_replicator1 ;
    BEGIN
      IF apple^ = 40 THEN
        Wr.PutText(Stdio.stdout, Fmt.Int(40) & "\n") ;
        Wr.Flush(Stdio.stdout) ;
      ELSIF if_replicator1() THEN Skip() ;
      ELSE Stop() ;
      END ; (* END IF *)
    END ;
  END ; (* END PROCESS DECLARATION *)
The case where a construct begins with an IF replicator and has a nested IF construct also has been considered. See the translation of IF hello = 16 FOR 10, which is in example 9, Appendix.

The code internal to the IF replicator, when translated is placed in a loop that will terminate either on natural completion of the loop or on the first condition being found true. If no conditions are found to be true, the Boolean zzz_temp will remain false and the translation of the occam code will act like STOP.

3.5.7 Simple and multiple assignments

To translate multiple assignments, Modula-3 code is first generated to assign the right hand side expressions to arrays of temporary variables of the appropriate type. Then the actual variables are assigned the values of temporary variables.

The list of variables to be assigned to is transversed and a count is made of each type. The numbers are then summed and a choice made between generating code for a simple assignment or for a multiple assignment. The latter being more verbose.

For multiple assignments, temporary one-dimensional arrays of variables are defined, with their dimension corresponding to the number of variables of their type in the variable list. The temporary variables are then assigned to the expressions according to type. Finally the actual variables are assigned to the temporary variables. Whilst the temporary variables are being assigned to the right hand side expressions a check is made that the number of variables assigned to is the same as the number of expressions. If not an error is generated.

See example 4 in the appendix for a Modula-3 translation of an occam multiple assignment.

3.5.8 Declarations and types

occam declaration lists immediately precede a process. For example,

	INT32 apple, pear:
	REAL64 frog, bat:
	CHAN OF INT32 message:
	PAR
	  process1
	  process2
The variables declared only exist for the scope of that process. When the code generator encounters a process, it passes the associated declaration list to a function. That function will output equivalent Modula-3 declarations and will return the number of variables found in the declaration list. If variables were declared then a Modula-3 BEGIN token is output. It will also add the variable names and type information to the head of a linked list (DVEC). DVEC is a list of all variables in scope.

On returning, the number of variables declared is copied to a local variable for safe keeping. Then the code generator will output the translation of the process. Once this translation is complete, the variables are removed from DVEC from the knowledge of how many where originally added at the start of the process. If a BEGIN token was initially output, then a matching END token will be output at this stage.

As mentioned in chapter 2, 'holes' in the scope of variables are supported, as when type information is required for a given variable name, DVEC is searched from the head downwards.

The translator will map occam's integer (INT, INT16, INT32), real (REAL32, REAL64) and Boolean (BOOL) types to appropriate Modula-3 ones. However occam's BYTE and TIMER types are not supported. BYTE simply holds positive integers less than 256. The type TIMER allows the creation of integer variables, whose value is incremented at regular intervals. The value of these 'clocks' are cyclic.

3.5.9 Operators, numbers and variable names

Most of the standard arithmetic, relational and Boolean operators are the same in both occam and Modula-3. However there are a few differences that are shown below and the translator will map one to the other.

occam Modula-3
<> #
REM or \ MOD
/ (floating-point) /
/ (integer) DIV

From the above, we can see occam uses one operator for both floating point and integer division, whereas Modula-3 uses different ones. To allow the correct translation, a flag is set according to whether an expression contains integer or floating point numbers or variables during code generation. A / or DIV is output accordingly.

occam and Modula-3 have slightly different formats for representing floating-point numbers. The two differences are: a) In occam the exponent (e or E) is optional, whereas for Modula-3 it is compulsory. b) Modula-3 has three different letters for its REAL, LONGREAL and EXTENDED floating-point types (e, d and x respectively). occam only uses one (e). As we translate all occam's floating-point types into Modula-3's LONGREAL types, things are slightly simplified.

To aid translation, occam floating-point numbers are stored as strings in the parse tree. A check is first made if the number contains an 'e' or 'E'. If it does not, the string is output followed by "d0". If the string does contain an 'e' or 'E' then it is substituted for a 'd' before output.

In contrast occam integers are stored as binaries in the parse tree and need no special consideration.

Variables names are unaltered by code generation. Variable names in occam are allowed to include the period e.g red.colour. However Modula-3 variable names are not and so such a variable name would cause the translated Modula-3 code to be rejected by a Modula-3 compiler. This deficiency could be corrected by a simple mapping, but due to time constraints was not.

3.5.10 Indentation

The translated Modula-3 code output is indented to help show its structure. This is achieved by having a global variable, which is set to the number of spaces that should be printed in front of every statement produced. Function calls increment and decrement this value in steps of two. A function call has to be made to output the correct number of spaces at the start of each new statement.

However the translator may well switch from writing from one file to another when generating the thread definitions for the component PAR processes. When this happens the global variable's value is saved into a local variable.


4

EVALUATION

4.1 PRODUCTION OF CORRECT MODULA-3 CODE

The translator was tested throughout its development with four different classes of occam2 programs, as outlined in the proposal:

The four classes of occam2 programs together with the generated Modula-3 output can be seen in the appendix. All Modula-3 output that I have seen is visibly correct and can be compiled into executable code. Where output is generated it is in accordance to correct functioning of the translator (see example 7 in particular).

The first example shows us the translation of the occam SEQ construct containing no processes. We can see the translator's 'preamble' and 'post-amble' code that has to surround all its output. SKIP and STOP are implemented as procedure calls. SKIP does nothing, whereas STOP never terminates, but as it does no work its implementation yields to another thread so that it does not unnecessarily consume processor time.

The second example shows us the virtual correspondence between occam and Modula-3's WHILE and IF constructs. Note however Modula-3's IF construct ends with:

	ELSE Stop() ;
	END ; (* END IF *)
although this is is absent from the occam code.

This is because an IF conditional in occam behaves like STOP if none of its choices can proceed. This is why the "catch all"

	TRUE
	  SKIP
is often found at the end of occam IF constructs.

A minor optimisation could be that if an occam IF contains a

	TRUE 
	  process
at any stage, then this is implemented instead of the final ELSE Stop() ; in Modula-3.

The third example shows us that occam's equivalent to the usual FOR loop is a SEQ replicator.

Example 4 shows us the translation of occam's multiple assignments. Here the values of a and b are swapped without the need to explicit temporary variables in occam. Single assignments are treated differently and are translated without the need for temporary variables. However the translation of assignments could be further optimised. If a multiple assignment does not alter any variable that is also present in an expression on the right, then assignments via temporary variables can be discarded. Inspection of the Modula-3 source code will show that translation of a multiple assignment reveals what the compiler thinks a variable's type should be (a^ := zzz_INTEGER[2] ;). This has been used to check for correct functioning of the translator's implementation of 'holes' e.g.

	INT orange, apple, peach:
	SEQ
  	  REAL64 apple:
	  -- a multiple assignment here will reveal the translator
	  -- now treats the variable apple as a real number
	  -- instead of an integer
Example 5 is our first program here that uses input/output. A common programming problem is to find the first element of an array that obeys some rule. occam's IF replicator handles this matter elegantly. The user inputs an integer and the program will search the array for the first occurrence of that integer and it will then output the position in the array where it was found. The first IF evaluates its conditions in sequence, looking for the first one that is true. In this case, its first condition is the IF replicator IF i = 0 FOR 5. If this finds the integer in the array it will be true and then the first IF will then terminate. If the IF replicator returns false, then the first IF will evaluate its second condition, which is TRUE and sets the value to be output as -1 signifying the integer has not been found. Without the TRUE branch the IF construct would act like STOP if the integer was not found.

Example 6 is out first example here using parallelism. As with example 8, we can see that the use of temporary files to unravel nested PAR constructs/replicators into top-level procedures is working correctly.

The occam source code for example 7 was taken from the occam Programming Manual[4] and demonstrates the use of an array of channels.

To calculate the square root using the Newton-Raphson approximation method, a process needs the value and an initial estimate at the answer - in our case we use half the value. The formula (estimate + (x/estimate))/2 is applied to this initial estimate to produce a better estimate. The same formula is reapplied several times to get a good final result.

Example 7 applies this formula 49 times and treats each application of the formula as a separate occam process. We can consider this as a pipeline of processes with data flowing from the input to the output via channels. The example took x = 12345678 as its value and gave the correct answer of 3.513642E3 as its output.

Diagram (pipeline of processes)

To set up a pipeline of processes in occam we use the PAR replicator:
	PAR i = 0 FOR 49
This means replicate the same process in parallel 49 times. Each one of the processes receives a different value of the index i. The first one receives i = 0 and the last one receives i = 48.

Each of the 49 processes in the pipeline has the following code as its body:

Diagram (pipeline code snipet)

For the calculation of a single square root, as in the code for example 7, this method is no faster than one that makes use of iteration. However the parallelism of this pipeline will allow several square roots to be different stages of calculation simultaneously.

Example 9 shows that despite complex nesting of IF constructs and replicators, correct Modula-3 code is still generated.

4.2 ERROR HANDLING

The translator immediately stops on encountering a lexical or syntax error, an illegal change in indentation or an undeclared variable with one of three possible error messages. Future work could include recovery to a consistent state after an error is encountered, so that a single run will find all the errors. There is also no semantic checking allowing type errors to be undetected.

It is bad practise for a variable to be altered in any component process of a parallel construct if it is used in any other component of the same construct. Within the same construct no channel may be used for input (or output) in more than one component process. Further improvements could check for these conditions and issue warnings accordingly.

4.3 occam CONSTRUCTS AND TYPES YET TO BE IMPLEMENTED

Due to time constraints records, the data types TIMER and BYTE, ALT constructs/replicators, procedures and functions cannot be translated. This does somewhat limit the number of pre-written occam programs that can be translated.

4.4 NATURAL SUPPORT FOR COMMUNICATION AND CONCURRENCY

Examination of examples 6 and 7 (Appendix A), which use parallelism, shows that the Modula-3 produced is much larger than the occam from which it is generated. The expansion is tenfold. Perhaps parallel constructs containing processes with longer sequential bodies would decrease this expansion. When solely sequential programs are translated the expansion is only twofold. The implementation team for Southampton's Portable Occam Compiler noted an expansion by a factor of 4-5, when occam2 was translated into C.

This highlights occam's natural and concise syntax for describing concurrency and communication in comparison to Modula-3's primitive support.


5

CONCLUSIONS

The primary aims of the project have been achieved. The translator supports three primitive data types (integers, real numbers and Booleans), scope rules and 'holes', the specification of concurrent processes and synchronous communication. Difficult cases of nested constructs and replicators are successfully translated into Modula-3, which can be further compiled to produce executable code. However the addition of the construct ALT, procedures and functions would make the system able to translate a much wider range of occam code. Error recovery would also be a useful addition.

As well as allowing portability of occam program from transputer to non-transputer systems, the availability of this translator has additional benefits. It is obvious from the translations in the appendix, that specifying concurrency is far easier and more concise in occam than Modula-3. The translation of sample occam into Modula-3 may help teach how Modula-3 implements concurrency and how to use shared variables to communicate between Modula-3 threads. Furthermore Modula-3 programmers may wish to use the translator as an easier way to write some of their concurrent code.


BIBLIOGRAPHY

  1. Kent Retargetable Occam Compiler (KROC) at http://www.hensa.ac.uk/parallel/occam/projects/occam-for-all/kroc/index.html
  2. Southampton Portable Occam Computer (SPOC) at ftp://www.ecs.soton.ac.uk/pub/ occam/spoc1.1
  3. Aho A.V., Sethi R. and Ullman J.D. (1986), Compilers: Principles, Techniques and Tools, Addison-Wesley.
  4. INMOS Limited (1984), occam Programming Manual, Prentice-Hall.
  5. Brookes G.R. and Stewart A.J. (1989), Introduction to occam2 on the Transputer, Macmillan.
  6. Jones G. and Goldsmith M. (1988), Programming in occam2, Prentice-Hall.
  7. Hinchey M.G. and Jarvis S.A. (1995), Concurrent Systems: Formal Development in CSP, McGraw-Hill.
  8. Bennett J.P. (1990), Introduction To Compiling Techniques: A First Course using ANSI C, Lex and Yacc, McGraw-Hill.
  9. Harbison S.P. (1992), Modula-3, Prentice-Hall.
  10. Levine J.R., Mason T. and Brown D. (1992), lex & yacc, O'Reilly and Associates, Inc.
  11. Barrett G. (1992): occam3: A Programming Manual (Draft), available by anonymous ftp from ftp://inmos.co.uk/inmos/languages/occam/


APPENDIX

Examples of translation of occam2 source code into Modula-3

Example 1 - The Null Program:

occam

SEQ

Modula-3

MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

BEGIN

END Mod3fromOccam.

----------------------------------------------------------------

Example 2 - Expressions, WHILE and IF:

occam

INT32 white: -- Declarations of primitive types supported
REAL64 pink:
BOOL orange:
SEQ
  white := 1000(INT32)/2(INT32)
  white := (500*2)+(60-2)
  pink := 55.2(REAL64) / 44.5(REAL64)
  orange := (TRUE OR FALSE) AND TRUE
  WHILE (white - 1000) > 0
    SEQ
      white := white - 1 
      IF
        white = 1000
          SKIP
        pink = 200.0(REAL64)
          STOP
        orange = FALSE
          STOP
        TRUE
          SKIP
Modula-3
MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

BEGIN

  VAR
    white := NEW(REF INTEGER) ;
    pink := NEW(REF LONGREAL) ;
    orange := NEW(REF BOOLEAN) ;
  BEGIN
    white^ := 1000 DIV 2 ;
    white^ := (500 * 2) + (60 - 2) ;
    pink^ := 55.2d0 / 44.5d0 ;
    orange^ := (TRUE OR FALSE) AND TRUE ;
    WHILE (white^ - 1000) > 0 DO
      white^ := white^ - 1 ;
      BEGIN
        IF white^ = 1000 THEN
          Skip() ;
        ELSIF pink^ = 200.0d0 THEN
          Stop() ;
        ELSIF orange^ = FALSE THEN
          Stop() ;
        ELSIF TRUE THEN
          Skip() ;
        ELSE Stop() ;
        END ; (* END IF *)
      END ;
    END ; (* END WHILE *)
  END ; (* END PROCESS DECLARATION *)

END Mod3fromOccam.

----------------------------------------------------------------

Example 3 - The SEQ Replicator (another loop):

occam

INT yellow:
SEQ
  yellow := 100
  SEQ index = yellow FOR 50
    SEQ
      yellow := yellow + (index-2)
      yellow := yellow/1

Modula-3

MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

BEGIN

  VAR
    yellow := NEW(REF INTEGER) ;
  BEGIN
    yellow^ := 100 ;
    VAR index := NEW(REF INTEGER) ;
    BEGIN
      FOR zzz_temp := yellow^ TO (yellow^ + 50) - 1 DO
        index^ := zzz_temp ;
        yellow^ := yellow^ + (index^ - 2) ;
        yellow^ := yellow^ DIV 1 ;
      END ; (* END SEQ REPLICATOR *)
    END ; (* END INDEX DECLARATION *)
  END ; (* END PROCESS DECLARATION *)

END Mod3fromOccam.

----------------------------------------------------------------

Example 4 - Multiple Assignments:

occam

INT a, b:
REAL64 c:
SEQ
  a, b, c := b, a, c + 10.0(REAL64)

Modula-3

MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

BEGIN

  VAR
    b := NEW(REF INTEGER) ;
    a := NEW(REF INTEGER) ;
    c := NEW(REF LONGREAL) ;
  BEGIN
    VAR
      zzz_INTEGER : ARRAY [1..2] OF INTEGER ;

      zzz_LONGREAL : ARRAY [1..1] OF LONGREAL ;
    BEGIN (* BEGIN assignment *)
      zzz_LONGREAL[1] := c^ + 10.0d0 ;
      zzz_INTEGER[1] := a^ ;
      zzz_INTEGER[2] := b^ ;
      c^ := zzz_LONGREAL[1] ;
      b^ := zzz_INTEGER[1] ;
      a^ := zzz_INTEGER[2] ;
    END ; (* END assignment *)
  END ; (* END PROCESS DECLARATION *)

END Mod3fromOccam.

----------------------------------------------------------------

Example 5 - Input/Output And A Nested IF Replicator:

occam

[5]INT items: -- array of integers
INT firstnonzero, userinput:
SEQ
  items[0] := 10
  items[1] := 11
  items[2] := 12
  items[3] := 13
  items[4] := 14
  GETINT(userinput)
  IF
    IF i = 0 FOR 5
      items[i] = userinput
        firstnonzero := i
    TRUE
      firstnonzero := -1
  PUTINT(firstnonzero)

Modula-3

MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

BEGIN

  VAR
    items := NEW(REF ARRAY OF INTEGER, 5) ;
    userinput := NEW(REF INTEGER) ;
    firstnonzero := NEW(REF INTEGER) ;
  BEGIN
    items[0] := 10 ;
    items[1] := 11 ;
    items[2] := 12 ;
    items[3] := 13 ;
    items[4] := 14 ;    
    userinput^ := Scan.Int(Rd.GetLine(Stdio.stdin)) ;
    PROCEDURE if_replicator1() : BOOLEAN =
      VAR i := NEW(REF INTEGER) ;
      VAR zzz_temp : BOOLEAN := FALSE ;
      BEGIN
        FOR zzz_temp2 := 0 TO (0 + 5) - 1 DO
          i^ := zzz_temp2 ;
          IF items[i^] = userinput^ THEN
            zzz_temp := TRUE ;
            firstnonzero^ := i^ ;
            EXIT ;
          END ; (* END IF *)
        END ; (* END FOR *)
        RETURN zzz_temp ;
      END if_replicator1 ;
    BEGIN
      IF if_replicator1() THEN Skip() ;
      ELSIF TRUE THEN
        firstnonzero^ :=  - 1 ;
      ELSE Stop() ;
      END ; (* END IF *)
    END ;
    Wr.PutText(Stdio.stdout, Fmt.Int(firstnonzero^) & "\n") ;
    Wr.Flush(Stdio.stdout) ;
  END ; (* END PROCESS DECLARATION *)

END Mod3fromOccam.

----------------------------------------------------------------

Example 6 - The PAR construct and replicator:

occam

INT32 apple, apricot:
SEQ
  apple := 100
  apricot := 200
  PAR i = 0 FOR 10
    PAR
      SEQ
        apple := apple * 1
        PUTINT(apple+i)
      SEQ
        apricot := apricot * 1
        PUTINT(apricot+i)

Modula-3

MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

  Environment1 = REF RECORD
    apple : REF INTEGER ;
    apricot : REF INTEGER ;
    i : REF INTEGER ;
  END ;

  Thread_Prototype1 = Thread.Closure OBJECT
    vars : Environment1 ;
  END ;

  Environment2 = REF RECORD
    apple : REF INTEGER ;
    apricot : REF INTEGER ;
    i : REF INTEGER ;
  END ;

  Thread_Prototype2 = Thread.Closure OBJECT
    vars : Environment2 ;
  END ;

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

PROCEDURE pthread1_1 (self : Thread_Prototype1) : REFANY =
  VAR
    apple : REF INTEGER := self.vars.apple ;
    apricot : REF INTEGER := self.vars.apricot ;
    i : REF INTEGER := self.vars.i ;
  BEGIN
    VAR
      env2 := NEW(Environment2) ;
      thread2_2 : Thread.T ;
      thread2_1 : Thread.T ;
    BEGIN
      env2.apple := apple ;
      env2.apricot := apricot ;
      env2.i := i ;
      thread2_2 := Thread.Fork(NEW(Thread_Prototype2, vars := env2, apply := pthread2_2)) ;
      thread2_1 := Thread.Fork(NEW(Thread_Prototype2, vars := env2, apply := pthread2_1)) ;
      EVAL Thread.Join(thread2_2) ; 
      EVAL Thread.Join(thread2_1) ; 
      env2 := NIL ;
    END ; (* END PAR *)
    RETURN NIL ;
  END pthread1_1 ;

PROCEDURE pthread2_1 (self : Thread_Prototype2) : REFANY =
  VAR
    apple : REF INTEGER := self.vars.apple ;
    apricot : REF INTEGER := self.vars.apricot ;
    i : REF INTEGER := self.vars.i ;
  BEGIN
    apricot^ := apricot^ * 1 ;
    Wr.PutText(Stdio.stdout, Fmt.Int(apricot^ + i^) & "\n") ;
    Wr.Flush(Stdio.stdout) ;
    RETURN NIL ;
  END pthread2_1 ;

PROCEDURE pthread2_2 (self : Thread_Prototype2) : REFANY =
  VAR
    apple : REF INTEGER := self.vars.apple ;
    apricot : REF INTEGER := self.vars.apricot ;
    i : REF INTEGER := self.vars.i ;
  BEGIN
    apple^ := apple^ * 1 ;
    Wr.PutText(Stdio.stdout, Fmt.Int(apple^ + i^) & "\n") ;
    Wr.Flush(Stdio.stdout) ;
    RETURN NIL ;
  END pthread2_2 ;

BEGIN

  VAR
    apricot := NEW(REF INTEGER) ;
    apple := NEW(REF INTEGER) ;
  BEGIN
    apple^ := 100 ;
    apricot^ := 200 ;
    VAR
      env1 := NEW(REF ARRAY OF Environment1, 10) ;
      thread_array := NEW(REF ARRAY OF Thread.T, 10) ;
      value_array := NEW(REF ARRAY OF REF INTEGER, 10) ;
      zzz_temp : INTEGER ;
    BEGIN
      FOR i := 0 TO (0) + (10) - 1 DO

        zzz_temp := i - (0) ;
        env1[zzz_temp] := NEW(Environment1) ;
        env1[zzz_temp].apple := apple ;
        env1[zzz_temp].apricot := apricot ;
        value_array[zzz_temp] := NEW(REF INTEGER) ;
        value_array[zzz_temp]^ := i ;
      END ;
      FOR f := 0 TO (10) - 1 DO
        env1[f].i := value_array[f] ;
        thread_array[f] := Thread.Fork(NEW(Thread_Prototype1, vars := env1[f], apply := pthread1_1)) ;
      END ;
      FOR g := 0 TO (10) - 1 DO
        EVAL Thread.Join(thread_array[g]) ;
      END ;
      env1 := NIL ;
      thread_array := NIL ;
      value_array := NIL ;
    END ; (* END PAR REPLICATOR *)
  END ; (* END PROCESS DECLARATION *)

END Mod3fromOccam.

Output:

203
103
202
102
200
100
206
106
205
105
201
101
109
208
108
207
107
104
209
204

----------------------------------------------------------------

Example 7 - Channels:

occam

[50]CHAN OF REAL32 channel :
PAR
  PAR i = 0 FOR 49
    REAL32 x, estimate:
    SEQ
      channel[i] ? x
      channel[i] ? estimate
      channel[i+1] ! x
      channel[i+1] ! (estimate+(x/estimate))/2
  REAL32 x:
  SEQ
    x := 12345678.0(REAL32)  -- value to be square rooted
    channel[0] ! x
    channel[0] ! x/2     -- form initial estimate
  REAL32 dummy, root:
  SEQ
    channel[49] ? dummy
    channel[49] ? root   -- receive final estimate
    PUTREAL(root)

Modula-3

MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

  channel_type1 = Thread.Mutex OBJECT
    notFull, notEmpty : REF ARRAY OF Thread.Condition ;
    channel_status : REF ARRAY OF REF BOOLEAN ;
    channel_value : REF ARRAY OF REF LONGREAL ;
  END ;

  Environment1 = REF RECORD
    channel : channel_type1 ;
  END ;

  Thread_Prototype1 = Thread.Closure OBJECT
    vars : Environment1 ;
  END ;

  Environment2 = REF RECORD
    channel : channel_type1 ;
    i : REF INTEGER ;
  END ;

  Thread_Prototype2 = Thread.Closure OBJECT
    vars : Environment2 ;
  END ;

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

PROCEDURE pthread1_1 (self : Thread_Prototype1) : REFANY =
  VAR
    channel : channel_type1 := self.vars.channel ;
  BEGIN
    VAR
      root := NEW(REF LONGREAL) ;
      dummy := NEW(REF LONGREAL) ;
    BEGIN    
      LOCK channel DO
        WHILE channel.channel_status[49]^ = FALSE DO
          Thread.Wait(channel, channel.notEmpty[49]) ;
        END ;
        dummy^ := channel.channel_value[49]^ ;
        channel.channel_status[49]^ :=  FALSE ;
        Thread.Signal(channel.notFull[49]) ;
      END ;
      LOCK channel DO
        WHILE channel.channel_status[49]^ = FALSE DO
          Thread.Wait(channel, channel.notEmpty[49]) ;
        END ;
        root^ := channel.channel_value[49]^ ;
        channel.channel_status[49]^ := FALSE ; 
        Thread.Signal(channel.notFull[49]) ;
      END ;
      Wr.PutText(Stdio.stdout, Fmt.LongReal(root^) & "\n") ;
      Wr.Flush(Stdio.stdout) ;
    END ; (* END PROCESS DECLARATION *)
    RETURN NIL ;
  END pthread1_1 ;

PROCEDURE pthread1_2 (self : Thread_Prototype1) : REFANY =
  VAR
    channel : channel_type1 := self.vars.channel ;
  BEGIN
    VAR
      x := NEW(REF LONGREAL) ;
    BEGIN
      x^ := 12345678.0d0 ;
      LOCK channel DO
        WHILE channel.channel_status[0]^ = TRUE DO
          Thread.Wait(channel, channel.notFull[0]) ;
        END ;
        channel.channel_value[0]^ := x^ ;
        channel.channel_status[0]^ := TRUE ;
        Thread.Signal(channel.notEmpty[0]) ;
      END ;
      LOCK channel DO
        WHILE channel.channel_status[0]^ = TRUE DO
          Thread.Wait(channel, channel.notFull[0]) ;
        END ;
        channel.channel_value[0]^ := x^/2.0d0 ;
        channel.channel_status[0]^ := TRUE ;
        Thread.Signal(channel.notEmpty[0]) ;
      END ;
    END ; (* END PROCESS DECLARATION *)
    RETURN NIL ;
  END pthread1_2 ;

PROCEDURE pthread1_3 (self : Thread_Prototype1) : REFANY =
  VAR
    channel : channel_type1 := self.vars.channel ;
  BEGIN
    VAR
      env2 := NEW(REF ARRAY OF Environment2, 49) ;
      thread_array := NEW(REF ARRAY OF Thread.T, 49) ;
      value_array := NEW(REF ARRAY OF REF INTEGER, 49) ;
      zzz_temp : INTEGER ;
    BEGIN
      FOR i := 0 TO (0) + (49) - 1 DO
        zzz_temp := i - (0) ;
        env2[zzz_temp] := NEW(Environment2) ;
        env2[zzz_temp].channel := channel ;
        value_array[zzz_temp] := NEW(REF INTEGER) ;
        value_array[zzz_temp]^ := i ;
      END ;
      FOR f := 0 TO (49) - 1 DO
        env2[f].i := value_array[f] ;
        thread_array[f] := Thread.Fork(NEW(Thread_Prototype2, vars := env2[f], apply := pthread2_1)) ;
      END ;
      FOR g := 0 TO (49) - 1 DO
        EVAL Thread.Join(thread_array[g]) ;
      END ;
      env2 := NIL ;
      thread_array := NIL ;
      value_array := NIL ;
    END ; (* END PAR REPLICATOR *)
    RETURN NIL ;
  END pthread1_3 ;

PROCEDURE pthread2_1 (self : Thread_Prototype2) : REFANY =
  VAR
    channel : channel_type1 := self.vars.channel ;
    i : REF INTEGER := self.vars.i ;
  BEGIN
    VAR
      estimate := NEW(REF LONGREAL) ;
      x := NEW(REF LONGREAL) ;
    BEGIN
      LOCK channel DO
        WHILE channel.channel_status[i^]^ = FALSE DO
          Thread.Wait(channel, channel.notEmpty[i^]) ;
        END ;
        x^ := channel.channel_value[i^]^ ;
        channel.channel_status[i^]^ := FALSE ;
        Thread.Signal(channel.notFull[i^]) ;
      END ;
      LOCK channel DO
        WHILE channel.channel_status[i^]^ = FALSE DO
          Thread.Wait(channel, channel.notEmpty[i^]) ;
        END ;
        estimate^ := channel.channel_value[i^]^ ;
        channel.channel_status[i^]^ := FALSE ;
        Thread.Signal(channel.notFull[i^]) ;
      END ;
      LOCK channel DO     
        WHILE channel.channel_status[i^ + 1]^ = TRUE DO
          Thread.Wait(channel, channel.notFull[i^ + 1]) ;
        END ;
        channel.channel_value[i^ + 1]^ := x^ ;
        channel.channel_status[i^ + 1]^ := TRUE ;
        Thread.Signal(channel.notEmpty[i^ + 1]) ;
      END ;    
      LOCK channel DO
        WHILE channel.channel_status[i^ + 1]^ = TRUE DO
          Thread.Wait(channel, channel.notFull[i^ + 1]) ;
        END ;
        channel.channel_value[i^ + 1]^ := (estimate^+(x^/estimate^))/2.0d0 ;
        channel.channel_status[i^ + 1]^ := TRUE ;
        Thread.Signal(channel.notEmpty[i^ + 1]) ;     
      END ;
    END ; (* END PROCESS DECLARATION *)
    RETURN NIL ;
  END pthread2_1 ;

BEGIN

  VAR
    channel := NEW(channel_type1) ;
  BEGIN
    channel.channel_status := NEW(REF ARRAY OF REF BOOLEAN, 50) ;
    channel.channel_value := NEW(REF ARRAY OF REF LONGREAL, 50) ;
    channel.notFull := NEW(REF ARRAY OF Thread.Condition, 50) ;
    channel.notEmpty := NEW(REF ARRAY OF Thread.Condition, 50) ;
    FOR zzz_temp := 0 TO (50) - 1 DO
      channel.channel_status[zzz_temp] := NEW(REF BOOLEAN) ;
      channel.channel_value[zzz_temp] := NEW(REF LONGREAL) ;
      channel.channel_status[zzz_temp]^ := FALSE ;
      channel.notFull[zzz_temp] := NEW(Thread.Condition) ;
      channel.notEmpty[zzz_temp] := NEW(Thread.Condition) ;
    END ;
    VAR
      env1 := NEW(Environment1) ;
      thread1_3 : Thread.T ;
      thread1_2 : Thread.T ;
      thread1_1 : Thread.T ;
    BEGIN
      env1.channel := channel ;
      thread1_3 := Thread.Fork(NEW(Thread_Prototype1, vars := env1, apply := pthread1_3)) ;
      thread1_2 := Thread.Fork(NEW(Thread_Prototype1, vars := env1, apply := pthread1_2)) ;
      thread1_1 := Thread.Fork(NEW(Thread_Prototype1, vars := env1, apply := pthread1_1)) ;
      EVAL Thread.Join(thread1_3) ; 
      EVAL Thread.Join(thread1_2) ; 
      EVAL Thread.Join(thread1_1) ; 
      env1 := NIL ;
    END ; (* END PAR *)
  END ; (* END PROCESS DECLARATION *)

END Mod3fromOccam.

Output:

3.513642D3

----------------------------------------------------------------

Example 8 - Why we need an arbitrary number of temporary files to unravel nested PAR constructs:

occam

INT apple:
PAR
  SEQ
    apple := 56
    PAR
      apple := apple * 2
      PUTINT(apple)
    PUTINT(apple*7)

Modula-3 translation using only one temporary File (incorrect)

MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

  Environment1 = REF RECORD
    apple : REF INTEGER ;
  END ;

  Thread_Prototype1 = Thread.Closure OBJECT
    vars : Environment1 ;
  END ;

  Environment2 = REF RECORD
    apple : REF INTEGER ;
  END ;

  Thread_Prototype2 = Thread.Closure OBJECT
    vars : Environment2 ;
  END ;

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

PROCEDURE pthread1_1 (self : Thread_Prototype1) : REFANY =
  VAR
    apple : REF INTEGER := self.vars.apple ;
  BEGIN
    apple^ := 56 ;
    VAR
      env2 := NEW(Environment2) ;
      thread2_2 : Thread.T ;
      thread2_1 : Thread.T ;
    BEGIN
      env2.apple := apple ;
      thread2_2 := Thread.Fork(NEW(Thread_Prototype2, vars := env2, apply := pthread2_2)) ;
      thread2_1 := Thread.Fork(NEW(Thread_Prototype2, vars := env2, apply := pthread2_1)) ;
      EVAL Thread.Join(thread2_2) ;
      EVAL Thread.Join(thread2_1) ;
      env2 := NIL ;
    END ; (* END PAR *)
    RETURN NIL ;
  END pthread1_1 ;

PROCEDURE pthread2_1 (self : Thread_Prototype2) : REFANY =
  VAR
    apple : REF INTEGER := self.vars.apple ;
  BEGIN
    Wr.PutText(Stdio.stdout, Fmt.Int(apple^) & "\n") ;
    Wr.Flush(Stdio.stdout) ;
    RETURN NIL ;
  END pthread2_1 ;

PROCEDURE pthread2_2 (self : Thread_Prototype2) : REFANY =
  VAR
    apple : REF INTEGER := self.vars.apple ;
  BEGIN
    apple^ := apple^ * 2 ;Cross
    Wr.PutText(Stdio.stdout, Fmt.Int(apple^ * 7) & "\n") ; 
    Wr.Flush(Stdio.stdout) ;
    RETURN NIL ;
  END pthread2_2 ;

BEGIN

  VAR
    apple := NEW(REF INTEGER) ;
  BEGIN
    VAR
      env1 := NEW(Environment1) ;
      thread1_1 : Thread.T ;
    BEGIN
      env1.apple := apple ;
      thread1_1 := Thread.Fork(NEW(Thread_Prototype1, vars := env1, apply := pthread1_1)) ;
      EVAL Thread.Join(thread1_1) ;
      env1 := NIL ;
    END ; (* END PAR *)
  END ; (* END PROCESS DECLARATION *)

END Mod3fromOccam.

Modula-3 translation using an arbitrary number of temporary files (correct)

MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

  Environment1 = REF RECORD
    apple : REF INTEGER ;
  END ;

  Thread_Prototype1 = Thread.Closure OBJECT
    vars : Environment1 ;
  END ;

  Environment2 = REF RECORD
    apple : REF INTEGER ;
  END ;

  Thread_Prototype2 = Thread.Closure OBJECT
    vars : Environment2 ;
  END ;

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

PROCEDURE pthread1_1 (self : Thread_Prototype1) : REFANY =
  VAR
    apple : REF INTEGER := self.vars.apple ;
  BEGIN
    apple^ := 56 ;
    VAR
      env2 := NEW(Environment2) ;
      thread2_2 : Thread.T ;
      thread2_1 : Thread.T ;
    BEGIN
      env2.apple := apple ;
      thread2_2 := Thread.Fork(NEW(Thread_Prototype2, vars := env2, apply := pthread2_2)) ;
      thread2_1 := Thread.Fork(NEW(Thread_Prototype2, vars := env2, apply := pthread2_1)) ;
      EVAL Thread.Join(thread2_2) ;
      EVAL Thread.Join(thread2_1) ;
      env2 := NIL ;
    END ; (* END PAR *) Tick
    Wr.PutText(Stdio.stdout, Fmt.Int(apple^ * 7) & "\n") ;
    Wr.Flush(Stdio.stdout) ;
    RETURN NIL ;
  END pthread1_1 ;

PROCEDURE pthread2_1 (self : Thread_Prototype2) : REFANY =
  VAR
    apple : REF INTEGER := self.vars.apple ;
  BEGIN
    Wr.PutText(Stdio.stdout, Fmt.Int(apple^) & "\n") ;
    Wr.Flush(Stdio.stdout) ;
    RETURN NIL ;
  END pthread2_1 ;

PROCEDURE pthread2_2 (self : Thread_Prototype2) : REFANY =
  VAR
    apple : REF INTEGER := self.vars.apple ;
  BEGIN
    apple^ := apple^ * 2 ;
    RETURN NIL ;
  END pthread2_2 ;

BEGIN

  VAR
    apple := NEW(REF INTEGER) ;
  BEGIN
    VAR
      env1 := NEW(Environment1) ;
      thread1_1 : Thread.T ;
    BEGIN
      env1.apple := apple ;
      thread1_1 := Thread.Fork(NEW(Thread_Prototype1, vars := env1, apply := pthread1_1)) ;
      EVAL Thread.Join(thread1_1) ;
      env1 := NIL ;
    END ; (* END PAR *)
  END ; (* END PROCESS DECLARATION *)

END Mod3fromOccam.
----------------------------------------------------------------

Example 9 - A difficult case of nesting for IF constructs/replicators:

occam

INT cake, bat:
SEQ
  cake := 22
  bat := 44
  IF
    IF
      IF i = 0 FOR 3
        cake = 33
          PUTINT(99)
    IF
      bat = 100
        bat := 900
      IF index = 10 FOR 12
        IF
          bat = 99
            PUTINT(100)
          bat = 45
            PUTINT(995)
          cake = 22
            PUTINT(cake)
  IF hello = 16 FOR 10 -- **** IF REPLICATOR ****
    IF
      cake = 23 + 1
        PUTINT(9953420)
      cake = 9999
        PUTINT(-9953420)
      TRUE
        PUTINT(99)

Modula-3

  VAR
    bat := NEW(REF INTEGER) ;
    cake := NEW(REF INTEGER) ;
  BEGIN
    cake^ := 22 ;
    bat^ := 44 ;
    PROCEDURE if_replicator1() : BOOLEAN =
      VAR index := NEW(REF INTEGER) ;
      VAR zzz_temp : BOOLEAN ;
      BEGIN
        FOR zzz_temp2 := 10 TO (10 + 12) - 1 DO
          zzz_temp := TRUE ;
          index^ := zzz_temp2 ;
          IF bat^ = 99 THEN
            Wr.PutText(Stdio.stdout, Fmt.Int(100) & "\n") ;
            Wr.Flush(Stdio.stdout) ;
          ELSIF bat^ = 45 THEN
            Wr.PutText(Stdio.stdout, Fmt.Int(995) & "\n") ;
            Wr.Flush(Stdio.stdout) ;
          ELSIF cake^ = 22 THEN
            Wr.PutText(Stdio.stdout, Fmt.Int(cake^) & "\n") ;
            Wr.Flush(Stdio.stdout) ;
          ELSE zzz_temp := FALSE ;
          END ;
          IF zzz_temp THEN EXIT ; END ;
        END ;
        RETURN zzz_temp ;
      END if_replicator1 ;
    PROCEDURE if_replicator2() : BOOLEAN =
      VAR i := NEW(REF INTEGER) ;
      VAR zzz_temp : BOOLEAN := FALSE ;
      BEGIN
        FOR zzz_temp2 := 0 TO (0 + 3) - 1 DO
          i^ := zzz_temp2 ;
          IF cake^ = 33 THEN
            zzz_temp := TRUE ;
            Wr.PutText(Stdio.stdout, Fmt.Int(99) & "\n") ;
            Wr.Flush(Stdio.stdout) ;
            EXIT ;
          END ; (* END IF *)
        END ; (* END FOR *)
        RETURN zzz_temp ;
      END if_replicator2 ;
    BEGIN
      IF if_replicator2() THEN Skip() ;
      ELSIF bat^ = 100 THEN
        bat^ := 900 ;
      ELSIF if_replicator1() THEN Skip() ;
      ELSE Stop() ;
      END ; (* END IF *)
    END ;
    VAR hello := NEW(REF INTEGER) ;
    VAR zzz_temp : BOOLEAN ;
    BEGIN
      FOR zzz_temp2 := 16 TO (16 + 10) - 1 DO 
        zzz_temp := TRUE ;
        hello^ := zzz_temp2 ;
        IF cake^ = 23 + 1 THEN
          Wr.PutText(Stdio.stdout, Fmt.Int(9953420) & "\n") ;
          Wr.Flush(Stdio.stdout) ;
        ELSIF cake^ = 9999 THEN
          Wr.PutText(Stdio.stdout, Fmt.Int( - 9953420) & "\n") ;
          Wr.Flush(Stdio.stdout) ;
        ELSIF TRUE THEN
          Wr.PutText(Stdio.stdout, Fmt.Int(99) & "\n") ;
          Wr.Flush(Stdio.stdout) ;
        ELSE zzz_temp := FALSE ;
        END ;
        IF zzz_temp THEN EXIT ; END ;
      END ; (* END FOR *)
      IF zzz_temp = FALSE THEN Stop() ; END ;
    END ; (* END zzz_temp DECLARATION *)
  END ; (* END PROCESS DECLARATION *)

Translation of IF hello = 16 FOR 10 starts here.

Output:

22
99
----------------------------------------------------------------

Example 10 - How occam's channels map into Modula-3:

occam

INT x, y, z:
CHAN OF INT chan1, chan2:
PAR
  SEQ
    x := 5
    chan1 ! x
  SEQ
    chan1 ? y
    chan2 ! y
  SEQ
    chan2 ? z
    PUTINT(z*z)

Modula-3

MODULE Mod3fromOccam EXPORTS Main ;

IMPORT Thread, Wr, Stdio, Fmt, Rd, Scan ;

TYPE

  chan1_type1 = Thread.Mutex OBJECT
    notFull, notEmpty : Thread.Condition ;
    channel_status : REF BOOLEAN ;
    channel_value : REF INTEGER ;
  END ;

  chan2_type2 = Thread.Mutex OBJECT
    notFull, notEmpty : Thread.Condition ;
    channel_status : REF BOOLEAN ;
    channel_value : REF INTEGER ;
  END ;

  Environment1 = REF RECORD
    x : REF INTEGER ;
    y : REF INTEGER ;
    z : REF INTEGER ;
    chan1 : chan1_type1 ;
    chan2 : chan2_type2 ;
  END ;

  Thread_Prototype1 = Thread.Closure OBJECT
    vars : Environment1 ;
  END ;

PROCEDURE Skip() =
  BEGIN
  END Skip ;

PROCEDURE Stop() =
  BEGIN
    LOOP Thread.Yield() ; END ;
  END Stop ;

PROCEDURE pthread1_1 (self : Thread_Prototype1) : REFANY =
  VAR
    x : REF INTEGER := self.vars.x ;
    y : REF INTEGER := self.vars.y ;
    z : REF INTEGER := self.vars.z ;
    chan1 : chan1_type1 := self.vars.chan1 ;
    chan2 : chan2_type2 := self.vars.chan2 ;
  BEGIN
    LOCK chan2 DO
      WHILE chan2.channel_status^ = FALSE DO
        Thread.Wait(chan2, chan2.notEmpty) ;
      END ;
      z^ := chan2.channel_value^ ;
      chan2.channel_status^ := FALSE ;
      Thread.Signal(chan2.notFull) ;
    END ; 
    Wr.PutText(Stdio.stdout, Fmt.Int(z^ * z^) & "\n") ;
    Wr.Flush(Stdio.stdout) ;
    RETURN NIL ;
  END pthread1_1 ;

PROCEDURE pthread1_2 (self : Thread_Prototype1) : REFANY =
  VAR
    x : REF INTEGER := self.vars.x ;
    y : REF INTEGER := self.vars.y ;
    z : REF INTEGER := self.vars.z ;
    chan1 : chan1_type1 := self.vars.chan1 ;
    chan2 : chan2_type2 := self.vars.chan2 ;
  BEGIN
    LOCK chan1 DO
      WHILE chan1.channel_status^ = FALSE DO
        Thread.Wait(chan1, chan1.notEmpty) ;
      END ;
      y^ := chan1.channel_value^ ;
      chan1.channel_status^ := FALSE ;
      Thread.Signal(chan1.notFull) ;
    END ;
    LOCK chan2 DO
      WHILE chan2.channel_status^ = TRUE DO
        Thread.Wait(chan2, chan2.notFull) ;
      END ;
      chan2.channel_value^ := y^ ;
      chan2.channel_status^ := TRUE ;
      Thread.Signal(chan2.notEmpty) ;
    END ;
    RETURN NIL ;
  END pthread1_2 ;

PROCEDURE pthread1_3 (self : Thread_Prototype1) : REFANY =
  VAR
    x : REF INTEGER := self.vars.x ;
    y : REF INTEGER := self.vars.y ;
    z : REF INTEGER := self.vars.z ;
    chan1 : chan1_type1 := self.vars.chan1 ;
    chan2 : chan2_type2 := self.vars.chan2 ;
  BEGIN
    x^ := 5 ;
    LOCK chan1 DO
      WHILE chan1.channel_status^ = TRUE DO
        Thread.Wait(chan1, chan1.notFull) ;
      END ;
      chan1.channel_value^ := x^ ;
      chan1.channel_status^ := TRUE ;
      Thread.Signal(chan1.notEmpty) ;
    END ;
    RETURN NIL ;
  END pthread1_3 ;

BEGIN

  VAR
    z := NEW(REF INTEGER) ;
    y := NEW(REF INTEGER) ;
    x := NEW(REF INTEGER) ;
    chan1 := NEW(chan1_type1) ;
    chan2 := NEW(chan2_type2) ;
  BEGIN
    chan1.notFull := NEW(Thread.Condition) ;
    chan1.notEmpty := NEW(Thread.Condition) ;
    chan1.channel_status := NEW(REF BOOLEAN) ;
    chan1.channel_value := NEW(REF INTEGER) ;
    chan1.channel_status^ := FALSE ;
    chan2.notFull := NEW(Thread.Condition) ;
    chan2.notEmpty := NEW(Thread.Condition) ;
    chan2.channel_status := NEW(REF BOOLEAN) ;
    chan2.channel_value := NEW(REF INTEGER) ;
    chan2.channel_status^ := FALSE ;
    VAR
      env1 := NEW(Environment1) ;
      thread1_3 : Thread.T ;
      thread1_2 : Thread.T ;
      thread1_1 : Thread.T ;
    BEGIN
      env1.x := x ;
      env1.y := y ;
      env1.z := z ;
      env1.chan1 := chan1 ;
      env1.chan2 := chan2 ;
      thread1_3 := Thread.Fork(NEW(Thread_Prototype1, vars := env1, apply := pthread1_3)) ;
      thread1_2 := Thread.Fork(NEW(Thread_Prototype1, vars := env1, apply := pthread1_2)) ;
      thread1_1 := Thread.Fork(NEW(Thread_Prototype1, vars := env1, apply := pthread1_1)) ;
      EVAL Thread.Join(thread1_3) ; 
      EVAL Thread.Join(thread1_2) ; 
      EVAL Thread.Join(thread1_1) ; 
      env1 := NIL ;
    END ; (* END PAR *)
  END ; (* END PROCESS DECLARATION *)

END Mod3fromOccam.

Diploma In Computer Science
Project Proposal

An occam To Modula-3 Translator







Martin Mamo
Magdalene College


Project Supervisor:
Stephen Hand

Project Originator:
Michael Hinchey

Overseers:
Dr Paulson and Dr Sparck Jones

Introduction

The need for increased computer power led to the development of the transputer by INMOS. Each transputer is capable of supporting truly parallel computing by means of four communication links for point-to-point inter-transputer connections. The programming language occam was closely associated with the transputer's development and supports concurrency. Occam is used for parallel algorithms that exploit a multiprocessor environment as well as for applications dealing with concurrent objects in an environment. Occam is not an assembly language, although it does completely provide for hardware control, negating the need to use assembly language on transputers.

I wish to write a program to translate from occam to Modula-3. The generation of Modula-3 code will allow occam programs to be run on any platform supporting a Modula-3 compiler, facilitating portability between transputer and non-transputer systems.

Outline Of Work

C is a good choice of language to write the translator in, as the tools lex and yacc are provided, which respectively handle lexical and syntactic analysis. Lexical analysis consists of reading characters from the source code and recognising the tokens they represent. Syntactic analysis consists of recognising the syntactic structure of a stream of tokens.

I will write a compiler to take occam source code and to turn in into Modula-3. This will be done in a number of stages. I will assume that the input is valid. The first two stages (the front-end) will be lexical and syntactic analysis, which will generate a parse tree. At most there will be limited type and semantic checking.

The back-end will be the generation of Modula-3 code from the tree. I will first be concerned with the Modula-3 code generation for a null occam program, so that the appropriate prologue and epilogue can be dealt with. Then expressions will be handled. The next stage of development will deal with statements for a sequential only program. Input/output will be then be added. Lastly parallel constructs will be handled.

As occam is not a complex language, I hope the translator will deal with most of it. The new facilities added by occam2 may not be dealt with.

Occam's processes will be emulated by Modula-3's threads. It should be noted that occam processes do not share common memory, whereas threads do. Occam's processes communicate by use of channels and these too will be provided for. It is hoped that at least a good proportion of occam can be successfully emulated by use of threads.

A series of occam source programs will be used to test the translator through it's development. The first class of source program, with no input/output included, will simply be a null program. The second class will be sequential and only contain conditional statements, one type of loop statement and expressions. The third class will include input/output. The fourth class of test program will handle parallelism. Each of these classes is a superset of the preceding one.

An occam2 to ANSI C translator (binary file only) is freely available from the University of Southampton. Testing can be carried out by executing compiled Modula-3 output from my translator and comparing sample program results with those produced by executing compiled C from Southampton's translator.

Formal correctness is not at this point anticipated to be dealt with.

Special Resources

This project does not require any special resources.

languages:
ANSI C available on cus will be used to write the translator and will facilitate use of lex and yacc Modula-3 to run the translator's output

Starting Point

I have no previous knowledge of writing translators or the languages occam2 and ANSI C. My knowledge of Modula-3 comes from this term's work. Modula-3 supports threads, which will simplify matters. The courses on compilers and C are not until next term.

Timetable

5 Dec - 12 Jan:
Read about:

Trial use of lex and yacc. Obtain simple, pre-written occam programs.

15 Jan - 15 Feb:
Parser with a reduced subset of occam (if, loops etc).
Generate parse tree.
Equate occam and Modula-3 types (INT, BYTE, REAL etc) and operators.

15 Feb - 20 March:
Development of parser.
Modula-3 code generator.
Emulate calls for standard input/output.
Test with simple occam programs.
Write progress report (20 March).

20 March - 20 April:
Complete above and further testing.
Addition of limited type and syntax checking if time.

May - June :
Exams

June - July:
Dissertation





<---Go back