

		      The P4 Parallel Processing Library

p4 is a package of procedures that support development of parallel programs
that are portable across a wide variety of architectures.  For the
shared-memory model of parallel computation, p4 provides a set of primitives
from which monitors can be constructed, as well as a set of useful monitors.
For the distributed-memory model, p4 provides typed send and receive
operations, and creation of processes according to a text file describing
files to be executed by created processes and the machines to execute them
on.  These two models can be combined to form the "cluster" model.

Authorship:  This system owes much to many people (see History, below) but the
current version was written by Ralph Butler of the University of North Florida
and Argonne National Laboratory, and Rusty Lusk at Argonne National Laboratory.

History: The system is directly decended from the "Argonne macros", or
"monmacs", or "parmacs", the system built at Argonne by Rusty Lusk and Ross
Overbeek using m4 macros, and documented in the book "Portable Programs for
Parallel Processors", by Boyle, Butler, Disz, Glickfeld, Lusk, Overbeek,
Patterson, and Stevens, all of whom contributed to that version of the system.
Major contributions were also made by Bob Olson, now at the University of
Illinois, and Remy Evard, now at the University of Oregon.  The book is
published by Holt, Rinehart, and Winston (ISBN 0-03-014404-3).  p4 does not
change any of the basic concepts presented there, but is a completely new
implementation.  Major differences are:

  The m4 macro processor is not part of the system; it is now constructed 
    entirely with C macros and functions.
  Processes on various machines are specified in a text file that is read at
    run time, rather than in macros that are compiled into the executable
    file.
  More machines are supported, notably the BBN Butterflies, the Intel
    IPSC/860, and the IBM RS/6000.
  Machine dependent code has been isolated in two files, making the system
    easier to port.


The files and subdirectories in this directory are:

  README     - this file
  alog       - the logging package for debugging/analyzing p4 pgms
  disclaimer - the "no warranty" statement
  doc	     - a draft user manual, man page, and other documentation
  lib	     - the p4 system C source code
  lib_f	     - Fortran interface routines
  makefile   - the top-level makefile for installing p4
  messages   - distributed-memory examples
  messages_f - distributed-memory examples in Fortran
  monitors   - shared-memory examples
  contrib    - miscellaneous other examples
  contrib_f  - miscellaneous fortran examples
  util	     - installation files, especially defs.all and makefile.proto
  bin	     - a few shell scripts and executable version of server daemon(s)

See the manual in the doc subdirectory for full details on how to install
and use p4.  Following is a brief description of how to do the
installation.


To build p4, position yourself in this directory and type:

   make all P4ARCH=<machine>

where <machine> is one of the machine names listed in util/machines.

For example:

  make all P4ARCH=SYMMETRY


The 'all' is optional, for example

  make P4ARCH=IPSC860_HOST


To save disk space, various intermediate files can be removed with

  make clean


The system can be restored to its original, machine-independent state with

  make realclean

Note that this removes the machine-dependent Makefiles in each directory, so
the operation is not idempotent.

It is also possible to install (or clean) only some of the directories:

  make all P4ARCH=SUN DIRS=messages

  make clean DIRS='monitors messages'


To install only the Makefiles in all subdirectories, use:

  make makefiles P4ARCH=<machine>

You can have p4 build a makefile in your own directory by copying one of
the makefile.proto files from an example directory (e.g. messages or
monitors).  Then edit the proto file to include information about your
own programs and the location of the p4 library (LIBDIR).
Finally, you build the Makefile by getting into the 
top-level p4 directory and typing:

    make makefiles P4ARCH=<machine> DIRS=<your_full_pathname>

To add a new machine, or to change the characteristic parameters associated
with an existing one, edit the file util/defs.all

To add a new example program in one of the application directories, edit the
file makefile.proto in the appropriate directory.

p4 currently traps the signals SIGINT and SIGUSR1.  The SIGINT interrupt is
trapped in order to do some cleanup especially on System V machines where
IPC facilities must be returned to the OS.  If the user decides to trap
this interrupt for his own cleanup, he should invoke p4_error as part of
any termination processing.  p4 uses the SIGUSR1 interrupt internally by a
listener process for socket communications.  The only time in which p4
would not need this interrupt would be in those instances in which no
socket communications are to be performed.



/******************  OLD STUFF  *****************************************/


Status of version 0.2 (2/1/1992):
  
  As in version 0.1 (see below), processes on the Butterfly may not be 
    created in the cluster you are expecting.  In particular, processes
    which are started via remote shell from another machine may start in the
    public cluster.  We will explore this further as time permits.
  
  Also on the butterfly, we have implemented quite a bit of code for the 
    TCMP package planned for that machine, but do not yet have it stable.

  Shared memory operations in Fortran are not supported on the Sequents.
    This implies that you cannot have a procgroup file entry such as 
    "local 2" for a Fortran program on the Sequent, because such an entry
    would imply that messages would be passed thru shared memory.

  We do not currently support System V IPC on any machine although we are
    exploring it.  The major reason that we have not supported it is that
    in some instances the operating system may not reclaim IPC resources if a
    program abnormally terminates, and the user must explicitly do it.  We 
    are planning to make p4 handle such cases when possible.
  
  The alog/upshot packages are available as graphical debugging tools.  In
    the near future, we hope to instrument the internals of p4 with logging
    calls for these packages.
  
  The new machine type IBM_3090_AIX does not support Fortran at this point.
    We have encountered one or two "standard" Fortran functions such as getarg
    that are not available on the machine which are using.  We will explore
    this problem more as time permits.

  The Fortran on the NeXT has the same problem as the IBM_3090_AIX.

  The Fortran on the Titan is not yet working correctly.

  If your machine is running an old socket implementation (e.g. 4.2),
    then you may experience very slow communications via sockets.
    Also, in the current release, we are only using the socket option
    TCP_NODELAY on BSD systems to boost performance.  We have found
    that the setsockopt call is not available on some SYSV machines 
    that we have ported to; we plan to fix this for those SYSV
    environments that do support it in the future.  If you wish to 
    turn that option on, the relevant code is protected by ifdef P4BSD
    lines in p4_sock_util.h and can easily be changed.
  
  On an SGI machine, the make facility will only work correctly if you
    set your environment variable SHELL to the Bourne shell, since our 
    makefiles contain Bourne shell code at this point.



Status of version 0.1 (2/1/1991):

  The C version is stable.  No significant changes are planned.

  The examples all work, but could stand some cleaning up.  Subsequent
    releases may have different examples.  The grid example in the messages
    directory has a compile-time configuration of 4 processes.

  The p4_logging routines have not been tested.  They are in the process of
    being replaced by the atrace package.

  The Fortran version is included, but it is not guaranteed to compile and
    link on any of the given machines without local modifications.  It is
    really a place-holder for a new version to come real soon now, which may
    have different procedure names and parameters but the same idea.

  The tutorial is in very rough draft form.  It is expected to be extended in
    subsequent releases.

  It is not possible to create node processes on the IPSC860 from machines
    other than the IPSC host machine, nor is it possible to create node
    processes excuting different files.  The only model that works on this
    machine has a master on the srm and the same executable file loaded on all
    the nodes.  We hope to change this restriction soon.

  The clock on the ATT_3B2 is coarse-grained.

  The socket-pair system routine on the 3B2 has a bug in it which causes an
    error message to be reported at the end of a run.  It should not effect
    the job as it is running.

  The Butterflies are treated as shared-memory machines, with primitive memory
    management.  The next release should have BBN native memory management 
    with support for interleaved memory, and an optimized message-passing
    system written by Ken Sedgewick.

  On the butterfly be alert to the fact that processes may not be created in
    the cluster you were expecting.  Automatic cluster management is being
    explored.

  On the IPSC860, mixing p4 message passing with Intel message passing, even
    the system messages that are associated with global data operations, may
    lead to unexpected (and bad) results.  We are working on this now.
