From jim@meiko.co.uk Mon Oct 31 09:30:20 1994

I believe that one of Tony's students is looking at this. (Possibly
also some people at Southampton). I'd like to do it myself, but I'm in
the middle of some work on TotalView...

Dropping directly to a single tport (i.e. inlining the csend, etc) is
probably actually rather easy. For instance the meat of csend looks
like this, after you've worked out that it isn't really a broadcast !

#pragma weak csend = _csend
int             _csend (int mtype, void *mbuf, int len, int node, int pid)
{
   /* Elided tests for broadcast stuff ... */

   {   /* Non broadcast case */
       TRACE ("csend start", node);
       TRACE ("csend start", mtype);
       TRACE ("csend start", len);

       ew_tportTxWait (ew_tportTxStart (mpsc_state.tport, 0,
					MPSC2VP (node), mpsc_state.tport,
					mtype, mbuf, len));
       TRACE ("csend done", mtype);
   }
   
   return (0);
}

Similarly the isend and msgwait are equally trivial. Of course you'd
still need all of the code MPI datatype code to handle non-contiguous.

I think the main gains would come from 
1) using the existing group and in particular collective ops in
   preference to bulding them from point to point.
2) going over to the one tport/context scheme which I outlined at the meeting.

-- Jim 


------------------------------------------------------------------------------

The current view of the ADI underlayer based on channels has an easy match to 
tports (I believe).  A different tport for each context might require some
small changes, since there would now be one channel for each context.  




