
Comments on commingling non-isomorphic arrays in vectorized wrappers
of functions accepting rank 0 arguments.  5/7/2007


Consider these commands executed at the ISIS interactive command line:

arr1 = [3,30,300]
arr2 = [4, 40, 400, 4, 40, 400]
arr2 = _reshape(arr2, [2,3])

print(hypot(arr1, arr2[0,*]))
5.000000e+00  
5.000000e+01  
5.000000e+02

and then compare with a vectorized version generated by SLIRP

import("phypot")
print(hypot(arr1, arr2[0,*]))
5.000000e+00  
5.000000e+01  
5.000000e+02

This works fine ... but this

print(hypot(arr1, arr2))
5.000000e+00  4.011234e+01  4.000112e+02
5.000000e+00  4.011234e+01  4.000112e+02

while a correct result (in the current slirp implementation), will
surprise users.

It happens because

	arg2 must have a stride of 1 -- even though its actual rank
	is 2 -- because its expected rank is 0
and
	arg1 must have stride 0, because it's non-isomorphic to arg2

It is not possible to "do the natural thing" here, which loosely is

		multiply arr1 by arr2[0,*], then
		multiply arr1 by arr2[1,*]

without slowing down vectorization in the general case ... to wit:

   assign both arguments strides of 1, so that the first 3 iters
   through the vectorization loop execute as

	double *arg1, *arg2;

	hypot(arg1, arg2)
	hypot(arg1+1, arg2+1)
	hypot(arg1+2, arg2+2)

   then the next three execute as

	hypot(arg1,   arg2+3)
	hypot(arg1+1, arg2+4)
	hypot(arg1+2, arg2+5)

   This can only be done by explicitly checking if the loop index is a
   perfect multiple of the length of arg1 (here 3), and THIS MUST BE
   DONE AFTER EVERY ITERATION OF THE VECTORIZATION LOOP ... when the
   test passes the arg1 pointer is reset back to its original value
   as arg[0] ... this will make vectorization loops more expensive to
   traverse, as the number of conditionals & modulus calls (or integer
   divisions, at least) scales linearly with the number of iterations.

If this ever becomes a real concern to users perhaps it would make 
sense to offer two options

  -fast     (the default, and current implementation)
  -flexible (the approach described in the previous paragraph).

to allow for the greatest flexibility of handling non-isomorphic args.
