•Optimise the serial
code in such a way that it exploits shared memory features optimally. Then run MPI with one processor per node.
•Run MPI with as many processors as required. From the mynode assignments
create new communicators as
follows: group all processors on a fiven
node into a communicator. Then create a
collective communicator with one processor from
each node.