ICS 632, Fall 2008: Homework #2

 What to turn in

Send a .tar.gz file to henric@hawaii.edu that contains all your source code and your report. The top-level directory should be called ics632_<lastname>_hw2. In that directory there should be sub-directories called exercise_1, exercise_2, etc. Each such directory should itself have sub-directories called question_1, question_2, etc. These directories contain your code with Makefiles and reports whenever applicable.

 Exercise #1: Your own MPI Broadcast/

Question #1: A Simple broadcast

MPI provides a broadcast function (MPI_Bcast()), which we had described in class. Implement an MPI program that takes one integer command-line argument, let's call it N. All MPI processes allocates an array of N integers. Process with rank 0 (i.e., the "master") fills its array with random elements and broadcasts it to all other processes by calling a function with the following prototype:
void MPI_MyBcast(int *buffer, int count, int root, MPI_Comm comm);
You can see that this function is essentially a simplified version of the original MPI_Bcast(), which handles multiple data types, returns all types of error codes, etc. It should be implemented easily via a sequence of send and receive operations.

Each process that had received the array must send it back to the master so that the master can check the data (making sure that it receives the same data it sent out in the first place).

The program then does the same thing calling the original MPI_Bcast() function.

Question #2: Performance Comparison

Modify your code from Question #1 so that MPI_Bcast() and MPI_MyBcast() are each called 5,000 times. The program also times how long it takes to do these 10,000 broadcasts with either your function or the one provided by MPI.

Run your code using 7 nodes, for the following values of N: 1000, 50000, 100000, 200000.

Discuss the results and the difference between your code and that in the MPICH implementation of MPI with respect to performance (in a README file).

Question #3 [EXTRA CREDIT]: A Better Broadcast

Modify your implementation of the MPI_MyBcast() function so that it uses a broadcast tree, by which broadcasting can happen in parallel. Report on performance improvements (or lack thereof) for the same experiment as in Question #2. Note that the number 7 is conveniently chosen so that it's easy to have a binary broadcast tree.

 Exercise #2: The Return of the Mystery Function

Question #1: Parallel Code

Implement a Master/Worker version of the mystery function minimization from homework #1, with 100 trials and a step size of 1 (or whatever step size make the code run in a reasonable amount of time!!). Your code should have a master processor that gives out work to the workers dynamically, i.e., only when a worker process is idle. Your code should print out the load imbalance as defined in homework #1 and the best local minimum found.
henric@hawaii.edu