ICS 632: List of Projects (Fall 2008)

 Things to Note

  1. Most of the projects have an "open-ended" flavor to them. The idea is to approach them like small research projects in which you define your own methodology and (some of) your objectives. What I am looking for is "mature" approaches expected from graduate students. In particular, your first task should be to come up with a precise formulation of the project.
  2. Some projects may be harder than others, and expectations will be tailored to the projects.
  3. At most 2 students can pick the same project.
  4. Group projects involving 2 students are possible, but expectations will be higher for the end result and I refuse to get involved in "I did everything and my partner did nothing because he/she was out surfing all the time" arguments.
  5. Students are strongly encouraged to define their own projects, but these projects have to be approved by me beforehand and approval will be contingent on the project being sufficiently involved.
  6. Looking at previous work, papers, results in the literature is encouraged. Downloading actual code that does the project is ok for comparison with your own implementation, but using that code instead of your own implementation constitutes ground for getting a zero.
  7. For the projects that require that you write a parallel applications, it is understood that you should write a sequential version (if need be) and that you compute speedups with respect to the sequential version. It is also understood that you will perform in-depth performance measurements. It is your responsibility to come up with interesting things to say in your report! One way to do this is coming up with multiple versions of your implementation so that you can study what worked and what didn't in terms of performance. Producing only one implementation doesn't really give you anything interesting to talk about. Performance analysis/modeling is a plus.
IMPORTANT: You should discuss progress with me, and not hesitate to ask me questions and for directions!

 What to Turn in

You must turn in your code, a report (PDF) around 10-pages, and be prepared to present your project to the class and answer questions (about 15/20 minutes).

Projects are due on 12/8, with presentations starting around 12/1.

 List of Possible Projects


 Project #1: Parallel Sorting

Sorting a list of numbers is a key operation in many applications and it is well-known that it is difficult to parallelize it efficiently. For instance, due to the fact that the amount of data is large when compared to the amount of computation, the cost of I/O may be overwhelming. In this project you will consider the following problem:
Problem Statement: You have a binary file in your home area that contains a list of N random integers. You must write another binary file in your home area that contains the sorted list. Your goal is to do this on our cluster as fast as possible. This is a difficult problem because sorting is not very compute intensive, and so the result may be disappointing. The point is to see whether some speedup can indeed be achieved and to see what matters for performance.
You will develop several parallel sorting algorithms (we saw one in class, you can probably come up with a few of your own, or research existing algorithms), and most likely several versions of each algorithm. For you performance evaluation focus on measuring I/O and computing costs. You probably need to spend some time thinking of how the data gets given to the processors initially. This could be done by some script before the call to mpirun in your PBS script, in the MPI code itself. Note that each node has its own local disk, to which I/O is of course faster than over the NSF-mounted home areas. Comparing different ways of doing the data distribution is most likely in order. And of course you should vary the value of N in your experiments. Report on performance discounting file I/O and on performance including file I/O.


 Project #2: Smith Watterman Algorithm

The Smith Waterman algorithm is a dynamic programming algorithm used to compute global alignments of biological sequences, e.g., to align full genomes of procaryot organisms (i.e., bacterias). Research this algorithm (it's famous and information on the algorithm is readily available) and implement a fast parallel implementation (using existing or random DNA sequences or arbitrary lengths). The input sequences are stored in files in the user's home area and the resulting alignment should be stored to an output file. Come up with reasonable ways of distributing the data. You can also find parallel implementations of the SW algorithm and compare them with your own.

 Project #3: Matrix Multiplication

In this project you'll try to implement the fastest possible matrix multiplication on our cluster. You should implement the outer-product algorithm using both a 2-D non-cyclic distribution and a 2-D cyclic distribution (you can assume nicely divisible matrix dimensions in this project). Your code should be self-checking (for instance using the same scheme as used in HW #3). Perform an in-depth performance analysis of your implementations and converge towards the fastest implementation, explaining the steps you followed and explaining the behaviors your observed. Your programs should all work on a rectangular grid of processors (e.g. 2x4). One of your goals should be to find out how fast you can multiply a matrix of total size ~8GB.

 Project #4: N-body Simulation

An N-body simulation is one that studies how bodies (represented by a mass, a location, and a velocity) move in space (in our case a 2-D space), according to the laws of (Newtonian) physics. Here is an implementation of the N-body problem in Matlab (runnable using the free implementation Octave on any good Linux system):
nbod.m and mkbod.m (courtesy of Howard Motteler at UMBC). Implement an MPI version of this sequential program (your MPI code should read a file that specifies the problem's parameters). Describe your data distribution strategies. You'll probably need to implement adaptive load-balancing. All these should be part of an in-depth performance analysis of subsequent versions of the code. Design a way in which your program will output the results in a verifiable and viewable format (best would probably be a sequence of bitmap images that can be animated as an animated gif, but don't spend all your time doing this if you have no idea how to do it!).

 Project #5: Intersecting Lines

Consider a 2-D rectangular area and a number n of random line segments of some maximum length l in this area. Write the fastest possible MPI program that returns the list of all intersections as a triplet segment1, segment2, intersection point, (for large n). Do the usual performance analysis describing what your different optimizations were and what worked what didn't.

 Project #6: DAG Scheduling

In this project you will verify the notion that DAG-scheduling heuristics that based their scheduling decisions on the critical path are indeed more effective than the standard MaxMin, MinMin, and Sufferage heuristics. This will be done entirely in simulation (i.e., by constructing Gantt charts, etc.). You will have to define a DAG generator that generates random DAGs with given characteristics, and perhaps synthetic DAGs with idiosyncratic properties to highlight the behavior of the different algorithms. The end result will be a set of graphs plotting the performance of relevant scheduling algorithms versus important DAG characteristics and perhaps number of available processors. An important component of this project is defining the experimental framework. Trying your own heuristics, or trying some available in the literature, is obviously a great idea.


henric@hawaii.edu