## Overview

The final project for COSC 273: Parallel and Distributed Computing gives you an opportunity to synthesize all of the skills you’ve developed throughtout the semester into two ridiculously fast programs. Your programs will exploit parallelism to perform fundamental tasks in computer science utilizing the full power of the college’s HPC cluster. As you refine your programs, you will document the changes you make as well as their effects on performance. You may work in groups of up to 3 for your final project.

## The Options

For the final project, you will submit two programs chosen from the options below.

#### Option 1: Computing Primes

A prime number is a natural number $$p$$ that is only divisible by $$1$$ and itself. The sequence of prime numbers begins $$2, 3, 5, 7, 11, \ldots$$. Prime numbers arrise in many contexts in computer science such as cryptography and error correcting codes. Thus, computing prime numbers is a fundamental computational problem. For this project, you will write a program that produces an array containing every prime int in increasing order. Given an initiallized int[] primes of size $$105,097,565$$ (the number of primes up to the maximum int value in Java) your program should fill primes with consecutive primes as quickly as possible.

#### Option 2: Sorting

Sorting a list of elements is one of the most common tasks performed by computers. For this project you will write a program that sorts a large array of floats as quickly as possible. Your program will take as input (a reference to) an array float[] values and sort it. The size of array will be $$2^{20} = 1,048,576$$.

If you’d like, you may elect to write a program to perform another computational task instead of one of the tasks listed above. The only requirement is that the task should admit a relatively simple baseline procedure that you will optimize for your project. If you choose this option, you will need to submit a baseline implementation of your procedure by the end of Week 11, which will be the point of comparison for the performance of your optimized procedure.

## Testing Performance

The performance of your programs will be tested on the HPC cluster. For Options 1 and 2, you will be provided with both a baseline implementation and a tester that checks both performance and correctness. Your submissions well get a score of 0 for performance if the correctness test is not passed. If you elect Option 3, you should submit a program that compares both the outputs and running times of your baseline and optimized implementations.

Starting in Week 11, you should submit weekly updates to your optimized program for Options 1 and/or 2. These updates will be tested on the HPC cluster, and a leaderboard of running times will be shared with the class.

## Timeline

• Week 10: group and project selection selection
• Weeks 11–13: leaderboard submissions for projects 1 and 2
• Last day of classes: final submissions due by 5:00 pm

## Final Submission

For each project (option) you must submit:

1. all of the code required run and test your program on the HPC cluster,
2. a PDF document describing your progam, the optimizations you tested/implemented, and the results of your tests.

## Assessment

Performance (50%)

For options 1 and 2, performance grades will be determined both by absolute performance relative the the (provided) baseline implementations, as well as the performance relative to other groups’ submissions. For the relative performance:

• the fastest submission will receive extra credit;
• the next two fastest submissions are guaranteed full marks for performance;
• any submission whose running time is less than 120% of the fastest running time is also guaranteed full marks for performance.

For open-ended projects (Option 3), the performance will be based on your optimized performance relative to the baseline performance (submitted in Week 11), as well well as the incremental performance improvements described in the documentation you submit.

Documentation (40%)

Along with you program, you must submit a write-up that documents both your final submission and its development. The documentation should include details on different optimizations you implemented, as well as their effects on performance in isolation and in conjunction with other optimizations. If applicable, you should document optimizations for multithreading (e.g., comparing different strategies for partitioning the problem), memory access optimizations, vector operations, and any other strategies you used to optimize performance. The documentation should include concrete data, i.e., running times of versions of your program with different optimizations applied. Including tables, etc., your documentation should probably be 3+ pages long for each project (option).

If you do Option 3 for one of your projects, the documentation must also include a thorough description of the problem being solved as well as the baseline implementation.

Style (10%)

Your final program submission should be well-organized and thoroughly (though not excessively) commented. Given a high-level description of the program,