Due by 11:59 pm on Tuesday 11/22
Objectives
- Practice designing an efficient parallel algorithm
- Analyze a parallel algorithm in terms of work and span
- Implement a parallel algorithm using OpenMP
Assignment
As many of you have realized, the problem from Project 1 is better
known as tthe RNA Secondary Structure problem, an important problem
in biology and genetics. The only difference is that rather than
sequences of COMIC-CON fans (G, H, W, T), you work with sequences of
RNA bases (A, C, G, U; C pairs with G, A pairs with U). Your task
is to design, analyze, and implement a multithreaded algorithm to
solve the RNA Secondary Structure problem. You may use a memoized
or bottom-up approach, but in either case, your goal is to design an
algorithm that makes efficient use of multiple processors.
Once you have designed your multithreaded algorithm, you will need
to analyze it. First compute the
work T1(n) and the
span T∞(n) where
n is the size of the input set X. You should
prove your expressions for T1
and T∞ as rigorously as possible. Finally,
compute the parallelism and estimate a range of parameters
(n and P) for which linear or near-linear speed-up
may be expected.
The last part of the project is to implement your RNA Secondary
Structure algorithm in C or C++ using OpenMP on UMBC's maya cluster
and measure it's performance empirically using 1, 2, 4, and 8
processors (optionally, you may test on 16 processors; most maya
nodes have eight cores, but some have 16). You will need to generate
test data with variety of different sized input sets and subsets to
demonstrate the performance characteristics of your algorithm.
Submission
Your project submission consists of the following:
-
A written report describing the design, analysis, and empirical
performance of your algorithm.
-
Multithreaded RNA Secondary Structure code and test data.
The written report must be turned-in on Blackboard as either a DOC,
DOCX, or PDF file. Code and test data should be archived in a
single zip file and submitted on Blackboard.
Grading
Points will be distributed among design, analysis, and
implementation as follows:
- Algorithm Design — 30 points
- Algorithm Analysis — 30 points
- Implementation — 40 points
Hints and Resources
-
Be sure to review the project policies.
-
Although accounts have already been created for each of you, you
will need to complete
the account
request form so that you can acknowledge the user agreement.
-
Your position is "Undergraduate Student"; your Department,
Affiliation, and Address are all "CSEE".
-
Your Role in the Resarch Project is "Member of the research
group".
-
The Information on UMBC Faculty/PI Account Sponsor section
should be completed with my information: "Chris Marron",
"CSEE", "cmarron@umbc.edu", and "x52417".
-
Use "CMSC 441 Project 2" for the Title of the Research Project
and "Multithreaded RNA Secondary Structure" for the Abstract.
-
Check the Accept Terms box and click Submit.
Once you've completed the user agreement, you can login to your
account on maya.umbc.edu.
-
You should use the Intel compilers (icc
and icpc) and the SLURM job management system. To make
these available, you will need to load a couple modules in your
Linux environment:
- module load intel/compiler
- module load slurm
The first module command loads the Intel compilers into
your environment; the second loads SLURM.
-
There is an excellent
OpenMP
Tutorial by Tim Mattson linked from
the openmp.org website. There are
also video
lectures on YouTube.
-
An introduction
to SLURM on maya is available from the UMBC HPCF website.
-
Although I prefer that you use OpenMP, I will accept projects
using pthreads so long as you use them in a way which is
consistent with the costing model (parralel loops, spawn, sync).
If you choose a top-down approach, pthreads may be easier than
OpenMP.
-
It may turn-out that eight processors are insufficient to fully
demonstrate the advanate of the parallel algorithm.