# Advanced MPI Tips and Tricks - University of Hawaii

An Introduction to Parallel Computing Dr. David Cronk Innovative Computing Lab University of Tennessee Distribution A: Approved for public release; distribution is unlimited. Outline Parallel Architectures Parallel Processing What is parallel processing? An example of parallel processing Why use parallel processing? Parallel programming Programming models Message passing issues Data distribution

Flow control David Cronk Distribution A: Approved for public release; distribution is unlimited. 2 Shared Memory Architectures Single address space All processors have access to a pool of shared memory Symmetric multiprocessors (SMPs) Access time is uniform CPU CPU CPU CPU

CPU bus Main Memory David Cronk Distribution A: Approved for public release; distribution is unlimited. 3 Shared Memory Architectures Single address space All processors have access to a pool of shared memory Non-Uniform Memory Access (NUMA) CPU CPU CPU

CPU CPU CPU CPU CPU CPU CPU bus Main Memory bus

Main Memory Network David Cronk Distribution A: Approved for public release; distribution is unlimited. 4 Distributed memory Architectures M M M M

M M M M P P P P P P

P P Network David Cronk Distribution A: Approved for public release; distribution is unlimited. 5 Networks Grid processors are connected to 4 neighbors Cylinder A closed grid Torus A closed cylinder Hypercube Each processor is connected to 2^n other processors, where n is the degree of the hypercube

Fully Connected Every processor is directly connected to every other processor David Cronk Distribution A: Approved for public release; distribution is unlimited. 6 Parallel Processing What is parallel processing? Using multiple processors to solve a single problem Task parallelism The problem consists of a number of independent tasks Each processor or groups of processors can perform a separate task Data parallelism

The problem consists of dependent tasks Each processor works on a different part of data David Cronk Distribution A: Approved for public release; distribution is unlimited. 7 Parallel Processing 1 4 dx 2 (1 x ) 0 We can approximate the integral as a sum of rectangles N

F(x )x i i 0 David Cronk Distribution A: Approved for public release; distribution is unlimited. 8 Parallel Processing David Cronk Distribution A: Approved for public release; distribution is unlimited. 9 Parallel Processing

David Cronk Distribution A: Approved for public release; distribution is unlimited. 10 Parallel Processing Why parallel processing? Faster time to completion Computation can be performed faster with more processors Able to run larger jobs or at a higher resolution Larger jobs can complete in a reasonable amount of time on multiple processors Data for larger jobs can fit in memory when spread out across multiple processors

David Cronk Distribution A: Approved for public release; distribution is unlimited. 11 Parallel Programming Outline Programming models Message passing issues Data distribution Flow control David Cronk Distribution A: Approved for public release; distribution is unlimited. 12 Parallel Programming Programming models

Shared memory All processes have access to global memory Distributed memory (message passing) Processes have access to only local memory. Data is shared via explicit message passing Combination shared/distributed Groups of processes share access to local data while data is shared between groups via explicit message passing David Cronk Distribution A: Approved for public release; distribution is unlimited. 13 Message Passing Message passing is the most common method for

programming for distributed memory With message passing, there is an explicit sender and receiver of data In message passing systems, different processes are identified by unique identifiers Simplify this to each having a unique numerical identifier Senders send data to a specific process based on this identifier Receivers specify which process to receive from based on this identifier David Cronk Distribution A: Approved for public release; distribution is unlimited. 14 Parallel Programming Message Passing Issues

Data Distribution Minimize overhead Latency (message start up time) Few large messages is better than many small Memory movement Maximize load balance Less idle time waiting for data or synchronizing Each process should do about the same work Flow Control Minimize waiting David Cronk Distribution A: Approved for public release; distribution is unlimited. 15 Data Distribution

David Cronk Distribution A: Approved for public release; distribution is unlimited. 16 Data Distribution David Cronk Distribution A: Approved for public release; distribution is unlimited. 17 Flow Control 0 Send to 1 1

Send to Recv from 2 0 Send to Recv from 2 0 2 Send to Recv from 3 1 Send to Recv from 3 1

3 4 5 Send to Recv from 4 2 Send to Recv from 4 2 David Cronk Distribution A: Approved for public release; distribution is unlimited. 18

This presentation was made possible through support provided by DoD HPCMP PET activities through Mississippi State University (MSU) under contract No. N62306-01-D-7110. David Cronk Distribution A: Approved for public release; distribution is unlimited. 19

## Recently Viewed Presentations

• Spectroscopy = using light to investigate a compound How the experiment works What color do we see? a) What we see as the color of a compound is the complementary color to what the compound absorbs b) Example: Absorbs red,...
• Phil Fernandes, The God Who Sits Enthroned, 2002. @ Dr. Heinz Lycklama * Справочна литература Phil Fernandes, Contend Earnestly for the Faith: A Survey of Christian Apologetics, 2008. Lee Strobel, The Case For Faith, 2000. Lee Strobel, The Case For...
• Demonstrate the process of seed germination in monocot and dicot seeds ... Video Results. Video Students Explaining Results. Keep Videos very short ... Process of Seed Germination. Difference between monocots and dicots. Phototropism. Parts of a seed. Presentation skills. Author:...
• We call the fact that any integer is either even or odd the parity property. Example 5 - Consecutive Integers Have Opposite Parity. Prove that given any two consecutive integers, one is even and the other is odd. Solution:Two integers...
• Definition: Range = Max - Min Median = Middle number when arranged low to high Mode = Most common number This Example: Range = 10 - 2 = 8 minutes Median = 6 minutes Mode = 2 minutes 2 53.43...
• Focusing on group perspectives of journey exploration, media and comms, and gaming ... They become part of a well rehearsed ritual that creates moments of enjoyment but are similar to buying a bar of chocolate - the experience is momentary...
• A complete command is called a statement Start the Python interpreter in an interactive mode Inside a Python program comments: any text from # through the end of a line intended for humans, ignored by the Python defining a function...
• Summary of Chapter 1 Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall * Chapter Two Slide