Author: Blaise Barney, Lawrence Livermore National Laboratory| UCRL-MI-133316| Table of Contents
1. What is Parallel Computing?
2. Why Use Parallel Computing?
3. Concepts and Terminology
3. von Neumann Computer Architecture
4. Flynn's Classical Taxonomy
5. Some General Parallel Terminology
4. Parallel Computer Memory Architectures
6. Shared Memory
7. Distributed Memory
8. Hybrid Distributed-Shared Memory
5. Parallel Programming Models
10. Shared Memory Model
11. Threads Model
12. Distributed Memory / Message Passing Model
13. Data Parallel Model
14. Hybrid Model
15. SPMD and MPMP
6. Designing Parallel Programs
16. Automatic vs. Manual Parallelization
17. Understand the Problem and the Program
21. Data Dependencies
22. Load Balancing
25. Limits and Costs of Parallel Programming
26. Performance Analysis and Tuning
7. Parallel Examples
27. Array Processing
28. PI Calculation
29. Simple Heat Equation
30. 1-D Wave Equation
8. References and More Information
This tutorial covers the very basics of parallel computing, and is intended for someone who is just becoming acquainted with the subject. It begins with a brief overview, including concepts and terminology associated with parallel computing. The topics of parallel memory architectures and programming models are then explored. These topics are followed by a discussion on a number of issues related to designing parallel programs. The tutorial concludes with several examples of how to parallelize simple serial programs. Level/Prerequisites: None
What is Parallel Computing?
* Traditionally, software has been written for serial computation: * To be run on a single computer having a single Central Processing Unit (CPU); * A problem is broken into a discrete series of instructions. * Instructions are executed one after another.
* Only one instruction may execute at any moment in time. For example:
* In the simplest sense, parallel computing is the simultaneous use of multiple compute resources to solve a computational problem: * To be run using multiple CPUs
* A problem is broken into discrete parts that can be solved concurrently * Each part is further broken down to a series of instructions * Instructions from each part execute simultaneously on different CPUs For example:
* The compute resources might be:
* A single computer with multiple processors;
* An arbitrary number of computers connected by a network; * A combination of both.
* The computational problem should be able to:
* Be broken apart into discrete pieces of work that can be solved simultaneously; * Execute multiple program instructions at any moment in time; * Be solved in less time with multiple compute resources than with a single compute resource. *
The Universe is Parallel:
* Parallel computing is an evolution of serial computing that attempts to emulate what has always been the state of affairs in the natural world: many complex, interrelated events happening at the same time, yet within a sequence. For example: * Galaxy formation * Planetary movement * Weather and ocean patterns * Tectonic plate drift | * Rush hour traffic * Automobile assembly line * Building a jet * Ordering a hamburger at the drive through. | The Real World is Massively Parallel|
Uses for Parallel Computing:
* Historically, parallel computing has been considered to be "the...