This blog is for students of TCS 4063: Parallel and Distributed Computing of UniSZA. This is for the implementation of Objective Based Education (OBE), & Blended Learning.
Sunday, 28 September 2014
example of parallel computer
Array Processing
This example demonstrates calculations on 2-dimensional array elements, with the computation on each array element being independent from other array elements.
The serial program calculates one element at a time in sequential order.
Serial code could be of the form:
do j = 1,n
do i = 1,n
a(i,j) = fcn(i,j)
end do
end do
The calculation of elements is independent of one another - leads to an embarrassingly parallel situation.
The problem should be computationally intensive.
Embarrassingly parallel array calculation
Array Processing
Parallel Solution 1
Arrays elements are distributed so that each processor owns a portion of an array (subarray).
Independent calculation of array elements ensures there is no need for communication between tasks.
Distribution scheme is chosen by other criteria, e.g. unit stride (stride of 1) through the subarrays. Unit stride maximizes cache/memory usage.
Since it is desirable to have unit stride through the subarrays, the choice of a distribution scheme depends on the programming language. See the Block - Cyclic Distributions Diagram for the options.
After the array is distributed, each task executes the portion of the loop corresponding to the data it owns.
Notice that only the outer loop variables are different from the serial solution.
Embarrassingly parallel array calculation data decomposition
One Possible Solution:
Implement as a Single Program Multiple Data (SPMD) model.
Master process initializes array, sends info to worker processes and receives results.
Worker process receives info, performs its share of computation and sends results to master.
Array Processing
Parallel Solution 2: Pool of Tasks
The previous array solution demonstrated static load balancing:
Each task has a fixed amount of work to do
May be significant idle time for faster or more lightly loaded processors - slowest tasks determines overall performance.
Static load balancing is not usually a major concern if all tasks are performing the same amount of work on identical machines.
If you have a load balance problem (some tasks work faster than others), you may benefit by using a "pool of tasks" scheme.
Pool of Tasks Scheme:
Two processes are employed
Master Process:
Holds pool of tasks for worker processes to do
Sends worker a task when requested
Collects results from workers
Worker Process: repeatedly does the following
Gets task from master process
Performs computation
Sends results to master
Worker processes do not know before runtime which portion of array they will handle or how many tasks they will perform.
Dynamic load balancing occurs at run time: the faster tasks will get more work to do.
https://computing.llnl.gov/tutorials/parallel_comp/#Examples
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment