Sunday 28 September 2014

example of parallel computer



Array Processing

    This example demonstrates calculations on 2-dimensional array elements, with the computation on each array element being independent from other array elements.

    The serial program calculates one element at a time in sequential order.

    Serial code could be of the form:


    do j = 1,n
    do i = 1,n
      a(i,j) = fcn(i,j)
    end do
    end do

    The calculation of elements is independent of one another - leads to an embarrassingly parallel situation.

    The problem should be computationally intensive.

    Embarrassingly parallel array calculation

Array Processing
Parallel Solution 1

    Arrays elements are distributed so that each processor owns a portion of an array (subarray).

    Independent calculation of array elements ensures there is no need for communication between tasks.

    Distribution scheme is chosen by other criteria, e.g. unit stride (stride of 1) through the subarrays. Unit stride maximizes cache/memory usage.

    Since it is desirable to have unit stride through the subarrays, the choice of a distribution scheme depends on the programming language. See the Block - Cyclic Distributions Diagram for the options.

    After the array is distributed, each task executes the portion of the loop corresponding to the data it owns.

    Notice that only the outer loop variables are different from the serial solution.

    Embarrassingly parallel array calculation data decomposition

One Possible Solution:

    Implement as a Single Program Multiple Data (SPMD) model.

    Master process initializes array, sends info to worker processes and receives results.

    Worker process receives info, performs its share of computation and sends results to master.

  

Array Processing
Parallel Solution 2: Pool of Tasks

    The previous array solution demonstrated static load balancing:
        Each task has a fixed amount of work to do
        May be significant idle time for faster or more lightly loaded processors - slowest tasks determines overall performance.

    Static load balancing is not usually a major concern if all tasks are performing the same amount of work on identical machines.

    If you have a load balance problem (some tasks work faster than others), you may benefit by using a "pool of tasks" scheme.

Pool of Tasks Scheme:

    Two processes are employed

    Master Process:
        Holds pool of tasks for worker processes to do
        Sends worker a task when requested
        Collects results from workers

    Worker Process: repeatedly does the following
        Gets task from master process
        Performs computation
        Sends results to master

    Worker processes do not know before runtime which portion of array they will handle or how many tasks they will perform.

    Dynamic load balancing occurs at run time: the faster tasks will get more work to do.

   

https://computing.llnl.gov/tutorials/parallel_comp/#Examples

No comments:

Post a Comment