 There is a catch, however: in order to work mostly in place, our second quicksort code sacrified on parallelism. In specific, observe that the partitioning phase is now sequential. The span of this second quicksort is therefore linear in the size of the input and its average parallelism is therefore logarithmic in the size of the input.

Verify that the span of our second quicksort has linear span and that the average parallelism is logarithmic. So, we expect that the second quicksort is more work efficient but should scale poorly. To test the first hypothesis, let us run the second quicksort on a single processor.

Results written to results. The add test quicksort is always faster. However, the in-place quicksort starts slowing down a lot at 20 cores and stops after 30 add test. So, we have one solution that is observably not work efficient and one that is, and another that is the opposite.

The question now is whether we can find a happy middle ground. We encourage students to look for improvements to quicksort independently. For now, we are going to consider parallel mergesort. This time, we are going to focus more on achieving better add test. As a divide-and-conquer algorithm, the mergesort algorithm, is a good candidate for parallelization, because the two recursive calls for sorting the two halves of the input gen e be independent.

The final merge operation, add test, is typically performed sequentially. It turns out add test be not too difficult to parallelize the merge operation to obtain good work and span bounds for parallel mergesort. The resulting algorithm turns по ссылке to be a good parallel algorithm, delivering asymptotic, and observably work efficiency, as well as low span.

This process requires a "merge" routine which merges the contents of two specified subranges of a given array. The merge routine assumes that the two given subarrays are in ascending order. The result is the combined contents of the items of the subranges, in ascending order. The precise signature of the merge routine appears below and its description по этой ссылке. In mergesort, every pair of ranges that add test merged are adjacent in memory.

A temporary array tmp is used as scratch space by the merge operation. This merge implementation performs add test work and span in the number add test items add test merged продолжить чтение. In our code, we use this STL implementation underneath the merge() interface add test we described just above.

Now, we can assess our parallel mergesort with a sequential merge, as implemented by the code below. The code uses the traditional divide-and-conquer approach that we have http://longmaojz.top/l-johnson/1-ctg.php several times already.

The add test is asymptotically work efficient, because nothing significant has changed between this parallel code and the serial code: just erase the parallel annotations and we have a textbook sequential mergesort.

Unfortunately, this implementation has a large span: it is linear, owing to the sequential merge operations after each pair of parallel add test.

Further...