Parallelism Gone Really Wrong

Couple days ago my friend told me that his team made a lot of efforts to Parallelise their computations for better performance. But strangely enough the performance worsened. Being a good friend i offered him to take a look at their code base. The problem was obvious – someone was snoozing during Parallelism performance class.

A little background:

So what do we mean when we say Parallelism? Seems just like another buzz-word out there, isn’t it?

We all remember Moore’s law that in Intel’s terms meant that a processor chip’s performance will be doubled every 2 years (or 18 months to be more precise). Well, wile proving correct for several decades, it turns out that in latest years, starting in 2012, the advancements in the chip performance is slowing down (according to Intel).

As software developers we are looking for alternative ways to increase performance of our applications, and the obvious path is to use more processor chips instead of waiting for a single chip performance increase. In other words we want to divide our problem in to smaller independent pieces, execute each piece on another processor chip and then merge the results, hence, achieve parallel processing.

(The code in this article is a simple C# code using the TPL library for parallel support, and PerfMon for monitoring the CPUs work).

Ok, so lets say we need 10 million iterations when each iteration will perform some simple calculation.

for (int i = 0; i < 10000000;)
{
    Console.WriteLine(i);
}

How long did it take? Well, i have a 64bit 4x-core machine.
It took roughly 5.5 minutes to complete.

Here is how my CPU time looked like while performing the loop:

10mSync

We can see that non of the CPUs are working too hard, some a little more, some a little less.

To improve my performance i would rather have all my CPUs perform at maximum capacity, i don’t want them sitting around doing nothing, right? Well, not so fast! (no pun intended)

Here is the parallel code for the same loop:

Parallel.For(0, 10000000, (x) =>
{
   Console.WriteLine(x);
});

The CPU time monitor looked like this:

10mASync

Oh, great you say, now all my cores are busy doing their job! It must have been blazing fast, has it?

Ok, i’ll just say it: The parallel operation took roughly as much as 12 minutes to complete! More than twice the time it took for the usual synchronous loop!

What the hell is going on you ask?

The answer is actually pretty simple: the “WriteLine” instruction was just not worth the effort incurred in parallel processing. For the performance gain to be worth the trouble you need to parallelize heavy calculations on your CPUs, that’s where they really shine.
In our case, the performance hit caused by parallel execution is much much higher than the performance gain from running in parallel. After all synchronization of the threads running in parallel is expensive. Very expensive.

Code on,
Shonn Lyga.

About Shonn Lyga

Obsessed with anything and everything in Software Engineering, Technology and Science
This entry was posted in .NET, Multithreading and tagged , , , , . Bookmark the permalink.

One Response to Parallelism Gone Really Wrong

  1. Pingback: Why locking your data? | Software thoughts…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s