Daniel Palme

Daniel Palme

.NET consultant from Germany.

Blog > .NET 4.0 - Performance of Task Parallel Library (TPL)

.NET 4.0 - Performance of Task Parallel Library (TPL)

The upcoming .NET Framework 4.0 will contain the new Task Parallel Library (TPL). The TPL will be used by PLINQ, which is a parallel implementation of LINQ to Objects.
PLINQ makes it easier to take advantage of today's multicore CPUs, without caring about thread synchronization.
In this post I will do some benchmarks to get an impression of the TPL's performance, compared to current technologies.

The benchmark

My benchmark executes one task in a loop. The number of loop iterations is increased from 0 to 10.000.000.
Each measure is repeated 3 times and the average execution time is taken as a result.
The overall benchmark is executed twice, first with a cheap task (empty method) then with a move expensive one. The first run will give us an impression of the overhead that occurs due to managing multiple threads. The second one is more realistic, since you'll probably will use multiple threads especially to execute expansive tasks.

The benchmark compares the following technologies/implementations:


The benchmark has been executed on a dual core machine.
Let's have a look at the results:


As mentioned above, a very cheap task gets executed in the first run.
The TPL is a bit slower than the single threaded implementation, but the overhead is very small compared to the multiple threads implementation (which uses locks). It is quite surprising that the ThreadPool is even slower.


In the second run a more expensive method gets executed.
Now the single threaded implementation is much slower. TPL and the multiple threads implementation both nearly have the same duration offset compared to the cheap task, but TPL is still faster. Again the ThreadPool is the slowest implementation.


The TPL seems to be quite fast in this benchmark.
The advantage of using the TPL is, that it works effective on different machines. You don't have to worry about how many threads to use for a specific task. The TPL will do this for you, depending on the number of cores. When you use PLINQ, simple tasks will automatically get executed in a single thread, unless you force parallel execution by calling the WithExecutionMode extension method.
Another benefit of the TPL is, that you don't have to deal with thread creation and synchronization.



Subscribe to RSS Feed

Tags: .NET, C#

Related posts


New comment










That bad ThreadPool performance shouldn't suprise. ThreadPool creates new threads only on demand and, more important, after certain intervals (0.5s is the default). In your case, with that very short runtime of only 1.5 - 2.5 seconds it's effectively single-threaded plus the overhead.



I've also experimented with more iterations where the runtime was between 30 - 60 seconds.
The result was the same, the ThreadPool always was slowest implementation.



I made a MultipleThreads without the 'lock' (using Interlocked.Increment) and in my benchmark it came very close to the TPL.

Thread loop:

while (true)

int number = Interlocked.Increment(ref currentNumber);
if (number >= numberOfCalculations)