Sunday, August 10, 2008

Concurrent Programming

Mighty oaks from tiny acorns grow or so an ancient proverb claims — consider the microchip. Smaller than a penny, it is the brain of every digital device in the world. Chips connect circuits, computers, handheld devices as well as satellites and an endless list of electronics. As the centerpiece of the information revolution, the chip is the driving force behind innovation as it follows an ambiguous rule called ‘Moore’s Law.’

In 1965, Gordon Moore who shared in the invention of the microprocessor chip and went on to co-found the Intel Corporation wrote an article where he noted that the density of components on semiconductor chips had doubled yearly since 1959. This annual doubling in component density amounted to an exponential growth rate, widely known as Moore's Law.

While Moore’s Law is not a physical law, like gravity, it is an empirical observation that the capacity of memory chips has risen from one thousand bits in 1971 to one million bits in 1991 and to one billion bits by 2001. The billion-bit semiconductor memory chip represented an extraordinary nine orders of magnitude in growth, and a similar growth rate has also been seen in the capability of microprocessor chips to process data.

While many have speculated on the future of Moore’s law, some have concluded that instead of focusing on obtaining greater speed from of a single processor, innovators should develop multi-core processors. Instead of scaling clock speed, which produces power usage and heat emission to unacceptable levels, in order to increase processing power, chip manufacturers have begun adding additional CPUs, or “cores” to the microprocessor package. By working in parallel, the total 'throughput' of the device is increased. Quad cores are already being produced commercially. The advances in parallel hardware development require similar advances in optimizing the execution of multiple tasks working in parallel, called Concurrent Programming.

While on a single core computer can use multithreads to parallelize processes, true processing parallelism doesn't occur without multi-CPU's. Distributed computing uses parallel work units distributed across numerous machines. However, distributed computing incurs additional requirements for task management.

Concurrent programming utilizes task management and communication. The task manager distributes work units to available threads while task communication uses state and memory sharing to establish the initial parameters for a task and collects the result of the task's work. Task communication requires locking mechanisms to insure performance gains, prevent subtle bugs as multiple tasks overwrite memory locations. Synchronization of state and memory issues can be controlled by using locks, monitors, and other techniques to block threads from altering state another makes changes.

Microsoft’s Parallel Computing Development Center provides support for parallelism, programming models, libraries and tools with F#, Task Parallel Library (TP), Parallel Extensions Assembly (PFX), and PLINQ.

F# is a typed functional programming language for the .NET framework that does not directly support concurrent programming. However, it does include asynchronous workflows for I/O. TPL is designed assist in writing managed code for multiple processors. PFX is being folded into TPL. PLINQ is LINQ where the query is run in parallel. PLINQ takes advantage of TPL by taking query iterations and assigning work units to threads (typically processor cores).

Collectively these efforts help the .NET Parallel class to ease the development of threaded applications. Nevertheless, state and shared memory issues are left to the programmer to solve. Adding concurrent programming capabilities to LINQ seems a natural extension of LINQ.

As a result, the next series of technological advances in the information revolution will be strongly dependent on concurrent programming.

Threads

Today work items are run by creating Threads such as:

Thread t = new Thread(DoSomeWorkMethod);
t.Start(someInputValue);

For 10 work items, we could create 10 threads, but this is not ideal because of context switching, and invalidation of each thread’s cache and memory for each thread’s stack. An alternative is to use the .NET ThreadPool class:

ThreadPool.QueueUserWorkItem(
DoSomeWorkMethod, someInputValue);

However, this lacks the richness of the full API since we do not get a reference to it and there is no explicit support to know when it is completed.

Parallel Extensions is a new class similar to Thread with semantics close to ThreadPool. A code snippet for the new Task class is:

Task t = Task.Create(DoSomeWorkMethod,
someInputValue);

See References for further material.


REFERENCES

[1] Alesso, H. P. and Smith, C. F., Connections: Patterns of Discovery, John Wiley & Sons Inc., New York, NY, 2007.

[2] Clifton, M. “Concurrent Programming - A Primer” 3 Jan 2008


[3] Microsoft - Concurrent Programming 2008.

[4] Moth, D., Parallel Extensions to the .NET Framework, 28 February 2008.

No comments: