Friday, April 3, 2009

4 cores? 8 cores? How about 240 processor cores?

NVIDIA has been making some pretty heavy-duty display cards for professionals for a number of years now. I use 2 NVIDIA Quadro dual head cards to drive my 4 monitor post production workstation. The acceleration provided for visual effects preview is extremely helpful in getting more work done in less time.
With all the focus on bigger and badder CPUs in our workstations, one of the more intriguing advancements in computer muscle is happening somewhat quietly. That would be the advent of parallel processing over a much larger group of processor cores.
While I'm absolutely positive we'll continue to see advancements in CPU power, one CPU core represents the capability to process one operation at a time...at an incredible speed of course. When you add more physical processors, you gain processing power but your limitation becomes how well the math can be sectioned up between two processes and the energy expended to figure out how to divide the operations up-and on the back side, reassemble the results into one unified dataset.
When you have multiple logical cores on one physical wafer, you now have the ability to do multiple operations using each logical core, limited by the pipeline that gets the operations on and off the chip as well as the efficiency of the code to divide and reassemble data. Multiple physical processors would involve multiple "pipes" to get data on and off each processor, gaining some extra torque over an equal configuration utilizing the same number of logical cores on one physical processor.
For operations like 3d animation or complex visual effects where the data that needs to be streamed onto the processor and the math involved is small in relationship to the processing necessary, multiple physical or logical cores are of immediate benefit. In video editing applications, adding processor cores can be helpful where large amounts of decode and encode operations are necessary, say when editing highly compressed HDV or AVCHD footage. In applications where the material is less compressed, or even uncompressed, multiple processor cores become less of an advantage as the dataset that needs to be moved becomes larger, but requires less processing, moving the speed burden to hard drives and buss speed.
NVIDIA has recently started focusing on their CUDA technology. CUDA is what gives software manufacturers a way to tap into the processing architecture of NVIDIA's powerful graphics cards to complete processes that may or may not be display or graphics related. NVIDIA uses parallel processing to get the speed from their configuration. While the cores may be smaller, there are a LOT more of them. The Quadro 5800 card for instance, has 240 processor cores. One example of utilization of this kind of processing is the CUDA-enabled RapiHD™ H.264 encoding plug-in for NVIDIA Quadro cards.
(Wikipedia's take on parallel processing-good general info.)
A way to picture the relative capability of an 8 core CPU and a 240 core GPU might be to picture five decks of cards being dealt out. With eight dealers, each dealer has 32.5 cards to distribute...with 240 dealers, each one distributes 1.083 cards. Even when we take into account that the 240 processor cores are smaller, the share of the load they have to carry is MUCH smaller and the processing is all happening at the same time. The 8 dealers may be very fast but they can't throw 32 cards out at the same time and expect them to fall neatly in front of each player in the proper rotation...they have to go one at a time. They may be dealing cards out of a pitching machine at a velocity that could severe a human limb, but the cards still have to be handled one at a time-serially. In the case of the 240 dealers in parallel, they also handle the cards one at a time with the one caveat that all but 20 of them are only handling one card with one destination. Which way do you think would be faster?
I think that GPU based processing is one of the most interesting areas of computer processing to keep an eye on... With GPUs becoming available to handle instructions along with ever more powerful CPUs, I don't think that the exponential growth in computing speed and power will be leveling off anytime soon. This technology is even being deployed as the primary processor in specialized workstations...learn more about Tesla here.

TimK