It’s
spring and a new crop of graphics processors are being readied for market
and I’m very excited about what’s coming. I was going to say graphics
processors are so cool—but then I would have had to add seventeenleven
sentences explaining that I was using the vernacular convention of the
word and not the thermal usage of the word, for heaven knows these puppies
ain’t cool thermally. But they are damn cool, and as I’ve been saying
for a few years now, the CPU is the co-processor, and the GPU-VPU is
the processor, period, end of discussion.
You
can see the outpacing of the CPU in so many ways. First of all, the
GHz race is now officially over—Intel said so. Never mind that
AMD, Transmeta, and VIA already abandoned that silly metric (it’s like
gauging a car’s performance on how many RPMs the engine can hit); Intel
said it now, so it’s official. Intel said it for a couple of reasons:
one is that they can’t hit the high notes as easily anymore, and for
another, the users are bored with it. It was meaningless when it was
started and it has even less meaning to the user today. (“Ah, how many
GHz do I need to run a 200 by 1280 spreadsheet?”) As for the high
notes, as you know, and I’ve commented on this in other parts of this
week’s issue, the shrinkage scaling process of CMOS transistors is about
to come to an end and with it the increases in GHz.
What we’re doing with transistors to get the game rolling is making
them taller, and that will work for a while. What we’re doing with processors
to make them more powerful is run them in parallel, and Intel’s Hyper-Threading
was a half step in that direction. But parallel is the universe graphics
lives in—it naturally scales in parallel, and you can see that
with the first GPUs that bragged about multiple pipes, and with 3Dlabs’s
P10.
The difference between the lowly, but noble CPU and the mighty and
regal GPU is that GPUs scale naturally with the apps, whereas CPUs require
the apps to be rewritten and compiled. I can hear Craig Barrett calling
Larry Ellison, “Hey, Lar, we got a new gizmo and we need you to recompile
all your code for it. When can you have that ready?”
With its parallel architectures, the GPU—maybe I’ll start referring
to it as the MP (the Main Processor)—scales out and up, so to speak,
giving it an exponential curve that exceeds the Moore’s Law curve.
But wait—it gets even better. When Longhorn comes with WinFS all
those full IEEE 32-bit floating-point processors in the GPU/VPUs—i.e.,
the “MPs”—can be put to other uses besides just crunching and enhancing
pixels. One of Longhorn’s big goals is a database engine that will find
things for you like a really smart agent. You say, Where’s that file
that I sent to Jim, or was it Jerry, last month, or maybe last year,
about the whatschamacallit? Now to do a search for that kind abstraction
you need a lot of horsepower (not to mention one hell of a meta file
on each file), and with 32 or 48 floating processors sitting on a PCI
Express gateway you’ll have processing capabilities only envisioned
in science fiction stories.
There have already been some—national laboratories and universities
have used the floating-point processors in GPUs as parallel processors.
Those guys are going to be ecstatic when they see the new crop. And
they are cheap, cheap, cheap! Four to six scalar units, four to six
vector units, and 16 FPUs plus miscellaneous other little thingies to
crunch special numbers, for what, maybe $300? And what does a Prescott
sell for with its lowly single processor and make-believe HT stuff?
The GPU/VPUs are also much more intimate with their memory than a CPU,
and they have as much as a CPU (if not more), and the GPU/VPU’s memory
runs faster. And what makes a computer fast and useful, class? That’s
right, a lot of fast memory.
So the core processor on a CPU may run like hell, 3.5 GHz, and a GPU
core may only run at 550 MHz, but it’s like a VLWI device in that it’s
doing 16 x 32 x 500 bits per second as compared to 32 x 3,500,000 a
second, or the equivalent of 8 GHz. And next year when the CPU gets
up to 4 GHz, the GPU will be the equivalent of 19 GHz (assuming a 600-MHz
core and 32 pipes).
Of course, we’ll have to use cryogenic cooling to get these things
to work, but that just adds to the excitement. “What’sa matter, Ralph?”
“My damn CO2 tank ran out and my machine is shutting down—why can’t
IT keep these things filled?”
I’m telling ya, these graphics thingies are cool, cool, cool!