Re: A 30p faster than A 31 in July PC World?

New Message Reply Date view Thread view Subject view Author view Attachment view

From: Mark Bell (electrosoft_at_earthlink.net)
Date: Sat Jun 22 2002 - 01:20:38 EDT


At 03:38 PM 6/21/2002 -0700, Peter Machule wrote:
>easy ...mhz does not mean a whole lot .
>http://www.pcmag.com/article2/0,4149,2298,00.asp
>
>
>
>Just because they say something is faster doesn't mean it always is. During
>our tests, the Mobile Intel Pentium 4 Processor-M, at 1.6 to 1.8 GHz, did
>not always garner as much performance on Business Winstone 2001 as the
>1.2-GHz PIII-M. The P4-M's deeper pipeline leads to an increased latency as
>disparate tasks are flushed out of the pipeline.

This means it chokes on non-optimized P4 software, as those long
pipes get full of bad predictive branching, and has to flush and start over
again. If the
world was sequential, the P4 would rule. That deep
pipeline is also one of the reasons the P4 scales so well speed wise. The
one thing
the P4 does is scale well for sheer Mhz, but sheesh its IPC and FPU are
atrocious.
Even my P4 1.8 @ 2.7ghz (FSB 150mhz) desktop gets man handled by a
P3-1.2ghz's FPU.

>In particular, the Dell
>Latitude C610, with its 1.2-GHz PIII-M processor, beat its mainstream
>notebook rivals, including the 1.7-GHz P4-M-based HP Omnibook vt6200 and
>Toshiba Tecra 9100, on Business Winstone.

I wrote a review of the Inspiron 8100/8200 at:

http://home.earthlink.net/~electrosoft/i8200.htm

Besides my pet peeve with the GF4 **MX** chipset in their 440, my greater peeve
was the P4's poor performance, and the lack of support for superior subsystems
for it, as given to the P4.

The P4 is a dog on its own accord. Unless you have software that is optimized
for SSE2 and to factor in the P4's super scalar design in regards to its deep
pipeline, then you have a CPU that in raw IPC efficiency wise is inferior.
Luckily,
you have a CPU that is bolstered by DDR or RDRAM and in regards to laptops,
better video subsystems (Even though users have reported swapping video
modules in their 8100's for the GF 440 Go).

>On the other hand, we could see that the P4-M, like the desktop P4,
>generally does well at tasks that involve sequential processing, like those
>in the Content Creation Winstone 2002 and Adobe Photoshop tests.

AKA, predictive, sequential streams of instructions that rarely are wrong and
are then tossed out of the pipe. If you give the P4 data it can predict and
work with in a knowing fashion, it does show some zip.

A gent on the comp.sys.laptops
group specifically tested sequential versus random memory reads
(thrashing), and the P4 and
DDR takes a serious hit on sequential reads and the A30p was ~10% faster
overall
if I remember correctly.

Because I was curious, I coded a small test procedure. If you use a large
query of sequential reads and scaled
processing (example, PL/SQL, the P4, create a query that is sequential,
and does not rely on random hits), the P4 zooms through it ~30% faster than
my P3 (A30p versus A31p here, folks). But using that same test and coding
it for random hits, and the A30p was ~25% faster. This was using a replicated
database from a client that was chocked full of data from Access and then
migrated
to Oracle (eventually put on a back end server), so it had plenty of data
to keep it
busy. I just wanted to make sure to have some very large datasets to work
with. I took
it a step further, and used a coded trigger based on user input (which was
conveniently
put into a data file to simulate 100+ user inputs), and it bogged down ~9%
more on the P4,
but only ~4% P3. I will look into this even more when I have time. But it
is clear that
the P4 really gets caught up with random data processing.

In its current state, the P4 is an evolutionary step versus a revolutionary
step, and
in a lot of aspects, even that is questionable. Software that is recompiled
for the
P4's design and SSE2 will run nicely, but SSE2 optimized software is the
exception,
not the norm. Look how long we waited for MMX and SSE software.

I had really wished that Intel, or a third party, had introduced an
optimized DDR chipset
to we could see what the P3 could do, especially with the Tualatin cores
out there
with 512kb of cache and 1400mhz speeds. :)

Only now, with the Northwoods (.13 micron fab) and their 512kb cache and
2.2ghz+
speeds is the P4 starting to show its somewhat faster but tainted colors.
The FSB 133mhz
models are, overall, coupled with the better DDR and RDRAM chipsets outpacing
everything out there, even AMDs now. Competition is good..........yeah good!

Mark
(Who talks trash on the P4, yet uses a Desktop P4- 1.8 @ 2.7 ,. and an A31p)
(I'm not biting the hand that feeds it....just nibbling at it a little <g>)


New Message Reply Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.3 : Thu Jan 23 2003 - 09:59:01 EST