CPU benchmark Cant arrive at the same result twice in a row..

Message boards : Number crunching : CPU benchmark Cant arrive at the same result twice in a row..
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Dorsai
Avatar

Send message
Joined: 7 Sep 04
Posts: 474
Credit: 4,504,838
RAC: 0
United Kingdom
Message 149302 - Posted: 9 Aug 2005, 21:46:32 UTC

Log copy...

--- - 2005-08-09 22:22:36 - Running CPU benchmarks
--- - 2005-08-09 22:22:36 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:23:37 - Benchmark results:
--- - 2005-08-09 22:23:37 - Number of CPUs: 2
--- - 2005-08-09 22:23:37 - 1560 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:23:37 - 940 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:23:37 - Finished CPU benchmarks
--- - 2005-08-09 22:23:38 - Resuming computation and network activity
--- - 2005-08-09 22:24:25 - Running CPU benchmarks
--- - 2005-08-09 22:24:25 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:25:26 - Benchmark results:
--- - 2005-08-09 22:25:26 - Number of CPUs: 2
--- - 2005-08-09 22:25:27 - 1565 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:25:27 - 2734 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:25:27 - Finished CPU benchmarks
--- - 2005-08-09 22:25:28 - Resuming computation and network activity
--- - 2005-08-09 22:26:42 - Running CPU benchmarks
--- - 2005-08-09 22:26:43 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:27:44 - Benchmark results:
--- - 2005-08-09 22:27:44 - Number of CPUs: 2
--- - 2005-08-09 22:27:44 - 1563 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:27:44 - 1312 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:27:44 - Finished CPU benchmarks
--- - 2005-08-09 22:27:45 - Resuming computation and network activity
--- - 2005-08-09 22:27:48 - Running CPU benchmarks
--- - 2005-08-09 22:27:49 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:28:50 - Benchmark results:
--- - 2005-08-09 22:28:50 - Number of CPUs: 2
--- - 2005-08-09 22:28:50 - 1563 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:28:50 - 1316 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:28:50 - Finished CPU benchmarks
--- - 2005-08-09 22:28:51 - Resuming computation and network activity
--- - 2005-08-09 22:34:52 - Running CPU benchmarks
--- - 2005-08-09 22:34:52 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:35:53 - Benchmark results:
--- - 2005-08-09 22:35:53 - Number of CPUs: 2
--- - 2005-08-09 22:35:53 - 1562 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:35:53 - 2806 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:35:53 - Finished CPU benchmarks

Copied from Boinc, version 4.19.

I did not use the PC for anyting.

4 benchmarks, in a row. and the drystone number varies between 940 and 2806.

Big differance, one to the next. No overclocking. No change to system. Two AMD MP2000 CPU's.

My other PC, XP2000 cpu, (only one CPU), gets 3300, plus/minus 200.

Why the mad variance on the dual CPU PC.





Foamy is "Lord and Master".
(Oh, + some Classic WUs too.)
ID: 149302 · Report as offensive
Profile tekwyzrd
Volunteer tester
Avatar

Send message
Joined: 21 Nov 01
Posts: 767
Credit: 30,009
RAC: 0
United States
Message 149305 - Posted: 9 Aug 2005, 21:50:35 UTC

Looks like I'm not the only one having this kind of problem.
Nothing travels faster than the speed of light with the possible exception of bad news, which obeys its own special laws.
Douglas Adams (1952 - 2001)
ID: 149305 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 149343 - Posted: 10 Aug 2005, 0:10:55 UTC - in response to Message 149302.  

Log copy...

--- - 2005-08-09 22:22:36 - Running CPU benchmarks
--- - 2005-08-09 22:22:36 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:23:37 - Benchmark results:
--- - 2005-08-09 22:23:37 - Number of CPUs: 2
--- - 2005-08-09 22:23:37 - 1560 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:23:37 - 940 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:23:37 - Finished CPU benchmarks
--- - 2005-08-09 22:23:38 - Resuming computation and network activity
--- - 2005-08-09 22:24:25 - Running CPU benchmarks
--- - 2005-08-09 22:24:25 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:25:26 - Benchmark results:
--- - 2005-08-09 22:25:26 - Number of CPUs: 2
--- - 2005-08-09 22:25:27 - 1565 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:25:27 - 2734 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:25:27 - Finished CPU benchmarks
--- - 2005-08-09 22:25:28 - Resuming computation and network activity
--- - 2005-08-09 22:26:42 - Running CPU benchmarks
--- - 2005-08-09 22:26:43 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:27:44 - Benchmark results:
--- - 2005-08-09 22:27:44 - Number of CPUs: 2
--- - 2005-08-09 22:27:44 - 1563 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:27:44 - 1312 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:27:44 - Finished CPU benchmarks
--- - 2005-08-09 22:27:45 - Resuming computation and network activity
--- - 2005-08-09 22:27:48 - Running CPU benchmarks
--- - 2005-08-09 22:27:49 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:28:50 - Benchmark results:
--- - 2005-08-09 22:28:50 - Number of CPUs: 2
--- - 2005-08-09 22:28:50 - 1563 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:28:50 - 1316 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:28:50 - Finished CPU benchmarks
--- - 2005-08-09 22:28:51 - Resuming computation and network activity
--- - 2005-08-09 22:34:52 - Running CPU benchmarks
--- - 2005-08-09 22:34:52 - Suspending computation and network activity - running CPU benchmarks
--- - 2005-08-09 22:35:53 - Benchmark results:
--- - 2005-08-09 22:35:53 - Number of CPUs: 2
--- - 2005-08-09 22:35:53 - 1562 double precision MIPS (Whetstone) per CPU
--- - 2005-08-09 22:35:53 - 2806 integer MIPS (Dhrystone) per CPU
--- - 2005-08-09 22:35:53 - Finished CPU benchmarks

Copied from Boinc, version 4.19.

I did not use the PC for anyting.

4 benchmarks, in a row. and the drystone number varies between 940 and 2806.

Big differance, one to the next. No overclocking. No change to system. Two AMD MP2000 CPU's.

My other PC, XP2000 cpu, (only one CPU), gets 3300, plus/minus 200.

Why the mad variance on the dual CPU PC.




I know you said you aren't using this for anything but SETI, but what else is running on this machine?
ID: 149343 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 149694 - Posted: 10 Aug 2005, 13:36:52 UTC

You can effectively have "nothing" else running under windows and see this very thing. We did tests back in Beta and nothing has changed since. The benchmarks are unstable and do not reflect actual capabilities.
ID: 149694 · Report as offensive
STE\/E
Volunteer tester

Send message
Joined: 29 Mar 03
Posts: 1137
Credit: 5,334,063
RAC: 0
United States
Message 149695 - Posted: 10 Aug 2005, 13:40:32 UTC

The benchmarks are unstable and do not reflect actual capabilities
==========

Then why are they used Paul, why can't all the Projects be like CPDN and after so much Processing you get so much Credit. It seems to me it would be a whole lot simpler that way and a lot less confusing on why the Credits jump around so much.
ID: 149695 · Report as offensive
Profile Kajunfisher
Volunteer tester
Avatar

Send message
Joined: 29 Mar 05
Posts: 1407
Credit: 126,476
RAC: 0
United States
Message 149707 - Posted: 10 Aug 2005, 14:20:39 UTC

Any service or process that is running in the background, even moving the mouse, will make the benchmarks erratic.

CPDN uses "timesteps" or "trickles", meaning that after approximately every 1.39% processed the wu will attempt to contact the server to report progress.

If your benchmark values are lower it will certainly take longer between each step or trickle to report.

Running your computer with the least amount of services necessary will increase your benchmark scores. They will not be exactly the same each time they are run, but the scores will be alot closer afterwards if you run them 2-3 times in a row.
ID: 149707 · Report as offensive
Profile Dorsai
Avatar

Send message
Joined: 7 Sep 04
Posts: 474
Credit: 4,504,838
RAC: 0
United Kingdom
Message 149745 - Posted: 10 Aug 2005, 16:28:52 UTC

The only program running at the time of the benchmarks, apart from windows services and Boinc, was AVG anti virus.
I did not move the mouse, touch the keyboard, etc.
The PC was built solely for Seti-crunching. The only software installed is that needed to do this: XP Pro, USB-networking, Boinc, and Anti virus, as the PC has net access.

Checks with CPU-Z, after the benchmarks had run, showed both CPU's running at 1.667GHz, so neither CPU was throttling, and temp reports, taken from the on die sensor, showed temps below 60 centigrade for both CPU's.

I thought it strange that the whetstone value varied from 1565 to 1560, a variance of 5 (I.E. nothing to write home about), while the drystone varied from 940 to 2806, a variance of 1866 (loads).

I wondered if this might be indicative of some form of problem. ??

So far all WU's take about the same time to crunch, and those that have made it through the validators (LOL) have validated OK. Bar two "client errors", caused by a user error (me)...

I'll just put it down to the "vagaries" of Boinc, and forget it.


Foamy is "Lord and Master".
(Oh, + some Classic WUs too.)
ID: 149745 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 149760 - Posted: 10 Aug 2005, 22:27:09 UTC - in response to Message 149695.  

The benchmarks are unstable and do not reflect actual capabilities
==========

Then why are they used Paul, why can't all the Projects be like CPDN and after so much Processing you get so much Credit. It seems to me it would be a whole lot simpler that way and a lot less confusing on why the Credits jump around so much.

Because the only way to fix the problem is to suspend all other tasks on the machine, and that isn't very practical.

It's also reasonable, because anything that throws off the benchmarks will probably affect processing even more.
ID: 149760 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 149762 - Posted: 10 Aug 2005, 22:28:18 UTC - in response to Message 149745.  

The only program running at the time of the benchmarks, apart from windows services and Boinc, was AVG anti virus.

Most anti-virus programs present a big load....

ID: 149762 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 150522 - Posted: 12 Aug 2005, 14:03:10 UTC - in response to Message 149695.  

Then why are they used Paul, why can't all the Projects be like CPDN and after so much Processing you get so much Credit. It seems to me it would be a whole lot simpler that way and a lot less confusing on why the Credits jump around so much.

They are used as they are faster. Again, we have this beat to death else where ... some of our notes are in the Wiki, at least as far as the conclusions. Also if you go to the base site (leave off the booinc wiki part) you will see a lecture notes entry, read down htere for a whole 4 houir lecture on benchmarks and why they stink.

The best benchmark would be for us to run the actual application with a test work unit. I did a long thead where we discussed the in-and-outs of this ... in general, the idea would be to class machines based on the reference work unit for which there is a "known" answer. The down side is that this takes an hour to a day (or longer on real slow machines), it would be a two step process. bench the machine, if it is too slow dont use the ref work unit, compare times with a "calibrated" machine with live results. The simplification is that we would only calibrate a percentage of the machines out there with the reference work unit. All others would have a claibration against calibrated machines using the acutal processing times.

Over time, all machines would eventually be claibrated directly, or with a single indirect. Random machines would be selected daily and would process the reference work unit returning the answer, for which they would be granted credit. So, not only would we be testing the speeds, we also would be testing the accuracy of the returned data.

The only negative is that we would be calibrating, at least initially, with only SETI@Home work. This is less "cross-project friendly" than the completely neutral benchmarks. However, the benchmarks stink so badly, and have such poor charistics that I think that should be a minor consideration. Long running projects like CPDN might not ever have a direct calibration ... but LHC@Home and Predictor@Home have short running work units that they could also be great candidates for calibration. New projects could test in the lab with the current calibration "constants" to develop their own ...

All-in-all, a sound concept ... I don't know what chances it has to be incorporated, or when, or even if ... like the rest of you, I have no insight or access that convinces me that this proposal has a chance ... though I did post the concept on the developer's mailing list ... which was received with such enthusiasm that there was not a single reply (as my memory serves) ... so, make of *THAT* what you will ...
ID: 150522 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20304
Credit: 7,508,002
RAC: 20
United Kingdom
Message 150530 - Posted: 12 Aug 2005, 14:16:31 UTC - in response to Message 150522.  

Then why are they used ...

... All-in-all, a sound concept ... I don't know what chances it has to be incorporated, or when, or even if ...

In short, it needs someone to volunteer to code it up.

Compared to other current issues, I would guess that the current 'fiddle-factor' fix has pushed the benchmarking issue right down to the bottom of the pile.

(Credits are, after all, just a bit of fun to entice people along a little :) )

Cheers,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 150530 · Report as offensive
Profile SunMicrosystemsLLG

Send message
Joined: 4 Jul 05
Posts: 102
Credit: 1,360,617
RAC: 0
United Kingdom
Message 150569 - Posted: 12 Aug 2005, 16:30:11 UTC

In my experience it doesn't do a bad job ... but I guess the hardware type might play a part

I am currently running 32 machines that are effectively clones of each other.
I don't just mean they are installed in the same way - they are all booted over a SAN from the SAME OS image (all the machines are diskless).
They are all exactly the same hardware config (2 x 2.6Ghz Opteron 252s with 8GB DDR400 RAM) and exactly the same software config (Solaris 10 x86-64 running Grid Engine execution daemon and a BIONC 4.32 Solaris client).


Although the benchmark figures range from:

Measured floating point speed 2491.53 million ops/sec
Measured integer speed 6762.64 million ops/sec

up to:

Measured floating point speed 2477.11 million ops/sec
Measured integer speed 7288.85 million ops/sec

only 1 or 2 of the machines are outside of:

Measured floating point speed 2400-2500 million ops/sec
Measured integer speed 6800-6900 million ops/sec



There is absolutely nothing else on these machines - the only login session is Grid Engine's scheduler login.
It seems the majority of the variance IS down to the benchmark itself rather than what else the machine might be doing at the time, but for the most part the benchmark is coming back with near identical results for identical machines ...
ID: 150569 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 151043 - Posted: 13 Aug 2005, 17:08:38 UTC

I can buy variation ... but in the third significant position? I could buy plus or minus 10 ... but even that is hard to swallow where you realize that you are talking about a spread of ONLY 10 MILLION operations a second ... hmmm... thats a lot of operations ...
ID: 151043 · Report as offensive

Message boards : Number crunching : CPU benchmark Cant arrive at the same result twice in a row..


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.