2 identical computers but one gives low credits

Message boards : Number crunching : 2 identical computers but one gives low credits
Message board moderation

To post messages, you must log in.

AuthorMessage
SethDK

Send message
Joined: 10 Dec 05
Posts: 4
Credit: 100,220
RAC: 0
Denmark
Message 523631 - Posted: 26 Feb 2007, 13:37:46 UTC
Last modified: 26 Feb 2007, 13:50:59 UTC

I have two computers (AuthenticAMD
AMD Athlon(tm) XP 2200+ [x86 Family 6 Model 8 Stepping 0]) and both runs Bonic 5.8.11.

The first one are producing following results:
Outcome Client s. CPU t. claimed c. granted c
Success Done 9,455.50 12.52 12.52
Success Done 19,601.91 26.92 26.92
Success Done 39,558.92 61.41 61.41
Success Done 35,258.11 60.65 pending
Success Done 20,070.09 27.65 27.65
(Uncut version of computer results at First computer )

The second is giving me this:
Outcome Client s. CPU t. claimed c. granted c
Success Done 13,334.48 0.10 pending
Unknown New --- --- ---
Success Done 7,050.52 0.11 pending
Success Done 291.02 0.18 0.00
Success Done 3,672.97 0.01 pending
Success Done 211.61 0.11 pending
Success Done 180.89 0.05 pending
Unknown New --- --- ---
Success Done 7,878.38 0.12 0.12
Success Done 9,455.50 12.52 12.52
(Uncut version of computer results at Second computer)

Why is there such a big difference between the two computeres? Shouldn't they be getting the same kind of WU?
Does anyone have the same problem or is there someone that can tell me what the problem is (or if it is a problem at all)
ID: 523631 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 523635 - Posted: 26 Feb 2007, 13:46:19 UTC
Last modified: 26 Feb 2007, 13:53:42 UTC

Something seems to have happened between the 21st and 22nd of Feb to cause all the results of your second computer to report work as -9 result overflow, and conversely "invalid". Notice the others who did this wu were NOT reporting them as -9, so it seems to be limited to your puter.

Run the benchmarks on both systems and compare them. I suspect something on the second computer is wrong. overheating, bad overclocking(especially this), bad ram, etc.
ID: 523635 · Report as offensive
SethDK

Send message
Joined: 10 Dec 05
Posts: 4
Credit: 100,220
RAC: 0
Denmark
Message 523639 - Posted: 26 Feb 2007, 14:02:08 UTC

First Computer;
CPU type AuthenticAMD
AMD Athlon(tm) XP 2200+ [x86 Family 6 Model 8 Stepping 0] [fpu tsc sse 3dnow mmx]
Number of CPUs 1
Operating System Microsoft Windows XP
Professional Edition, Service Pack 2, (05.01.2600.00)
Memory 1023.48 MB
Cache 976.56 KB
Swap space 2461.71 MB
Total disk space 50 GB
Free Disk Space 42.62 GB
Measured floating point speed 1474.19 million ops/sec
Measured integer speed 2469.94 million ops/sec
Average upload rate 0.23 KB/sec
Average download rate Unknown
Average turnaround time 0.73 days
Maximum daily WU quota per CPU 100/day

Second Computer;
CPU type AuthenticAMD
AMD Athlon(tm) XP 2200+ [x86 Family 6 Model 8 Stepping 1] [fpu tsc sse 3dnow mmx]
Number of CPUs 1
Operating System Microsoft Windows XP
Professional Edition, Service Pack 2, (05.01.2600.00)
Memory 1023.48 MB
Cache 976.56 KB
Swap space 2462.18 MB
Total disk space 16.83 GB
Free Disk Space 3.9 GB
Measured floating point speed 1523.81 million ops/sec
Measured integer speed 2556.85 million ops/sec
Average upload rate 3.16 KB/sec
Average download rate Unknown
Average turnaround time 0.12 days
Maximum daily WU quota per CPU 100/day

From what I can see there is almost no difference.

The first computer is running at a temp of 30'C/86'F
The second computer is running at a temp of 32'C/89.6'F

There is no indication of bad ram in any of the computers

None of the computers are overclocked.
ID: 523639 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 523640 - Posted: 26 Feb 2007, 14:11:14 UTC
Last modified: 26 Feb 2007, 14:16:23 UTC

Since both benchmarks are the same (for all intents), then the on-die thermal diode isn't limiting the CPU freq (note: this is not viewable with freq monitor like cpuz). It also shows Cool'n'quiet isn't the problem. The temps also seem acceptible ruling out overheating. I'd recommend you run memtest86+ and take it for a couple spins.

Since yours is the only one coming back "invalid" of the initial replication, and it's consist (I.E every single WU since 22 Feb), then something is happening on your end. Just need to figure out what.

hope this helps.

tony

Are both computers running the standard seti application and standard boinc versions? (note: this is a less likely grasping at straws question)
ID: 523640 · Report as offensive
SethDK

Send message
Joined: 10 Dec 05
Posts: 4
Credit: 100,220
RAC: 0
Denmark
Message 523645 - Posted: 26 Feb 2007, 14:28:30 UTC

I don't think bad ram is the problem
I ran memtest on the first computer 9 month ago with no error
and memtest on the second 7 months ago, again with no error.
But I'll do it again on the second computer soon just to rule that out.

And both are running the standart seti app. and the standart bonic version
ID: 523645 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 523650 - Posted: 26 Feb 2007, 14:43:04 UTC
Last modified: 26 Feb 2007, 14:46:04 UTC

If it's not RAM then I'm running out of ideas. PSU maybe???

It's obvious to me that whatever it is, the problem began and continues to plague your machine, and only your machine. The problem is not even intermittent, so finding it should be easier.

If others who ran the same wus as you reported -9 or invalid, then I might suspect it to be a wu specific error, but this isn't the case. If it were only a few of them, then I might be able to dismiss an error on that machine, but it's EVERY wu since 22 Feb (more than 40 in a row).

Anyone else think of something I might be missing?

ID: 523650 · Report as offensive
Profile keyboards
Volunteer tester
Avatar

Send message
Joined: 14 Jul 00
Posts: 66
Credit: 492,766
RAC: 0
United States
Message 523651 - Posted: 26 Feb 2007, 14:45:23 UTC

Any possibility of corruption on the HD?

What about trying a detach and re-attach (forcing all files to update)? Just a crazy suggestion since nothing "sane" seems to be the issue ;-)
!!Stupidity should be PAINFUL!!
ID: 523651 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 523652 - Posted: 26 Feb 2007, 14:48:20 UTC
Last modified: 26 Feb 2007, 14:48:43 UTC

I know Hitachi has the "drive fitness test" program you can download to test their products, and I know Maxtor and Western digital do as well, I just don't know their names.
ID: 523652 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 523654 - Posted: 26 Feb 2007, 14:57:19 UTC

did you change anything on Weds/Thurs the 21st/22nd of Feb? Install Vista? Add some program, do some update? I'm stretching here ofcourse.
ID: 523654 · Report as offensive
Profile Ocean Archer
Volunteer tester
Avatar

Send message
Joined: 11 Jun 04
Posts: 148
Credit: 504,219
RAC: 0
United States
Message 523656 - Posted: 26 Feb 2007, 15:04:44 UTC

I have to agree with ASTRO --

There's something about your second computer that just is not right. By chance, is your second computer the one you use more regularly to do your InterNet browsing?

Second thought - are both machines configured to run BOINC in the same manner? (e.g. as a service, or as a screensaver option?)


ID: 523656 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 523667 - Posted: 26 Feb 2007, 15:32:50 UTC

The only other thing which comes to mind at the moment would be if it passes all the other hardware tests would be to try throttling the flakey one back a little on BOINC. If the failures stop then it might indicate it's time to freshen up the heatsink thermal interface (I assume you already checked to make sure everything was clean externally, all fans running properly, etc).

Alinator

ID: 523667 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 523669 - Posted: 26 Feb 2007, 15:35:40 UTC
Last modified: 26 Feb 2007, 15:36:54 UTC

Hi, I'm also running an AMD Athlon(tm) XP 2200+

Created 6 Jan 2007 23:07:37 UTC
Total Credit 5,846.09
Recent average credit 151.83
CPU type AuthenticAMD
AMD Athlon(tm) XP 2200+ [x86 Family 6 Model 8 Stepping 1] [fpu tsc sse 3dnow mmx]
Number of CPUs 1
Operating System Microsoft Windows XP
Professional Edition, Service Pack 2, (05.01.2600.00)
Memory 255.48 MB
Cache 976.56 KB
Swap space 617.85 MB
Total disk space 38.33 GB
Free Disk Space 10.38 GB
Measured floating point speed 1662.99 million ops/sec
Measured integer speed 2798.79 million ops/sec
Average upload rate 0.1 KB/sec
Average download rate Unknown
Average turnaround time 0.94 days
Maximum daily WU quota per CPU 100/day

I'm running the Optimized app from KWSN.

As you can see my RAC for this one PC is higher than your 2 which have 4x more RAM each.

If you need help installing the Optimized app, there is plenty in these forums.
ID: 523669 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 523670 - Posted: 26 Feb 2007, 15:39:04 UTC
Last modified: 26 Feb 2007, 15:40:14 UTC

you know, if the ram is the same, you could just swap the ram between the two and see if the problem moves with them. or just test with one stick of ram at a time, but then we're getting ahead of ourselves lol
ID: 523670 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 523671 - Posted: 26 Feb 2007, 15:42:59 UTC
Last modified: 26 Feb 2007, 15:52:25 UTC

That's a thought, just reseating everything on the flakey one might be something to try before swapping out.

<edit> FWIW, the reason I'm thinking thermal here is I have a friend who crunches on a laptop. He decided he didn't like the dust collecting on the keyboard and started closing the lid and his symptoms were almost identical to Seth's, from what I figure were FPU errors when it heated up more when the lid was closed. Went back to leaving it open when running and the failures went away.

Alinator
ID: 523671 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 523673 - Posted: 26 Feb 2007, 15:47:44 UTC - in response to Message 523639.  
Last modified: 26 Feb 2007, 16:14:08 UTC

First Computer;
CPU type AuthenticAMD
AMD Athlon(tm) XP 2200+ [x86 Family 6 Model 8 Stepping 0] [fpu tsc sse 3dnow mmx]
Number of CPUs 1
Operating System Microsoft Windows XP
Professional Edition, Service Pack 2, (05.01.2600.00)
Memory 1023.48 MB
Cache 976.56 KB
Swap space 2461.71 MB
Total disk space 50 GB
Free Disk Space 42.62 GB
Measured floating point speed 1474.19 million ops/sec
Measured integer speed 2469.94 million ops/sec
Average upload rate 0.23 KB/sec
Average download rate Unknown
Average turnaround time 0.73 days
Maximum daily WU quota per CPU 100/day

Second Computer;
CPU type AuthenticAMD
AMD Athlon(tm) XP 2200+ [x86 Family 6 Model 8 Stepping 1] [fpu tsc sse 3dnow mmx]
Number of CPUs 1
Operating System Microsoft Windows XP
Professional Edition, Service Pack 2, (05.01.2600.00)
Memory 1023.48 MB
Cache 976.56 KB
Swap space 2462.18 MB
Total disk space 16.83 GB
Free Disk Space 3.9 GB
Measured floating point speed 1523.81 million ops/sec
Measured integer speed 2556.85 million ops/sec
Average upload rate 3.16 KB/sec
Average download rate Unknown
Average turnaround time 0.12 days
Maximum daily WU quota per CPU 100/day

From what I can see there is almost no difference.

The first computer is running at a temp of 30'C/86'F
The second computer is running at a temp of 32'C/89.6'F

There is no indication of bad ram in any of the computers

None of the computers are overclocked.


When were they last defragmented? You have plenty of RAM. Are the PCs used in a similar way? What other tasks are running regularly?

[EDIT]
What does Stepping 0 vs. Stepping 1 mean?
[/EDIT]
ID: 523673 · Report as offensive
Bob Guy
Volunteer tester

Send message
Joined: 7 Sep 00
Posts: 126
Credit: 213,429
RAC: 0
United States
Message 523769 - Posted: 26 Feb 2007, 21:05:02 UTC - in response to Message 523673.  

What does Stepping 0 vs. Stepping 1 mean?

What is CPU Stepping?, it is a hardware revision in the manufacturing process. It could result in your kind of problem in that the CPU might be less stable under certain conditions. So, just because you think your CPUs are identical they are not, you might need to tune the systems differently.
ID: 523769 · Report as offensive
SethDK

Send message
Joined: 10 Dec 05
Posts: 4
Credit: 100,220
RAC: 0
Denmark
Message 524546 - Posted: 28 Feb 2007, 10:12:04 UTC

Turned out that the error was quite simple.
I took my second computer offline for an hour to check the hardware and I discovered that I had installed the cpu temp. monitor wrong! So my CPU was running with a temp. of 79'C/174.2'F :(
After I fixed the heatproblem (heatfan not mounted properbly) the computer is running at 40'C/104'F and Bonic/SETI seems to be running fine now.

Thanks Arto and all the others for the help and quick replys.
And sorry for being a n00b and not check my hardware 100% first.




ID: 524546 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 524552 - Posted: 28 Feb 2007, 12:14:16 UTC

great, glad it's all OK now. My Mobile AMD Athlon Xp2200 laptop went up in smoke at 79C. Actually, the power section went out. I still have the memory and cpu in the computer parts storage location (closet).

I guess Boinc is an early warning system. LOL
ID: 524552 · Report as offensive
mtylive

Send message
Joined: 23 Feb 07
Posts: 1
Credit: 906
RAC: 0
Mexico
Message 524909 - Posted: 1 Mar 2007, 9:20:02 UTC

a mi me parece que es por la velocidad de tu segundo disco duro.. es de 20 gb es tecnologia mas baja que el de la primera compuatoda de mayor capacidad.

eso cuenta mucho en el desempeño


sorry solo escribo en español

saludos
ID: 524909 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 524925 - Posted: 1 Mar 2007, 11:13:38 UTC - in response to Message 524546.  

Turned out that the error was quite simple.
I took my second computer offline for an hour to check the hardware and I discovered that I had installed the cpu temp. monitor wrong! So my CPU was running with a temp. of 79'C/174.2'F :(
After I fixed the heatproblem (heatfan not mounted properbly) the computer is running at 40'C/104'F and Bonic/SETI seems to be running fine now.

Thanks Arto and all the others for the help and quick replys.
And sorry for being a n00b and not check my hardware 100% first.


Glad that you are getting some good results again Seth.

If you really want to boost your credits, get an Optimized SETI app. The one you need is the SSE version.

For more information see this thread. See my Athlon XP 2200+ results compared to yours. On a ~ 60 credit WU my average crunching time is ~ 18600 seconds or just over 5 hours; your average for similar ~ 60 credit WUs is ~ 37750 seconds or nearly 10.5 hours.

Keith.
ID: 524925 · Report as offensive

Message boards : Number crunching : 2 identical computers but one gives low credits


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.