The Attack of the Killer 58.7s

Message boards : Number crunching : The Attack of the Killer 58.7s
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 22 Apr 04
Posts: 758
Credit: 27,771,894
RAC: 0
United States
Message 483893 - Posted: 17 Dec 2006, 2:50:19 UTC - in response to Message 483875.  

just checked, chipset is a 5000X, configured with 4x1 GB FB-DIMMs (Memory set at 5/5/5/15/20 @ 667MHz) and 2 5150 Xeons.


If there are more than 4 DIMM slots, I think it matters which slots are populated to get the full bandwidth.


Dublin, California
Team: SETI.USA
ID: 483893 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 483895 - Posted: 17 Dec 2006, 2:53:08 UTC - in response to Message 483857.  

...(and I'll have to pester a Dell rep).


In order to get the interlaced dual-channel memory working right, the right sockets must be used, and of course, that depends on the manufacturer of the motherboard.

My Supermicro X7DAE needs to have sockets 1,3,5 and 7 populated in order to get the interlaced dual-channel (any other way, other than all 8 sockets and it will revert back to regular dual-channel or even straight access).

If I can get my copy of SiSoft working again, I'll see what my board reports. As for the total bandwidth efficiency, that's simply calculated by seeing how close the RAM ratio is running to the FSB. If you notice, even dual-channel DDR400 (which can technically fill the 8.4GB/s bandwidth of P4s and PDs) is even labeled as only having an efficiency of 55 to 65%, because it's running at 200MHz double pumped while the FSB of most P4s and PDs is 200MHz quad pumped. That's one of the problem with synthetic benchmarks such as SiSoft Sandra.
ID: 483895 · Report as offensive
Sisyfos

Send message
Joined: 22 Jul 00
Posts: 7
Credit: 2,796,632
RAC: 0
Denmark
Message 484148 - Posted: 17 Dec 2006, 13:00:27 UTC

I was fiddling with a large cache (10 days), just at the time these units were send out.
I think I got 200 of them on my C2D :-(

Anyways, just for kicks I tried Chicken's 2.0 Generic SSE2...
My C2D crunch times went from ~5400s to ~4200s

http://setiathome.berkeley.edu/result.php?resultid=434671531

http://setiathome.berkeley.edu/result.php?resultid=434827238

I have a lot of 58.7s to cover yet, so I don't know how other types of WU would fare.

Current setup E6600@3456(9x384), DDR2@960 5-5-5-15.
ID: 484148 · Report as offensive
Profile Pooh Bear 27
Volunteer tester
Avatar

Send message
Joined: 14 Jul 03
Posts: 3224
Credit: 4,603,826
RAC: 0
United States
Message 484170 - Posted: 17 Dec 2006, 13:41:27 UTC

All I say is for those aborting them, someone has to crunch them, so why not take the bad with the good? Aborting just slows down the science.

Just an observation. It probably will not make a difference, but someone had to say it.

I'm in it for the science, and will not abort the units. So, I get less credit per hour on a few units. It's how it goes.


My movie https://vimeo.com/manage/videos/502242
ID: 484170 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 484171 - Posted: 17 Dec 2006, 13:41:38 UTC - in response to Message 483895.  

...(and I'll have to pester a Dell rep).


In order to get the interlaced dual-channel memory working right, the right sockets must be used, and of course, that depends on the manufacturer of the motherboard.

My Supermicro X7DAE needs to have sockets 1,3,5 and 7 populated in order to get the interlaced dual-channel (any other way, other than all 8 sockets and it will revert back to regular dual-channel or even straight access).

If I can get my copy of SiSoft working again, I'll see what my board reports. As for the total bandwidth efficiency, that's simply calculated by seeing how close the RAM ratio is running to the FSB. If you notice, even dual-channel DDR400 (which can technically fill the 8.4GB/s bandwidth of P4s and PDs) is even labeled as only having an efficiency of 55 to 65%, because it's running at 200MHz double pumped while the FSB of most P4s and PDs is 200MHz quad pumped. That's one of the problem with synthetic benchmarks such as SiSoft Sandra.

Good point!

CPU-Z tells me my modules are in slots 1-4, which seems stupid (unless they do some funky numbering), may be holding performance back quite a bit.

Would be nice to get a basically free performance upgrade, anyway.

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 484171 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 484202 - Posted: 17 Dec 2006, 14:42:35 UTC - in response to Message 484171.  

...(and I'll have to pester a Dell rep).

CPU-Z tells me my modules are in slots 1-4, which seems stupid (unless they do some funky numbering), may be holding performance back quite a bit.

Would be nice to get a basically free performance upgrade, anyway.

Regards,
Simon.

My Dell (workstation motherboard) is also a 5000X, with the same memory timings, but so far only 2 x 1GB DIMMS in slots 1 and 2 (as supplied by Dell).

I've just downloaded and run SiSoft Sandra Lite 11.17b: it reports Int 3362, Float 3373, 39% efficiency :-( - I hope the DIMMs I've ordered aren't stuck in the Christmas mail for too much longer.

One advantage of Dell systems is that you can use the system tag (serial number) to get quite detailled technical information which is supposedly right for that exact machine. For mine, they're saying I should install DIMMs in pairs, starting with the lowest numbered slots - so (1&2), (3&4), (5&6), (7&8) in that order. But if your Dell rep says anything different, please let us know.
ID: 484202 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 484212 - Posted: 17 Dec 2006, 15:16:25 UTC - in response to Message 484170.  

All I say is for those aborting them, someone has to crunch them, so why not take the bad with the good? Aborting just slows down the science.

Just an observation. It probably will not make a difference, but someone had to say it.

I'm in it for the science, and will not abort the units. So, I get less credit per hour on a few units. It's how it goes.

I fully agree, and will do likewise.

FWIW, I haven't seen any sign of any units being aborted. I think what happens is that on some machines (like msattler's), they take much longer than expected, which bumps up the RDCF. So that machine's cache registers as being much fuller than it really is. Which inhibits work fetch for a while. So nothing gets reported. So everyone else's pending credit goes up, and RAC goes down.

Does that make sense to anyone?
ID: 484212 · Report as offensive
Profile KWSN - Chicken of Angnor
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 9 Jul 99
Posts: 1199
Credit: 6,615,780
RAC: 0
Austria
Message 484213 - Posted: 17 Dec 2006, 15:17:16 UTC
Last modified: 17 Dec 2006, 15:17:50 UTC

Enabling Quad-Channel operation on the Dell may also be a BIOS option - I'll have to check it out in more detail when I have physical access to that box again.

Regards,
Simon.
Donate to SETI@Home via PayPal!

Optimized SETI@Home apps + Information
ID: 484213 · Report as offensive
Sisyfos

Send message
Joined: 22 Jul 00
Posts: 7
Credit: 2,796,632
RAC: 0
Denmark
Message 484241 - Posted: 17 Dec 2006, 15:58:20 UTC - in response to Message 484148.  

I was fiddling with a large cache (10 days), just at the time these units were send out.
I think I got 200 of them on my C2D :-(

Anyways, just for kicks I tried Chicken's 2.0 Generic SSE2...
My C2D crunch times went from ~5400s to ~4200s

http://setiathome.berkeley.edu/result.php?resultid=434671531

http://setiathome.berkeley.edu/result.php?resultid=434827238

I have a lot of 58.7s to cover yet, so I don't know how other types of WU would fare.

Current setup E6600@3456(9x384), DDR2@960 5-5-5-15.


Hmmm...I thought the Core 2 people would be excited about this, but I just noticed that I forgot to mention, that the ~20% decrease in 58.7 credit WUs crunch time was going from Chicken SSSE3 1.41 to Chicken SSE2 2.0 Generic
ID: 484241 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 484245 - Posted: 17 Dec 2006, 16:05:00 UTC - in response to Message 484241.  

Hmmm...I thought the Core 2 people would be excited about this, but I just noticed that I forgot to mention, that the ~20% decrease in 58.7 credit WUs crunch time was going from Chicken SSSE3 1.41 to Chicken SSE2 2.0 Generic

Well, there's a challenge for Simon and the coop [I like that formulation - pun on chicken, or you could read it as co-op(erative), which is a good description]

Could you/they write a little app which would examine the headers of downloaded WUs, and dynamically choose which optimised app to crunch them with???!!! That should keep you out of mischief for an hour or two, LOL
ID: 484245 · Report as offensive
Sisyfos

Send message
Joined: 22 Jul 00
Posts: 7
Credit: 2,796,632
RAC: 0
Denmark
Message 484254 - Posted: 17 Dec 2006, 16:20:13 UTC - in response to Message 484245.  

Could you/they write a little app which would examine the headers of downloaded WUs, and dynamically choose which optimised app to crunch them with???!!! That should keep you out of mischief for an hour or two, LOL


Not a bad idea
I have started 2 58.7s with SSE3 Intel 2.0, just to see if there's more to be gained. I certainly have enough of those pesky WUs to try all the different apps for effectiveness (is that a word?).
ID: 484254 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 484256 - Posted: 17 Dec 2006, 16:24:01 UTC - in response to Message 484254.  

(is that a word?).

Yes, the word "that" is a word. <giggle> <chuckle><going back into my corner now>
ID: 484256 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20291
Credit: 7,508,002
RAC: 20
United Kingdom
Message 484284 - Posted: 17 Dec 2006, 17:19:13 UTC - in response to Message 484256.  
Last modified: 17 Dec 2006, 17:21:04 UTC

(is that a word?).

Yes, the word "that" is a word. <giggle> <chuckle><going back into my corner now>

Are you going over there for a byte or a nibble also?

Happy crunchin',
;-)
Martin


( ps: Sorry for inflicting multiple groans :-/ )
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 484284 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 484301 - Posted: 17 Dec 2006, 17:43:46 UTC - in response to Message 484245.  
Last modified: 17 Dec 2006, 17:43:55 UTC

Hmmm...I thought the Core 2 people would be excited about this, but I just noticed that I forgot to mention, that the ~20% decrease in 58.7 credit WUs crunch time was going from Chicken SSSE3 1.41 to Chicken SSE2 2.0 Generic

Well, there's a challenge for Simon and the coop [I like that formulation - pun on chicken, or you could read it as co-op(erative), which is a good description]

Could you/they write a little app which would examine the headers of downloaded WUs, and dynamically choose which optimised app to crunch them with???!!! That should keep you out of mischief for an hour or two, LOL


I have suggested this at least a couple of times in Simon's forums. Add both types of code to the app, and branch to the one that would be most effective for the WU to be processed.

"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 484301 · Report as offensive
Profile Clyde C. Phillips, III

Send message
Joined: 2 Aug 00
Posts: 1851
Credit: 5,955,047
RAC: 0
United States
Message 484305 - Posted: 17 Dec 2006, 17:51:10 UTC - in response to Message 483315.  

I think he's talking about lots of high-noise WUs that are taking 58.7 seconds.
These can take less time to process than to download another, resulting in an empty cache, an idle machine and only .01 credits for what does get done.

It's to be expected though as Berkeley are currently loading tapes not previously run due to suspicion they may be noisy.
/edit
Not that I've seen any significant number of these.


Not quite....
They are a breed of WU that, on my quad cruncher, take about 1h 33m to crunch for 58.7 credits instead of about 1h 7m for about 62 credits for the more common type of WU. Had about 1 1/2 days worth of them in my cache. Killed my RAC and drove up my pending credits. (Must take forever to crunch on a slower rig). I think I may have chewed up the majority of them (knock on silicon) and can get back to business. Oh well, ya' gotta crunch wot you got.


MSattler, I took a look at one of your results and it looks like you may not be using the latest version of Simon's cruncher. I saw "V1.41" which may be the previous edition. I used that version, too, and those 58-credit units crunched slowly. If you update your cruncher you might (I repeat MIGHT) get slightly better results. In my case the later version does better on the most plentiful units but does worse on others. It handles some VLARS very well but does more poorly on the 0.7-to-1.12 and the ones above three degrees. Overall my PD950s do slightly better (maybe just a couple percent) with Version 2.0.

ID: 484305 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 484306 - Posted: 17 Dec 2006, 17:51:37 UTC - in response to Message 484148.  

I was fiddling with a large cache (10 days), just at the time these units were send out.
I think I got 200 of them on my C2D :-(

Anyways, just for kicks I tried Chicken's 2.0 Generic SSE2...
My C2D crunch times went from ~5400s to ~4200s

http://setiathome.berkeley.edu/result.php?resultid=434671531

http://setiathome.berkeley.edu/result.php?resultid=434827238

I have a lot of 58.7s to cover yet, so I don't know how other types of WU would fare.

Current setup E6600@3456(9x384), DDR2@960 5-5-5-15.


Thanx for the idea!!! I am in the same boat as you are. I had a RAM crash on my rig a while back, errored out a bunch of WUs, and downloaded a bunch of these nastys which are now in my cache waiting to be processeed like little land mines.
I'm gonna switch to the SSE2 app for a while, see if it helps, and maybe stick with it until I get the little buggers cleared out.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 484306 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 484308 - Posted: 17 Dec 2006, 17:55:24 UTC - in response to Message 484305.  

I think he's talking about lots of high-noise WUs that are taking 58.7 seconds.
These can take less time to process than to download another, resulting in an empty cache, an idle machine and only .01 credits for what does get done.

It's to be expected though as Berkeley are currently loading tapes not previously run due to suspicion they may be noisy.
/edit
Not that I've seen any significant number of these.


Not quite....
They are a breed of WU that, on my quad cruncher, take about 1h 33m to crunch for 58.7 credits instead of about 1h 7m for about 62 credits for the more common type of WU. Had about 1 1/2 days worth of them in my cache. Killed my RAC and drove up my pending credits. (Must take forever to crunch on a slower rig). I think I may have chewed up the majority of them (knock on silicon) and can get back to business. Oh well, ya' gotta crunch wot you got.


MSattler, I took a look at one of your results and it looks like you may not be using the latest version of Simon's cruncher. I saw "V1.41" which may be the previous edition. I used that version, too, and those 58-credit units crunched slowly. If you update your cruncher you might (I repeat MIGHT) get slightly better results. In my case the later version does better on the most plentiful units but does worse on others. It handles some VLARS very well but does more poorly on the 0.7-to-1.12 and the ones above three degrees. Overall my PD950s do slightly better (maybe just a couple percent) with Version 2.0.


I just switched to the SSE2 app for a while (see other posts in this thread), but Simon has always maintained that the 1.41 was the best to use only on core 2 rigs. It obviously has one Achilles' heel.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 484308 · Report as offensive
Sisyfos

Send message
Joined: 22 Jul 00
Posts: 7
Credit: 2,796,632
RAC: 0
Denmark
Message 484336 - Posted: 17 Dec 2006, 18:38:06 UTC - in response to Message 484306.  
Last modified: 17 Dec 2006, 18:42:01 UTC

...Anyways, just for kicks I tried Chicken's 2.0 Generic SSE2...
My C2D crunch times went from ~5400s to ~4200s...


Thanx for the idea!!! I am in the same boat as you are. I had a RAM crash on my rig a while back, errored out a bunch of WUs, and downloaded a bunch of these nastys which are now in my cache waiting to be processeed like little land mines.
I'm gonna switch to the SSE2 app for a while, see if it helps, and maybe stick with it until I get the little buggers cleared out.


You're welcome :-)

These just in...
SSE3 Intel 2.0 App
http://setiathome.berkeley.edu/result.php?resultid=434827413 4221s
http://setiathome.berkeley.edu/result.php?resultid=434827417 4179s

Not much of a difference from SSE2 Generic 2.0, but still a 20% improvement from SSSE3 1.41.

Off to try SSE2 Intel 2.0...
ID: 484336 · Report as offensive
Sisyfos

Send message
Joined: 22 Jul 00
Posts: 7
Credit: 2,796,632
RAC: 0
Denmark
Message 484409 - Posted: 17 Dec 2006, 21:01:55 UTC

ID: 484409 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 484425 - Posted: 17 Dec 2006, 21:15:25 UTC - in response to Message 483635.  

Well, I think it's more than just the lack of interleaved dual-channel memory, as even AnandTech was having low throughput with the FB-DIMMs, and the problem could be attributed to many things, including (but not limited to): available BIOS settings (differs between manufacturers), immature MCH (this is Intel's first foray into FB-DIMM whereas they've had some time to work with DDR2), immature chipset drivers, etc.


A colleague of mine that was reading my post pointed out a different perspective to me, one that I thought I'd share.

He pointed out to me that even if it's a BIOS setting/limitation or an immature MCH/driver, it still doesn't change the fact that the Core 2 Duos/Quads are performing better with their DDR2 RAM as of right now, which would technically make the Core 2 Duos/Quads a better choice cruncher over the equivalently clocked Xeons as of right now, and until Intel can work on whatever is preventing them from using the FB-DIMMs to their fullest potential, it's going to hold the performance back.

I still hold to my theory that something is preventing the Xeons from performing at their fullest potential, but I must concede to the point that as it stands, it appears the Core 2 chips crunch better overall.

Also, as a follow up, I can't seem to get SiSoft to run on my new system. I have a legal full version of Sandra 2005 SR3 that I can install, but as soon as I launch the app, it simply exits without an error message. I'll have to see if I can upgrade to the latest version (I'll have to check with my accountant [girlfriend] on that matter) and see if that fixes the issue. I'm thinking that Sandra doesn't like 32bit Windows running with 4GB RAM or something.
ID: 484425 · Report as offensive
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Number crunching : The Attack of the Killer 58.7s


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.