Lunatics Windows Installer v0.39 release notes

Message boards : Number crunching : Lunatics Windows Installer v0.39 release notes
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 · Next

AuthorMessage
Profile BigBruce

Send message
Joined: 7 Feb 03
Posts: 11
Credit: 13,601,678
RAC: 0
United Kingdom
Message 1183235 - Posted: 6 Jan 2012, 15:51:46 UTC - in response to Message 1183206.  

Thanks,

I removed the extra lines and now it is asking for both CPU and GPU units (though it isn't getting any GPU ones).

Bruce
ID: 1183235 · Report as offensive
Profile shizaru
Volunteer tester
Avatar

Send message
Joined: 14 Jun 04
Posts: 1130
Credit: 1,967,904
RAC: 0
Greece
Message 1184317 - Posted: 10 Jan 2012, 11:34:41 UTC

Hi guys! First of all thank you Richard and Jason and Co for helping me out in this thread:
http://setiathome.berkeley.edu/forum_thread.php?id=66037&nowrap=true#1178524

Richard your suspicions were (of course) correct, since I installed Lunatics on the 31st and everything is back to (or better than) normal. Sten helped me out with a bunch of questions so a special thanx to him:) But I do have one more question that is more curiosity than anything else...

While CPU tasks are at least 50% faster than stock, GPU times look like they are the same as stock (when "stock" was running well for me that is). Sten says this is ION (my mobile GPU) specific, and of course I believe him. So my real question is what's the Tipping Point in nVidia specs that helps Lunatics run faster faster than stock? Just asking in case I ever buy a new laptop/desktop:)

Thanx again, and here's my little cruncher:
http://setiathome.berkeley.edu/show_host_detail.php?hostid=5550773
ID: 1184317 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1184322 - Posted: 10 Jan 2012, 12:23:17 UTC - in response to Message 1184317.  
Last modified: 10 Jan 2012, 12:58:29 UTC

...
While CPU tasks are at least 50% faster than stock, GPU times look like they are the same as stock (when "stock" was running well for me that is). Sten says this is ION (my mobile GPU) specific, and of course I believe him. So my real question is what's the Tipping Point in nVidia specs that helps Lunatics run faster faster than stock? Just asking in case I ever buy a new laptop/desktop:)...


Hi Alex,
Glad you've got things up and running. As far as a performance tipping point goes, I would suggest at this time the most obvious visible improvements apply to Fermi class GPUs of any description, and to some extent all earlier GPUs with lesser margin as capability reduces. Having said that, the more subtle benefits from increased accuracy & reliability over time should gradually add up even on the smallest GPUs.

Aside from the more obvious performance difference from scaling up GPU, and the extra associated gains from using newer builds on them (some GPUs giving a large boost large, some not), development itself is at several 'tipping points'. What I mean is firstly that firstly Cuda itself is poised to receive a rebuilt compiler with Cuda 4.1, which shows to improve performance on many GPUs... Secondly now that major reliability, precision, and optimisation techniques are understood, building a solid foundation, a lot of focus can now return to performance optimisation in it's purest throughput related sense, rather than mixed with base functionality refinement.

How far each GPU has to go will have a lot to do with the machine it's in as well, and the underlying machine configuration, perhaps factors such as power consumption & other issues. What I would suggest would be to pick an initial price range & running costs, and optimise for performance within that class. That's how my mother's i5 w/560ti was built, and is romping in ~34K RAC at this time, for what is a fairly low cost to run for its throughput, compared to my main dev machine with the 480, which was more expensive, is hotter, louder and uses more power, yet achieves lower throughput.

Even with the humble ION, though it may seem close to its limits, there are things you can do, such as thoroughly check for any driver latency issues using DPC latency Checker, remove unneeded background software & services, and make sure all device drivers are up to date. Things like internal modems & network/Wifi are notorious for making the whole system wait, and in some cases modified drivers, or driver tweaks are available, since the same factors that affect multimedia video/audio dropouts indeed affect crunching performance over the long haul as well.

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1184322 · Report as offensive
DMMD
Avatar

Send message
Joined: 14 Feb 00
Posts: 118
Credit: 71,564,960
RAC: 0
Message 1184327 - Posted: 10 Jan 2012, 13:02:27 UTC

Then, again, Jason, it is convenient to throgh the 'important' stuff to AVX, and let the riff raff settle as it may



[url][/url][url][/url][url][/url]http://www.youtube.com/watch?v=_f_p0CgPeyA

shall we see what UBB does with this, I wrote the ...ah spit
ID: 1184327 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1184328 - Posted: 10 Jan 2012, 13:19:18 UTC - in response to Message 1184327.  

Hah There's Bruce! Yup that pretty much resembles the Australian office of Lunatics & Beer related arts. Probably less English visitors & more Beer here though :P
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1184328 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1185221 - Posted: 14 Jan 2012, 12:47:05 UTC

periodically I'm finding gpu apps running but no progress. i can usually suspend it and the next work unit will load or sometimes it will take a reboot. any ideas?
heat related, or perhaps power related??? i see this on my 3x gtx295 box.
ID: 1185221 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1185224 - Posted: 14 Jan 2012, 13:01:16 UTC - in response to Message 1185221.  

periodically I'm finding gpu apps running but no progress. i can usually suspend it and the next work unit will load or sometimes it will take a reboot. any ideas?
heat related, or perhaps power related??? i see this on my 3x gtx295 box.


Definitely the normal checks with that stuff, then later if you could grab a screenshot of running DPC latency Checker , both during Idle system for a while, then during full crunch before, during & after these slowdowns , it might help pin down if there could be something going on with some drivers on the system, or put those in the clear. One more obscure thing to do, if your BIOS has it, might be to try wind out the PCI latency timer from the usual default of 32 cycles to 128 or 256 if available. Might not make a difference, though certainly modern drivers make heavy use of the bus for turbocache, which can fragment & timeout transfers.

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1185224 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1185227 - Posted: 14 Jan 2012, 13:05:42 UTC - in response to Message 1185221.  
Last modified: 14 Jan 2012, 13:23:03 UTC

These are probably three examples:

http://setiathome.berkeley.edu/result.php?resultid=2261033342

http://setiathome.berkeley.edu/result.php?resultid=2261138938

http://setiathome.berkeley.edu/result.php?resultid=2261033301

Edit: these were the only examples i could find, the other GPU tasks that took >1000secs were Low AR tasks.

Claggy
ID: 1185227 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1185228 - Posted: 14 Jan 2012, 13:18:47 UTC - in response to Message 1185227.  
Last modified: 14 Jan 2012, 13:33:13 UTC

Yeah, those unspecified launch failures do seem to point to possible driver (not necessarily the video one!) issues, along with the usual possible crashing due to power or thermal issues. It's a shame the errors aren't more descriptive, but could be a roundabout indication that adding more 295s under the newer driver models complicates timing & synchronisation in ways Fermi's don't seem to suffer as much from (parts of which we already knew, they have special acceleration hardware for WDDM... which even applies under later extended XPDM).

Definitely after the power & thermal checks, getting the system contention down (perhaps freeing a CPU core per card), along with looking at those those Latency observations should paint a clearer picture if there are ongoing issues with the video (or other) drivers. This is always a suspect with those cards from past history, though should be able to be worked around even if a special build is needed, or a better understanding of what causes the time-outs in the first place could lead to specific tweaks.

[Edit:] Try turning off Hyperthreading!

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1185228 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1185258 - Posted: 14 Jan 2012, 16:20:02 UTC

i ran memtestCL and i'm getting errors on the last two checks which are "logic <local memory, one iteration>" and "logic <local memory, 4 iterations>", i wonder what that means. it is on the secondary card, could it be bus errors? do i need to clean the cable again....
ID: 1185258 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1185260 - Posted: 14 Jan 2012, 16:41:52 UTC - in response to Message 1185258.  

i ran memtestCL and i'm getting errors on the last two checks which are "logic <local memory, one iteration>" and "logic <local memory, 4 iterations>", i wonder what that means. it is on the secondary card, could it be bus errors? do i need to clean the cable again....


The tester itself could be crashing kernels in the same way. Could be a hardware fault of some sort, or something else going on as described earlier. If it is consistently on the same GPU, and with the same reported issue, then it does point to something in the direction of hardware though. I;d swap cards in slots, reseating carefully, then run the tests again. If the same card/GPU fails with the same error in a different slot, and the others still pass, you may have found at least one culprit. If the problem doesn't move with the card though, then the slot itself, or something driver related going on. Either way, it'll take some juggling to isolate.

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1185260 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1185266 - Posted: 14 Jan 2012, 17:36:33 UTC - in response to Message 1185260.  
Last modified: 14 Jan 2012, 18:18:25 UTC

agreed, I've already pulled the primary card and put it in it's place and the problem followed the affected gpu. i tried under clocking as much as precision would allow and the problem was still there.

so i'm cleaning the bridge and sli connector... see if it helps.
ID: 1185266 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1185285 - Posted: 14 Jan 2012, 18:47:33 UTC - in response to Message 1185266.  
Last modified: 14 Jan 2012, 18:56:13 UTC

didn't work, perhaps i will try the oven trick tonight on the daughter card...
ID: 1185285 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1185286 - Posted: 14 Jan 2012, 18:49:11 UTC - in response to Message 1185285.  

didn't work, perhaps i will try the oven test tonight...


Ugh, sounds drastic. Is this a common thing for these particular models to go out this way ?

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1185286 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1185335 - Posted: 14 Jan 2012, 22:38:41 UTC - in response to Message 1185286.  

And I experienced some weird symtoms on my 2 ATI HD5870 GPUs, an I7-2600,
(runs slightly faster with a 102MHz base clock), produced all kinds of errors,
-177; -12 and crashes. The whole system behaves more stable, with a 102MHz base-
clock, no crashes and no -177 (max time ecceeded).

Also changed, period_iteration 1 to 2 and 2 instances_per_device to
3
, no real improvement, so I'll change this again. Screenlag is too much, also, but not worse compaired to last setting.

Pauzing BOINC, won't stop the GPUs (!?), inmediatly and stops after a few
minutes to about 5 minutes, resulting in error.

Starting BOINC takes quite some time and CPU load doesn't reach 100%,
sometimes after a few minutes. (CPU setting in BOINC = 0; use = 100% of CPU and use 100% of CPUs).

Using: CPU use less then 16%, run, results in a quick start of BOINC and load
rise to 100%, as soon as the BOINC manager, has connected to local host!

Second GPU, always has a (slightly) lower load compaired to first one.
This
host
.
GPUs have a very jumpy load, period_iterations(_for_pulse_find) 1 and 2
instances_per_device
gives the best results, also a noticeble screen-lag.

Has anyone tried the latest, cat 11.12., drivers (with OpenCL)?
Now using cat. 11.6 and AMD-APP-SDK 2.4.
(Running BOINC 6.12.34; x64, on a separate harddisk partition, excluding
K:\BOINC_DATA from AV scanning and Back-Up, used the latest installer and
added AstroPulse files and app_info.xml entries).

Anybody with similar issues, on a compairable system and drivers and can I
UPDate to cat 11.12?



ID: 1185335 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1185341 - Posted: 14 Jan 2012, 22:51:13 UTC

Fred

11.6. is a buggy driver.
11.12. is at least bug free.

Also change period_iterations_num to 15-20 and you will experience no more lags.

Mike



With each crime and every kindness we birth our future.
ID: 1185341 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1185373 - Posted: 15 Jan 2012, 1:43:38 UTC - in response to Message 1185341.  
Last modified: 15 Jan 2012, 1:45:44 UTC

Fred

11.6. is a buggy driver.
11.12. is at least bug free.

Also change period_iterations_num to 15-20 and you will experience no more lags.

Mike


OK, I'll try 10 for a start, doing 3 per GPU now, gives far better load.
Also try the cat. 11.12 driver.
I don't like to try 4 on each of the two 1024MByte HD5870 GPUs.
Thanx for your reply, Mike
ID: 1185373 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1185381 - Posted: 15 Jan 2012, 1:55:23 UTC - in response to Message 1185373.  

IIRC the 5XXX series cards didn't do well with anything more than 2 WU's at a time. 3 or more doesnt actually get more work done


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1185381 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1185389 - Posted: 15 Jan 2012, 2:27:50 UTC - in response to Message 1185285.  
Last modified: 15 Jan 2012, 2:28:34 UTC

didn't work, perhaps i will try the oven trick tonight on the daughter card...


it is probably early to call success, but it passed memtestCL!

395F for 10 min in the oven on convection. I just couldn't resist...
ID: 1185389 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65745
Credit: 55,293,173
RAC: 49
United States
Message 1185392 - Posted: 15 Jan 2012, 3:04:08 UTC - in response to Message 1185389.  

didn't work, perhaps i will try the oven trick tonight on the daughter card...


it is probably early to call success, but it passed memtestCL!

395F for 10 min in the oven on convection. I just couldn't resist...

Oh no! Torture... ;)

Hope It works for Ya RM.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1185392 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 · Next

Message boards : Number crunching : Lunatics Windows Installer v0.39 release notes


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.