V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use

Author	Message
RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0	Message 870758 - Posted: 1 Mar 2009, 4:01:31 UTC Boinc should be fixed to not run more AP work units then the number of logical cpus. it should also be smart enough to get some MP apps to keep the GPU feed. ID: 870758 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 870769 - Posted: 1 Mar 2009, 4:17:09 UTC - in response to Message 870758. Boinc should be fixed to not run more AP work units then the number of logical cpus. it should also be smart enough to get some MP apps to keep the GPU feed. Yeah, I gather in fine tuning for stock application work-fetch behaviour, the newer development versions currently don't seem to play well with opt-apps (for me anyway). Probably when they get around to looking at fitting the Anonymous platform mechanism into the new work-fetch and process control methods, things might look a bit better from our perspective. Until then I've had to go back to 6.5.0 (still had single work fetch), to get any MB work. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 870769 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 870972 - Posted: 1 Mar 2009, 15:39:28 UTC - in response to Message 870769. Version of CUDA MB w/o affinity lock mod attached there. http://lunatics.kwsn.net/gpu-crunching/v10-of-modified-seti-mb-cuda-opt-ap-package-for-full-multi-gpucpu-use.msg15058.html#msg15058 I hope it should remove performance loss during CUDA task start for multi-GPU configs. But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps. ID: 870972 ·

Misfit Volunteer tester Send message Joined: 21 Jun 01 Posts: 21804 Credit: 2,815,091 RAC: 0	Message 871102 - Posted: 1 Mar 2009, 19:53:29 UTC - in response to Message 870972. But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps. Sometimes it appears to be the other way around. me@rescam.org ID: 871102 ·

Simon Send message Joined: 13 Aug 99 Posts: 11 Credit: 18,874,447 RAC: 0	Message 871132 - Posted: 1 Mar 2009, 22:02:17 UTC - in response to Message 871102. Hi, Just wanted to say thanks, your V10 Multi GPU pack is exactly what I needed. Since regular AP units are sparse I was always stuck with the dilemma of running four GPU's and leaving the CPU out of work (other than scheduling) or your V7 pack to use all the cores but only one of the GPU's. Thanks, Simon. ID: 871132 ·

Bob Mahoney Design Send message Joined: 4 Apr 04 Posts: 178 Credit: 9,205,632 RAC: 0	Message 871322 - Posted: 2 Mar 2009, 13:52:31 UTC - in response to Message 870972. I hope it should remove performance loss during CUDA task start for multi-GPU configs. But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps. The "no_affinity_lock" version of V10 fixed the issue. I've seen multiple GPU loads while the other GPU tasks continue to process as normal. I'm testing it right now with 4 CPU AP V5 running and all GPU loaded with MB CUDA. Looks perfect. Next I'll test with 4 CPU AK MB plus all GPU loaded with MB CUDA. Nice work! Bob Mahoney ID: 871322 ·

Bob Mahoney Design Send message Joined: 4 Apr 04 Posts: 178 Credit: 9,205,632 RAC: 0	Message 871364 - Posted: 2 Mar 2009, 16:20:34 UTC - in response to Message 870972. I hope it should remove performance loss during CUDA task start for multi-GPU configs. But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps. The fix is also confirmed for MB on CPU and GPU. I can't break it or find any slowdowns. Bob Mahoney ID: 871364 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 871368 - Posted: 2 Mar 2009, 16:25:22 UTC - in response to Message 871364. I hope it should remove performance loss during CUDA task start for multi-GPU configs. But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps. The fix is also confirmed for MB on CPU and GPU. I can't break it or find any slowdowns. Bob Mahoney Fine, thanks. So it will be default build now. ID: 871368 ·

RuthlessRufus Send message Joined: 18 Oct 07 Posts: 11 Credit: 70,386,101 RAC: 28	Message 871646 - Posted: 3 Mar 2009, 6:48:23 UTC Is it possible to declare which GPU you want to run Seti with this mod? I have a core on my pair of GTX295s which will not fold. Configuration: WinXP 64 2x GTX295s 182.06 drivers ID: 871646 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 871647 - Posted: 3 Mar 2009, 6:52:37 UTC - in response to Message 871646. Last modified: 3 Mar 2009, 6:53:17 UTC Is it possible to declare which GPU you want to run Seti with this mod? I have a core on my pair of GTX295s which will not fold. Configuration: WinXP 64 2x GTX295s 182.06 drivers not quite, but you can set first N GPUs (in CUDA detection order) to work only. That is for dual GPU config you can use both or only first one. There is no way to enable only second one (in current V10a) besides physically change card in PCI-E slot (unappropriate for dual-core GPU of course). ID: 871647 ·

Bob Mahoney Design Send message Joined: 4 Apr 04 Posts: 178 Credit: 9,205,632 RAC: 0	Message 871798 - Posted: 3 Mar 2009, 17:30:14 UTC I have seen a few VLAR WU's start on the CPU, then they are moved to a GPU. This example, WU 23ja09aa.18774.2526.9.8.9_0 was running very slowly on a GPU. I aborted it then looked it up here in my completed tasks: http://setiathome.berkeley.edu/result.php?resultid=1177059133 I think I saw this happen a few times now. V10 realized it was a VLAR, but still moved it from CPU to GPU, as in this example. Bob Mahoney ID: 871798 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 871812 - Posted: 3 Mar 2009, 22:26:04 UTC - in response to Message 871798. I have seen a few VLAR WU's start on the CPU, then they are moved to a GPU. This example, WU 23ja09aa.18774.2526.9.8.9_0 was running very slowly on a GPU. I aborted it then looked it up here in my completed tasks: http://setiathome.berkeley.edu/result.php?resultid=1177059133 I think I saw this happen a few times now. V10 realized it was a VLAR, but still moved it from CPU to GPU, as in this example. Bob Mahoney It's another tradeoff between saving work already done by CPU and slowness of CUDA VLAR processing. That is, if VLAR task partially done already, CUDA MB will not abort it but will continue computations. Why CPU app decided to reschedule VLAR to CPU - one of GPUs was idle and one of CPU apps did checkpoint just at the same moment. So it detects idle GPU (probably before BOINC starts new task) and reschedule own task to idle GPU. Simple solution for this - forbid re-scheduling of VLAR tasks. It can be done. Will post new mod soon. For now you can just increase time between checkpointing (CPU app searches for idle GPU only after completed checkpoint) - it will reduce possibility of re-scheduling. ID: 871812 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 872401 - Posted: 5 Mar 2009, 11:38:34 UTC Little 'new member question'.. ;-D I downloaded 'app_info.xml 608 CUDA WUs'. [Raistmer's V7 mod with selfmade app_info.xml mod for only CUDA] What'll happen, if I delete the app_info.xml? ..to make some experiences with stock app. I'll crash/trash/delete all downloaded WUs on my rig? (My 'wingmen' will be very happy.. ;-) Or BOINC will download the stock CUDA app? ID: 872401 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874	Message 872404 - Posted: 5 Mar 2009, 12:04:09 UTC - in response to Message 872401. Little 'new member question'.. ;-D I downloaded 'app_info.xml 608 CUDA WUs'. [Raistmer's V7 mod with selfmade app_info.xml mod for only CUDA] What'll happen, if I delete the app_info.xml? ..to make some experiences with stock app. I'll crash/trash/delete all downloaded WUs on my rig? (My 'wingmen' will be very happy.. ;-) Or BOINC will download the stock CUDA app? You'll crash/trash/delete all downloaded WUs on your rig, and then BOINC will download the stock CUDA app with new WUs to replace them. Set "No new tasks" and wait for them to flush out first. ID: 872404 ·

Jord Volunteer tester Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3	Message 872411 - Posted: 5 Mar 2009, 12:14:19 UTC - in response to Message 872401. Last modified: 5 Mar 2009, 12:14:55 UTC What'll happen, if I delete the app_info.xml? ..to make some experiences with stock app. A more complete answer is this: When you delete only the app_info.xml file and restart BOINC, it will try to download the modified executable from the Seti servers, while crashing all your remaining tasks. Even deleting both the app_info.xml and the executable files will not download the stock application, as there are entries in the client-state.xml file that tell BOINC to use this modified application. So the best thing to do is to delete both the app_info.xml file and the executable, then after you restarted BOINC do a project reset. Since this will delete all work in progress, perhaps you want to wait until all that work is done, while setting Seti to No New Tasks in the mean time. If you don't want to wait, just detaching and re-attaching is easier, as that'll tell Seti that the work on your system was flushed and that it can be reassigned immediately. ID: 872411 ·

MarkJ Volunteer tester Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5	Message 872414 - Posted: 5 Mar 2009, 12:19:25 UTC - in response to Message 870769. Boinc should be fixed to not run more AP work units then the number of logical cpus. it should also be smart enough to get some MP apps to keep the GPU feed. Yeah, I gather in fine tuning for stock application work-fetch behaviour, the newer development versions currently don't seem to play well with opt-apps (for me anyway). Probably when they get around to looking at fitting the Anonymous platform mechanism into the new work-fetch and process control methods, things might look a bit better from our perspective. Until then I've had to go back to 6.5.0 (still had single work fetch), to get any MB work. Jason You mean like this message? BOINC blog ID: 872414 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 872443 - Posted: 5 Mar 2009, 13:34:08 UTC - in response to Message 872414. Last modified: 5 Mar 2009, 13:34:52 UTC Boinc should be fixed to not run more AP work units then the number of logical cpus. it should also be smart enough to get some MP apps to keep the GPU feed. Yeah, I gather in fine tuning for stock application work-fetch behaviour, the newer development versions currently don't seem to play well with opt-apps (for me anyway). Probably when they get around to looking at fitting the Anonymous platform mechanism into the new work-fetch and process control methods, things might look a bit better from our perspective. Until then I've had to go back to 6.5.0 (still had single work fetch), to get any MB work. Jason You mean like this message? Yes, that's the dark depths where our experiments led. The situation had only become a little clearer by the time Richard posted on the 4th, since my post you quoted on the 1st. We were already experimenting to try make it work ... which of course it wouldn't... Fun Week :D "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 872443 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 872626 - Posted: 5 Mar 2009, 21:16:51 UTC Last modified: 5 Mar 2009, 21:23:11 UTC Is this worst case possible? AFAIK.. max. 10 'copies' will be send out from every WU. The VLAR kill mod -> 'bad WU header' -> client error It's possible that 10 CUDA apps will report 'client error' and the WU will be deleted? Or this 'client error' will counts as 10th at the side of other possible errors (no reply, validate error, or others)? And because of this we wouldn't see the WOW signal that was maybe in it? Or wouldn't counts a 'client error'? [ In past very often. Keywords: 'RRI' and '~ 60 sec'. Current I see 1 sec. space between upload and report and no 'validate error'. The server software were updated and we'll never see again 'validate errors' like it was long long time (years) ago? Or it's maybe 'only' a next effekt of an other made update? And after the next update of the server software we'll have again the 'validate errors'?] ID: 872626 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 872645 - Posted: 5 Mar 2009, 22:18:32 UTC - in response to Message 872626. Is this worst case possible? AFAIK.. max. 10 'copies' will be send out from every WU. Up to 12 can be sent, "max # of error/total/success tasks 5, 10, 10" are the allowed counts and the max on total isn't really checked until tasks are reported. The VLAR kill mod -> 'bad WU header' -> client error It's possible that 10 CUDA apps will report 'client error' and the WU will be deleted? It only takes 6 errors to kill a WU. Although mathematically possible, it's extremely unlikely the VLAR kill mod will affect that many for one WU. It's only intended as a temporary workaround until the CUDA app can be improved or a better method of getting VLARs on CPU apps is available. Joe ID: 872645 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 872751 - Posted: 6 Mar 2009, 5:15:23 UTC Last modified: 6 Mar 2009, 5:32:29 UTC @ Josef W. Segur O.K., I had little bit time.. ;-) And searched in all my WUs/results.. I found 1 or 3 negative highlights.. with 3 times 'client errors' side by side at one WU: [EDIT #2: Of course, I found many WUs with 2 times 'CUDA kill mod errors' side by side. The space for to post here wouldn't be enough to post all the URLs.. ;-) AND I don't know, if the >= 3. 'CUDA kill error' will come additional.] 'Real CUDA kill mod errors': http://setiathome.berkeley.edu/workunit.php?wuid=418339880 And here something 'funny'.. With maybe the 1st V of the 'CUDA - kill mod'? http://setiathome.berkeley.edu/workunit.php?wuid=414645638 with http://setiathome.berkeley.edu/result.php?resultid=1170589427 [EDIT: Ooppsss.. here it's app V6.03, so not CUDA..] http://setiathome.berkeley.edu/workunit.php?wuid=418166229 with http://setiathome.berkeley.edu/result.php?resultid=1171682588 If I understood you well.. Different 'client error' counts not for the max. 6 errors? So, maybe three 'client errors' from 'CPU overheating' and three 'client errors' from 'CUDA kill mod' counts not together 6 times for to delete a WU? ONLY 6 times SAME error? If I would think twice about the 'VLAR kill function', so it would be maybe better to crunch VLARs with CUDA, although very veeery slowly.. Maybe soon we see the server will send VLARs only to MB V6.03 [CPU app]? Until > MB V6.08 CUDA will release [GPU app]? ID: 872751 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.