V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use

Message boards : Number crunching : V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 15 · Next

AuthorMessage
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 870758 - Posted: 1 Mar 2009, 4:01:31 UTC

Boinc should be fixed to not run more AP work units then the number of logical cpus. it should also be smart enough to get some MP apps to keep the GPU feed.
ID: 870758 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 870769 - Posted: 1 Mar 2009, 4:17:09 UTC - in response to Message 870758.  

Boinc should be fixed to not run more AP work units then the number of logical cpus. it should also be smart enough to get some MP apps to keep the GPU feed.


Yeah, I gather in fine tuning for stock application work-fetch behaviour, the newer development versions currently don't seem to play well with opt-apps (for me anyway). Probably when they get around to looking at fitting the Anonymous platform mechanism into the new work-fetch and process control methods, things might look a bit better from our perspective. Until then I've had to go back to 6.5.0 (still had single work fetch), to get any MB work.

Jason

"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 870769 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 870972 - Posted: 1 Mar 2009, 15:39:28 UTC - in response to Message 870769.  

Version of CUDA MB w/o affinity lock mod attached there.
http://lunatics.kwsn.net/gpu-crunching/v10-of-modified-seti-mb-cuda-opt-ap-package-for-full-multi-gpucpu-use.msg15058.html#msg15058

I hope it should remove performance loss during CUDA task start for multi-GPU configs.
But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps.
ID: 870972 · Report as offensive
Profile Misfit
Volunteer tester
Avatar

Send message
Joined: 21 Jun 01
Posts: 21804
Credit: 2,815,091
RAC: 0
United States
Message 871102 - Posted: 1 Mar 2009, 19:53:29 UTC - in response to Message 870972.  

But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps.

Sometimes it appears to be the other way around.
me@rescam.org
ID: 871102 · Report as offensive
Simon

Send message
Joined: 13 Aug 99
Posts: 11
Credit: 18,874,447
RAC: 0
United Kingdom
Message 871132 - Posted: 1 Mar 2009, 22:02:17 UTC - in response to Message 871102.  

Hi,

Just wanted to say thanks, your V10 Multi GPU pack is exactly what I needed.

Since regular AP units are sparse I was always stuck with the dilemma of running four GPU's and leaving the CPU out of work (other than scheduling) or your V7 pack to use all the cores but only one of the GPU's.

Thanks, Simon.


ID: 871132 · Report as offensive
Profile Bob Mahoney Design
Avatar

Send message
Joined: 4 Apr 04
Posts: 178
Credit: 9,205,632
RAC: 0
United States
Message 871322 - Posted: 2 Mar 2009, 13:52:31 UTC - in response to Message 870972.  

I hope it should remove performance loss during CUDA task start for multi-GPU configs.
But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps.

The "no_affinity_lock" version of V10 fixed the issue. I've seen multiple GPU loads while the other GPU tasks continue to process as normal.

I'm testing it right now with 4 CPU AP V5 running and all GPU loaded with MB CUDA. Looks perfect.

Next I'll test with 4 CPU AK MB plus all GPU loaded with MB CUDA.

Nice work!

Bob Mahoney
ID: 871322 · Report as offensive
Profile Bob Mahoney Design
Avatar

Send message
Joined: 4 Apr 04
Posts: 178
Credit: 9,205,632
RAC: 0
United States
Message 871364 - Posted: 2 Mar 2009, 16:20:34 UTC - in response to Message 870972.  

I hope it should remove performance loss during CUDA task start for multi-GPU configs.
But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps.

The fix is also confirmed for MB on CPU and GPU.

I can't break it or find any slowdowns.

Bob Mahoney
ID: 871364 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 871368 - Posted: 2 Mar 2009, 16:25:22 UTC - in response to Message 871364.  

I hope it should remove performance loss during CUDA task start for multi-GPU configs.
But if you use some non-BOINC CPU-intensive applications they can slowdown CUDA MB apps.

The fix is also confirmed for MB on CPU and GPU.

I can't break it or find any slowdowns.

Bob Mahoney


Fine, thanks.
So it will be default build now.
ID: 871368 · Report as offensive
RuthlessRufus

Send message
Joined: 18 Oct 07
Posts: 11
Credit: 70,386,101
RAC: 28
United States
Message 871646 - Posted: 3 Mar 2009, 6:48:23 UTC

Is it possible to declare which GPU you want to run Seti with this mod? I have a core on my pair of GTX295s which will not fold.

Configuration:

WinXP 64
2x GTX295s
182.06 drivers
ID: 871646 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 871647 - Posted: 3 Mar 2009, 6:52:37 UTC - in response to Message 871646.  
Last modified: 3 Mar 2009, 6:53:17 UTC

Is it possible to declare which GPU you want to run Seti with this mod? I have a core on my pair of GTX295s which will not fold.

Configuration:

WinXP 64
2x GTX295s
182.06 drivers

not quite, but you can set first N GPUs (in CUDA detection order) to work only. That is for dual GPU config you can use both or only first one.

There is no way to enable only second one (in current V10a) besides physically change card in PCI-E slot (unappropriate for dual-core GPU of course).
ID: 871647 · Report as offensive
Profile Bob Mahoney Design
Avatar

Send message
Joined: 4 Apr 04
Posts: 178
Credit: 9,205,632
RAC: 0
United States
Message 871798 - Posted: 3 Mar 2009, 17:30:14 UTC

I have seen a few VLAR WU's start on the CPU, then they are moved to a GPU. This example, WU 23ja09aa.18774.2526.9.8.9_0 was running very slowly on a GPU. I aborted it then looked it up here in my completed tasks:

http://setiathome.berkeley.edu/result.php?resultid=1177059133

I think I saw this happen a few times now. V10 realized it was a VLAR, but still moved it from CPU to GPU, as in this example.

Bob Mahoney
ID: 871798 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 871812 - Posted: 3 Mar 2009, 22:26:04 UTC - in response to Message 871798.  

I have seen a few VLAR WU's start on the CPU, then they are moved to a GPU. This example, WU 23ja09aa.18774.2526.9.8.9_0 was running very slowly on a GPU. I aborted it then looked it up here in my completed tasks:

http://setiathome.berkeley.edu/result.php?resultid=1177059133

I think I saw this happen a few times now. V10 realized it was a VLAR, but still moved it from CPU to GPU, as in this example.

Bob Mahoney


It's another tradeoff between saving work already done by CPU and slowness of CUDA VLAR processing.
That is, if VLAR task partially done already, CUDA MB will not abort it but will continue computations.

Why CPU app decided to reschedule VLAR to CPU - one of GPUs was idle and one of CPU apps did checkpoint just at the same moment. So it detects idle GPU (probably before BOINC starts new task) and reschedule own task to idle GPU.

Simple solution for this - forbid re-scheduling of VLAR tasks. It can be done. Will post new mod soon.
For now you can just increase time between checkpointing (CPU app searches for idle GPU only after completed checkpoint) - it will reduce possibility of re-scheduling.
ID: 871812 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 872401 - Posted: 5 Mar 2009, 11:38:34 UTC


Little 'new member question'.. ;-D

I downloaded 'app_info.xml 608 CUDA WUs'.
[Raistmer's V7 mod with selfmade app_info.xml mod for only CUDA]

What'll happen, if I delete the app_info.xml?
..to make some experiences with stock app.

I'll crash/trash/delete all downloaded WUs on my rig? (My 'wingmen' will be very happy.. ;-)
Or BOINC will download the stock CUDA app?

ID: 872401 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 872404 - Posted: 5 Mar 2009, 12:04:09 UTC - in response to Message 872401.  


Little 'new member question'.. ;-D

I downloaded 'app_info.xml 608 CUDA WUs'.
[Raistmer's V7 mod with selfmade app_info.xml mod for only CUDA]

What'll happen, if I delete the app_info.xml?
..to make some experiences with stock app.

I'll crash/trash/delete all downloaded WUs on my rig? (My 'wingmen' will be very happy.. ;-)
Or BOINC will download the stock CUDA app?

You'll crash/trash/delete all downloaded WUs on your rig, and then BOINC will download the stock CUDA app with new WUs to replace them.

Set "No new tasks" and wait for them to flush out first.
ID: 872404 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 872411 - Posted: 5 Mar 2009, 12:14:19 UTC - in response to Message 872401.  
Last modified: 5 Mar 2009, 12:14:55 UTC

What'll happen, if I delete the app_info.xml?
..to make some experiences with stock app.

A more complete answer is this:

When you delete only the app_info.xml file and restart BOINC, it will try to download the modified executable from the Seti servers, while crashing all your remaining tasks. Even deleting both the app_info.xml and the executable files will not download the stock application, as there are entries in the client-state.xml file that tell BOINC to use this modified application.

So the best thing to do is to delete both the app_info.xml file and the executable, then after you restarted BOINC do a project reset. Since this will delete all work in progress, perhaps you want to wait until all that work is done, while setting Seti to No New Tasks in the mean time.

If you don't want to wait, just detaching and re-attaching is easier, as that'll tell Seti that the work on your system was flushed and that it can be reassigned immediately.
ID: 872411 · Report as offensive
MarkJ Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 08
Posts: 1139
Credit: 80,854,192
RAC: 5
Australia
Message 872414 - Posted: 5 Mar 2009, 12:19:25 UTC - in response to Message 870769.  

Boinc should be fixed to not run more AP work units then the number of logical cpus. it should also be smart enough to get some MP apps to keep the GPU feed.


Yeah, I gather in fine tuning for stock application work-fetch behaviour, the newer development versions currently don't seem to play well with opt-apps (for me anyway). Probably when they get around to looking at fitting the Anonymous platform mechanism into the new work-fetch and process control methods, things might look a bit better from our perspective. Until then I've had to go back to 6.5.0 (still had single work fetch), to get any MB work.

Jason


You mean like this message?
BOINC blog
ID: 872414 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 872443 - Posted: 5 Mar 2009, 13:34:08 UTC - in response to Message 872414.  
Last modified: 5 Mar 2009, 13:34:52 UTC

Boinc should be fixed to not run more AP work units then the number of logical cpus. it should also be smart enough to get some MP apps to keep the GPU feed.


Yeah, I gather in fine tuning for stock application work-fetch behaviour, the newer development versions currently don't seem to play well with opt-apps (for me anyway). Probably when they get around to looking at fitting the Anonymous platform mechanism into the new work-fetch and process control methods, things might look a bit better from our perspective. Until then I've had to go back to 6.5.0 (still had single work fetch), to get any MB work.

Jason


You mean like this message?


Yes, that's the dark depths where our experiments led. The situation had only become a little clearer by the time Richard posted on the 4th, since my post you quoted on the 1st. We were already experimenting to try make it work ... which of course it wouldn't... Fun Week :D
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 872443 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 872626 - Posted: 5 Mar 2009, 21:16:51 UTC
Last modified: 5 Mar 2009, 21:23:11 UTC

Is this worst case possible?

AFAIK.. max. 10 'copies' will be send out from every WU.


The VLAR kill mod -> 'bad WU header' -> client error


It's possible that 10 CUDA apps will report 'client error' and the WU will be deleted?
Or this 'client error' will counts as 10th at the side of other possible errors (no reply, validate error*, or others)?

And because of this we wouldn't see the WOW signal that was maybe in it?

Or wouldn't counts a 'client error'?



[* In past very often. Keywords: 'RRI' and '~ 60 sec'.
Current I see 1 sec. space between upload and report and no 'validate error'.
The server software were updated and we'll never see again 'validate errors' like it was long long time (years) ago?
Or it's maybe 'only' a next effekt of an other made update?
And after the next update of the server software we'll have again the 'validate errors'?]
ID: 872626 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 872645 - Posted: 5 Mar 2009, 22:18:32 UTC - in response to Message 872626.  

Is this worst case possible?

AFAIK.. max. 10 'copies' will be send out from every WU.

Up to 12 can be sent, "max # of error/total/success tasks 5, 10, 10" are the allowed counts and the max on total isn't really checked until tasks are reported.

The VLAR kill mod -> 'bad WU header' -> client error


It's possible that 10 CUDA apps will report 'client error' and the WU will be deleted?

It only takes 6 errors to kill a WU. Although mathematically possible, it's extremely unlikely the VLAR kill mod will affect that many for one WU. It's only intended as a temporary workaround until the CUDA app can be improved or a better method of getting VLARs on CPU apps is available.
                                                                 Joe
ID: 872645 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 872751 - Posted: 6 Mar 2009, 5:15:23 UTC
Last modified: 6 Mar 2009, 5:32:29 UTC

@ Josef W. Segur

O.K., I had little bit time.. ;-)
And searched in all my WUs/results..

I found 1 or 3 negative highlights.. with 3 times 'client errors' side by side at one WU:

[EDIT #2: Of course, I found many WUs with 2 times 'CUDA kill mod errors' side by side. The space for to post here wouldn't be enough to post all the URLs.. ;-)
AND I don't know, if the >= 3. 'CUDA kill error' will come additional.]


'Real CUDA kill mod errors':
http://setiathome.berkeley.edu/workunit.php?wuid=418339880



And here something 'funny'..
With maybe the 1st V of the 'CUDA - kill mod'?


http://setiathome.berkeley.edu/workunit.php?wuid=414645638
with
http://setiathome.berkeley.edu/result.php?resultid=1170589427
[EDIT: Ooppsss.. here it's app V6.03, so not CUDA..]


http://setiathome.berkeley.edu/workunit.php?wuid=418166229
with
http://setiathome.berkeley.edu/result.php?resultid=1171682588



If I understood you well..
Different 'client error' counts not for the max. 6 errors?

So, maybe three 'client errors' from 'CPU overheating' and three 'client errors' from 'CUDA kill mod' counts not together 6 times for to delete a WU?
ONLY 6 times SAME error?


If I would think twice about the 'VLAR kill function', so it would be maybe better to crunch VLARs with CUDA, although very veeery slowly..

Maybe soon we see the server will send VLARs only to MB V6.03 [CPU app]?
Until > MB V6.08 CUDA will release [GPU app]?
ID: 872751 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 15 · Next

Message boards : Number crunching : V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.