V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use

Message boards : Number crunching : V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 15 · Next

AuthorMessage
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 872769 - Posted: 6 Mar 2009, 6:50:29 UTC - in response to Message 872751.  

@ Josef W. Segur
...
If I understood you well..
Different 'client error' counts not for the max. 6 errors?

So, maybe three 'client errors' from 'CPU overheating' and three 'client errors' from 'CUDA kill mod' counts not together 6 times for to delete a WU?
ONLY 6 times SAME error?

No, all client errors count together. Any 6 will kill a WU.

If I would think twice about the 'VLAR kill function', so it would be maybe better to crunch VLARs with CUDA, although very veeery slowly..

Maybe soon we see the server will send VLARs only to MB V6.03 [CPU app]?
Until > MB V6.08 CUDA will release [GPU app]?

I still think the risk of killing a WU is very small, but avoiding it by allowing them to crunch on your host is of course something you could do to almost eliminate the risk on WUs your host gets.

I don't have any information on what the project might try. AFAIK BOINC doesn't have any way to direct VLAR's to CPU apps, though Eric could code changes for that purpose if he decided it was needed. I suspect he'll try to improve the CUDA code instead, if he has time. Or maybe Raistmer and Mimo/Devaster will find a way to improve the CUDA code.
                                                                Joe
ID: 872769 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 873332 - Posted: 7 Mar 2009, 13:09:39 UTC - in response to Message 872769.  
Last modified: 7 Mar 2009, 13:11:26 UTC

Here are updates to V10b for CPU part of team mods for x86 versions.
Now CPU app will not do attempts to re-schedule VLAR on GPU even if GPU is idle.
In multi-core systems it will allow another instance of CPU app that running non-VLAR task do re-schedule slightly later. So, total host performance should increase.

Thanks to Bob Mahoney who inspired this mod update.
ID: 873332 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 873350 - Posted: 7 Mar 2009, 14:26:18 UTC - in response to Message 873332.  
Last modified: 7 Mar 2009, 14:26:51 UTC

I hope all advanced users for whom this apps intended remember rules of app updates, but anyway I repost it once more time here ;)

The recommended way to do any (not only this one) update of opt apps:

1) DISABLE network access in BOINC
2) STOP BOINC
3) Copy BOINC whole data directory in another place as backup.
4) install update
5) start BOINC.
6) if all work fine - resume network access, else - stop BOINC and restore BOINC data folder from backup.

In other case you risk to trash whole your cache and be blamed by all few hundreds of your wingmans for that unfriendly action!
ID: 873350 · Report as offensive
Gnitter

Send message
Joined: 2 Jan 07
Posts: 26
Credit: 19,909,753
RAC: 0
Sweden
Message 873353 - Posted: 7 Mar 2009, 14:43:03 UTC - in response to Message 873332.  
Last modified: 7 Mar 2009, 14:44:28 UTC

Thank you so much for all the efforts from you guys who contribute in bringing the cuda crunching forward...

Just installed V10b, on host, eagerly studying the wu´s :)

For you multi GPU guys who do not want to upgrade, please follow
Rastimers advise in Message 871812 regarding Cache and Checkpoint. Improved the situation for V10a
greatly!

Edit:
Fixed bad formated link to host
ID: 873353 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 873389 - Posted: 7 Mar 2009, 15:35:30 UTC
Last modified: 7 Mar 2009, 15:43:00 UTC

@ Raistmer

It's possible to make a mod that VLARs would be crunched ONLY on CPU instead to 'kill'?


EDIT:
Or make it possible that the VLARs are not 'client errors'?
Keyword: worst case
ID: 873389 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 873413 - Posted: 7 Mar 2009, 16:19:40 UTC - in response to Message 873389.  

@ Raistmer

It's possible to make a mod that VLARs would be crunched ONLY on CPU instead to 'kill'?

Keyword: worst case


Answered many times already - it will make GPU idle.
ID: 873413 · Report as offensive
gomeyer
Volunteer tester

Send message
Joined: 21 May 99
Posts: 488
Credit: 50,370,425
RAC: 0
United States
Message 873423 - Posted: 7 Mar 2009, 16:42:15 UTC

Advice for any who are still running v9, you must first update to v10a before you apply this update for v10b since v10a contains other files needed to run. A link to v10a is at the beginning of this thread.

(Another reason to follow his advice in message 873350)
ID: 873423 · Report as offensive
elgar

Send message
Joined: 21 May 99
Posts: 69
Credit: 2,687,478
RAC: 0
United States
Message 873475 - Posted: 7 Mar 2009, 18:15:55 UTC - in response to Message 873423.  

I tried to use this optapp and ended up trashing all my work units, dangit. Had to back all the way out, removed boinc 6.4.5 and installed 6.4.7. Is there a step-by-stepper anywhere on how to get this version installed, by any chance? I'm sure with enough trial and error I could get it working but that seems pretty inefficient.
ID: 873475 · Report as offensive
Profile Edboard
Volunteer tester

Send message
Joined: 4 Jun 08
Posts: 9
Credit: 1,043,577
RAC: 0
Spain
Message 873522 - Posted: 7 Mar 2009, 20:31:24 UTC
Last modified: 7 Mar 2009, 20:32:05 UTC

I'm crunching in a PC with a GTX295 and a 8800GT three SETI-cuda WUs at a time with the Raistmer's opt. packagge v7 that uses the MB_6.08_mod_VLAR_kill_CUDA.exe and it works fine. I wonder if I would get a better crunching rate with this new ones (v10 and so).
ID: 873522 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 873529 - Posted: 7 Mar 2009, 21:31:47 UTC
Last modified: 7 Mar 2009, 21:41:34 UTC

@ Raistmer

I meant VLARs not as 'client error':
Maybe it's possible to use an other 'option' to 'kill' the VLARs?

For example:
Only deleting on the rig - the deadline must arrive to send to 3rd PC [not very kind to 'wingmen', but more safe for the project]
AFAIK.. This could happen 10 (max. 12 copies of every WU) times? Not 'only' 6 times..?

Or something other..

To reduce the possibility of the worst case..
ID: 873529 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 873544 - Posted: 7 Mar 2009, 22:51:24 UTC - in response to Message 873529.  

x64 SSSE3 & SSE3_Intel V10b updates added.
AMD users with x64 Windows should use SSE3_AMD x86 binary.
http://lunatics.kwsn.net/12-gpu-crunching/v10-of-modified-seti-mb-cuda-opt-ap-package-for-full-multi-gpucpu-use.msg15445.html#msg15445
ID: 873544 · Report as offensive
Profile SoNic

Send message
Joined: 24 Dec 00
Posts: 140
Credit: 2,963,627
RAC: 0
Romania
Message 873634 - Posted: 8 Mar 2009, 4:04:18 UTC
Last modified: 8 Mar 2009, 4:13:25 UTC

I have just installed Vista x64 and V10 x64 apps (C2D with GF9500, 4GB). I don't get any more WU's...
The last messages of Boinc manager (latest version) are:
Fetching scheduler list
Master file download succeded

LE: Installed ver 10b and all's good. For now. Thanks.
Dang it, I got only AP v5.03 now... no nice CUDA for me I guess...
ID: 873634 · Report as offensive
Profile Gonad the Destroyer®©™
Avatar

Send message
Joined: 6 Aug 99
Posts: 204
Credit: 12,463,705
RAC: 0
United States
Message 873724 - Posted: 8 Mar 2009, 13:51:38 UTC

Can the Cuda app be run if the cards are in SLi yet, or does that still have to be disabled???
ID: 873724 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 873729 - Posted: 8 Mar 2009, 14:15:34 UTC - in response to Message 873724.  

Can the Cuda app be run if the cards are in SLi yet, or does that still have to be disabled???

SLI has never prevented it running. It is a function of CUDA (go look at the nVidia site) that 2 cards in SLI function as 1 card (that is the purpose of SLI in the first place) so Boinc will "see" only one CUDA device. If you want to use the cards (or cores on a 295) independently then SLI must be disabled.

F.
ID: 873729 · Report as offensive
EPG

Send message
Joined: 3 Apr 99
Posts: 110
Credit: 10,416,543
RAC: 0
Hungary
Message 874350 - Posted: 10 Mar 2009, 16:26:59 UTC

Uff!

Raistmer, I would like to ask about V8-V9 difference, I mentioned in this post;
you said, you try to reproduce it. Any news on it?

Meanwhile i tried it with the V10b and the problem still exists.

Thx!
ID: 874350 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 874562 - Posted: 11 Mar 2009, 8:12:40 UTC - in response to Message 874350.  

Uff!

Raistmer, I would like to ask about V8-V9 difference, I mentioned in this post;
you said, you try to reproduce it. Any news on it?

Meanwhile i tried it with the V10b and the problem still exists.

Thx!

Both my CUDA-enabled hosts didn't encountered such situation.
ID: 874562 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 874654 - Posted: 11 Mar 2009, 17:18:16 UTC - in response to Message 874562.  
Last modified: 11 Mar 2009, 17:19:57 UTC

This update ONLY for those who use BOINC only for SETI MB crunching. No other projects, not even AstroPulse, ONLY SETI Enhanced (MultiBeam).
In this build CPU-GPU re-scheduling is disabled (it's not needed for only SETI MB configuration).
It should increase total performance for such configs.
Again, use it ONLY if you crunch SETI MB solely.

And don't forget RULEs for opt apps updating:


1) DISABLE network access in BOINC
2) STOP BOINC
3) Copy BOINC whole data directory in another place as backup.
4) install update
5) start BOINC.
6) if all work fine - resume network access, else - stop BOINC and restore BOINC data folder from backup.
ID: 874654 · Report as offensive
Profile Toppie

Send message
Joined: 3 Apr 99
Posts: 31
Credit: 50,287,619
RAC: 0
South Africa
Message 875791 - Posted: 15 Mar 2009, 14:26:44 UTC - in response to Message 873729.  


SLI has never prevented it running. It is a function of CUDA (go look at the nVidia site) that 2 cards in SLI function as 1 card (that is the purpose of SLI in the first place) so Boinc will "see" only one CUDA device. If you want to use the cards (or cores on a 295) independently then SLI must be disabled.

F.


Umm...how does one disable the sli on a 295? Just bought one and not installed yet. Hope I needn't open the card.

Toppie.

ID: 875791 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 875792 - Posted: 15 Mar 2009, 14:29:46 UTC - in response to Message 875791.  


SLI has never prevented it running. It is a function of CUDA (go look at the nVidia site) that 2 cards in SLI function as 1 card (that is the purpose of SLI in the first place) so Boinc will "see" only one CUDA device. If you want to use the cards (or cores on a 295) independently then SLI must be disabled.

F.


Umm...how does one disable the sli on a 295? Just bought one and not installed yet. Hope I needn't open the card.

Toppie.

In driver settings. No need to open card.
ID: 875792 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 875983 - Posted: 15 Mar 2009, 23:35:43 UTC - in response to Message 875791.  


SLI has never prevented it running. It is a function of CUDA (go look at the nVidia site) that 2 cards in SLI function as 1 card (that is the purpose of SLI in the first place) so Boinc will "see" only one CUDA device. If you want to use the cards (or cores on a 295) independently then SLI must be disabled.

F.


Umm...how does one disable the sli on a 295? Just bought one and not installed yet. Hope I needn't open the card.

Toppie.

It's one of the options in the NVidia control panel. That is where you will need to use the controls to increase the fan speed too. The default setting of 45% is just not enough for crunching. I have to have it set at 75% to keep the temps around 70 deg C.

F.
ID: 875983 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 15 · Next

Message boards : Number crunching : V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.