V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use

Message boards : Number crunching : V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 886053 - Posted: 17 Apr 2009, 7:31:52 UTC - in response to Message 885851.  

Build w/o VLAR kill mod added to the same post. It should load faster (use less CPU time) than current stock while doing all kind of work stock does.
http://lunatics.kwsn.net/12-gpu-crunching/v10-of-modified-seti-mb-cuda-opt-ap-package-for-full-multi-gpucpu-use.msg16715.html#msg16715
(remember that VLAR and VHAR tasks use GPU less effective than midrange tasks, complete data chart in progress).
ID: 886053 · Report as offensive
Profile Voyager
Volunteer tester
Avatar

Send message
Joined: 2 Nov 99
Posts: 602
Credit: 3,264,813
RAC: 0
United States
Message 886095 - Posted: 17 Apr 2009, 21:14:19 UTC
Last modified: 17 Apr 2009, 21:23:15 UTC

Just noticed my gpu temp dropped quite a bit,shows it's running in task manager,but in boinc mgr times went from ~14min to 30min now one running at about 2hrs. I'll do a reboot now and see what happens.
edit..After reboot temp went back to normal, also processing time looks regular./edit
ID: 886095 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 886101 - Posted: 17 Apr 2009, 21:46:51 UTC - in response to Message 886095.  

ID: 886101 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 887306 - Posted: 22 Apr 2009, 15:51:46 UTC - in response to Message 886101.  

bump keeping this up so cuda users can find it


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 887306 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 887319 - Posted: 22 Apr 2009, 16:34:48 UTC - in response to Message 885265.  


http://setiathome.berkeley.edu/result.php?resultid=1195495666
icfft=86040, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error

http://setiathome.berkeley.edu/result.php?resultid=1190297258
icfft=94365, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error

http://setiathome.berkeley.edu/result.php?resultid=1195160891
icfft=86665, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error

http://setiathome.berkeley.edu/result.php?resultid=1202303914
icfft=84509, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error

http://setiathome.berkeley.edu/result.php?resultid=1202338717
icfft=98384, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error


Two more:

http://setiathome.berkeley.edu/result.php?resultid=1205398993
icfft=84915, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error

http://setiathome.berkeley.edu/result.php?resultid=1205430734
icfft=86092, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error


---------------------------------------------------------------

If you would like to help Raistmer.. please post the '-12 Unknown error' also.. *thumb up*

It could look like this:
Exception detected inside cudaAcc_find_triplets, dumping client state
icfft=98384, PoT_activity=0, PoT_freq_bin=-1SETI@home error -12 Unknown error
cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel
File: ..\analyzePoT.cpp
Line: 348


And only the [bolded] line is needed.


ID: 887319 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 887520 - Posted: 23 Apr 2009, 8:48:14 UTC


I'm curious..

I had a bunch of VLAR 'killed' WUs.. [52]
And BOINC uploaded them like every time..

Then BOINC had a ~ 24 hours break counter in the project overview. [BOINC V6.4.7]
Why?


BTW.
Why the 'killed' VLARs need to upload?

ID: 887520 · Report as offensive
Profile Virtual Boss*
Volunteer tester
Avatar

Send message
Joined: 4 May 08
Posts: 417
Credit: 6,440,287
RAC: 0
Australia
Message 887524 - Posted: 23 Apr 2009, 9:08:51 UTC - in response to Message 887520.  
Last modified: 23 Apr 2009, 9:11:32 UTC

The answer is in This thread

BTW answer - To let the server know that you are aborting them, otherwise they would have to time out (no response).

Editted for correct thread
ID: 887524 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 888339 - Posted: 25 Apr 2009, 21:17:24 UTC
Last modified: 25 Apr 2009, 21:18:56 UTC


http://setiathome.berkeley.edu/forum_thread.php?id=52212&nowrap=true#884686

Because of my 'Out Of Memory' errors..

Some examples:
http://setiathome.berkeley.edu/result.php?resultid=1206226503
http://setiathome.berkeley.edu/result.php?resultid=1211721515
http://setiathome.berkeley.edu/result.php?resultid=1212504106


All errors look like this:
-------------------------------------------------------
WU true angle range is : x.xxxxxx


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Out Of Memory (C++ Exception) (0xe06d7363)* at address 0x7C81EB33

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.3.22
-------------------------------------------------------



I guess now it's not a RAM error it's looks like it's a compiler error.
A german side:
* http://support.microsoft.com/kb/185294/de

Or not?

ID: 888339 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 888353 - Posted: 25 Apr 2009, 22:30:45 UTC - in response to Message 888339.  

Compiler error? Don't think so.
Maybe it's worth to add dump of available system physical memory too in additional to GPU memory. Will think about it.
For now such out of memory behavior could be summoned by buggy stock 6.03 on beta (if I correctly understood Pappa's report).
ID: 888353 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 888363 - Posted: 25 Apr 2009, 22:56:49 UTC


This '0x7C81EB33' couldn't be the address of the place of a damaged RAM, or?

Like I said, the error output is all the time the same.

ID: 888363 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 888369 - Posted: 25 Apr 2009, 23:26:25 UTC - in response to Message 888339.  

"at address 0x7C81EB33"
This adress the same? Same exception code means same error - out of memory.

ID: 888369 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 888417 - Posted: 26 Apr 2009, 6:43:10 UTC


Yes, all the time the same output:

------------------------------------------------
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Out Of Memory (C++ Exception) (0xe06d7363) at address 0x7C81EB33

Engaging BOINC Windows Runtime Debugger...

------------------------------------------------


I'm confused - I'm the only one which see this error in the own overview? ;-)


ID: 888417 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 888626 - Posted: 26 Apr 2009, 23:37:51 UTC


It's look like I have one Out Of Memory per day..

http://setiathome.berkeley.edu/result.php?resultid=1213158748


The problem is noted/noticed.. so no more URLs?


Maybe it's time for other members to post, which saw this errors also?

It can't be a problem with my rig, or? This possibility we could exclude?

ID: 888626 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 888678 - Posted: 27 Apr 2009, 2:56:55 UTC - in response to Message 888626.  



It can't be a problem with my rig, or? This possibility we could exclude?

No, we can't .
try to check "performance" tab in task manager time to time. How much memory committed (Commit Charge (K) Total/Peak in WinXP). And if these values grow due time.
Looks like some memory leak presents.
ID: 888678 · Report as offensive
Profile Paul D Harris
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 1122
Credit: 33,600,005
RAC: 0
United States
Message 889278 - Posted: 28 Apr 2009, 23:08:53 UTC

How do I get CUDA WU
Here is my BOINC Message


4/28/2009 7:04:41 PM Starting BOINC client version 6.6.20 for windows_intelx86
4/28/2009 7:04:41 PM log flags: task, file_xfer, sched_ops
4/28/2009 7:04:41 PM Libraries: libcurl/7.19.4 OpenSSL/0.9.8j zlib/1.2.3
4/28/2009 7:04:41 PM Data directory: C:\Documents and Settings\All Users\Application Data\BOINC
4/28/2009 7:04:41 PM Running under account Paul Harris
4/28/2009 7:04:41 PM SETI@home Found app_info.xml; using anonymous platform
4/28/2009 7:04:41 PM Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [x86 Family 6 Model 26 Stepping 4]
4/28/2009 7:04:41 PM Processor features: fpu tsc pae nx sse sse2 mmx
4/28/2009 7:04:41 PM OS: Microsoft Windows XP: Professional x86 Editon, Service Pack 3, (05.01.2600.00)
4/28/2009 7:04:41 PM Memory: 2.99 GB physical, 4.83 GB virtual
4/28/2009 7:04:41 PM Disk: 232.88 GB total, 222.33 GB free
4/28/2009 7:04:41 PM Local time is UTC -4 hours
4/28/2009 7:04:41 PM CUDA device: GeForce 9500 GT (driver version 18206, CUDA version 1.1, 1024MB, est. 16GFLOPS)
4/28/2009 7:04:41 PM Not using a proxy
4/28/2009 7:04:41 PM SETI@home URL: http://setiathome.berkeley.edu/; Computer ID: 4894451; location: home; project prefs: default
4/28/2009 7:04:41 PM SETI@home General prefs: from SETI@home (last modified 04-Feb-2009 23:14:44)
4/28/2009 7:04:41 PM SETI@home Computer location: home
4/28/2009 7:04:41 PM SETI@home General prefs: no separate prefs for home; using your defaults
4/28/2009 7:04:41 PM Preferences limit memory usage when active to 1531.02MB
4/28/2009 7:04:41 PM Preferences limit memory usage when idle to 2755.84MB
4/28/2009 7:04:41 PM Preferences limit disk usage to 100.00GB
4/28/2009 7:04:41 PM Preferences limit # CPUs to 9
4/28/2009 7:04:41 PM SETI@home Restarting task ap_13mr09ac_B0_P0_00091_20090427_00642.wu_0 using astropulse_v5 version 503
4/28/2009 7:04:41 PM SETI@home Restarting task ap_13mr09ac_B1_P1_00117_20090427_02248.wu_0 using astropulse_v5 version 503
4/28/2009 7:04:41 PM SETI@home Restarting task ap_13mr09ac_B1_P0_00366_20090427_00914.wu_0 using astropulse_v5 version 503
4/28/2009 7:04:41 PM SETI@home Restarting task ap_13mr09aa_B5_P0_00101_20090427_31302.wu_2 using astropulse_v5 version 503
4/28/2009 7:04:41 PM SETI@home Restarting task ap_13mr09ac_B0_P1_00391_20090427_01114.wu_0 using astropulse_v5 version 503
4/28/2009 7:04:41 PM SETI@home Restarting task ap_02ja09ab_B5_P0_00190_20090313_15480.wu_4 using astropulse_v5 version 503
4/28/2009 7:04:42 PM SETI@home Restarting task ap_13mr09ac_B3_P0_00053_20090427_09698.wu_0 using astropulse_v5 version 503
4/28/2009 7:04:42 PM SETI@home Restarting task ap_21fe09ac_B3_P1_00277_20090328_04805.wu_2 using astropulse_v5 version 503
4/28/2009 7:04:42 PM SETI@home Sending scheduler request: To fetch work.
4/28/2009 7:04:42 PM SETI@home Requesting new tasks
4/28/2009 7:04:52 PM SETI@home Scheduler request completed: got 0 new tasks
4/28/2009 7:04:52 PM SETI@home Message from server: No work sent
4/28/2009 7:04:52 PM SETI@home Message from server: No work is available for Astropulse
4/28/2009 7:05:07 PM SETI@home Sending scheduler request: To fetch work.
4/28/2009 7:05:07 PM SETI@home Requesting new tasks
4/28/2009 7:05:17 PM SETI@home Scheduler request completed: got 1 new tasks
4/28/2009 7:05:19 PM SETI@home Started download of ap_13mr09ad_B4_P0_00076_20090428_23576.wu
4/28/2009 7:05:31 PM SETI@home Temporarily failed download of ap_13mr09ad_B4_P0_00076_20090428_23576.wu: HTTP error
4/28/2009 7:05:31 PM SETI@home Backing off 1 min 0 sec on download of ap_13mr09ad_B4_P0_00076_20090428_23576.wu
4/28/2009 7:06:31 PM SETI@home [error] File ap_13mr09ad_B4_P0_00076_20090428_23576.wu has wrong size: expected 8392046, got 0
4/28/2009 7:06:31 PM SETI@home Started download of ap_13mr09ad_B4_P0_00076_20090428_23576.wu
ID: 889278 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 889708 - Posted: 30 Apr 2009, 1:28:39 UTC
Last modified: 30 Apr 2009, 1:29:44 UTC


@ PaulDHarris

It's not easy to tell..

But I saw you have the cc_config.xml with 9 CPUs?
This is no longer needed with BOINC V6.6.20 .


Your app_info.xml is looking similar to this?

app_info for AP500, AP503, MB603 and MB608
http://setiathome.berkeley.edu/forum_thread.php?id=52589

ID: 889708 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 889711 - Posted: 30 Apr 2009, 1:35:12 UTC


@ Raistmer

I saw now 'funny' things.. ;-)

My out of memory are 'normally':
SETI@home error -12 Unknown error
cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel
File: c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu
Line: 236



Because I found three 'out of memory' from my rig compared with other CUDA rigs:

http://setiathome.berkeley.edu/workunit.php?wuid=436194406
http://setiathome.berkeley.edu/workunit.php?wuid=437243195
http://setiathome.berkeley.edu/workunit.php?wuid=436422417

On the CPU rigs the WUs ran well.

ID: 889711 · Report as offensive
Profile Paul D Harris
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 1122
Credit: 33,600,005
RAC: 0
United States
Message 889768 - Posted: 30 Apr 2009, 4:29:19 UTC - in response to Message 889708.  
Last modified: 30 Apr 2009, 5:08:58 UTC


@ PaulDHarris

It's not easy to tell..

But I saw you have the cc_config.xml with 9 CPUs?
This is no longer needed with BOINC V6.6.20 .


Your app_info.xml is looking similar to this?

app_info for AP500, AP503, MB603 and MB608
http://setiathome.berkeley.edu/forum_thread.php?id=52589

@Sutaru Tsureku
OK thanks I will remove it.

I did a CUDA WU yesterday. I do not seem to be getting much CUDA unless I run standard app and when I run the V10 Raistmer's app I do not get any CUDA. Is it the spliters?

My message is now

4/30/2009 12:32:12 AM Starting BOINC client version 6.6.20 for windows_intelx86
4/30/2009 12:32:12 AM log flags: task, file_xfer, sched_ops
4/30/2009 12:32:12 AM Libraries: libcurl/7.19.4 OpenSSL/0.9.8j zlib/1.2.3
4/30/2009 12:32:12 AM Data directory: C:\Documents and Settings\All Users\Application Data\BOINC
4/30/2009 12:32:12 AM Running under account Paul Harris
4/30/2009 12:32:12 AM SETI@home Found app_info.xml; using anonymous platform
4/30/2009 12:32:12 AM Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [x86 Family 6 Model 26 Stepping 4]
4/30/2009 12:32:12 AM Processor features: fpu tsc pae nx sse sse2 mmx
4/30/2009 12:32:12 AM OS: Microsoft Windows XP: Professional x86 Editon, Service Pack 3, (05.01.2600.00)
4/30/2009 12:32:12 AM Memory: 2.99 GB physical, 4.83 GB virtual
4/30/2009 12:32:12 AM Disk: 232.88 GB total, 222.96 GB free
4/30/2009 12:32:12 AM Local time is UTC -4 hours
4/30/2009 12:32:13 AM CUDA device: GeForce 9500 GT (driver version 18206, CUDA version 1.1, 1024MB, est. 16GFLOPS)
4/30/2009 12:32:13 AM Not using a proxy
4/30/2009 12:32:13 AM SETI@home URL: http://setiathome.berkeley.edu/; Computer ID: 4894451; location: home; project prefs: default
4/30/2009 12:32:13 AM SETI@home General prefs: from SETI@home (last modified 04-Feb-2009 23:14:44)
4/30/2009 12:32:13 AM SETI@home Computer location: home
4/30/2009 12:32:13 AM SETI@home General prefs: no separate prefs for home; using your defaults
4/30/2009 12:32:13 AM Preferences limit memory usage when active to 1531.02MB
4/30/2009 12:32:13 AM Preferences limit memory usage when idle to 2755.84MB
4/30/2009 12:32:13 AM Preferences limit disk usage to 100.00GB
4/30/2009 12:32:13 AM SETI@home Restarting task ap_14mr09aa_B2_P0_00202_20090429_18646.wu_1 using astropulse_v5 version 503
4/30/2009 12:32:13 AM SETI@home Restarting task ap_14mr09aa_B2_P0_00297_20090429_18646.wu_0 using astropulse_v5 version 503
4/30/2009 12:32:13 AM SETI@home Restarting task ap_06fe09ac_B2_P1_00296_20090330_25585.wu_2 using astropulse_v5 version 503
4/30/2009 12:32:13 AM SETI@home Restarting task ap_14mr09aa_B2_P1_00342_20090429_16396.wu_0 using astropulse_v5 version 503
4/30/2009 12:32:13 AM SETI@home Restarting task ap_14mr09aa_B4_P0_00129_20090429_26870.wu_1 using astropulse_v5 version 503
4/30/2009 12:32:13 AM SETI@home Restarting task ap_14mr09aa_B4_P0_00155_20090429_26870.wu_0 using astropulse_v5 version 503
4/30/2009 12:32:13 AM SETI@home Restarting task ap_14mr09aa_B4_P0_00189_20090429_26870.wu_1 using astropulse_v5 version 503
4/30/2009 12:32:13 AM SETI@home Restarting task ap_14mr09aa_B5_P0_00215_20090429_24630.wu_1 using astropulse_v5 version 503
4/30/2009 12:34:13 AM SETI@home Sending scheduler request: To fetch work.
4/30/2009 12:34:13 AM SETI@home Requesting new tasks
4/30/2009 12:34:18 AM SETI@home Scheduler request completed: got 0 new tasks
4/30/2009 12:34:18 AM SETI@home Message from server: No work sent
4/30/2009 12:34:18 AM SETI@home Message from server: No work is available for Astropulse

PS
I also changed the app_info.xml file to the one listed in the link you supplied.
ID: 889768 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 889773 - Posted: 30 Apr 2009, 4:49:18 UTC - in response to Message 889711.  


@ Raistmer

I saw now 'funny' things.. ;-)

My out of memory are 'normally':
SETI@home error -12 Unknown error
cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel
File: c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_pulsefind.cu
Line: 236



Because I found three 'out of memory' from my rig compared with other CUDA rigs:

http://setiathome.berkeley.edu/workunit.php?wuid=436194406
http://setiathome.berkeley.edu/workunit.php?wuid=437243195
http://setiathome.berkeley.edu/workunit.php?wuid=436422417

On the CPU rigs the WUs ran well.


Very interesting indeed.
It has at least 2 consequencies:
1) dumping mod works not so well - in some conditions it leads to another error instead of dumping useful app state.
2) tasks with this "out of memory" error could not be processed by CUDA MB (stock too) anyway so this problem with dumping code not critical.

Although to summon BOINC debugger code is more costly than just to abort task with "-12" error. Will do new build soon.
ID: 889773 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 889775 - Posted: 30 Apr 2009, 4:54:54 UTC - in response to Message 889768.  

I did a CUDA WU yesterday. I do not seem to be getting much CUDA unless I run standard app and when I run the V10 Raistmer's app I do not get any CUDA. Is it the spliters?

Yes, "it's splitters". BTW, your log shows work request only for AstroPulse.
W/o "team" modded CPU apps my CUDA MB build will behave exactly the same as stock one in regards of work fetch just because work fetch is BOINC client function and not function of science app itself.
ID: 889775 · Report as offensive
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 · Next

Message boards : Number crunching : V10 of modified SETI MB CUDA + opt AP package for full multi-GPU+CPU use


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.