Message boards :
Number crunching :
Modified SETI MB CUDA + opt AP package for full GPU utilization
Message board moderation
Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Raistmer, Is new BOINC version anticipated to be incompatible with all previous builds? I think not. So I see no reasons why not to use the same science apps with new BOINC version. What GPU-related changes are planned in new BOINC ? Better scheduler ? |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
What GPU-related changes are planned in new BOINC ? Better scheduler ? It's got separate work-fetch modules for CPU and GPU as well as a new client-side CPU scheduler. Amongst other things, at least. Also a very nasty bug where when you have a CUDA card and set the project to no new tasks, it'll send you work anyway. ;-) |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
There is new package version available: http://lunatics.kwsn.net/gpu-crunching/modified-seti-mb-cuda-opt-ap-package-for-full-gpu-utilize.msg12177.html#msg12177 http://lunatics.kwsn.net/gpu-crunching/modified-seti-mb-cuda-opt-ap-package-for-full-gpu-utilize.msg13120.html#msg13120 or (if you can't download from Lunatics) http://files.mail.ru/K2AFN0 It contains the same CUDA app (equivalent to 6.08 stock ) and latest SSE3 opt AP release (r103) that has increased performance. |
ghost17 Send message Joined: 15 Jun 04 Posts: 2 Credit: 3,590 RAC: 0 ![]() |
At first thanks for the package. I tried to run SETI CUDA on my laptop, it ran GPURID wihtout any problems. First thing gone bad was this validation error yesterday which resulted in a loop of client errors, which btw should be fixed in my eyes. 90 client errors in 2 hours should stop BOINC from downloading anything more, but okay, that isn't a point to discuss here. Today after my quota was reenabled I installed this package and everything worked fine for... 19 seconds. After that it aborts with a compute error. Graphics card is a nVidia 9500m GS with 512 MB and newest notebook drivers from nVidia itself. BTW: The standard CUDA application is working BUT I can't use my computer then because it slows down everything, starting from moving windows up to showing anything on the screen. That's a little bit crappy :-/ |
![]() ![]() Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0 ![]() |
Have the new modified + AP app installed. I had an AP already running and it made the transition just fine as I'm sure was expected. Haven't had a chance to run the modified CUDA app yet, that will be a few hours before I finish the tasks for Beta that I'm running. What is to be expected as far as a performance increase from the new AP app? |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
What is to be expected as far as a performance increase from the new AP app? Well, as Jason already wrote in AP-dedicated thread there is few changes in new AP release - they can be classified in 3 cathegories: 1) Attempts of BOINC's instant termination bug/feature workaround (0 state file on termination during checkpoint). Unfortunately, we can only try to reduce probability of this bug manifestation. As long as BOINC will kill science app's process w/o mercy this bug still possible. 2) Overall performance improvements - these improvements will speedup any AP task processing. 3) More effective handling of tasks wih overflows. Now some unneeded computations for these tasks will be skipped. This mod can give very nice performance boost for some of tasks but value of speed increase will be data-dependent and different for different tasks. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
At first thanks for the package. I looked on one of your results http://setiathome.berkeley.edu/result.php?resultid=1129701614 This error can be attributed to driver incompatibility. Unfortunately, my build works only with driver version 180.48 and higher (don't know why - it's just experimental fact). If you can't install 18x.xx driver on your notebook you could try to use some another (maybe beta) driver. If it will not help - well, you can either reject to use CUDA app on this host or modify app_info.xml to use stock 6.08 along with opt AP r103. Stock app has no driver limit it seems. |
ghost17 Send message Joined: 15 Jun 04 Posts: 2 Credit: 3,590 RAC: 0 ![]() |
OK, I'll try that later. I'm running on 179.28 notebook drivers. |
Dawn Raison Send message Joined: 6 Jun 99 Posts: 3 Credit: 4,317,068 RAC: 0 ![]() |
So just what do I need to do to get a cache of CUDA workunits loaded? I've had two batches sent so far - each time about 15 or so and they crunched fine and went on to become the canocial result in a few cases. However unless I go into BOINC and fiddle (e.g. suspend tasks) I can't get any CUDA WUs sent - instead I keep getting loads of Astropulse WUs. Surely I can have 3/4 Astropulses and Seti CUDA at the same time? Rgds. DavidR. Boinc 6.4.5, Astropulse v5.00, Seti CUDA 6.0.8 Card: 9800GTX+ oc Driver: v181.20, PC: XP x64, 8GiB ECC RAM, 790FX, Phenom II X4 940 Also running ClimatePrediction, Spinhenge |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
1) try to use opt package, described in this thread (you listed stock apps - not subject of this threrad) 2) try to increase SETI project share 3) try to increase cache size. |
Morten Ross Send message Joined: 30 Apr 01 Posts: 183 Credit: 385,664,915 RAC: 0 ![]() |
Hi, I have now been monitoring for some time and still do not see any change in the following behaviour: Cuda crunching works like a charm for maybe one or two hours, then the WU simply does not progress, and the Cuda app consumes a full CPU. Closing BOINC and ensuring all executables are no longer in process list (sometimes I must kill boinc.exe, as processes are not released), then restarting BOINC sometimes resolve and WU is completed OK. Other times more than one restart of BOINC is needed before the WU is processed. The next is one of two - either the next WU is processed normally or it is processed for a few seconds, then completed with success(!). Only option is to restart computer to start the cycle once again. OS is Windows Server 2008 x64, BOINC 6.6.0 x64. Video drivers tested: 185.20 which I downgraded to 181.22 as part of ap_5.00r103_SSE3 and MB_6.08_mod_VLAR_kill_CUDA upgrade. To no avail it seems. Below are the WUs during the abovementioned scenario: 1129308489 1129308493 1129308763 Perhaps this is an x64-issue? The following hosts are 32-bit and are seemingly working very well: 2650128 4242105 Morten Ross Morten Ross ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Hi, Well, most probably it's x64 issue indeed. I have second CUDA-anabled host with dual GPU cards (8500GT + 9400GT). OS: Wndows Server 2003 x64. It manifessted the same behavior. There was memory leak (my build clerly demonstrated that cause it dumps available GPU memory at beginning of task). 1-2 tasks for 8500 and a little more for 9400 could be completed (but they ended with computational error usually) and after that app fall back to CPU processing due to low GPU memory condition. Only OS restart helped. But when I upgraded ti 181.20 x64 drivers this problem disappeared. ow it seems this host can crunch OK on both GPUs. Link to host on beta: http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=18439 |
Morten Ross Send message Joined: 30 Apr 01 Posts: 183 Credit: 385,664,915 RAC: 0 ![]() |
... Additional note: boincmgr.exe is also allways using 100% CPU when the abovementioned scenario occurs, so it seems the interaction between cuda and boincmgr is a problem. Morten Morten Ross ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
... Additional note: boincmgr.exe is also allways using 100% CPU when the abovementioned scenario occurs, so it seems the interaction between cuda and boincmgr is a problem. I didn't try BOINC 6.6.0 Use 6.4.5. Maybe you could downgrade to it too to rule our BOINC from this "equation" ? |
Morten Ross Send message Joined: 30 Apr 01 Posts: 183 Credit: 385,664,915 RAC: 0 ![]() |
I will try a downgrade and see what happens, allthough that might introduce other isses fixed in 6.6.0. There is also the difference in OS versions - you have Windows 2003 and I've got Windows 2008. Morten Ross Morten Ross ![]() |
Morten Ross Send message Joined: 30 Apr 01 Posts: 183 Credit: 385,664,915 RAC: 0 ![]() |
Hi, I did a downgrade, but unfortunately that did not make any difference - WU will still hang after some time of crunching. There is one change - boincmgr.exe is not using 100% CPU when this happens, and all WUs are released properly when I exit BOINC manager. Morten Ross Morten Ross ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Well, some positive progress at last :) What will be if you will use 181.20 driver now ? |
Morten Ross Send message Joined: 30 Apr 01 Posts: 183 Credit: 385,664,915 RAC: 0 ![]() |
Hi, I have downgraded from 181.22 to 181.20 (from file version 7.15.11.8122 to 7.15.11.8120) - but no change, boincmgr must be restarted for the hanging WU to be processed..... The WU hangs are allways occurring at the very beginning of the processing - after about 17 seconds (CPU time - not elapsed time) of processing, and between 0,000% and 0,500% progress. Morten Morten Ross ![]() |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Well, it seems your host is good candidate for using Maik's script. It handles such hang situation AFAIK. Look for it here or on Lunatics site. |
![]() ![]() Send message Joined: 3 Apr 99 Posts: 104 Credit: 4,382,041 RAC: 2 ![]() |
Well, some positive progress at last :) @ Raistmer: Well your V7 optimized "MB_6.08-mod_kill_VLAR_CUDA.exe" client is working well for me, using WinXP Pro SP3 32bit with nVidia 181.20 drivers on my single 8800GT/512mb (G92 GPU) video card under BOINC 6.6.0 on S@H main (non-Beta) project. My Tasks are listed here: Tasks for User ID 7782289 Aside: As I reported here Boinc 6.6.2 just released, my two attempts to use Boinc 6.6.2 with your V7 Cuda client gave me Boinc Manager problems, but that's for others to work out (I guess?). I haven't tried 6.6.2 with stock Berkeley 6.08-Cuda client. I was really hoping 6.6.2 would work because it reportedly should allow concurrent GPU MB crunching and CPU MB/AP crunching. Have you tried Boinc 6.6.2 yet? I am wondering if the present VLAR Kill feature in your V7 application is actually necessary, because when I earlier was running Boinc 6.4.5 with stock Berkeley 6.08-Cuda and nVidia 180.60 drivers, that stock 6.08-Cuda would successfully crunch MB WUs that had very low angle range (VLAR). (Aside: Like so many others reported, VLAR WUs crunched with stock Berkeley 6.06-Cuda and 6.07-Cuda would produce Compute Errors). Another question: When your V7 application detects that a WU angle range is below your VLAR limit, V7 terminates that WU with an exit code -6 so that WU gets listed as a Client Error/Compute Error. For eaxample see: http://setiathome.berkeley.edu/result.php?resultid=1130457910 Could you instead terminate VLAR WUs with an error code that reports it as "Client Aborted"? That would help differentiate such WUs from WUs that terminate with error code -9 overflow (e.g. #spikes=30, or #spikes=15+#peaks=15). Just a thought.... Sabertooth Z77, i7-3770K@4.2GHz, GTX680, W8.1Pro x64 P5N32-E SLI, C2D E8400@3Ghz, GTX580, Win7SP1Pro x64 & PCLinuxOS2015 x64 ![]() ![]() |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.