Message boards :
Number crunching :
Modified SETI MB CUDA + opt AP package for full GPU utilization
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 25 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0 ![]() |
Even though these are still pending, I'm pretty sure that they're going to be the same as the earlier mentioned validation problem with me showing an overflow and the other two wingmen having a similar result. They all happen in the same time period of the crash I mentioned. One of them is just at the begining of that time period. 1101014721 The next result was the one mentioned in the earlier post. Then there were what I'm guessing are 3 more results that will probably have the same validation problem once a result is gotten. 1101015020 1101015257 1101015259 I'm now guessing the last one is the crash or could be the one just prior to the crash, and the others are an effect leading up to it. Like I mentioned I had been crunching tasks at Beta for about a day before without any restarts, freezing, and only had to abort a few VLAR. I then came right over to main and started the tasks so maybe a restart at that point would have been a good idea on my part. I will after I finish the couple of 6.06 I'm doing at Beta, before I start any here. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Ok, lets try new version :) It's based on latest available for me revision of CUDA MB. http://lunatics.kwsn.net/gpu-crunching/modified-seti-mb-cuda-opt-ap-package-for-full-gpu-utilize.0.html 0) Please, watch carefully what results your CUDA-enabled host returns and stop using CUDA SETI MB till new version will be available if you see excessive invalid results rate. 1) This package (Raistmer's_opt_package_V2.rar) can be downloaded from http://files.mail.ru/5MI3EM (and from post on Lunatics forums, see link above). Targed hosts: Windows x86, SSE3 support for AP, CUDA support for MB. 2) It consist of modified SETI MB CUDA and current SSE3 opt SETI AP binaries with corresponding app_info.xml file 3) Modification that I have done increases CUDA worker thread priority in SETI MB CUDA that allows more fully GPU usage while keeping all CPU cores busy too. That is, using of this build can increase total performance of your host for BOINC tasks. 4) MB binaries based on CUDA MB sources recived from Eric (with small modification), opt AP is just repacking of current Lunatics opt AP release (SSE3 build). 5) It's not "official" Lunatics release so you could blame only me (or yourself, or BOINC bugs and so on and so forth) for any issues you encounter. 6) I still can' check AP+MB work (no AP tasks here) but it works just fine with CUDA MB + einstein@home combination. 7) For best CPU and GPU usage I recommend to set number of processors available for BOINC to real_number_of_cores+1. This will mitigate current BOINC bug with CPU+CUDA scheduling and will allow fully load CPU and GPU. 8) Installation instructions are the same as for any opt app: stop BOINC, decompress all files in archive into SETI project directory, restart BOINC. Please, report issues here too. |
MarkJ ![]() ![]() ![]() ![]() Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 ![]() |
Ok, lets try new version :) It's based on latest available for me revision of CUDA MB. When you say "latest available" is that the 6.05 or 6.06 code base? If it looks to be more stable I might fire up Seti CUDA again. I've done a couple of GPUGRID wu that take from 8 to 10 hours a pop. Another question I had was if the CUDA app uses more shader units on the graphics card if they have them, or does it used a fixed number regardless of the card having more? Or maybe it uses the processors (which also vary in number)? My graphics card is a 9800GT running at "stock" speed and temps seem to be around 50 (idle) and 60 (under load). Room temp is currently 29. Its summer in Australia at the moment. I will post some pics on my blog soon. I have 3 more graphics cards on order for the other machines so hopefully we can iron out the bugs soon and get into full production. BOINC blog |
![]() ![]() Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0 ![]() |
|
MarkJ ![]() ![]() ![]() ![]() Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 ![]() |
I like the extra added info on the card. Cool 1102600941 Yeah it does have a lot more info about the video card doesn't it? All the GPUGRID wu report about the card is: <stderr_txt> # Using CUDA device 0 # Device 0: "GeForce 9800 GT" # Clock rate: 1512000 kilohertz # Number of multiprocessors: 14 # Number of cores: 112 Which is why I was asking about the app using the extra cores (sometimes called shaders) or the processors. BOINC blog |
Cosmic_Ocean ![]() Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 ![]() ![]() |
I like the extra added info on the card. Cool 1102600941 The shaders, or stream processors, are from what I understand, basically dynamic RISC processors. Once they get told to do something, that's all they can do, and they do it very well, and very efficiently, until they are told to do something else. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
![]() ![]() Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0 ![]() |
That didn't take long. Had a restart while doing this task The lack of information in it seems very odd. Will have to wait on the wingman to even know what AR it is. The next task afterwards was an overflow. Will have to wait on my wingman to see the outcome of that one too. Will also have to wait on wingmen for the tasks before these. I'm now doing another task, which appears to be doing ok. Edit: One of the pending tasks has been completed by my wingman and appears to be strongly similar, but has not been validated yet. No error message in event viewer for the reboot. |
![]() ![]() Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0 ![]() |
Turns out the wingman returned an overflow on the task I had the crash on. AR is 2.718469. Still waiting to see if a third wingman will be sent out, although I can't imagine one won't. The task completed a couple of tasks after the crash had a similar AR of 2.714647 so it appears I can do the AR, but remains to be seen if it can be done on this app with a valid result. Kind of like a similar task I ran at Beta with this AR, where it blew my drivers out and had to reinstall them again, but later was able to do the AR there too. |
![]() ![]() Send message Joined: 26 May 99 Posts: 9958 Credit: 103,452,613 RAC: 328 ![]() ![]() |
|
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
When you say "latest available" is that the 6.05 or 6.06 code base? I used rev380 (head for that moment) revision of berkeley's CUDA branch in repository. Is it 6.05 or 6.06 - ask Eric & Co. In version file stated 6.02 that is apparently false value. Anyway, there is no more recent public accessible code.
CUDA uses huge amounts of threads. They much lightwight than CPU threads. All current cards has different values of simultaneously executable threads, but it's recommended that app has more threads than GPU has. GPU handles threads swapping in so called warps. So answer is yes. (Not sure can CUDA threads be called shaders or not though). |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
That didn't take long. Had a restart while doing this task The lack of information in it seems very odd. Will have to wait on the wingman to even know what AR it is. Strange indeed. No comments...
It's VHAR unit. VHAR correlated with overflows already. Your result just supports that correletion (for CUDA app sure). |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
The task completed a couple of tasks after the crash had a similar AR of 2.714647 so it appears I can do the AR, but remains to be seen if it can be done on this app with a valid result. Hm, AR=2,7 could be named VHAR too... Interesting will you get valid result here or not. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
I am running your V2 app and soon after I started I got these 2: WU true angle range is : 0.012528.... They are VLARS. Look here: http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=1443 |
![]() ![]() Send message Joined: 7 Jun 99 Posts: 512 Credit: 148,746,305 RAC: 0 ![]() |
Got some results but as with the previous version. Sometimes a frequent (at least every minute) video driver crash. It doesn't hang the system in Vista but it's hardly a stable situation... http://setiathome.berkeley.edu/result.php?resultid=1103499305 Almost all task that do run are valid. And a cuda task that sometimes takes up 50% of the CPU time ..... Sometimes it eats away 20 at a time but I have my doubt about the validation system of SETI. Seen results that are 100% in error and got 40 points for it. I see tasks done by 3 users, 1 is in error and all get points of the 2 valid tasks. I wait for a more stable solution. |
![]() Send message Joined: 19 Mar 05 Posts: 551 Credit: 4,673,015 RAC: 0 ![]() |
I just had a task hang at 0:00 CPU time and zero GPU usage... Task details Tried to suspend it...Didn't suspend/start new task Tried to abort it... Didn't Abort... So I shut it down in task manager... Also just found out that the CUDA app doesn't trigger performance 3d clocks on my gtx 260...Only low power 3d clocks if I'm lucky... Used ATI tools to set the clocks at performance 3d clocks right across the board.. Now I'm seeing some CRAZY speed :) ![]() Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957 Or Good Shop? http://www.goodshop.com/?charityid=888957 |
![]() ![]() Send message Joined: 14 Mar 04 Posts: 357 Credit: 650,069 RAC: 0 ![]() |
I have had two tasks with too many results (waiting on wingmen to verify or not), and this one: setiathome_CUDA: Found 1 CUDA device(s): Device 1 : GeForce 8500 GT totalGlobalMem = 536543232 sharedMemPerBlock = 16384 regsPerBlock = 8192 warpSize = 32 memPitch = 262144 maxThreadsPerBlock = 512 clockRate = 918000 totalConstMem = 65536 major = 1 minor = 1 textureAlignment = 256 deviceOverlap = 1 multiProcessorCount = 2 setiathome_CUDA: CUDA Device 1 specified, checking... Device 1: GeForce 8500 GT is okay SETI@home using CUDA accelerated device GeForce 8500 GT Rise priority modification by Raistmer based on rev380 of SETI@home sources Priority of worker thread rised successfully Cuda error 'cufftPlan1d(&fft_analysis_plans[FftNum], FftLen, CUFFT_C2C, NumDataPoints / FftLen)' in file 'd:/BTR/seticuda/Berkeley_rep/client/cuda/cudaAcc_fft.cu' in line 49 : out of memory. setiathome_CUDA: CUDA runtime ERROR in plan FFT. Falling back to HOST CPU processing... setiathome_enhanced 6.02 Visual Studio/Microsoft C++ libboinc: 6.3.22 Work Unit Info: ............... WU true angle range is : 2.715856 Optimal function choices: ----------------------------------------------------- name ----------------------------------------------------- v_BaseLineSmooth (no other) v_GetPowerSpectrum 0.00021 0.00000 v_ChirpData 0.01462 0.00000 v_Transpose4 0.00563 0.00000 FPU opt folding 0.00172 0.00000 Flopcounter: 5215429468138.598600 Spike count: 3 Pulse count: 0 Triplet count: 1 Gaussian count: 0 called boinc_finish ![]() ![]() Boinc Button Abuser In Training >My Shrubbers< |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
I just had a task hang at 0:00 CPU time and zero GPU usage... OMG... look here on this your result. It's absolute record about quantity of errors per single result %) It seems you should check your GPU stability before doing any OCing... |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
I have had two tasks with too many results (waiting on wingmen to verify or not), and this one: Well, it seems this build can fall back to CPU processing if it encounter CUDA error... nice ability :) Did this result validated? |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
I am running your V2 app and soon after I started I got these 2: Ah, but the second one has "WU true angle range is : 0.083363", which is a very helpful indication that the problem extends beyond the 0.05 true VLAR range. Anything with angle range 0.03 to 0.35 is quite rare, and there are variations in array sizes and other details of the computations for anything above 0.05. It's quite possible that an 0.079 might be OK even though the 0.083 is bad, for instance. I did spot some 0.147 range work which seemed OK a couple of days ago. Joe |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Will collect statistics about VLAR crashes on my own GPU (it's still underclocked, so I'm pretty sure in hardware stability). Wanna build some debug version that will write in text file AR of overflowed WU - that way we will have VHAR <-> overflow statistic much easier.. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.