Message boards :
Number crunching :
s@h on GPU CUDA now?!
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5
Author | Message |
---|---|
Byron S Goodgame Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0 |
can I fix it somehow? You could try a different driver for the card. That has been known to help others with similar problems. It's kind of a hit and miss with that though. What version are you using now and which ones have you tried before? Also there's a method in this thread that you can use to keep false results from making it to the server. |
Zoran Kirsic Send message Joined: 22 May 99 Posts: 34 Credit: 102,258 RAC: 0 |
i have nvdia drivers 178.24.. but I wondering if this is a problem of SETI or me? because this is my first results in cuda processing, so maybe I done something wrong!? |
Byron S Goodgame Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0 |
i have nvdia drivers 178.24.. but I wondering if this is a problem of SETI or me? because this is my first results in cuda processing, so maybe I done something wrong!? It's proably not because you've done something wrong, but more likely either the app or the driver. You could try doing some more tasks and see if you still keep getting similar errors, though from my own experience I'd try a Beta drvier like one of the 180.xx and see what results come from that. Wether you continue using the driver you have, or go to a different one, it would be helpfull to stop netowrk communiation and follow the instructions mentioned in the post of my previous message, since there's a good chance you'll continue getting false results. |
Zoran Kirsic Send message Joined: 22 May 99 Posts: 34 Credit: 102,258 RAC: 0 |
OK. Thanks. I try with beta drivers. i let you know. |
Harper101 Send message Joined: 13 Jan 08 Posts: 3 Credit: 11,658,919 RAC: 0 |
Grrr.. they bring out a CUDA version of Seti 3 months after I upgrade my pc.. were I changed my GPU from Nvidia to ATi!! Anyone know if they are working on a version that I can run on my ATi card? (I know the Folding@home guys have versions availible for ATi and Nividia.. though I realise they have a massive amount more funding than my beloved Seti@home!) |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
Grrr.. they bring out a CUDA version of Seti 3 months after I upgrade my pc.. were I changed my GPU from Nvidia to ATi!! As far as I know, there are just rumors that ATi has said they are interested in doing something CUDA-like, but the problem is that CUDA is nvidia-specific. What would be smart for everyone to push for is something like OpenCL. This can make it so you program in one unified and consistent language and it works on different "makes and models" of hardware. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
Zoran Kirsic Send message Joined: 22 May 99 Posts: 34 Credit: 102,258 RAC: 0 |
i have nvdia drivers 178.24.. but I wondering if this is a problem of SETI or me? because this is my first results in cuda processing, so maybe I done something wrong!? I try the new driver (180.84), but still doesn't work. On 31 sec. screen flickers, and the CPU time stop going on. I had a answer from Raistmer: "There is bug in 6.06 sources so both stock and my builds will have this error overflows time to time. Debugging going.." I went processing WUs on AK_v8_win_SSE3.exe, for now. Thx |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
Have you tried this? <cc_config> <options> <ncpus>5</ncpus> </options> </cc_config> create in notepd, save as all files named cc_config.xml to C:\Documents and Settings\All Users\Application Data\BOINC folder. Then in the Boinc Manager Advanced menu click on Read config file I was having the flickering problem and getting the messages you were but this seemed to fix it. I had to change the number 5 to 3 since I only have a C2D. I had left it at 5 at first and was running 5 WUs at a time but slowly. :) PROUD MEMBER OF Team Starfire World BOINC |
Zoran Kirsic Send message Joined: 22 May 99 Posts: 34 Credit: 102,258 RAC: 0 |
Have you tried this? My cc_config.xml: <cc_config> <log_flags> </log_flags> <options> <client_version_check_url>http://www.worldcommunitygrid.org/download.php?xml=1</client_version_check_url> <client_download_url>http://www.worldcommunitygrid.org/download.php</client_download_url> <network_test_url>http://www.ibm.com/</network_test_url> <start_delay>15</start_delay> <ncpus>3</ncpus> </options> </cc_config> I try this change in the beginning (<ncpus>3</ncpus>), but no change! |
John Send message Joined: 5 Jun 99 Posts: 30 Credit: 77,663,734 RAC: 236 |
I have changed preferences to not send any more CUDA units as there seem to be stability problems on two of my NVidia supported computers. If there is any interruption of DSL service the computer will dramatically cease to respond until some sort of timeout than I get one chance to abort before waiting for another timeout. If I restart without rebooting then there are massive artifacts in the display screen that flash on and off like Christmas lights. On the other computer I could play games and run Boinc at the same time before CUDA now the atifact thing seems to work itself in at times and once started does not stop until the computer is rebooted. This computer is for games and Boinc is secondary, so I removed the CUDA option under preferences. Will try later perhaps after the bugs are worked out. Later. John |
Zoran Kirsic Send message Joined: 22 May 99 Posts: 34 Credit: 102,258 RAC: 0 |
What about this new drivers 181.20, they came out 5 day ago. Any difference or what!?? |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
or what... Most of the posts on CUDA are finally finding their way to the Q & A section but its still very buggy and requires a lot of TLC to take care of freeze ups, lock ups and such. clearly Cuda isnt for the inexperienced user In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
Tronic Send message Joined: 23 Mar 03 Posts: 8 Credit: 10,599,675 RAC: 0 |
(...) On the other computer I could play games and run Boinc at the same time before CUDA now the atifact thing seems to work itself in at times and once started does not stop until the computer is rebooted. This computer is for games and Boinc is secondary, so I removed the CUDA option under preferences. Will try later perhaps after the bugs are worked out. Later. John Remember that by using CUDA, your GPU temperature will increase dramatically. I have two 8800GTS 640MB with watercooling in series (there is a single high performance radiator between the CPU and GPUs and a triple radiator between GPUs an CPU). The first card jumped from 49ºC to 58ºC, while the second one jumped from 49ºC to 62ºC. This increase is more than when playing Oblivion. (By the way, my E8400 CPU cores jumped from 44ºC to 52ºC with only one astropulse process). Remember this is with watercooling. Forced Air cooling Video cards will have a much bigger increase in temperature, not to mention passively cooled video cards. |
ML1 Send message Joined: 25 Nov 01 Posts: 20291 Credit: 7,508,002 RAC: 20 |
Remember that by using CUDA, your GPU temperature will increase dramatically. ... That all depends on your graphics card and how good your cooling is... I'm seeing a mere 10 deg C increase which is still well below the limits set in the nVidia software. More of a problem is a slowdown for other graphics when Boinc-CUDA is running. We really do need a one-button suspend all Boinc CUDA for when trying to do other graphics work! Happy crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
angler Send message Joined: 19 Oct 00 Posts: 33 Credit: 880,214 RAC: 0 |
been pretty stable but my first CUDA glitch http://setiathome.berkeley.edu/result.php?resultid=1159831901 <core_client_version>6.4.5</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> setiathome_CUDA: Found 1 CUDA device(s): Device 1 : GeForce 9600 GT totalGlobalMem = 536870912 sharedMemPerBlock = 16384 regsPerBlock = 8192 warpSize = 32 memPitch = 262144 maxThreadsPerBlock = 512 clockRate = 1625000 totalConstMem = 65536 major = 1 minor = 1 textureAlignment = 256 deviceOverlap = 0 multiProcessorCount = 8 setiathome_CUDA: CUDA Device 1 specified, checking... Device 1: GeForce 9600 GT is okay SETI@home using CUDA accelerated device GeForce 9600 GT setiathome_enhanced 6.03 Visual Studio/Microsoft C++ libboinc: 6.3.22 Work Unit Info: ............... WU true angle range is : 0.447866 Optimal function choices: ----------------------------------------------------- name ----------------------------------------------------- v_BaseLineSmooth (no other) v_GetPowerSpectrum 0.00024 0.00000 v_ChirpData 0.01821 0.00000 v_Transpose4 0.00930 0.00000 FPU opt folding 0.00460 0.00000 Cuda error 'cufftExecC2C' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_fft.cu' in line 63 : unknown error. Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error. Cuda error 'cudaAcc_GetPowerSpectrum_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_PowerSpectrum.cu' in line 56 : unknown error. Cuda error 'cudaAcc_summax32_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error. Cuda error 'cudaAcc_summax32_kernel' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_summax.cu' in line 147 : unknown error. Cuda error 'cudaMemcpy(PowerSpectrumSumMax, dev_PowerSpectrumSumMax, cudaAcc_NumDataPoints / fftlen * sizeof(*dev_PowerSpectrumSumMax), cudaMemcpyDeviceToHost)' in file 'c:/sw/gpgpu/seti/seti_boinc/client/cuda/cudaAcc_summax.cu' in line 160 : unknown error. </stderr_txt> ]]> |
Zydor Send message Joined: 4 Oct 03 Posts: 172 Credit: 491,111 RAC: 0 |
An 8800GTS will idle at 45C, and go to 70C under full load in its stock state, so the temps given are way below any kind of hassle - which they shood be if water cooled. I run a 9800GTX 7x24 on one of my machines under full load, it happily purrs away at around 62C no water, well below hassle point. If I go to "normal" use not under full load it drops to around 55C. Such rises of circa 10C are normal when a gpu is placed under full load. its not "because" of BOINC or CUDA, any demanding application or game will have the same result. What can happen is perceptions of the max allowable temp in a gpu under load are usually grossly understated - they are resilient beasts and designed for high temperatures. I tend to find with PCs of friend's I have looked at that are behaving "wierdly", my first reaction is to put my hand on the exit fan grill, and if its warmer than a tepid cup of coffee, the cover comes off with the air can at the ready. 7 times out of 10, there's more dust inside, than in a municipal dust cart, some have so much caked on the cpu fan, it has vertually stopped because it cant get through the piled up clag! In those situations, a 10C rise can be a killer if they start using it at full load, but thats not the fault of the software or hardware, its the user not maintaining the machine. Too few realise that dust actually accumilates inside them (and quickly!) let alone go to the trouble of opening it up once every couple of months, cleaning the inside case fan filters, and using an air can on the cpu and psu etc. Re drivers, I am using 181.22 (with 6.5.0), and have been since I returned to Seti crunching 10 days ago after an extended absence - I drifted away on the changeover to BOINC - CUDA tempted me back, meant I could use the spare capacity in the card when not gaming. Not noticed any undue hassles apart from the work/fetch routine which is definitely not right yet. I also use the "set for 10 days" work-around to get hold of CUDA WUs when I am short. I had one hassle that resulted in losing some WUs, and using the abort button, but that was my fault, not Seti's. Frankly, I reckon the guys do a great job keeping the ship running, compared to 5/6 years ago they are serving and shovelling out a staggering volume of work on the same shoe string they have always had for finance in comparison to the results they are expected to achieve. Get's my Vote ..... |
Tronic Send message Joined: 23 Mar 03 Posts: 8 Credit: 10,599,675 RAC: 0 |
My point was more intended for overclockers and passive cooled graphic cards. GPUs are designed to operate at higher temps than CPUs. My two 8800GTS are factory overclocked (by 10% and 13%) and I've been able to overclock them 20% with watercooling. I've seen in game forums some people to argue a game has problems because they cannot overclock their GPU when playing that specific game, while they have no problem in other games. I specifically remember complaints in "The Elder Scrolls: Oblivion" forums. It turns out that the problem is not the game stability, but that the game is better compiled to use the full GPU potential, making it rise to higher temps than other games. So if video artifacts or instability start to appear when using CUDA, I'd point to a GPU cooling problem rather than a CUDA or SETI problem. On another matter, I'm having some CUDA problems which seem to arise when accessing the PC through windows XP "remote desktop". nVidia control panel (181.22) doesn't show any of my 2 8800GTS, so boinc doesn't see them either and CUDA reverts back to CPU processing, taking up both my CPU cores, but not pausing the astropulse workunit which stays as an active process but is hardly assigned any CPU time. Problem seems to be resolved by exiting boinc (with stop tasks option) and restarting it. If the session was (re)started through a remote desktop, the only solution seems to be restarting the session. But in both cases this has to be done on the PC itself and not through the remote desktop. I suppose it's a problem with the nvidia driver or windows XP itself and has nothing to do with boinc. Maybe this workunit shows the problem, which includes a manual stop and restart done by me after a remote desktop access done a couple of hours earlier: http://setiathome.berkeley.edu/result.php?resultid=1157030215 I also had to disable SLI for boinc to recognize both GPUs and run 2 CUDA tasks simultaneously. I know that enabling SLI uses only one of the GPUs (work is not distributed between the two) because only one of the GPUs shows a temperature rise. |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
The remote desktop uses it's own drivers, circumventing the NVidia drivers. That's what the problem is there. And you are right about having to disable SLI to get both GPUs crunching. PROUD MEMBER OF Team Starfire World BOINC |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
On another matter, I'm having some CUDA problems which seem to arise when accessing the PC through windows XP "remote desktop". nVidia control panel (181.22) doesn't show any of my 2 8800GTS, so boinc doesn't see them either and CUDA reverts back to CPU processing, taking up both my CPU cores, but not pausing the astropulse workunit which stays as an active process but is hardly assigned any CPU time. perryjay is correct........ I had the same problems and after a lot of messing around I found a MS page that said MS remote desktop loads it's own video driver, ignoring the installed driver. I then disabled MS remote desktop and installed VNC Viewer. No problems since doing this. Boinc....Boinc....Boinc....Boinc.... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.