Message boards :
Number crunching :
A growing problem with GT/GTX 6xx series cards.
Message board moderation
Author | Message |
---|---|
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
1 thing that I have noticed over the last few months is the ever growing amount of rigs with GT/GTX 6xx series cards that have not had the environment variable setting, CUDA_GRID_SIZE_COMPAT value 1, set and are just trashing work at an ever increasing and alarming rate. Some notice that they only produce errors, but instead of finding out why they just stop crunching (likely removing BOINC) without aborting the work they have left onboard so a combination of both means that lot of work now has to hang around the database for so much longer than it really needs to. I don't mind taking the time on occasions to PM people with rigs that are throwing invalid work (and errors when I stumble over them) but there is absolutely no way that I'd have the time to PM all those GT/GTX 6xx owners (there is just far to many of them). Yes, I got bored 1 day and started looking for them (while waiting for a job to turn up), but I gave up half way through the finished tasks on just 1 of my rigs. Cheers. |
spitfire_mk_2 Send message Joined: 14 Apr 00 Posts: 563 Credit: 27,306,885 RAC: 0 |
More work for the rest of us. |
Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597 |
Maybe so, but also unnecessary... |
Horacio Send message Joined: 14 Jan 00 Posts: 536 Credit: 75,967,266 RAC: 0 |
Wouldnt be of help to issue a notice through BOINC messages? |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
In about a week, (give or take), the problematic 6.10 (Cuda 3.0) application will probably be supplanted by x41zc as stock on main, for V7 Cuda multibeam processing. These applications are not susceptible to the Cuda 3.0 library faults concerned.. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Keith White Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22 |
So not a problem with the card itself, a problem with their users who won't RTFM. Honestly are we surprised? "Life is just nature's way of keeping meat fresh." - The Doctor |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
So not a problem with the card itself, a problem with their users who won't RTFM. Actually, a problem with NVidia (the company) - which supplied the project with a CUDA 3.0 stock application compatible with Fermis, but didn't come back and supply a CUDA 3.2 (or later) stock application to support their own Kepler release. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65709 Credit: 55,293,173 RAC: 49 |
So not a problem with the card itself, a problem with their users who won't RTFM. Nvidia, I'm beginning to think that their support of cuda isn't too serious anymore, that's My suspicion at least, with downclocks happening, I'm experimenting with 306.97 and a dummy plug on the 2nd DVI port, which when I have the desktop extended to it, results in 772MHz on My GTX580 being the minimum that the card will do, I found out about this by accident since I thought when Boinc was done for the day that the clock speed of the gpu would go from 772MHz to 51MHz, only that didn't happen, to turn off this effect I just specify in 'screen resolution' the desktop be restricted to the 1st monitor and not the second monitor/dummy plug on the card. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Sleepy Send message Joined: 21 May 99 Posts: 219 Credit: 98,947,784 RAC: 28,360 |
GT/GTX 6xx series cards that have not had the environment variable setting Thank you for this hint! I am not producing that much amount of invalids with my 650Ti (the errors you see are 98% due to optimisation tests and failures during the card switch), but I had overlooked this issue, though I am constantly reading this forum. Maybe it should be put in more evidence. Now I have set the environment variable and I will see how everything will behave and if this brings the problems to 0 level. Sleepy |
Bernie Vine Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328 |
Thank you for this hint! If you see Jason's post, hopefully soon it will be a thing of the past! In about a week, (give or take), the problematic 6.10 (Cuda 3.0) application will probably be supplanted by x41zc as stock on main, for V7 Cuda multibeam processing. These applications are not susceptible to the Cuda 3.0 library faults concerned.. Although as you are using x41zc, you don't actually need the environment variable setting anyway. Just one needed not all three: Solution/workround: a) Use an optimised Cuda application, where available. b) Downgrade to the 301.42 driver - only possible on GTX 670/680/690 cards, not on newer releases like the 650/660 or their Ti variants. c) Set an environment variable. |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
In about a week, (give or take), the problematic 6.10 (Cuda 3.0) application will probably be supplanted by x41zc as stock on main, for V7 Cuda multibeam processing. These applications are not susceptible to the Cuda 3.0 library faults concerned.. That's good to hear. Now to see if my 2500K can break into the top 100 by RAC before all those cards stop producing errors and both my rigs get pushed well down the list. Cheers. |
FLAT ERIC Send message Joined: 8 Dec 07 Posts: 4 Credit: 1,028,735 RAC: 0 |
I have been out of the loop for a few years and decided to come back with a new computer I just finished building. I am generating about 100-200 of these errors a day even after changing the environment variable. I just updated my driver from 314 to 320 and I have GeForce GTX670 what exactly does it mean to prevent the computer to go to blank screen? the boinc screen saver has a "blank screen" option that I set to never. My power options are set to turn the monitor off after an hour. Is that what it is talking about? Is there anything else I should do to fix this? I'm running version 7.0.64 version of Boinc |
Daddiogrif Send message Joined: 9 Apr 13 Posts: 5 Credit: 52,648 RAC: 0 |
Noob here Just to be safe ...and lazy I suppose sorry Any problems with a GTX 260 GPU's thing take up so much room I no longer can fit my old GTX 8600 for PhysX card lol Thanks K |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65709 Credit: 55,293,173 RAC: 49 |
I have been out of the loop for a few years and decided to come back with a new computer I just finished building. I am generating about 100-200 of these errors a day even after changing the environment variable. I use windows 7 Pro x64(a 64 bit operating system), you could go to Control Panel/Power Options, click on Power Options and then set the monitor to not turn off, as all such options on this PC are set to Never. Also the screen saver should be turned off also, that would be right click on the Desktop/Personalize, click on Personalize, then look for the icon on the lower right that says screen saver, double click on that icon, then you can turn off the screen saver, but this is only until one upgrades Boinc, then the screensaver is activated again. And no this won't hurt anything, you can always manually turn the monitor off manually, there by allowing Boinc to crunch. I don't know if this will resolve any errors or not of course, but it can't hurt. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
trader Send message Joined: 25 Jun 00 Posts: 126 Credit: 4,968,173 RAC: 0 |
I've seen several posts on this issue and I use win7 x64 ult. i have a screen saver running. and power management for the monitor is turned on. never experienced this issue myself. could this issue be caused by low system memory, swapping between page file and ram? |
FLAT ERIC Send message Joined: 8 Dec 07 Posts: 4 Credit: 1,028,735 RAC: 0 |
I've seen several posts on this issue and I use win7 x64 ult. i have a screen saver running. and power management for the monitor is turned on. never experienced this issue myself. could this issue be caused by low system memory, swapping between page file and ram? I hope that isn't my issue. I am running 16gb of ram and 2gb on the video card. I use windows 7 Pro x64(a 64 bit operating system), you could go to Control Panel/Power Options, click on Power Options and then set the monitor to not turn off, as all such options on this PC are set to Never. man.. I really enjoyed the screen saver. I will give it a try. Thanks! |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
I have been out of the loop for a few years and decided to come back with a new computer I just finished building. I am generating about 100-200 of these errors a day even after changing the environment variable. Did you restart your PC after changing the environment variable as your errors still show as if you havn't. Cheers. |
FLAT ERIC Send message Joined: 8 Dec 07 Posts: 4 Credit: 1,028,735 RAC: 0 |
Thanks! I may not have restarted. I just restarted right now. how do you check that? |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
The reason that I asked that was because your Stderr outputs for your errored CUDA work still showed, "CUFFT error in file 'd:/Projects/SETI/seti_boinc/client/cuda/cudaAcc_fft.cu' in line 62", which is the error produced by not having that environment variable set when running the current stock CUDA app. You'll just have to wait now until you get some more CUDA work to see if it's fixed. Cheers. |
FLAT ERIC Send message Joined: 8 Dec 07 Posts: 4 Credit: 1,028,735 RAC: 0 |
seems to be fixed :D |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.