Message boards :
News :
Bug in server affecting older BOINC clients with NVIDIA GPUs.
Message board moderation
Author | Message |
---|---|
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
We've identified a bug in the current BOINC server that is online at SETI@home. With older BOINC clients this bug results in running multiple SETI@home GPU applications simultaneously on a single GPU. While we debug and fix the problem we've suspended distribution of NVIDIA work. We hope that everything will be back to normal some time tomorrow. @SETIEric@qoto.org (Mastodon) |
robertmiles Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6 |
Do the multiple workunits use some same resource of the GPU or interfere in some other way? Some BOINC projects seem to let two or more OpenCL GPU workunits, but not multiple CUDA GPU workunits, share a single GPU after checking that they are not assigned any of the same resources within the GPU. |
Francesco Forti Send message Joined: 24 May 00 Posts: 334 Credit: 204,421,005 RAC: 15 |
Last week I had same problem but with boinc 7.0.28 and lunatics v0.40 32 bit. As soon I had tried to run two instance of GPU task on a new GT 640 (driver 301.42) using count 0.5 I get 80 instance of GPU running and I had to deinstall adnd reinstall everithing. Other hosts, with older GTX cards, are also abbe to run two instances. |
Shakir Send message Joined: 14 Aug 99 Posts: 3 Credit: 90,452,595 RAC: 93 |
I have a Notebook where this problem occoures. It isnt a Problem starting more threads on the GPU, the bigest Problem in my case is, the 10 Setis on the GPU use around 2 gig ram and the NB is 100% in swap mode. I cant deactivate the GPU calculation either. Well, there are 8 regular CPU Setis working, so the system hit against the 4 gig ram wall. I have to deaktivate calculation until this issue is fixed. Thanks for working on it. |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
Do the multiple workunits use some same resource of the GPU or interfere in some other way? The main problem is that the BOINC client, depending upon the relative speed of your GPU and CPU, could decide to run as many as 10 GPU apps per CPU core simultaneously. If you've got 4 CPU cores, that's 40 GPU apps running at once. So no, we're not talking about running 2 or even 4 apps simultaneously on the GPU. The possible results, in order of severity, could be: 1) The apps error out when the GPU runs out of memory. 2) Your GPU driver freezes causes a reboot every time BOINC tries to run the apps. 3) Your GPU overheats and causes a reboot every time BOINC tries to run the apps. @SETIEric@qoto.org (Mastodon) |
Michael W.F. Miles Send message Joined: 24 Mar 07 Posts: 268 Credit: 34,410,870 RAC: 0 |
I have Boinc client 7.0.31 installed on windows 7 x64 I am also running x41x miltibeam for gpu app Since the outage my 460 gtx is now taking 1.5 hours to complete two tasks It usually takes 15 minutes Now I was wondering what was happening as I hope I did not blow my card. It does not heat up to its normal temps when crunching My task manager is showing the six cpu apps running and the two x41x tasks running but that is all. I am hoping it is the server doing this and not poor 460 gtx which has been my main crunching unit Michael Miles |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
We've installed a fix from David Anderson that we hope will solve the problem. If you have a BOINC version 7 client, the problem never affected you, and you can stop reading this now. If you use BOINC version 6, you are probably affected. The fix we installed will not fix workunits that have already been downloaded. For that, you've got four options. 1) Abort all your CUDA tasks. 2) Upgrade to BOINC v7 or 3) Exit BOINC, edit your client_state.xml to replace all the occurrences of " @SETIEric@qoto.org (Mastodon) |
Sunny129 Send message Joined: 7 Nov 00 Posts: 190 Credit: 3,163,755 RAC: 0 |
there was a first time poster named Cathy who posted a question about her GPU problems in this thread earlier, but her post has since mysteriously vanished. her post may have been slightly out of place in this thread, as its not a troubleshooting thread, but rather a thread dedicated to the status of the NOINC server bug. regardless, i'm hoping that her post was not deleted altogether, and at the very least towed to the appropriate sub-forum or thread so that her question can get answered... I have Boinc client 7.0.31 installed on windows 7 x64 doesn't sound like a server-side issue to me, despite all the issues the server has right now. it sounds more like your video driver crashed and reset itself, leaving the GPU in limp/safe mode. are your GPU's core and memory clocks underclocked to approx. half of what they should be? you'll need to open Catalyst Control Center or some 3rd party utility like MSI Afterburner to confirm this. if so, you'll need to suspend all BOINC work and reboot the entire system to bring the GPU out of safe mode. but even if this isn't the solution, i highly doubt physical damage is responsible for the way your GPU is currently acting. |
Michael W.F. Miles Send message Joined: 24 Mar 07 Posts: 268 Credit: 34,410,870 RAC: 0 |
I did think it was a driver crash but a reboot will clear this out. All OC programs say it is running okay except for temp on my gpu which is lower than usual. I am going to try a driver reinstall and see what happens. It started to do this right after the last outage which make me suspicious |
Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0 |
I did think it was a driver crash but a reboot will clear this out. The units that are taking 1.5 hours aren't VLAR's by chance, are they? |
rob smith Send message Joined: 7 Mar 03 Posts: 22529 Credit: 416,307,556 RAC: 380 |
Or Astropulses? Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Michael W.F. Miles Send message Joined: 24 Mar 07 Posts: 268 Credit: 34,410,870 RAC: 0 |
They are vlars, I just noticed that. I am going to abort the vlars and see what happens |
Michael W.F. Miles Send message Joined: 24 Mar 07 Posts: 268 Credit: 34,410,870 RAC: 0 |
That was it. Thanks you all. Man what a relief. I thought I cooked my card running it in this weather. Vlars ran up the time by a huge margin. Thank you, thank you, thank you GOD almighty. One thing I don't have right now is 200 dollars for a new card Michael Miles |
alan soden Send message Joined: 12 Feb 12 Posts: 2 Credit: 105,617 RAC: 0 |
ha ha maybe this bug is extraterestial ??? good luck with the fix. |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
|
Peter Send message Joined: 4 May 12 Posts: 22 Credit: 26,746 RAC: 0 |
Hello Thank you for finding the problem, I got it and when I stated that I was told it is impossible. So you proved that I was right, am not crazy. All I can say to ones that told me I was wrong (I Told you so!) THEY SEE YOU!! LOOK UP!! |
Horacio Send message Joined: 14 Jan 00 Posts: 536 Credit: 75,967,266 RAC: 0 |
Damn, now the vlars are broke again? I dont think so... the vlars aborted by him were sent on Aug 11 (I guess before the fixes)... |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
Damn, now the vlars are broke again? Agreed. No new ones seen here, either. |
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0 |
If you use BOINC version 6, you are probably affected. I like the option '3)' most (the 'Replace All' is easy using just Notepad) I can think of another 'fix' (for those uncomfortable with any of 1) ... 4) above) but it involves 'hand' work: - Temporarily Disable getting NVIDIA/CUDA tasks ('Use NVIDIA GPU' here: http://setiathome.berkeley.edu/prefs.php?subset=project) - Suspend all your current CUDA tasks (sort by Application column, click ... Shift+click to select all CUDA tasks) - Resume them one at a time (or 2, 3 at a time if your GPU is good enough (Fermi++)) - When all 'old' CUDA tasks are done - Enable again getting NVIDIA/CUDA tasks ('Use NVIDIA GPU' yes) Â - ALF - "Find out what you don't do well ..... then don't do it!" :) Â |
.clair. Send message Joined: 4 Nov 04 Posts: 1300 Credit: 55,390,408 RAC: 69 |
Damn, now the vlars are broke again? I run ATI GPU and have not seen a VLAR in days If there are any out there i can not find them :( |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.