Questions and Answers :
GPU applications :
Seti cuda not suspending, PC extremely sluggish
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 20 Jan 02 Posts: 5 Credit: 236,021 RAC: 0 ![]() |
For some reason, my running Seti cuda workunits often do not suspend when I resume using the PC (more often than not) even though the Boinc manager says that the work is suspended due to user activity. The progress percentage of the cuda workunit shown in Boinc manager continues to rise while the others I have running like climate prediction or rosetta do stop like they should. Whenever this happens, the PC is being extremely sluggish. Changing between application windows takes several seconds, the mouse pointer is getting jumpy and even typing text is being slow. I've configured the Boinc manager to not hold the calculation apps in memory while the user is active but whenever the problem occurs, the Seti process and often also the other apps too do remain loaded (shown in task manager). I have the latest Boinc manager and also the latest nVidia driver 182.50, video card is a geForce 9800 GT with 512 MB RAM, OS is XP Pro SP3 with all patches applied. PC is an Intel Q8200 with 4 GB RAM. System is properly cooled and not overclocked. Boinc is configured to use 50% of the CPUs (i.e. 2 cores) and 50% of CPU time. The system is becoming responsive again a couple of seconds after I end Boinc manager and tell it to also stop running apps or as soon as the current cuda workunit has finished. Even if I select the cuda app that is running despite user activity and manually suspend the task, it keeps running. I never had this kind of problem with the non-cuda Seti apps. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
You've given us a lot of information, but unfortunately you've missed out one item which could help. You say you have "the latest Boinc manager", but you don't give it's actual version number: there have been a lot of "latest" managers recently! And we can't check the exact version number because your computers are hidden from us on this website. My guess is that you are using v6.4.5 or v6.4.7, the latest "stable", "recommended" BOINC releases. They have fairly rudimentary support for CUDA. There is much better CUDA support in the v6.6.xx range (currently in the final stages of testing with v6.6.20). You should find that these test versions do suspend the CUDA application, and remove it from graphics memory, when your preferences require it. The other problem, the sluggish response when typing or switching application windows, is a well-known flaw with the current SETI application. It tends to affect only a limited number of SETI tasks (the ones we describe as 'Very Low Angle Range' or VLAR). This problem will be with us until the application developers find a new way of doing the necessary calculations that fits the CUDA environment better, or new tools are released by the hardware suppliers to fix the problem. Unfortunately, not much progress seems to be being made on either of these fronts at the moment. The best advice I can give at the moment is to install BOINC v6.6.20 (which is very close to being released as a 'recommended' version), and suspend any VLAR tasks which cause problems on your screen until you can finish them off at a less inconvenient time. [Someone will probably dive in and say that there are all sorts of third-party modified versions which try to alleviate the VLAR problem, but I'm deliberately holding back from giving that response to your very first post on these boards. If you've been reading the other threads, you'll have found that advice anyway] |
![]() Send message Joined: 20 Jan 02 Posts: 5 Credit: 236,021 RAC: 0 ![]() |
Thank you for your reply. My bad, I was talking about the latest recommended v6.4.7 version of the manager, sorry for not including the version number. I've just installed the v6.6.20 and will keep you updated if the problem persists. |
SMR Send message Joined: 29 Jun 99 Posts: 13 Credit: 2,648,825 RAC: 0 ![]() |
I have a similar problem, I have a mac mini (ppc 1.25 ghz 1gb memory, 95% time to boinc), running boinc 6.2.18, osx 10.4.11, and any time I have the boinc manager open if takes around 2 seconds to respond to any typing or mouse clicks. When I click away from or close the boinc window, the computer responds normally. my other mac mini (also 1.25GHz ppc w/ 1gb ram) set up the same way, and with same optimized app (I think), works fine. (that has boinc 5.10.45, has been running for quite some time) If anyone looks at this, mini3 is the one w/ problems, our_mini is the one that works. the PC versions all seem fine. I will try to look at boinc FAQs to see if anyone else has had this problem; and will probably try installing same version of boinc as on other (working) mini after running seti out of work. |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
I have a similar problem Perhaps so, but definitely not CUDA related as a) you need BOINC 6.4 for CUDA support to be enabled while b) a CUDA application is not available yet for the Macintosh. So you're posting it in the wrong forum and the wrong thread. Best post about it in the Macintosh forum, starting a new thread. |
![]() Send message Joined: 20 Jan 02 Posts: 5 Credit: 236,021 RAC: 0 ![]() |
I've been running BOINC manager 6.6.20 for a few days now and I have to report that situation has not significantly improved. Sometimes the work units do get suspended properly but sometimes the CUDA app is totally unfazed and continues to number-crunch along after I come back to the PC. If that happens, the PC is still downright unusable until I manually end the Boinc manager along with all apps. |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
If that happens, the PC is still downright unusable until I manually end the Boinc manager along with all apps. That is most probably due to the very low angle range tasks, for which we hope a newer Seti application will eventually come that will dump those unceremoniously to the CPU, instead of trying to crunch them on the GPU. Not much that BOINC can do about. |
![]() Send message Joined: 20 Jan 02 Posts: 5 Credit: 236,021 RAC: 0 ![]() |
Any clue why some of the CUDA WU's still defy the order to suspend themselves? |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
If BOINC 6.4.x, then that's the problem. Try updating to 6.6.20, which has a lot of fixes for CUDA included, including the unloading out of video memory when you suspend BOINC (something 6.4 doesn't do). |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
Well I wish they'd hurry up and fix this vlar problem We all do, but I don't think it's going to happen any time soon. Eric Korpela has stated in this post that the official stance on this is to only use CUDA when you're not using the computer. Since you have that option and a fix for VLAR is difficult or far off, best start using it. The default preference is now also set to use the GPU only when the computer is idle. |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
I only wish I knew before hand how to tell a VLAR from a non VLAR, Oh well. Use the Lunatics modified package with VLAR killer. |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
I just get computation errors with those, I've tried V9-V11 they all produce error -5 or -6 and I've been told their not interested in anything other than -12 right now. As far as I understand the working of the VLAR killer, the -5 and -6 are normal errors for what you're doing: You are scanning the tasks as they come in and immediately killing them. So they will be returned as an error, to be sent out to someone else. I don't say this is the right thing to do though, others call it 'cherry picking', favoring only those tasks you like. |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
LOL, OK. :-) |
![]() Send message Joined: 20 Jan 02 Posts: 5 Credit: 236,021 RAC: 0 ![]() |
The problem I am having is, that the PC is starting to work on a CUDA work unit while it is idle. Then I come back to the PC and the BOINC manager claims that all work is suspended but often, it is in fact not suspended on the CUDA wu. The progress percentage keeps going up and the PC is awfully sluggish. The only possible solutions for me right now are a) manually shut down Boinc when I come back to the PC and notice it's being sluggish (and hopefully not forget to restart it when I leave the comp again) b) disable CUDA altogether (I'm currently using Boinc 6.6.23, no improvement compared to 6.6.20) Right now, I'm hanging in there and using option a), hoping for a solution to pop up. |
![]() Send message Joined: 19 Mar 05 Posts: 551 Credit: 4,673,015 RAC: 0 ![]() |
No the -5 and -6 errors were not cause I killed them, they happened by themselves, Killing VLARs is different as the -5 and -6 errors were not VLARs at all. And they've happened by the hundreds, So I don't consider them normal like a -9 error as they ain't normal to Me. How do you know they were not VLAR's? The VLAR kill mod kills the VLAR wu's automatically by outputting an error of -5 or -6 It does the killing for you. If it did not error out like that then it would be possible for 2 cuda machines to validate killed VLAR wu's and put invalid data into the science database. Bob ![]() Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957 Or Good Shop? http://www.goodshop.com/?charityid=888957 |
Who Know's Send message Joined: 12 Sep 99 Posts: 9 Credit: 10,923,514 RAC: 0 ![]() |
hi, i am running the 6.6.20 and widiget 2.8.0, and i still have to stop the manger. ie. not just telling it to exit but i have to go kill it and the setiathome cuda in task manger to get my gpu back. but after i kill it i can then start up the manger and it will stay not working till the start time for the manger. after it start back up, i have to do it all over again to get a working pc back. yes i have it all set to stop when in use and not to keep in memory. so in other words cuda get its start command from the manger but does not get a stop from it. if it did get the stop command then i would not have to turn it off. but the manger will keep it alive unless it is turned off also. so the manger is keeping the start command active on the cuda ? i make my own rigs. this rig is winxpsp2 64bit amd 9850 8 gb ram 280 gtx with a asua MB |
Who Know's Send message Joined: 12 Sep 99 Posts: 9 Credit: 10,923,514 RAC: 0 ![]() |
hi, well switch to the beta 6.6.25 with widget of 2.8.0 well the cuda keep running after i came back to it. but it as not as unresponsive as before. yes it keep running for 2 min after till it was finished. but i was able to use the pc with out any major issues. it is going in the right direction. Now my pc that i use for my desktop work is on a asus mb board with on board Nvidia 8300 graphics chip. it is a winxpsp 3 32 bit no ext. video card 4 gb of memory nothing fancy. just some thing to do work doc and some publishing. on that one i upgraded to the beta of 6.6.25 the lag is still very noticeable. but yes it keep running for 5 min till it finished out the Work Unit. so the fix is still a long way to go for the every day person to use the cuda with out them killing the w.u. and getting a bad error to send to the server. I fell that this is important, because if the user can use it with out them having to worry about not be enable to use their pc, then they will be happy to do the research. the second reason is because the less errors that the server has to receive and then reissue W.U. to others will mean that the work load on the servers will be less taxing on them, which will be a saving on the resources of the servers and the bandwidth used. thank you |
Jeremiah Huston Send message Joined: 26 Nov 00 Posts: 6 Credit: 785,318 RAC: 0 ![]() |
I have this exact same problem.... On 3 computers. Is there any update as to when this is going to be fixed? I don't mind cuda making my machine slow when I am not using it... but I do not understand why it cannot suspend correctly (or why the devs) wont add some code to check this and force it to halt. |
OzzFan ![]() ![]() ![]() ![]() Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 ![]() ![]() |
I have this exact same problem.... On 3 computers. Is there any update as to when this is going to be fixed? I don't mind cuda making my machine slow when I am not using it... but I do not understand why it cannot suspend correctly (or why the devs) wont add some code to check this and force it to halt. There is a new option in the preferences that disables GPU crunching while computer is in use. Does that fix the problem? |
Jeremiah Huston Send message Joined: 26 Nov 00 Posts: 6 Credit: 785,318 RAC: 0 ![]() |
Yes it does... I just don't like the idea of my GPU going to waste. So for now I just stop and restart boinc when I want to use one of the PC's. It's just really annoying. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.