Seti cuda not suspending, PC extremely sluggish

Questions and Answers : GPU applications : Seti cuda not suspending, PC extremely sluggish
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Duncan Blues

Send message
Joined: 20 Jan 02
Posts: 5
Credit: 236,021
RAC: 0
Germany
Message 882682 - Posted: 6 Apr 2009, 7:50:45 UTC

For some reason, my running Seti cuda workunits often do not suspend when I resume using the PC (more often than not) even though the Boinc manager says that the work is suspended due to user activity. The progress percentage of the cuda workunit shown in Boinc manager continues to rise while the others I have running like climate prediction or rosetta do stop like they should.
Whenever this happens, the PC is being extremely sluggish. Changing between application windows takes several seconds, the mouse pointer is getting jumpy and even typing text is being slow.
I've configured the Boinc manager to not hold the calculation apps in memory while the user is active but whenever the problem occurs, the Seti process and often also the other apps too do remain loaded (shown in task manager).
I have the latest Boinc manager and also the latest nVidia driver 182.50, video card is a geForce 9800 GT with 512 MB RAM, OS is XP Pro SP3 with all patches applied. PC is an Intel Q8200 with 4 GB RAM. System is properly cooled and not overclocked.
Boinc is configured to use 50% of the CPUs (i.e. 2 cores) and 50% of CPU time.
The system is becoming responsive again a couple of seconds after I end Boinc manager and tell it to also stop running apps or as soon as the current cuda workunit has finished.
Even if I select the cuda app that is running despite user activity and manually suspend the task, it keeps running.
I never had this kind of problem with the non-cuda Seti apps.
ID: 882682 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 882687 - Posted: 6 Apr 2009, 8:35:01 UTC - in response to Message 882682.  

You've given us a lot of information, but unfortunately you've missed out one item which could help.

You say you have "the latest Boinc manager", but you don't give it's actual version number: there have been a lot of "latest" managers recently! And we can't check the exact version number because your computers are hidden from us on this website.

My guess is that you are using v6.4.5 or v6.4.7, the latest "stable", "recommended" BOINC releases. They have fairly rudimentary support for CUDA. There is much better CUDA support in the v6.6.xx range (currently in the final stages of testing with v6.6.20). You should find that these test versions do suspend the CUDA application, and remove it from graphics memory, when your preferences require it.

The other problem, the sluggish response when typing or switching application windows, is a well-known flaw with the current SETI application. It tends to affect only a limited number of SETI tasks (the ones we describe as 'Very Low Angle Range' or VLAR). This problem will be with us until the application developers find a new way of doing the necessary calculations that fits the CUDA environment better, or new tools are released by the hardware suppliers to fix the problem. Unfortunately, not much progress seems to be being made on either of these fronts at the moment.

The best advice I can give at the moment is to install BOINC v6.6.20 (which is very close to being released as a 'recommended' version), and suspend any VLAR tasks which cause problems on your screen until you can finish them off at a less inconvenient time.

[Someone will probably dive in and say that there are all sorts of third-party modified versions which try to alleviate the VLAR problem, but I'm deliberately holding back from giving that response to your very first post on these boards. If you've been reading the other threads, you'll have found that advice anyway]
ID: 882687 · Report as offensive
Profile Duncan Blues

Send message
Joined: 20 Jan 02
Posts: 5
Credit: 236,021
RAC: 0
Germany
Message 882700 - Posted: 6 Apr 2009, 10:53:34 UTC

Thank you for your reply.

My bad, I was talking about the latest recommended v6.4.7 version of the manager, sorry for not including the version number.
I've just installed the v6.6.20 and will keep you updated if the problem persists.

ID: 882700 · Report as offensive
SMR
Volunteer tester

Send message
Joined: 29 Jun 99
Posts: 13
Credit: 2,648,825
RAC: 0
United States
Message 883371 - Posted: 8 Apr 2009, 13:41:18 UTC - in response to Message 882700.  

I have a similar problem, I have a mac mini (ppc 1.25 ghz 1gb memory, 95% time to boinc), running boinc 6.2.18, osx 10.4.11, and any time I have the boinc manager open if takes around 2 seconds to respond to any typing or mouse clicks. When I click away from or close the boinc window, the computer responds normally. my other mac mini (also 1.25GHz ppc w/ 1gb ram) set up the same way, and with same optimized app (I think), works fine. (that has boinc 5.10.45, has been running for quite some time) If anyone looks at this, mini3 is the one w/ problems, our_mini is the one that works. the PC versions all seem fine.
I will try to look at boinc FAQs to see if anyone else has had this problem; and will probably try installing same version of boinc as on other (working) mini after running seti out of work.
ID: 883371 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 883406 - Posted: 8 Apr 2009, 15:10:19 UTC - in response to Message 883371.  

I have a similar problem

Perhaps so, but definitely not CUDA related as a) you need BOINC 6.4 for CUDA support to be enabled while b) a CUDA application is not available yet for the Macintosh.

So you're posting it in the wrong forum and the wrong thread. Best post about it in the Macintosh forum, starting a new thread.
ID: 883406 · Report as offensive
Profile Duncan Blues

Send message
Joined: 20 Jan 02
Posts: 5
Credit: 236,021
RAC: 0
Germany
Message 883911 - Posted: 10 Apr 2009, 12:17:48 UTC

I've been running BOINC manager 6.6.20 for a few days now and I have to report that situation has not significantly improved. Sometimes the work units do get suspended properly but sometimes the CUDA app is totally unfazed and continues to number-crunch along after I come back to the PC. If that happens, the PC is still downright unusable until I manually end the Boinc manager along with all apps.
ID: 883911 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 883916 - Posted: 10 Apr 2009, 12:30:43 UTC - in response to Message 883911.  

If that happens, the PC is still downright unusable until I manually end the Boinc manager along with all apps.

That is most probably due to the very low angle range tasks, for which we hope a newer Seti application will eventually come that will dump those unceremoniously to the CPU, instead of trying to crunch them on the GPU. Not much that BOINC can do about.
ID: 883916 · Report as offensive
Profile Duncan Blues

Send message
Joined: 20 Jan 02
Posts: 5
Credit: 236,021
RAC: 0
Germany
Message 885330 - Posted: 14 Apr 2009, 16:00:17 UTC

Any clue why some of the CUDA WU's still defy the order to suspend themselves?

ID: 885330 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 885338 - Posted: 14 Apr 2009, 16:22:24 UTC - in response to Message 885330.  

If BOINC 6.4.x, then that's the problem. Try updating to 6.6.20, which has a lot of fixes for CUDA included, including the unloading out of video memory when you suspend BOINC (something 6.4 doesn't do).
ID: 885338 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 885749 - Posted: 16 Apr 2009, 6:51:16 UTC - in response to Message 885734.  

Well I wish they'd hurry up and fix this vlar problem

We all do, but I don't think it's going to happen any time soon. Eric Korpela has stated in this post that the official stance on this is to only use CUDA when you're not using the computer.

Since you have that option and a fix for VLAR is difficult or far off, best start using it. The default preference is now also set to use the GPU only when the computer is idle.
ID: 885749 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 885762 - Posted: 16 Apr 2009, 7:44:55 UTC - in response to Message 885761.  

I only wish I knew before hand how to tell a VLAR from a non VLAR, Oh well.

Use the Lunatics modified package with VLAR killer.
ID: 885762 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 885806 - Posted: 16 Apr 2009, 15:31:13 UTC - in response to Message 885799.  

I just get computation errors with those, I've tried V9-V11 they all produce error -5 or -6 and I've been told their not interested in anything other than -12 right now.

As far as I understand the working of the VLAR killer, the -5 and -6 are normal errors for what you're doing: You are scanning the tasks as they come in and immediately killing them. So they will be returned as an error, to be sent out to someone else.

I don't say this is the right thing to do though, others call it 'cherry picking', favoring only those tasks you like.
ID: 885806 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 885815 - Posted: 16 Apr 2009, 16:10:55 UTC - in response to Message 885812.  

LOL, OK. :-)
ID: 885815 · Report as offensive
Profile Duncan Blues

Send message
Joined: 20 Jan 02
Posts: 5
Credit: 236,021
RAC: 0
Germany
Message 887551 - Posted: 23 Apr 2009, 11:36:13 UTC - in response to Message 885749.  


the official stance on this is to only use CUDA when you're not using the computer.

Since you have that option and a fix for VLAR is difficult or far off, best start using it. The default preference is now also set to use the GPU only when the computer is idle.


The problem I am having is, that the PC is starting to work on a CUDA work unit while it is idle. Then I come back to the PC and the BOINC manager claims that all work is suspended but often, it is in fact not suspended on the CUDA wu. The progress percentage keeps going up and the PC is awfully sluggish.

The only possible solutions for me right now are

a) manually shut down Boinc when I come back to the PC and notice it's being sluggish (and hopefully not forget to restart it when I leave the comp again)

b) disable CUDA altogether

(I'm currently using Boinc 6.6.23, no improvement compared to 6.6.20)

Right now, I'm hanging in there and using option a), hoping for a solution to pop up.
ID: 887551 · Report as offensive
Profile popandbob
Volunteer tester

Send message
Joined: 19 Mar 05
Posts: 551
Credit: 4,673,015
RAC: 0
Canada
Message 887669 - Posted: 23 Apr 2009, 18:10:23 UTC - in response to Message 885812.  

No the -5 and -6 errors were not cause I killed them, they happened by themselves, Killing VLARs is different as the -5 and -6 errors were not VLARs at all. And they've happened by the hundreds, So I don't consider them normal like a -9 error as they ain't normal to Me.


How do you know they were not VLAR's?
The VLAR kill mod kills the VLAR wu's automatically by outputting an error of -5 or -6
It does the killing for you.
If it did not error out like that then it would be possible for 2 cuda machines to validate killed VLAR wu's and put invalid data into the science database.

Bob



Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957
ID: 887669 · Report as offensive
Who Know's

Send message
Joined: 12 Sep 99
Posts: 9
Credit: 10,923,514
RAC: 0
United States
Message 890175 - Posted: 1 May 2009, 14:31:22 UTC

hi,
i am running the 6.6.20 and widiget 2.8.0, and i still have to stop the manger. ie. not just telling it to exit but i have to go kill it and the setiathome cuda in task manger to get my gpu back.
but after i kill it i can then start up the manger and it will stay not working till the start time for the manger. after it start back up, i have to do it all over again to get a working pc back.
yes i have it all set to stop when in use and not to keep in memory.
so in other words cuda get its start command from the manger but does not get a stop from it. if it did get the stop command then i would not have to turn it off. but the manger will keep it alive unless it is turned off also.
so the manger is keeping the start command active on the cuda ?
i make my own rigs. this rig is
winxpsp2 64bit
amd 9850
8 gb ram
280 gtx
with a asua MB
ID: 890175 · Report as offensive
Who Know's

Send message
Joined: 12 Sep 99
Posts: 9
Credit: 10,923,514
RAC: 0
United States
Message 890550 - Posted: 2 May 2009, 14:28:38 UTC

hi,
well switch to the beta 6.6.25 with widget of 2.8.0
well the cuda keep running after i came back to it. but it as not as unresponsive as before. yes it keep running for 2 min after till it was finished. but i was able to use the pc with out any major issues. it is going in the right direction.

Now my pc that i use for my desktop work is on a asus mb board with on board Nvidia 8300 graphics chip.
it is a
winxpsp 3 32 bit
no ext. video card
4 gb of memory

nothing fancy. just some thing to do work doc and some publishing.
on that one i upgraded to the beta of 6.6.25 the lag is still very noticeable. but yes it keep running for 5 min till it finished out the Work Unit.
so the fix is still a long way to go for the every day person to use the cuda with out them killing the w.u. and getting a bad error to send to the server.
I fell that this is important, because if the user can use it with out them having to worry about not be enable to use their pc, then they will be happy to do the research. the second reason is because the less errors that the server has to receive and then reissue W.U. to others will mean that the work load on the servers will be less taxing on them, which will be a saving on the resources of the servers and the bandwidth used.
thank you
ID: 890550 · Report as offensive
Jeremiah Huston

Send message
Joined: 26 Nov 00
Posts: 6
Credit: 785,318
RAC: 0
United States
Message 900958 - Posted: 29 May 2009, 15:01:07 UTC - in response to Message 890550.  

I have this exact same problem.... On 3 computers. Is there any update as to when this is going to be fixed? I don't mind cuda making my machine slow when I am not using it... but I do not understand why it cannot suspend correctly (or why the devs) wont add some code to check this and force it to halt.
ID: 900958 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 901095 - Posted: 29 May 2009, 19:08:19 UTC - in response to Message 900958.  

I have this exact same problem.... On 3 computers. Is there any update as to when this is going to be fixed? I don't mind cuda making my machine slow when I am not using it... but I do not understand why it cannot suspend correctly (or why the devs) wont add some code to check this and force it to halt.


There is a new option in the preferences that disables GPU crunching while computer is in use. Does that fix the problem?
ID: 901095 · Report as offensive
Jeremiah Huston

Send message
Joined: 26 Nov 00
Posts: 6
Credit: 785,318
RAC: 0
United States
Message 901257 - Posted: 30 May 2009, 2:26:49 UTC - in response to Message 901095.  

Yes it does... I just don't like the idea of my GPU going to waste. So for now I just stop and restart boinc when I want to use one of the PC's. It's just really annoying.
ID: 901257 · Report as offensive
1 · 2 · Next

Questions and Answers : GPU applications : Seti cuda not suspending, PC extremely sluggish


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.