Message boards :
Number crunching :
Aborted Tasks
Message board moderation
Author | Message |
---|---|
Swordfish Send message Joined: 5 Aug 06 Posts: 72 Credit: 3,014,493 RAC: 0 |
I've had to abort over 800 ATI tasks on one of my machines. Over the last day or 2 , 800 ATI tasks suddenly downloaded, which is very strange as I only usually have at most 20 waiting and 3 running. I noticed later that these tasks began to hang at 100% in Boinc Manager. I'm not prepared to mince around with this, as 800 ATI tasks should not have downloaded in the first place. I normally only get anything from 1-30, with my settings of 5.00 and 0.01 days of work. CPU tasks are unaffected. My other pc processing CPU and ATI tasks are also unaffected. I've therfore aborted these tasks, and will await further tasks, and see how they progress. Never had this occurence before. Apology to Wingpeople, but I simply have not got time to faff around with this |
Swordfish Send message Joined: 5 Aug 06 Posts: 72 Credit: 3,014,493 RAC: 0 |
It happened again Whats going on here. Another 800 ATI GPU tasks downloaded today. I should only get around 30 maximum. 2 of thses 800 tasks have reached 100% and just stay there. They appear to hang preventing futher ATI tasks from starting. I'm in the dark here, so I've reset the project this time, and wait to see what happens tomorrow. Setis servers wont allow any more saying the computer has finished a daily quota of 1 tasks What-ever that appears to mean ? Can anyone throw any light on this annoying problem |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14649 Credit: 200,643,578 RAC: 874 |
You probably need to set the <sched_op_debug> logging flag in cc_config.xml, so you can see in a little more detail how much work (how many seconds) your computer is requesting, and how long the work you receive in response to those requests is estimated (again, by your computer) to take. |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
800 isnt "a lot". I assume your GPU is getting these WU's because seti thinks it can crunch them before the deadline. A newer GPU can run through 300+ a day. I assume your's isnt the latest greatest. If you think you are getting to many WU's, then reduce your cache to 1-2 days. If that is too much then change the "contact server every" to 1 day or less. Just a demonstration. I have 1 PC that currently has 4800 WU's in progress. My PC's with GPUs all have large caches. In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
Swordfish Send message Joined: 5 Aug 06 Posts: 72 Credit: 3,014,493 RAC: 0 |
Guys I appreciate what has been said, but whats has happened of late, is abnormal for my set-up, and these taks hang at 100% preventing other tasks from running. I never get this many ATI GPU tasks. The other computer is almost identical to this one, and its running as this one was until a day or 2 ago. I had changed nothing on my computer before these tasks downloaded Anyway I've taken drastic action on this computer by detatching from the project, deleting all Boinc and Seti files including those in the program file and data directories. Reinstalled Boinc, reattached Seti, and optimised it for Ati gpu tasks with my backed copy of L*****c. I cant get any Ati tasks at present as Seti server is saying "this computer has finished a daily quota of 10 tasks" |
Slavac Send message Joined: 27 Apr 11 Posts: 1932 Credit: 17,952,639 RAC: 0 |
|
Swordfish Send message Joined: 5 Aug 06 Posts: 72 Credit: 3,014,493 RAC: 0 |
The clean install of Boinc etc, made no difference. 600+ ATI GPU tasks downloade this morning on this computer. I've set the " no more tasks" parameter till I get rid of this lot. See what happens when I re-enable this parameter in the future. :) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14649 Credit: 200,643,578 RAC: 874 |
What did the event log say about the amount of work requested and received? |
Swordfish Send message Joined: 5 Aug 06 Posts: 72 Credit: 3,014,493 RAC: 0 |
Unfortunately Richard the event log has been reset. The log only goes back 2 hours, and there's no work request data logged, because i'm now not requesting any. When I start requesting work again , I'll see whats been logged |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14649 Credit: 200,643,578 RAC: 874 |
Unfortunately Richard the event log has been reset. Older messages, predating the last BOINC restart, can usually be found in the file stdoutdae.txt in the BOINC data directory (for location, see the startup messages in the current message log). |
bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0 |
Just what manufacturer make and model is the graphics card on the machine in question? Example Gigabyte GTX560Ti GV-N560OC-1GI Just what do you have set for "Minimum work buffer"? Just what do you have set for "Max additional work buffer"? |
Christoph Send message Joined: 21 Apr 03 Posts: 76 Credit: 355,173 RAC: 0 |
If I have tasks which continue running after reaching 100% (No, I don't mean just 10 Minutes) I have to reboot my machine. If I only close BOINC and restart it it will resume from last checkpoint but still not complete the task. I will write to the Alpha mailing list about it tomorow or so to find out which log flags to set. And then wait for it to happen again in a couple of weeks.......... Addition: did the estimated time to completion go down significantly? Then you need to wait and complete at least 10 tasks (which need to validate) for that the server will re-calculate the time correction factor. Christoph |
Swordfish Send message Joined: 5 Aug 06 Posts: 72 Credit: 3,014,493 RAC: 0 |
I'm using Radeon HD 6670 Min work buffer is 5.00 days, Max additional work buffer is 0.01 I've reset the app_info count settings from .33 back to default 1 It seems to be reporting ok now, but I did get another 200 ATI files today. Funny my other computer has the same grahics card, virtually the same setup, and that is running 3 ATI tasks ok using count settings .33 This computer was also running ok until a few days ago using the count setting of .33 I'll leave both running with a default count setting of 1 See if things settle down, then ajust the count settings back to .33 Running 3 ATI tasks on the card, didnt seem to affect my set up, other than cause video to stutter occasionally. I therefor included settings in BOINC exclusive apps, to suspend BOINC if I am using iexplore.exe, vlc.exe and googleearth.exe. Temperatures are about 48C for the other computer GPU and about 56 for this one. CPU temps are about the same as the GPU |
bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0 |
I'm using Radeon HD 6670 Ok, there was a huge shorty storm a while back, which may have caused your experience. If you card was capable of crunching 7 work units an hour set to crunch 3 at a time, 24 hours a day with a 5 day cache, 800+ work units in your cache is a reasonable number. My wild ass guess is the servers saw you going like gangbusters through work units (shortys remember) and at the same time something was done at the servers to increase the number of downloads per request. There were posts of people getting 100+ work units per download so what you saw was your cache FINALLY being filled up to its maximum for 5+ days worth of work. I can see where that might be unsettling to someone who never has experienced that sort of thing before. What you can do is set from .33 to 1 for how many work units to run at one time (which I see you have done). This should drop the number of downloads by 2/3rds BUT NOT IMMEDIATELY. There is some time lag in the system that would make this take a few days. Quicker would be to set your cache down to 1 day max and .01 additional. This will probably cause Seti to go into high priority mode to clear out the excess. DON'T WORRY about it, just be patient and give Boinc a week to settle down. What the problem really was was not that you got a metric buttload of work units all at once but that you weren't getting all the work units you were previously requesting in the first place. To figure out the cache you want take the number of work units your gpu and cpu can do per hour (on average), multiply that by how many hours a day you are going to crunch work units. Say 5 wu per hour for gpu and 2 per cpu. (5 + 2)X 24hrs= 168 per day If you want 5 days cache expect to see about 5 X 168= 840 work units in your cache if everything works as it should. In the mean time patience, there is quite a bit of lag built into Boinc. Don't worry about High Priority. Don't worry about missed deadlines. Don't worry about wingmen. What ever you don't get done will be done by somebody else eventually. This is not a race. Just give your system a few days to weeks to settle down. As long as you are not producing errors everything will adjust itself eventually. Now here is where others tell you where I'm wrong, The good thing is that between me and them is that you'll get the answer you want. :) |
Swordfish Send message Joined: 5 Aug 06 Posts: 72 Credit: 3,014,493 RAC: 0 |
Thanks for those comments Bill I'll run with a GPU count of 1 for a week, and then see if I can move to .33 again. I was more worried about WUs that reached 100% staying there and not moving from BOINC Manager preventing others from starting. Anyway, I'm currently running with app_inf set to count 1 and everthing is running and reporting ok. Again thanks Bill |
bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0 |
Thanks for those comments Bill You're welcome. I would wait at most an hour on the ones that seem to be hung at 100% then abort them. maybe it's the work unit or maybe it's machine but no sense letting it run for ever. Like I said it will be crunched by somebody eventually if it's crunchable. If all of the work units hang then I'd look into a computer solution. If it's a bad work unit a history of more than one person aborting it will show it. Aborting a work unit is not a crime. |
musicplayer Send message Joined: 17 May 10 Posts: 2430 Credit: 926,046 RAC: 0 |
Apparently too late to validate. http://setiathome.berkeley.edu/workunit.php?wuid=1091613861 http://setiathome.berkeley.edu/result.php?resultid=2661068021 |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.