| Author |
Message |
|
|
|
I've had to abort over 800 ATI tasks on one of my machines.
Over the last day or 2 , 800 ATI tasks suddenly downloaded, which is very strange as I only usually have at most 20 waiting and 3 running.
I noticed later that these tasks began to hang at 100% in Boinc Manager.
I'm not prepared to mince around with this, as 800 ATI tasks should not have downloaded in the first place.
I normally only get anything from 1-30, with my settings of 5.00 and 0.01 days of work.
CPU tasks are unaffected.
My other pc processing CPU and ATI tasks are also unaffected.
I've therfore aborted these tasks, and will await further tasks, and see how they progress.
Never had this occurence before.
Apology to Wingpeople, but I simply have not got time to faff around with this |
|
|
|
|
|
It happened again
Whats going on here.
Another 800 ATI GPU tasks downloaded today.
I should only get around 30 maximum.
2 of thses 800 tasks have reached 100% and just stay there.
They appear to hang preventing futher ATI tasks from starting.
I'm in the dark here, so I've reset the project this time, and wait to see what happens tomorrow.
Setis servers wont allow any more saying the computer has finished a daily quota of 1 tasks
What-ever that appears to mean ?
Can anyone throw any light on this annoying problem |
|
|
|
|
|
You probably need to set the <sched_op_debug> logging flag in cc_config.xml, so you can see in a little more detail how much work (how many seconds) your computer is requesting, and how long the work you receive in response to those requests is estimated (again, by your computer) to take. |
|
|
|
|
|
800 isnt "a lot". I assume your GPU is getting these WU's because seti thinks it can crunch them before the deadline. A newer GPU can run through 300+ a day.
I assume your's isnt the latest greatest.
If you think you are getting to many WU's, then reduce your cache to 1-2 days. If that is too much then change the "contact server every" to 1 day or less.
Just a demonstration. I have 1 PC that currently has 4800 WU's in progress. My PC's with GPUs all have large caches.
____________
Proud member of TSWB.
End terrorism by building a school
|
|
|
|
|
|
Guys
I appreciate what has been said, but whats has happened of late, is abnormal for my set-up, and these taks hang at 100% preventing other tasks from running.
I never get this many ATI GPU tasks.
The other computer is almost identical to this one, and its running as this one was until a day or 2 ago.
I had changed nothing on my computer before these tasks downloaded
Anyway I've taken drastic action on this computer by detatching from the project, deleting all Boinc and Seti files including those in the program file and data directories.
Reinstalled Boinc, reattached Seti, and optimised it for Ati gpu tasks with my backed copy of L*****c.
I cant get any Ati tasks at present as Seti server is saying "this computer has finished a daily quota of 10 tasks" |
|
|
|
|
|
Don't stress. I've aborted 5000 at least twice and 3000 another time due to sheer stupidity on my part.
____________
Executive Director GPU Users Group Inc. -
brad@gpuug.org |
|
|
|
|
|
The clean install of Boinc etc, made no difference.
600+ ATI GPU tasks downloade this morning on this computer.
I've set the " no more tasks" parameter till I get rid of this lot.
See what happens when I re-enable this parameter in the future.
:) |
|
|
|
|
|
What did the event log say about the amount of work requested and received? |
|
|
|
|
|
Unfortunately Richard the event log has been reset.
The log only goes back 2 hours, and there's no work request data logged, because i'm now not requesting any.
When I start requesting work again , I'll see whats been logged
|
|
|
|
|
Unfortunately Richard the event log has been reset.
The log only goes back 2 hours, and there's no work request data logged, because i'm now not requesting any.
When I start requesting work again , I'll see whats been logged
Older messages, predating the last BOINC restart, can usually be found in the file stdoutdae.txt in the BOINC data directory (for location, see the startup messages in the current message log). |
|
|
|
|
|
Just what manufacturer make and model is the graphics card on the
machine in question?
Example Gigabyte GTX560Ti GV-N560OC-1GI
Just what do you have set for "Minimum work buffer"?
Just what do you have set for "Max additional work buffer"? |
|
|
|
|
|
If I have tasks which continue running after reaching 100% (No, I don't mean just 10 Minutes) I have to reboot my machine.
If I only close BOINC and restart it it will resume from last checkpoint but still not complete the task.
I will write to the Alpha mailing list about it tomorow or so to find out which log flags to set.
And then wait for it to happen again in a couple of weeks..........
Addition: did the estimated time to completion go down significantly? Then you need to wait and complete at least 10 tasks (which need to validate) for that the server will re-calculate the time correction factor.
____________
Christoph |
|
|
|
|
|
I'm using Radeon HD 6670
Min work buffer is 5.00 days, Max additional work buffer is 0.01
I've reset the app_info count settings from .33 back to default 1
It seems to be reporting ok now, but I did get another 200 ATI files today.
Funny my other computer has the same grahics card, virtually the same setup, and that is running 3 ATI tasks ok using count settings .33
This computer was also running ok until a few days ago using the count setting of .33
I'll leave both running with a default count setting of 1
See if things settle down, then ajust the count settings back to .33
Running 3 ATI tasks on the card, didnt seem to affect my set up, other than cause video to stutter occasionally.
I therefor included settings in BOINC exclusive apps, to suspend BOINC if I am using iexplore.exe, vlc.exe and googleearth.exe.
Temperatures are about 48C for the other computer GPU and about 56 for this one.
CPU temps are about the same as the GPU
|
|
|
|
|
I'm using Radeon HD 6670
Min work buffer is 5.00 days, Max additional work buffer is 0.01
(sniporest)
Ok, there was a huge shorty storm a while back, which may have caused your experience. If you card was capable of crunching 7 work units an hour set to crunch 3 at a time, 24 hours a day with a 5 day cache, 800+ work units in your cache is a reasonable number.
My wild ass guess is the servers saw you going like gangbusters through work units (shortys remember) and at the same time something was done at the servers to increase the number of downloads per request. There were posts of people getting 100+ work units per download so what you saw was your cache FINALLY being filled up to its maximum for 5+ days worth of work.
I can see where that might be unsettling to someone who never has experienced that sort of thing before.
What you can do is set from .33 to 1 for how many work units to run at one time (which I see you have done). This should drop the number of downloads by 2/3rds BUT NOT IMMEDIATELY. There is some time lag in the system that would make this take a few days. Quicker would be to set your cache down to 1 day max and .01 additional. This will probably cause Seti to go into high priority mode to clear out the excess. DON'T WORRY about it, just be patient and give Boinc a week to settle down.
What the problem really was was not that you got a metric buttload of work units all at once but that you weren't getting all the work units you were previously requesting in the first place.
To figure out the cache you want take the number of work units your gpu and cpu can do per hour (on average), multiply that by how many hours a day you are going to crunch work units.
Say 5 wu per hour for gpu and 2 per cpu. (5 + 2)X 24hrs= 168 per day
If you want 5 days cache expect to see about 5 X 168= 840 work units in your cache if everything works as it should.
In the mean time patience, there is quite a bit of lag built into Boinc.
Don't worry about High Priority.
Don't worry about missed deadlines.
Don't worry about wingmen.
What ever you don't get done will be done by somebody else eventually.
This is not a race. Just give your system a few days to weeks to settle down. As long as you are not producing errors everything will adjust itself eventually.
Now here is where others tell you where I'm wrong, The good thing is that between me and them is that you'll get the answer you want. :) |
|
|
|
|
|
Thanks for those comments Bill
I'll run with a GPU count of 1 for a week, and then see if I can move to .33 again.
I was more worried about WUs that reached 100% staying there and not moving from BOINC Manager preventing others from starting.
Anyway, I'm currently running with app_inf set to count 1 and everthing is running and reporting ok.
Again thanks Bill
|
|
|
|
|
Thanks for those comments Bill
I'll run with a GPU count of 1 for a week, and then see if I can move to .33 again.
I was more worried about WUs that reached 100% staying there and not moving from BOINC Manager preventing others from starting.
Anyway, I'm currently running with app_inf set to count 1 and everthing is running and reporting ok.
Again thanks Bill
You're welcome.
I would wait at most an hour on the ones that seem to be hung at 100% then abort them. maybe it's the work unit or maybe it's machine but no sense letting it run for ever. Like I said it will be crunched by somebody eventually if it's crunchable. If all of the work units hang then I'd look into a computer solution. If it's a bad work unit a history of more than one person aborting it will show it. Aborting a work unit is not a crime.
|
|
|