Of the Woods (Feb 19 2009)


log in

Advanced search

Message boards : Technical News : Of the Woods (Feb 19 2009)

Previous · 1 · 2 · 3 · 4 · Next
Author Message
Swibby Bear
Send message
Joined: 1 Aug 01
Posts: 236
Credit: 7,276,504
RAC: 1
United States
Message 868279 - Posted: 22 Feb 2009, 22:58:12 UTC - in response to Message 868254.

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.


... or one or more WUs are suspended in the task list on your computer.

XWing69
Avatar
Send message
Joined: 3 Jan 08
Posts: 43
Credit: 2,448,408
RAC: 504
United States
Message 868281 - Posted: 22 Feb 2009, 23:05:30 UTC - in response to Message 868254.

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.


...finally downloading, sort of. I can see the bandwidth problem. I have 12 WUs in the queue trying to download, but the two that are trying to download are not actually downloading and sometimes time out and have to retry). Oh, only on single project (SETI). Will take the moderators advise and put on a backup project (lower priority) so I don't run dry when SETI issues arise.


____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4207
Credit: 34,463,533
RAC: 20,368
United Kingdom
Message 868282 - Posted: 22 Feb 2009, 23:11:53 UTC - in response to Message 868279.

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.


... or one or more WUs are suspended in the task list on your computer.


... or there are more than (about) 4 wu's waiting to upload, epecially if any uploads are counting down to their next try.

Claggy

Fred W
Volunteer tester
Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 868301 - Posted: 22 Feb 2009, 23:52:27 UTC - in response to Message 868282.

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.


... or one or more WUs are suspended in the task list on your computer.


... or there are more than (about) 4 wu's waiting to upload, epecially if any uploads are counting down to their next try.

Claggy

More accurately, that's twice the number of processors trying to upload.

F.
____________

Tim Lee
Send message
Joined: 15 Feb 00
Posts: 22
Credit: 9,657,943
RAC: 2,339
Australia
Message 868309 - Posted: 23 Feb 2009, 0:29:24 UTC - in response to Message 867158.

I wouldn't be surprised if there are network hiccups or if the assimilator queue swells during the weekend.
- Matt


As I write this the server status page shows 2,594,416 "Results returned and awaiting validation" This seems quite an achievement as I am unable to return any results, my fastest machine has about 100 results which cannot upload. I guess this will get sorted out Monday morning (when California eventually gets around to Monday morning) Time to stop fretting over boinc and go and do something useful.

It would be interesting to be able to view some trend data on the server stats - I'm assuming that 2.5e6 is an abnormally large number of results waiting, but I'm relying on my memory of something I've not taken a lot of notice before - I'm usually just looking at the ready to send data.
____________

archae86
Send message
Joined: 31 Aug 99
Posts: 889
Credit: 1,572,794
RAC: 3
United States
Message 868362 - Posted: 23 Feb 2009, 2:45:04 UTC - in response to Message 868309.

It would be interesting to be able to view some trend data on the server stats - I'm assuming that 2.5e6 is an abnormally large number of results waiting...
To see trends, try Scarecrow trend graphs

I chose the 30-day period, as it speaks to your assumption--false as it turns out. That number was near 4 million a couple of weeks ago.

____________

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8737
Credit: 25,595,351
RAC: 12,970
United Kingdom
Message 868371 - Posted: 23 Feb 2009, 3:17:17 UTC - in response to Message 868309.

I wouldn't be surprised if there are network hiccups or if the assimilator queue swells during the weekend.
- Matt


As I write this the server status page shows 2,594,416 "Results returned and awaiting validation" This seems quite an achievement as I am unable to return any results, my fastest machine has about 100 results which cannot upload. I guess this will get sorted out Monday morning (when California eventually gets around to Monday morning) Time to stop fretting over boinc and go and do something useful.

It would be interesting to be able to view some trend data on the server stats - I'm assuming that 2.5e6 is an abnormally large number of results waiting, but I'm relying on my memory of something I've not taken a lot of notice before - I'm usually just looking at the ready to send data.

The results awaiting Validation are the ones waiting for the wingman to report.

With about a million MB tasks generated/day and with an average of 3 days turn round time, 2.5 million waiting for a wingman is reasonable.

Speedy
Volunteer tester
Avatar
Send message
Joined: 26 Jun 04
Posts: 699
Credit: 5,992,460
RAC: 2,122
New Zealand
Message 868445 - Posted: 23 Feb 2009, 7:58:41 UTC
Last modified: 23 Feb 2009, 8:02:17 UTC

I'm amazed at the speed that the ap results are coming in. As I write this the average turn around time is 13.74 hours and mb result turn around time is 96.34 hours. This morning ap result turn around time was the lowest I've ever seen it 7 or so hours. Has anyone seen ap times this low before? Maybe they have hit a noisy section of sky for the ap data or could this be thanks to the latest optimized application? I'm crunching 2 ap unit at present they have been running for 7 hours with just under 2.5 hours go, I'm using the latest optimized application. This could also help explain why the cricket graph is all but maxed out.
____________

Live in NZ y not join Smile City?

uBronan
Volunteer tester
Avatar
Send message
Joined: 19 Sep 99
Posts: 21
Credit: 215,127
RAC: 0
Antarctica
Message 868446 - Posted: 23 Feb 2009, 8:12:53 UTC - in response to Message 867158.

Well my data is not flowing at all, i am getting no units nor finished units being uploading.
After some time they get rewarded as client errors....
Or stay in the upload to the server untill time passes then they get deleted ending in ofcourse again not rewarded
____________

Andreas
Send message
Joined: 21 Jan 02
Posts: 16
Credit: 9,911,789
RAC: 0
Germany
Message 868461 - Posted: 23 Feb 2009, 9:50:19 UTC - in response to Message 868254.
Last modified: 23 Feb 2009, 9:53:19 UTC

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.



As I am not attached to any other projects, BOINC must be "thinking" it has enough work, but all it has on my main cruncher is ~275 WU to upload. So BOINC seems quite "stupid" in this case, and my cache is empty :-(

And talking about cache I do have a question to the more experienced users: My BOINC (6.4.5) does not factor in the number of cores (4 in my case), so a 10 day cache lasts only aprox. 2.5 days. Is this by design or a known bug or is it just my instance of BOINC behaving strange?

Greetings to all Earthlings,
Andreas

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5917
Credit: 61,695,391
RAC: 28,819
Australia
Message 868467 - Posted: 23 Feb 2009, 10:06:42 UTC - in response to Message 868461.

And talking about cache I do have a question to the more experienced users: My BOINC (6.4.5) does not factor in the number of cores (4 in my case), so a 10 day cache lasts only aprox. 2.5 days. Is this by design or a known bug or is it just my instance of BOINC behaving strange?

Setting a cache larger than 5 days can limit the amount of work that is downloaded in order to not miss a deadline.
I've got a 4 day cache & have only run out of work twice in the last 4-5 years.
____________
Grant
Darwin NT.

Zydor
Send message
Joined: 4 Oct 03
Posts: 172
Credit: 491,111
RAC: 0
United Kingdom
Message 868472 - Posted: 23 Feb 2009, 10:22:46 UTC - in response to Message 868461.
Last modified: 23 Feb 2009, 10:24:34 UTC

My BOINC (6.4.5) does not factor in the number of cores (4 in my case), so a 10 day cache lasts only aprox. 2.5 days. Is this by design or a known bug or is it just my instance of BOINC behaving strange?


Sounds like you are running CUDA. CUDA WUs (albeit they are the same in reality to cpu WUs just run on a GPU) are limited not by the number of days set, but by the hardware. A quad with one gpu running gets as a quota 100 for each core plus 100 for the gpu. For you thats a max download of 500, which would be in line with only lasting 2.5 days and the amount you have ready to upload.

They did it that way because GPUs eat CUDA WUs like there is no tomorrow, and cant be managed with the "normal" by days protocol.

The Cache is empty because it cant get past the AP download issue, when that clears, you'll refill.
____________

Andreas
Send message
Joined: 21 Jan 02
Posts: 16
Credit: 9,911,789
RAC: 0
Germany
Message 868479 - Posted: 23 Feb 2009, 10:40:17 UTC - in response to Message 868467.
Last modified: 23 Feb 2009, 10:46:45 UTC


Setting a cache larger than 5 days can limit the amount of work that is downloaded in order to not miss a deadline.
I've got a 4 day cache & have only run out of work twice in the last 4-5 years.


Deadlines are no problem, average turnaround is below 3days.


Sounds like you are running CUDA. CUDA WUs (albeit they are the same in reality to cpu WUs just run on a GPU) are limited not by the number of days set, but by the hardware. A quad with one gpu running gets as a quota 100 for each core plus 100 for the gpu. For you thats a max download of 500, which would be in line with only lasting 2.5 days and the amount you have ready to upload.



IMHO quotas work different, they are to prevent machines from going "nuts" when producing errors and download all available work. If you return a completed task, the quota is set back to 100. I have downloaded more than 100WU/day/cpu in the past, no problem if you return completed results inbetween.

This strange cache behavior was with no CUDA enabeled. The sum of the estimated work was allways ~10 Days and filled up to that when lower. But spread over 4 cores 10 days of work only last 2.5 days.
____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8757
Credit: 52,707,000
RAC: 27,892
United Kingdom
Message 868485 - Posted: 23 Feb 2009, 11:14:16 UTC - in response to Message 868461.

"requesting 0 seconds of work".

If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.

As I am not attached to any other projects, BOINC must be "thinking" it has enough work, but all it has on my main cruncher is ~275 WU to upload. So BOINC seems quite "stupid" in this case, and my cache is empty :-(

BOINC also has another safety mechanism, designed to prevent it producing work faster than the results can be processed. If there are tasks waiting to be uploaded, BOINC won't ask for new work to add to the problem. Many, many users will have hit that restriction this weekend.

Andreas
Send message
Joined: 21 Jan 02
Posts: 16
Credit: 9,911,789
RAC: 0
Germany
Message 868486 - Posted: 23 Feb 2009, 11:33:36 UTC - in response to Message 868485.

BOINC also has another safety mechanism, designed to prevent it producing work faster than the results can be processed. If there are tasks waiting to be uploaded, BOINC won't ask for new work to add to the problem. Many, many users will have hit that restriction this weekend.


Thanks for the answer Richard,

Andreas

David J. Moritz
Send message
Joined: 15 Aug 99
Posts: 21
Credit: 1,725,226
RAC: 675
United States
Message 868505 - Posted: 23 Feb 2009, 13:08:33 UTC - in response to Message 868486.

Once again it appears that AP has caused a weekend outage for most SETI contributors. If the purpose of SETI@Home is to process information, it would seem that AP needs to go back to beta testing until it is ready for prime time and does not stop overall system processing. Maybe adding a preference to exclude AP processing on client computers would be appropriate?

As an enginering manager (including classisied computer systems needed to design and test product), my experience tells me that nearly weekly system crashes/outages result from improper management above the "worker" level. The performance of SETI@Home would never be tolerated in the commercial world. Is there anything we clients can do, other than donate more money that with bad management seems to be wasted, to help improve consistant system operation?

The bottom line is, if you want to find ET's message, the system must be up!
____________
David Moritz

Andreas
Send message
Joined: 21 Jan 02
Posts: 16
Credit: 9,911,789
RAC: 0
Germany
Message 868507 - Posted: 23 Feb 2009, 13:13:20 UTC - in response to Message 868485.


BOINC also has another safety mechanism, designed to prevent it producing work faster than the results can be processed. If there are tasks waiting to be uploaded, BOINC won't ask for new work to add to the problem.



Any chance to "hide" these tasks from BOINC?

____________

Zebra3
Avatar
Send message
Joined: 22 Oct 01
Posts: 186
Credit: 13,658,148
RAC: 0
Canada
Message 868514 - Posted: 23 Feb 2009, 13:27:26 UTC - in response to Message 868505.

Once again it appears that AP has caused a weekend outage for most SETI contributors. If the purpose of SETI@Home is to process information, it would seem that AP needs to go back to beta testing until it is ready for prime time and does not stop overall system processing. Maybe adding a preference to exclude AP processing on client computers would be appropriate?

As an enginering manager (including classisied computer systems needed to design and test product), my experience tells me that nearly weekly system crashes/outages result from improper management above the "worker" level. The performance of SETI@Home would never be tolerated in the commercial world. Is there anything we clients can do, other than donate more money that with bad management seems to be wasted, to help improve consistant system operation?

The bottom line is, if you want to find ET's message, the system must be up!


but...most corporate computer systems are not old and bandaged up like S@H and thus less susceptible to down time. When there is a problem with these corporate systems the hardware is replaced more often than not or there is a backup unit to put online. It all comes down to money and Seti dosen't have it.
____________
http://www.novascotia.com

Profile Lutz Michaelis
Avatar
Send message
Joined: 15 Mar 02
Posts: 79
Credit: 310,668
RAC: 0
China
Message 868516 - Posted: 23 Feb 2009, 13:33:46 UTC

I was already afraid that I am the only one who can not upload the finished WU's. I hope the problem will be fixed soon that I get more tasks to calculate.
____________
»beep*rrrr*uuuuh*piep*uhhhhhhh***beeeeeep***uhh***uhh«

Chelski
Avatar
Send message
Joined: 3 Jan 00
Posts: 121
Credit: 8,873,152
RAC: 932
Malaysia
Message 868519 - Posted: 23 Feb 2009, 13:39:55 UTC - in response to Message 868505.

Once again it appears that AP has caused a weekend outage for most SETI contributors. If the purpose of SETI@Home is to process information, it would seem that AP needs to go back to beta testing until it is ready for prime time and does not stop overall system processing. Maybe adding a preference to exclude AP processing on client computers would be appropriate?

As an enginering manager (including classisied computer systems needed to design and test product), my experience tells me that nearly weekly system crashes/outages result from improper management above the "worker" level. The performance of SETI@Home would never be tolerated in the commercial world. Is there anything we clients can do, other than donate more money that with bad management seems to be wasted, to help improve consistant system operation?

The bottom line is, if you want to find ET's message, the system must be up!


Yes, there is a setting on your account where you can pick which one of your profiles (e.g. work/home/school) that you want to turn AP off. And remember to deselect the "If no work for selected applications is available, accept work from other applications?" option.

As for SETI, there are really limited scope for management improvement when there's only like 3 person in the team. The project is run on a tight shoestring budget - unlike IT departments in the the commercial world - given the magnitude of the task handled by the team so far - would likely have ended with probably a brigade size organisation with an overall director, an OS director, UHD director, multiple project managers, drones of coordinators and still one "Herbert" to do the work.
____________

Previous · 1 · 2 · 3 · 4 · Next

Message boards : Technical News : Of the Woods (Feb 19 2009)

Copyright © 2014 University of California