Of the Woods (Feb 19 2009)

Message boards : Technical News : Of the Woods (Feb 19 2009)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 868282 - Posted: 22 Feb 2009, 23:11:53 UTC - in response to Message 868279.  

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.


... or one or more WUs are suspended in the task list on your computer.


... or there are more than (about) 4 wu's waiting to upload, epecially if any uploads are counting down to their next try.

Claggy
ID: 868282 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 868301 - Posted: 22 Feb 2009, 23:52:27 UTC - in response to Message 868282.  

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.


... or one or more WUs are suspended in the task list on your computer.


... or there are more than (about) 4 wu's waiting to upload, epecially if any uploads are counting down to their next try.

Claggy

More accurately, that's twice the number of processors trying to upload.

F.
ID: 868301 · Report as offensive
Tim Lee

Send message
Joined: 15 Feb 00
Posts: 22
Credit: 32,655,046
RAC: 32
Australia
Message 868309 - Posted: 23 Feb 2009, 0:29:24 UTC - in response to Message 867158.  

I wouldn't be surprised if there are network hiccups or if the assimilator queue swells during the weekend.
- Matt


As I write this the server status page shows 2,594,416 "Results returned and awaiting validation" This seems quite an achievement as I am unable to return any results, my fastest machine has about 100 results which cannot upload. I guess this will get sorted out Monday morning (when California eventually gets around to Monday morning) Time to stop fretting over boinc and go and do something useful.

It would be interesting to be able to view some trend data on the server stats - I'm assuming that 2.5e6 is an abnormally large number of results waiting, but I'm relying on my memory of something I've not taken a lot of notice before - I'm usually just looking at the ready to send data.
ID: 868309 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 868362 - Posted: 23 Feb 2009, 2:45:04 UTC - in response to Message 868309.  

It would be interesting to be able to view some trend data on the server stats - I'm assuming that 2.5e6 is an abnormally large number of results waiting...
To see trends, try Scarecrow trend graphs

I chose the 30-day period, as it speaks to your assumption--false as it turns out. That number was near 4 million a couple of weeks ago.

ID: 868362 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 18996
Credit: 40,757,560
RAC: 67
United Kingdom
Message 868371 - Posted: 23 Feb 2009, 3:17:17 UTC - in response to Message 868309.  

I wouldn't be surprised if there are network hiccups or if the assimilator queue swells during the weekend.
- Matt


As I write this the server status page shows 2,594,416 "Results returned and awaiting validation" This seems quite an achievement as I am unable to return any results, my fastest machine has about 100 results which cannot upload. I guess this will get sorted out Monday morning (when California eventually gets around to Monday morning) Time to stop fretting over boinc and go and do something useful.

It would be interesting to be able to view some trend data on the server stats - I'm assuming that 2.5e6 is an abnormally large number of results waiting, but I'm relying on my memory of something I've not taken a lot of notice before - I'm usually just looking at the ready to send data.

The results awaiting Validation are the ones waiting for the wingman to report.

With about a million MB tasks generated/day and with an average of 3 days turn round time, 2.5 million waiting for a wingman is reasonable.
ID: 868371 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1639
Credit: 12,921,799
RAC: 89
New Zealand
Message 868445 - Posted: 23 Feb 2009, 7:58:41 UTC
Last modified: 23 Feb 2009, 8:02:17 UTC

I'm amazed at the speed that the ap results are coming in. As I write this the average turn around time is 13.74 hours and mb result turn around time is 96.34 hours. This morning ap result turn around time was the lowest I've ever seen it 7 or so hours. Has anyone seen ap times this low before? Maybe they have hit a noisy section of sky for the ap data or could this be thanks to the latest optimized application? I'm crunching 2 ap unit at present they have been running for 7 hours with just under 2.5 hours go, I'm using the latest optimized application. This could also help explain why the cricket graph is all but maxed out.
ID: 868445 · Report as offensive
uBronan
Volunteer tester
Avatar

Send message
Joined: 19 Sep 99
Posts: 21
Credit: 215,127
RAC: 0
Antarctica
Message 868446 - Posted: 23 Feb 2009, 8:12:53 UTC - in response to Message 867158.  

Well my data is not flowing at all, i am getting no units nor finished units being uploading.
After some time they get rewarded as client errors....
Or stay in the upload to the server untill time passes then they get deleted ending in ofcourse again not rewarded
ID: 868446 · Report as offensive
Andreas

Send message
Joined: 21 Jan 02
Posts: 16
Credit: 9,911,789
RAC: 0
Germany
Message 868461 - Posted: 23 Feb 2009, 9:50:19 UTC - in response to Message 868254.  
Last modified: 23 Feb 2009, 9:53:19 UTC

"requesting 0 seconds of work".


If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.



As I am not attached to any other projects, BOINC must be "thinking" it has enough work, but all it has on my main cruncher is ~275 WU to upload. So BOINC seems quite "stupid" in this case, and my cache is empty :-(

And talking about cache I do have a question to the more experienced users: My BOINC (6.4.5) does not factor in the number of cores (4 in my case), so a 10 day cache lasts only aprox. 2.5 days. Is this by design or a known bug or is it just my instance of BOINC behaving strange?

Greetings to all Earthlings,
Andreas
ID: 868461 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 868467 - Posted: 23 Feb 2009, 10:06:42 UTC - in response to Message 868461.  

And talking about cache I do have a question to the more experienced users: My BOINC (6.4.5) does not factor in the number of cores (4 in my case), so a 10 day cache lasts only aprox. 2.5 days. Is this by design or a known bug or is it just my instance of BOINC behaving strange?

Setting a cache larger than 5 days can limit the amount of work that is downloaded in order to not miss a deadline.
I've got a 4 day cache & have only run out of work twice in the last 4-5 years.
Grant
Darwin NT
ID: 868467 · Report as offensive
Zydor

Send message
Joined: 4 Oct 03
Posts: 172
Credit: 491,111
RAC: 0
United Kingdom
Message 868472 - Posted: 23 Feb 2009, 10:22:46 UTC - in response to Message 868461.  
Last modified: 23 Feb 2009, 10:24:34 UTC

My BOINC (6.4.5) does not factor in the number of cores (4 in my case), so a 10 day cache lasts only aprox. 2.5 days. Is this by design or a known bug or is it just my instance of BOINC behaving strange?


Sounds like you are running CUDA. CUDA WUs (albeit they are the same in reality to cpu WUs just run on a GPU) are limited not by the number of days set, but by the hardware. A quad with one gpu running gets as a quota 100 for each core plus 100 for the gpu. For you thats a max download of 500, which would be in line with only lasting 2.5 days and the amount you have ready to upload.

They did it that way because GPUs eat CUDA WUs like there is no tomorrow, and cant be managed with the "normal" by days protocol.

The Cache is empty because it cant get past the AP download issue, when that clears, you'll refill.
ID: 868472 · Report as offensive
Andreas

Send message
Joined: 21 Jan 02
Posts: 16
Credit: 9,911,789
RAC: 0
Germany
Message 868479 - Posted: 23 Feb 2009, 10:40:17 UTC - in response to Message 868467.  
Last modified: 23 Feb 2009, 10:46:45 UTC


Setting a cache larger than 5 days can limit the amount of work that is downloaded in order to not miss a deadline.
I've got a 4 day cache & have only run out of work twice in the last 4-5 years.


Deadlines are no problem, average turnaround is below 3days.


Sounds like you are running CUDA. CUDA WUs (albeit they are the same in reality to cpu WUs just run on a GPU) are limited not by the number of days set, but by the hardware. A quad with one gpu running gets as a quota 100 for each core plus 100 for the gpu. For you thats a max download of 500, which would be in line with only lasting 2.5 days and the amount you have ready to upload.



IMHO quotas work different, they are to prevent machines from going "nuts" when producing errors and download all available work. If you return a completed task, the quota is set back to 100. I have downloaded more than 100WU/day/cpu in the past, no problem if you return completed results inbetween.

This strange cache behavior was with no CUDA enabeled. The sum of the estimated work was allways ~10 Days and filled up to that when lower. But spread over 4 cores 10 days of work only last 2.5 days.
ID: 868479 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14644
Credit: 200,643,578
RAC: 874
United Kingdom
Message 868485 - Posted: 23 Feb 2009, 11:14:16 UTC - in response to Message 868461.  

"requesting 0 seconds of work".

If BOINC is not requesting any work, then it either thinks it has enough or it owes CPU time to another project that you may be attached to.

As I am not attached to any other projects, BOINC must be "thinking" it has enough work, but all it has on my main cruncher is ~275 WU to upload. So BOINC seems quite "stupid" in this case, and my cache is empty :-(

BOINC also has another safety mechanism, designed to prevent it producing work faster than the results can be processed. If there are tasks waiting to be uploaded, BOINC won't ask for new work to add to the problem. Many, many users will have hit that restriction this weekend.
ID: 868485 · Report as offensive
Andreas

Send message
Joined: 21 Jan 02
Posts: 16
Credit: 9,911,789
RAC: 0
Germany
Message 868486 - Posted: 23 Feb 2009, 11:33:36 UTC - in response to Message 868485.  

BOINC also has another safety mechanism, designed to prevent it producing work faster than the results can be processed. If there are tasks waiting to be uploaded, BOINC won't ask for new work to add to the problem. Many, many users will have hit that restriction this weekend.


Thanks for the answer Richard,

Andreas
ID: 868486 · Report as offensive
David J. Moritz

Send message
Joined: 15 Aug 99
Posts: 21
Credit: 2,542,037
RAC: 0
United States
Message 868505 - Posted: 23 Feb 2009, 13:08:33 UTC - in response to Message 868486.  

Once again it appears that AP has caused a weekend outage for most SETI contributors. If the purpose of SETI@Home is to process information, it would seem that AP needs to go back to beta testing until it is ready for prime time and does not stop overall system processing. Maybe adding a preference to exclude AP processing on client computers would be appropriate?

As an enginering manager (including classisied computer systems needed to design and test product), my experience tells me that nearly weekly system crashes/outages result from improper management above the "worker" level. The performance of SETI@Home would never be tolerated in the commercial world. Is there anything we clients can do, other than donate more money that with bad management seems to be wasted, to help improve consistant system operation?

The bottom line is, if you want to find ET's message, the system must be up!
David Moritz
ID: 868505 · Report as offensive
Andreas

Send message
Joined: 21 Jan 02
Posts: 16
Credit: 9,911,789
RAC: 0
Germany
Message 868507 - Posted: 23 Feb 2009, 13:13:20 UTC - in response to Message 868485.  


BOINC also has another safety mechanism, designed to prevent it producing work faster than the results can be processed. If there are tasks waiting to be uploaded, BOINC won't ask for new work to add to the problem.



Any chance to "hide" these tasks from BOINC?

ID: 868507 · Report as offensive
Zebra3
Avatar

Send message
Joined: 22 Oct 01
Posts: 186
Credit: 13,658,148
RAC: 0
Canada
Message 868514 - Posted: 23 Feb 2009, 13:27:26 UTC - in response to Message 868505.  

Once again it appears that AP has caused a weekend outage for most SETI contributors. If the purpose of SETI@Home is to process information, it would seem that AP needs to go back to beta testing until it is ready for prime time and does not stop overall system processing. Maybe adding a preference to exclude AP processing on client computers would be appropriate?

As an enginering manager (including classisied computer systems needed to design and test product), my experience tells me that nearly weekly system crashes/outages result from improper management above the "worker" level. The performance of SETI@Home would never be tolerated in the commercial world. Is there anything we clients can do, other than donate more money that with bad management seems to be wasted, to help improve consistant system operation?

The bottom line is, if you want to find ET's message, the system must be up!


but...most corporate computer systems are not old and bandaged up like S@H and thus less susceptible to down time. When there is a problem with these corporate systems the hardware is replaced more often than not or there is a backup unit to put online. It all comes down to money and Seti dosen't have it.
http://www.novascotia.com
ID: 868514 · Report as offensive
Profile Lutz Michaelis
Avatar

Send message
Joined: 15 Mar 02
Posts: 79
Credit: 310,668
RAC: 0
China
Message 868516 - Posted: 23 Feb 2009, 13:33:46 UTC

I was already afraid that I am the only one who can not upload the finished WU's. I hope the problem will be fixed soon that I get more tasks to calculate.
»beep*rrrr*uuuuh*piep*uhhhhhhh***beeeeeep***uhh***uhh«
ID: 868516 · Report as offensive
Chelski
Avatar

Send message
Joined: 3 Jan 00
Posts: 121
Credit: 8,979,050
RAC: 0
Malaysia
Message 868519 - Posted: 23 Feb 2009, 13:39:55 UTC - in response to Message 868505.  

Once again it appears that AP has caused a weekend outage for most SETI contributors. If the purpose of SETI@Home is to process information, it would seem that AP needs to go back to beta testing until it is ready for prime time and does not stop overall system processing. Maybe adding a preference to exclude AP processing on client computers would be appropriate?

As an enginering manager (including classisied computer systems needed to design and test product), my experience tells me that nearly weekly system crashes/outages result from improper management above the "worker" level. The performance of SETI@Home would never be tolerated in the commercial world. Is there anything we clients can do, other than donate more money that with bad management seems to be wasted, to help improve consistant system operation?

The bottom line is, if you want to find ET's message, the system must be up!


Yes, there is a setting on your account where you can pick which one of your profiles (e.g. work/home/school) that you want to turn AP off. And remember to deselect the "If no work for selected applications is available, accept work from other applications?" option.

As for SETI, there are really limited scope for management improvement when there's only like 3 person in the team. The project is run on a tight shoestring budget - unlike IT departments in the the commercial world - given the magnitude of the task handled by the team so far - would likely have ended with probably a brigade size organisation with an overall director, an OS director, UHD director, multiple project managers, drones of coordinators and still one "Herbert" to do the work.
ID: 868519 · Report as offensive
David J. Moritz

Send message
Joined: 15 Aug 99
Posts: 21
Credit: 2,542,037
RAC: 0
United States
Message 868528 - Posted: 23 Feb 2009, 14:31:44 UTC - in response to Message 868519.  

Thank you for the information that AP can be disabled.

With regard to the comment that commerical computer systems have more money to spend, I beg to differ. Old hardware and outdated software are a constant problem due to lack of funds. All management must work within its budget or ultimately the organization will go away. Setting priorities within budgetary constraints is where I believe SETI management must improve. Matt and the other "workers" do a reasonable job considering their limits of hardware.

Again, is there anything the clients can do to help improve system operation and reliability besides throwing money?
David Moritz
ID: 868528 · Report as offensive
Richard
Avatar

Send message
Joined: 10 Jul 99
Posts: 19
Credit: 17,341,684
RAC: 0
Argentina
Message 868531 - Posted: 23 Feb 2009, 14:40:30 UTC - in response to Message 868519.  

As for SETI, there are really limited scope for management improvement when there's only like 3 person in the team. The project is run on a tight shoestring budget - unlike IT departments in the the commercial world - given the magnitude of the task handled by the team so far - would likely have ended with probably a brigade size organisation with an overall director, an OS director, UHD director, multiple project managers, drones of coordinators and still one "Herbert" to do the work.


Brillant! I agree 100%...

ID: 868531 · Report as offensive
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Technical News : Of the Woods (Feb 19 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.