Panic Mode On (69) Server problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (69) Server problems?

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next
Author Message
Blake Bonkofsky
Volunteer tester
Avatar
Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,332,781
RAC: 0
United States
Message 1199796 - Posted: 25 Feb 2012, 21:46:03 UTC - in response to Message 1199788.
Last modified: 25 Feb 2012, 21:46:36 UTC

anyone notice results generation is near zero again!!! gab them quick while ya can, project is going down.

Yep it's down but that's because there is over a quarter million ready to send (that will pick up if it drops below that number) and I'm only picking up a few at a time since yesterday as I'm still bouncing off the limits. :)

Cheers.


Best I can tell from casual observations without any real digging is like this:

If it drops below 200k results ready to send, a splitter or splitters fire up and chew threw a channel. If that brings it back over 200k, then they go back to sleep and wait for it to drop below 200k again. If it gets significantly below 200k and the splitters still haven't woken up, then worry :)
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4347
Credit: 1,124,780
RAC: 786
United States
Message 1199823 - Posted: 25 Feb 2012, 22:37:39 UTC - in response to Message 1199772.

I dont see that trimming the expiry date would disenfranchise users, even those with 'very' old rigs..

It 'might' mean that 'some' users would have to accept smaller amounts of WU at a time to process, just sufficient to process in the time available.

The last time the project made that kind of adjustment in February 2008, it increased the host requirement from 30 Whetstone MIPS to 40 (crunching 24/7). That disenfranchised some 700+ hosts according to Eric Korpela, though I don't know if he derived that from all hosts ever attached to SETI_BOINC or only those which were still active then.

Somebody with a broadband connection and willing to do some work could download the hosts stats export and figure out how many would be disenfranchised by any proposed change. Doubling to 80 Whetstone MIPS would halve the deadlines.
Joe

Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6362
Credit: 795,574
RAC: 1,517
United States
Message 1199886 - Posted: 26 Feb 2012, 6:27:03 UTC - in response to Message 1199772.

I dont see that trimming the expiry date would disenfranchise users, even those with 'very' old rigs..

It 'might' mean that 'some' users would have to accept smaller amounts of WU at a time to process, just sufficient to process in the time available.

It might make everyones d/loads easier in fact if smaller amounts were being taken at any one time by people.

Those old and slow machines would still be able to contribute to the job in hand, just that they would'nt be able to hold tons of WU in reserve.


Due to the length of time it takes us old/slow machines to complete a WU (APR/DCF), and the maximum cache size BOINC allows, we CAN'T hold "tons" of tasks, only tens or at most hundreds. I am CPU-only (not GPU-capable), keep a 3-day cache, and as best Ii can recall, have never had more than 8 Tasks on either box, even during "shorty" storms.

We old/slow guys are not the cause of the bandwidth and download problems.

____________
Donald
Infernal Optimist / Submariner, retired

Profile cliff
Avatar
Send message
Joined: 16 Dec 07
Posts: 467
Credit: 2,799,652
RAC: 2,930
United Kingdom
Message 1199887 - Posted: 26 Feb 2012, 6:30:16 UTC - in response to Message 1199823.

Hi Joe,

While I do have a broadband 20Mbps connection, it doesnt seem to cross the pond too well:-/ I suffer from the usual problem of dropped conncetions, http errors and frozen downloads. Nor do I have the expertise to analyse the data.

However would increasing the CPU share of the load from .04 to .20 in tasks be of use?

I notice that E@H is set to do that.. However I have no idea how that affects things in real terms.

Cheers,

____________
Cliff,
Been there, Done that, Still no damm T shirt!

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5943
Credit: 62,367,034
RAC: 38,165
Australia
Message 1199909 - Posted: 26 Feb 2012, 9:44:22 UTC - in response to Message 1199886.

We old/slow guys are not the cause of the bandwidth and download problems.

Nope.
But the long time it takes for such systems to complete work results in problems with the database size. The bigger it is, the more it tends to trip over itself.
____________
Grant
Darwin NT.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5943
Credit: 62,367,034
RAC: 38,165
Australia
Message 1199910 - Posted: 26 Feb 2012, 9:46:44 UTC - in response to Message 1199887.

However would increasing the CPU share of the load from .04 to .20 in tasks be of use?

I notice that E@H is set to do that.. However I have no idea how that affects things in real terms.

That relates to the processing of data, nothing to do with donwloads or uploads.

____________
Grant
Darwin NT.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2355
Credit: 8,945,321
RAC: 4,040
United States
Message 1199928 - Posted: 26 Feb 2012, 11:37:00 UTC

My single-core machine is nowhere near fast, but it does well enough. A 2.5-day cache is ~8 VLARs, or ~20 shorties, or ~14 "average" MBs. A 10-day cache of AP is about 5.

Not that we're trying to discourage anyone from participating, but realistically.. if a computer can't do a single MB in less than 24 hours (actual crunching time, not turn-around time), buh-bye. If it means helping the database be as lean as it can be and not have to wait ~40 days for a task to complete, then so be it.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

musicplayer
Send message
Joined: 17 May 10
Posts: 1502
Credit: 748,098
RAC: 237
Message 1199940 - Posted: 26 Feb 2012, 13:18:34 UTC
Last modified: 26 Feb 2012, 13:20:22 UTC

If for some reason you had access to a database containing more than 2 billion result lines, would it not be that tempting just sorting on the different scores with of course the highest ones (best ones) coming first?

If so, perhaps something of interest might be readily visible this way. I guess it is quite an experience doing such a thing at times. Probably the different candidates show up in this way as well.

Profile Michel448a
Volunteer tester
Avatar
Send message
Joined: 27 Oct 00
Posts: 1201
Credit: 2,891,635
RAC: 0
Canada
Message 1199943 - Posted: 26 Feb 2012, 13:28:01 UTC
Last modified: 26 Feb 2012, 13:29:08 UTC

talking about database....

how can we have access to all the data and results we worked ?

i can see a sub-forum with "candidates" the only info we have is RA and DEC, and no link to be able to see the "results or data" we got from these pixels.

what were the data we ve got that made that "pixel"" became a "candidate" ?
i dunno, how i can get all the info for this or that pixel ?

how do we have access to that database ?
____________

Profile Anthony Arbuzoff
Volunteer tester
Avatar
Send message
Joined: 6 Apr 00
Posts: 204
Credit: 2,633,118
RAC: 1,334
Russia
Message 1199947 - Posted: 26 Feb 2012, 13:35:24 UTC - in response to Message 1199943.


how do we have access to that database ?


not till than there is begin workday in the Berkeley :)

____________

Profile Michel448a
Volunteer tester
Avatar
Send message
Joined: 27 Oct 00
Posts: 1201
Credit: 2,891,635
RAC: 0
Canada
Message 1199948 - Posted: 26 Feb 2012, 13:43:07 UTC - in response to Message 1199943.
Last modified: 26 Feb 2012, 13:52:25 UTC

talking about database....

how can we have access to all the data and results we worked ?

i can see a sub-forum with "candidates" the only info we have is RA and DEC, and no link to be able to see the "results or data" we got from these pixels.

what were the data we ve got that made that "pixel"" became a "candidate" ?
i dunno, how i can get all the info for this or that pixel ?

how do we have access to that database ?



example :
Candidate: 20581835 (RA: 22.895508 Dec: 16.432024) Discuss.


euh ok... lets see...
the RA is between 22 and 23 :) and the DEC is between 16 and 17 :)
20581835 is a pretty nice number ... it has 8 digits...
what else i can discuss ? ROFL

where is that database ? where are the infos ? how many spikes, gausians, pulse we got ? since they are candidates.... do we still have a sample ? lol

discussing for discusssing .. LOL
____________

musicplayer
Send message
Joined: 17 May 10
Posts: 1502
Credit: 748,098
RAC: 237
Message 1199978 - Posted: 26 Feb 2012, 15:03:43 UTC - in response to Message 1199948.
Last modified: 26 Feb 2012, 15:12:22 UTC

> 20581835 is a pretty nice number ... it has 8 digits...

Are you perhaps meaning that it has a score (gaussian perhaps) of 8 digits?

If so, it definitely must rank among the candidates.

Don't forget we are doing quite well for being a civilization in space ourselves. Currently we are the only one that is known to be existing.

Profile cliff
Avatar
Send message
Joined: 16 Dec 07
Posts: 467
Credit: 2,799,652
RAC: 2,930
United Kingdom
Message 1199982 - Posted: 26 Feb 2012, 15:18:01 UTC - in response to Message 1199910.

Yup, I'm aware it has to do with data processing..does it make it faster or slower?
Faster WU = more WU in less time.. like shorties:-)
These DO affect uploads and downloads.

Cheers,
____________
Cliff,
Been there, Done that, Still no damm T shirt!

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4241
Credit: 34,944,040
RAC: 23,049
United Kingdom
Message 1199991 - Posted: 26 Feb 2012, 15:35:37 UTC - in response to Message 1199982.

Yup, I'm aware it has to do with data processing..does it make it faster or slower?
Faster WU = more WU in less time.. like shorties:-)
These DO affect uploads and downloads.

Cheers,

The 0.04 and 0.20 figures are used for Boinc's internal use, increase the values too far and Boinc will run one less CPU task,
Cuda, CAL, Brook+ and OpenCL apps will use what CPU they require, changing the 0.04 to 0.20 won't have a direct impact on their speed,

Claggy

Profile cliff
Avatar
Send message
Joined: 16 Dec 07
Posts: 467
Credit: 2,799,652
RAC: 2,930
United Kingdom
Message 1200003 - Posted: 26 Feb 2012, 16:09:31 UTC - in response to Message 1199991.

Hi Claggy,
Thanks, dont know much about the in[f]ternal workings of Boinc or the drivers.

Cheers,
____________
Cliff,
Been there, Done that, Still no damm T shirt!

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4664
Credit: 123,532,032
RAC: 98,723
United States
Message 1200035 - Posted: 26 Feb 2012, 17:20:55 UTC - in response to Message 1200003.

Hi Claggy,
Thanks, dont know much about the in[f]ternal workings of Boinc or the drivers.

Cheers,

That throws most people for a loop the first time they see it. In reality ,depending on the speed on your GPU & GPU, the processing of a task will spend much less than .04% of its time on the CPU. As there is a ~30 second "load time" to get GPU tasks started.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6362
Credit: 795,574
RAC: 1,517
United States
Message 1200056 - Posted: 26 Feb 2012, 18:20:21 UTC - in response to Message 1199928.
Last modified: 26 Feb 2012, 18:21:47 UTC

My single-core machine is nowhere near fast, but it does well enough. A 2.5-day cache is ~8 VLARs, or ~20 shorties, or ~14 "average" MBs. A 10-day cache of AP is about 5.

Not that we're trying to discourage anyone from participating, but realistically.. if a computer can't do a single MB in less than 24 hours (actual crunching time, not turn-around time), buh-bye. If it means helping the database be as lean as it can be and not have to wait ~40 days for a task to complete, then so be it.

My G4 boxes take about 55 hours for a mid-range MB, 35-40 for a VLAR, and 10-15 for a "shorty", depending on how much I am using that computer - I have NO dedicated crunchers. It has been my experience (YMMV) that when "I" have to wait more than a week to get a Task validated and credited, my wingman is a much newer/faster box with one or more GPUs, a cache of over 1000 Tasks, and an average return time of 20-30 days.

This might indicate he does not have a 24/7 broadband connection, and reports and requests new work in large batches. I have also had wingmen who were new to the project, loaded up on Tasks, then never completed them and left them to time-out (45 days + resend time). I suspect these folks are a significant part of the problem with database bloat.
____________
Donald
Infernal Optimist / Submariner, retired

Profile cliff
Avatar
Send message
Joined: 16 Dec 07
Posts: 467
Credit: 2,799,652
RAC: 2,930
United Kingdom
Message 1200087 - Posted: 26 Feb 2012, 19:04:29 UTC - in response to Message 1200056.

I wonder if the folks that load up a lot of WU and then just dissapear, might not be doing so because they run into the sort of problems we have been seeing recently, and are to some extent still an issue.

To those more used to 'instant' gratification, the lack thereof makes for wanderlust, and a move to pastures new, where credits are gained more rapidly and stats [which are the only 'visible' reward for effort] are seen to increase at a rapid rate. When they find it a non stop hassle to d/l work, servers fall over dramatically now and then and there is a regular outage on Tuesdays which is posted as a 3 DAY outage, rather than the current 3 hour one..

Methinks they just say what the hell and either dump the whole idea of distributed computing or move to easier pastures..

Some people just cant take setbacks in thier stride:-) Other thrive on them.
I have learned over time that what cant be cured must be endured:-/ And unless the whole proposition is on a loosing wicket, I just meander on..

But if its a case of diminishing returns and no sign of improvement even I will
reconsider my options and set a limit on how long to persevere..

All things considered, each project is part of a greater whole, ie:- distrubuted computing, whic has been seen as a way of utilising resources that were being wasted..

However, the original concept has been overshadowed by those who build powerful rigs and farms dedicated to crunching WU.. Not quite the original concept is it?
Still I must admit to being as guilty as anyone else in so doing..

Its the lure of those damm stats wot is to blame I says. Not to mention all those downloadable certificates of immense cobblestone gratification:-)

Cheers,
____________
Cliff,
Been there, Done that, Still no damm T shirt!

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next

Message boards : Number crunching : Panic Mode On (69) Server problems?

Copyright © 2014 University of California