Panic Mode On (46) Server problems


log in

Advanced search

Message boards : Number crunching : Panic Mode On (46) Server problems

1 · 2 · 3 · 4 . . . 12 · Next
Author Message
Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3594
Credit: 47,338,244
RAC: 328
United States
Message 1091556 - Posted: 30 Mar 2011, 5:50:43 UTC

Seems an appropriate time to start the next version.
____________

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3334
Credit: 19,034,721
RAC: 20,128
Sweden
Message 1091723 - Posted: 30 Mar 2011, 20:41:45 UTC
Last modified: 30 Mar 2011, 20:42:43 UTC

Splitters disabled for hours, no new work produced, and now the scheduling server also became disabled.

Thanks ET for a big cache.

SNAFU, SNAFU...
____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3232
Credit: 31,585,541
RAC: 0
Netherlands
Message 1091726 - Posted: 30 Mar 2011, 21:06:54 UTC - in response to Message 1091723.
Last modified: 30 Mar 2011, 21:10:37 UTC

And they're still as of 30 Mar 2011 | 21:00:09 UTC,Not Running or Disabled according to the SERVER page.
But I still got work AP, MB, MB-CUDA, also Einstein and some others as well.

No new work, until this is fixed, IMHO.
____________


Knight Who Says Ni N!, OUT numbered.................

B-Man
Volunteer tester
Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1091745 - Posted: 30 Mar 2011, 22:11:20 UTC

Oh no Panic!!!!!! I am down to 9 hours more with task switching. What ever will I do????????

I guess I will just try to keep breathing. Other projects will take up the slack if they can't fix it in the next day or so.
____________

Profile perryjay
Volunteer tester
Avatar
Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 14,913,520
RAC: 12,237
United States
Message 1091761 - Posted: 30 Mar 2011, 23:10:25 UTC - in response to Message 1091745.

AAKKKKKKKKKKKKKKKKK!!! I got a herd of work but they are all shorties!!!! Man, it won't take anytime at all to run through them.
____________


PROUD MEMBER OF Team Starfire World BOINC

Profile SciManStev
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 4792
Credit: 79,719,019
RAC: 35,036
United States
Message 1091769 - Posted: 30 Mar 2011, 23:26:05 UTC

I can't believe it! I finally filled up. I am loaded with AP, MB, and CUDA wu's. It's been a long time since I had a full cache.

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 6499
Credit: 90,494,766
RAC: 74,353
Australia
Message 1091771 - Posted: 30 Mar 2011, 23:36:03 UTC - in response to Message 1091769.

I must be 1 of the lucky 1's as my 3 PC's have been able to top up at the right times so I don't have any complaints. :)

Cheers.
____________

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2236
Credit: 8,446,444
RAC: 4,085
United States
Message 1091779 - Posted: 31 Mar 2011, 0:20:08 UTC

Mine have stayed pretty much filled for the past 2-3 weeks.

I say pretty much because every now and then, one AP runs 50% longer for no apparent reason and throws the DCF waaaay out, and then BOINC thinks I have 12 days of cache when it is really only ~8. Takes about a dozen tasks to complete with a normal duration for the DCF estimates to come back down to a reasonable number..and then it happens again.

So pretty much, every time my machine does actually ask for work, it usually gets a task or two.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile Mike
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23368
Credit: 31,801,324
RAC: 24,235
Germany
Message 1091859 - Posted: 31 Mar 2011, 6:45:03 UTC - in response to Message 1091779.
Last modified: 31 Mar 2011, 6:45:53 UTC

Mine have stayed pretty much filled for the past 2-3 weeks.

I say pretty much because every now and then, one AP runs 50% longer for no apparent reason and throws the DCF waaaay out, and then BOINC thinks I have 12 days of cache when it is really only ~8. Takes about a dozen tasks to complete with a normal duration for the DCF estimates to come back down to a reasonable number..and then it happens again.

So pretty much, every time my machine does actually ask for work, it usually gets a task or two.


It has a reason.
AP run times depends largely on blankings.
One of your results are 93% blanked therefore it took 30000 seconds longer.
Thats pretty normal.
____________

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2236
Credit: 8,446,444
RAC: 4,085
United States
Message 1091907 - Posted: 31 Mar 2011, 10:47:11 UTC

That used to be true (technically, still is, but usually less than 5,000 seconds). When the pre-blanked WUs started happening, the observed effect is completely random. I have a spreadsheet of every AP I've ever done while using the optimized apps, and there's no correlation between the tasks running significantly longer and any aspects about the WU itself (amount of blanking or number of pulses found).

For example..

A while back, I had a task that had zero blanking, yet took 122,000 seconds. 14 WUs later, another zero blanked task took 88,000 seconds.

Inversely, I have one that had 91.06% blanked and took 96,000, whereas that recent one has 92.72% and took 121,000. It's random.

I've talked to Josef about it and there's a theory that it is one of the optimizations that started in r292 and can suffer tremendously from heavy L1/2 cache usage by other processes. Just a theory. The other theory is that it's probably a case of what has been known for a while about MB being on multiple cores and having bottleneck issues without some other kind of task running as well.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile SciManStev
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 4792
Credit: 79,719,019
RAC: 35,036
United States
Message 1092110 - Posted: 1 Apr 2011, 2:17:32 UTC

It seems the scheduler is disabled, and the cricket graph output has dropped.

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

bill
Send message
Joined: 16 Jun 99
Posts: 859
Credit: 22,349,828
RAC: 16,719
United States
Message 1092116 - Posted: 1 Apr 2011, 2:47:16 UTC - in response to Message 1092110.

It seems the scheduler is disabled, and the cricket graph output has dropped.

Steve


Is it Waily! Waily! time?

Profile Donald L. Johnson
Avatar
Send message
Joined: 5 Aug 02
Posts: 6009
Credit: 633,597
RAC: 1,004
United States
Message 1092120 - Posted: 1 Apr 2011, 2:51:01 UTC - in response to Message 1092116.

It seems the scheduler is disabled, and the cricket graph output has dropped.

Steve


Is it Waily! Waily! time?

Nah. Time to relax, go have a beer (or your beverage of choice), and wait until morning in Berkeley.
If they don't get it fixed by knock-off time on Friday, then you may want to be concerned.
____________
Donald
Infernal Optimist / Submariner, retired

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2236
Credit: 8,446,444
RAC: 4,085
United States
Message 1092168 - Posted: 1 Apr 2011, 6:28:13 UTC

Methinks probably a disk punted itself from an array and caused everything to lock up, and scheduler was remotely disabled. That's my guess.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 6499
Credit: 90,494,766
RAC: 74,353
Australia
Message 1092209 - Posted: 1 Apr 2011, 11:09:00 UTC - in response to Message 1092168.

Well I'll be ok for a while yet before I start panicking. ;)

Cheers.
____________

Profile Chris S
Volunteer tester
Avatar
Send message
Joined: 19 Nov 00
Posts: 31127
Credit: 11,293,840
RAC: 20,015
United Kingdom
Message 1092215 - Posted: 1 Apr 2011, 11:33:13 UTC
Last modified: 1 Apr 2011, 11:45:03 UTC

Something's up - logging in here is like stirring treacle ...

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 3859
Credit: 106,982,968
RAC: 98,166
United States
Message 1092216 - Posted: 1 Apr 2011, 13:43:58 UTC - in response to Message 1092215.

Something's up - logging in here is like stirring treacle ...

Frozen treacle with raisins I'd say.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile [B^S] madmac
Volunteer tester
Avatar
Send message
Joined: 9 Feb 04
Posts: 1138
Credit: 3,514,750
RAC: 4,295
United Kingdom
Message 1092218 - Posted: 1 Apr 2011, 16:19:08 UTC

any ideas as to what went wrong
____________

jravin
Send message
Joined: 25 Mar 02
Posts: 927
Credit: 94,981,463
RAC: 88,468
United States
Message 1092219 - Posted: 1 Apr 2011, 16:20:41 UTC - in response to Message 1092218.

any ideas as to what went wrong


Something broke, I think.

____________

Profile SciManStev
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 4792
Credit: 79,719,019
RAC: 35,036
United States
Message 1092220 - Posted: 1 Apr 2011, 16:24:34 UTC

It seems the only strange thing is that RAC is like a basketball. I can live with that, as long as the search for ET continues.

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

1 · 2 · 3 · 4 . . . 12 · Next

Message boards : Number crunching : Panic Mode On (46) Server problems

Copyright © 2014 University of California