Panic Mode On (46) Server problems


log in

Advanced search

Message boards : Number crunching : Panic Mode On (46) Server problems

1 · 2 · 3 · 4 . . . 12 · Next
Author Message
Profile arkaynProject Donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 4034
Credit: 51,008,381
RAC: 45
United States
Message 1091556 - Posted: 30 Mar 2011, 5:50:43 UTC

Seems an appropriate time to start the next version.
____________

Tutankhamon "Communist"
Volunteer tester
Avatar
Send message
Joined: 1 Nov 08
Posts: 5963
Credit: 37,224,457
RAC: 1,031
Sweden
Message 1091723 - Posted: 30 Mar 2011, 20:41:45 UTC
Last modified: 30 Mar 2011, 20:42:43 UTC

Splitters disabled for hours, no new work produced, and now the scheduling server also became disabled.

Thanks ET for a big cache.

SNAFU, SNAFU...
____________
Too much hormone treated meat.
Too much Monsanto veggies.
Too old, and outdated constitution.

Yeah, you do have a "crazy" problem, no doubt about that...


Why defend a culture of death?

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1091726 - Posted: 30 Mar 2011, 21:06:54 UTC - in response to Message 1091723.
Last modified: 30 Mar 2011, 21:10:37 UTC

And they're still as of 30 Mar 2011 | 21:00:09 UTC,Not Running or Disabled according to the SERVER page.
But I still got work AP, MB, MB-CUDA, also Einstein and some others as well.

No new work, until this is fixed, IMHO.
____________

B-Man
Volunteer tester
Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1091745 - Posted: 30 Mar 2011, 22:11:20 UTC

Oh no Panic!!!!!! I am down to 9 hours more with task switching. What ever will I do????????

I guess I will just try to keep breathing. Other projects will take up the slack if they can't fix it in the next day or so.
____________

Profile perryjay
Volunteer tester
Avatar
Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1091761 - Posted: 30 Mar 2011, 23:10:25 UTC - in response to Message 1091745.

AAKKKKKKKKKKKKKKKKK!!! I got a herd of work but they are all shorties!!!! Man, it won't take anytime at all to run through them.
____________


PROUD MEMBER OF Team Starfire World BOINC

Profile SciManStevProject Donor
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 5601
Credit: 104,273,099
RAC: 30,848
United States
Message 1091769 - Posted: 30 Mar 2011, 23:26:05 UTC

I can't believe it! I finally filled up. I am loaded with AP, MB, and CUDA wu's. It's been a long time since I had a full cache.

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

Profile Wiggo "Socialist"
Avatar
Send message
Joined: 24 Jan 00
Posts: 10160
Credit: 129,719,100
RAC: 30,141
Australia
Message 1091771 - Posted: 30 Mar 2011, 23:36:03 UTC - in response to Message 1091769.

I must be 1 of the lucky 1's as my 3 PC's have been able to top up at the right times so I don't have any complaints. :)

Cheers.
____________

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2815
Credit: 10,554,541
RAC: 795
United States
Message 1091779 - Posted: 31 Mar 2011, 0:20:08 UTC

Mine have stayed pretty much filled for the past 2-3 weeks.

I say pretty much because every now and then, one AP runs 50% longer for no apparent reason and throws the DCF waaaay out, and then BOINC thinks I have 12 days of cache when it is really only ~8. Takes about a dozen tasks to complete with a normal duration for the DCF estimates to come back down to a reasonable number..and then it happens again.

So pretty much, every time my machine does actually ask for work, it usually gets a task or two.
____________
Linux laptop:
Current uptime: 1491d 05h 06m (as of 20160713_0202 UTC)

Profile MikeProject Donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 29109
Credit: 46,744,634
RAC: 19,117
Germany
Message 1091859 - Posted: 31 Mar 2011, 6:45:03 UTC - in response to Message 1091779.
Last modified: 31 Mar 2011, 6:45:53 UTC

Mine have stayed pretty much filled for the past 2-3 weeks.

I say pretty much because every now and then, one AP runs 50% longer for no apparent reason and throws the DCF waaaay out, and then BOINC thinks I have 12 days of cache when it is really only ~8. Takes about a dozen tasks to complete with a normal duration for the DCF estimates to come back down to a reasonable number..and then it happens again.

So pretty much, every time my machine does actually ask for work, it usually gets a task or two.


It has a reason.
AP run times depends largely on blankings.
One of your results are 93% blanked therefore it took 30000 seconds longer.
Thats pretty normal.
____________
With each crime and every kindness we birth our future.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2815
Credit: 10,554,541
RAC: 795
United States
Message 1091907 - Posted: 31 Mar 2011, 10:47:11 UTC

That used to be true (technically, still is, but usually less than 5,000 seconds). When the pre-blanked WUs started happening, the observed effect is completely random. I have a spreadsheet of every AP I've ever done while using the optimized apps, and there's no correlation between the tasks running significantly longer and any aspects about the WU itself (amount of blanking or number of pulses found).

For example..

A while back, I had a task that had zero blanking, yet took 122,000 seconds. 14 WUs later, another zero blanked task took 88,000 seconds.

Inversely, I have one that had 91.06% blanked and took 96,000, whereas that recent one has 92.72% and took 121,000. It's random.

I've talked to Josef about it and there's a theory that it is one of the optimizations that started in r292 and can suffer tremendously from heavy L1/2 cache usage by other processes. Just a theory. The other theory is that it's probably a case of what has been known for a while about MB being on multiple cores and having bottleneck issues without some other kind of task running as well.
____________
Linux laptop:
Current uptime: 1491d 05h 06m (as of 20160713_0202 UTC)

Profile SciManStevProject Donor
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 5601
Credit: 104,273,099
RAC: 30,848
United States
Message 1092110 - Posted: 1 Apr 2011, 2:17:32 UTC

It seems the scheduler is disabled, and the cricket graph output has dropped.

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

bill
Send message
Joined: 16 Jun 99
Posts: 861
Credit: 28,700,253
RAC: 721
United States
Message 1092116 - Posted: 1 Apr 2011, 2:47:16 UTC - in response to Message 1092110.

It seems the scheduler is disabled, and the cricket graph output has dropped.

Steve


Is it Waily! Waily! time?

Profile Donald L. Johnson
Avatar
Send message
Joined: 5 Aug 02
Posts: 8198
Credit: 3,606,810
RAC: 6,034
United States
Message 1092120 - Posted: 1 Apr 2011, 2:51:01 UTC - in response to Message 1092116.

It seems the scheduler is disabled, and the cricket graph output has dropped.

Steve


Is it Waily! Waily! time?

Nah. Time to relax, go have a beer (or your beverage of choice), and wait until morning in Berkeley.
If they don't get it fixed by knock-off time on Friday, then you may want to be concerned.
____________
Donald
Infernal Optimist / Submariner, retired

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2815
Credit: 10,554,541
RAC: 795
United States
Message 1092168 - Posted: 1 Apr 2011, 6:28:13 UTC

Methinks probably a disk punted itself from an array and caused everything to lock up, and scheduler was remotely disabled. That's my guess.
____________
Linux laptop:
Current uptime: 1491d 05h 06m (as of 20160713_0202 UTC)

Profile Wiggo "Socialist"
Avatar
Send message
Joined: 24 Jan 00
Posts: 10160
Credit: 129,719,100
RAC: 30,141
Australia
Message 1092209 - Posted: 1 Apr 2011, 11:09:00 UTC - in response to Message 1092168.

Well I'll be ok for a while yet before I start panicking. ;)

Cheers.
____________

Profile Chris SCrowdfunding Project Donor
Volunteer tester
Avatar
Send message
Joined: 19 Nov 00
Posts: 38118
Credit: 19,595,128
RAC: 895
United Kingdom
Message 1092215 - Posted: 1 Apr 2011, 11:33:13 UTC
Last modified: 1 Apr 2011, 11:45:03 UTC

Something's up - logging in here is like stirring treacle ...

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 5963
Credit: 153,347,086
RAC: 1,955
United States
Message 1092216 - Posted: 1 Apr 2011, 13:43:58 UTC - in response to Message 1092215.

Something's up - logging in here is like stirring treacle ...

Frozen treacle with raisins I'd say.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile [B^S] madmac
Volunteer tester
Avatar
Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 1092218 - Posted: 1 Apr 2011, 16:19:08 UTC

any ideas as to what went wrong
____________

Cruncher-American
Send message
Joined: 25 Mar 02
Posts: 1242
Credit: 164,774,043
RAC: 76,037
United States
Message 1092219 - Posted: 1 Apr 2011, 16:20:41 UTC - in response to Message 1092218.

any ideas as to what went wrong


Something broke, I think.

____________

Profile SciManStevProject Donor
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 5601
Credit: 104,273,099
RAC: 30,848
United States
Message 1092220 - Posted: 1 Apr 2011, 16:24:34 UTC

It seems the only strange thing is that RAC is like a basketball. I can live with that, as long as the search for ET continues.

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

1 · 2 · 3 · 4 . . . 12 · Next

Message boards : Number crunching : Panic Mode On (46) Server problems

Copyright © 2016 University of California