Panic Mode On (77) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (77) Server Problems?

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 23 · Next
Author Message
Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3770
Credit: 21,507,735
RAC: 15,399
Sweden
Message 1295646 - Posted: 15 Oct 2012, 16:51:52 UTC - in response to Message 1295644.
Last modified: 15 Oct 2012, 16:52:50 UTC

Batten the hatches folks. Smooth sailing is over. AP's are being split again. Expect lots of complaints about slow downloads, no new tasks, can't report, slow uploads, no more coffee, bad times in general. :-)

What's odd is that the Cricket graph has not maxxed back out yet. Normally see that as soon as AP cranks back up. Maybe things were going good enough over the weekend that many MB caches are full. I know the kitties got about everything they needed.


I guess there are lots of AP only machines, which has been asking for AP's for days, and Boinc has gone into long pending/deferred hours. They'll come back asking for AP's, and when they do, all h*** will break loose.

Dunno...if few were asking for AP work, I would expect the server status page to show some AP ready to send being built up and I don't see that happening yet either.


Hmm, AP's disappeared from splitting again. Better just wait and see what happens.
____________

Profile ivan
Volunteer tester
Avatar
Send message
Joined: 5 Mar 01
Posts: 645
Credit: 147,806,335
RAC: 49,443
United Kingdom
Message 1295675 - Posted: 15 Oct 2012, 17:34:09 UTC - in response to Message 1295556.

I reckon if anyone doesn't have enough work then the problem is on their end.

Cheers.

Wish I could work out what -- I'm only getting two tasks/request on my fastest machine and not much more on the others, in general.

Gah! Finally realised that my disk usage was 1.5 GB, hitting the limit; upped it to 10 GB and jobs started flowing... Turns out that when I was playing with BOINC versions in August I'd screwed up and "lost" 3000 WUs, which were reported as errors but not deleted from my project area!
____________

.clair.
Volunteer moderator
Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 23,726,523
RAC: 33,980
United Kingdom
Message 1296340 - Posted: 17 Oct 2012, 21:16:17 UTC

I just spotted that all the multibeam splitter`s are disabled
and RTS iz kinda low
PANIC

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7124
Credit: 61,642,149
RAC: 15,636
Germany
Message 1296351 - Posted: 17 Oct 2012, 21:29:41 UTC
Last modified: 17 Oct 2012, 21:30:19 UTC

Still ~ 19,000 MB WUs available for D/L .. ;-)


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
____________
BR

SETI@home Needs your Help ... $10 & U get a Star!

Team seti.international

Das Deutsche Cafe. The German Cafe.

Phil
Send message
Joined: 12 Mar 04
Posts: 14
Credit: 4,841,296
RAC: 0
United Kingdom
Message 1296360 - Posted: 17 Oct 2012, 22:06:40 UTC

Zero WU RTS

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8830
Credit: 53,653,097
RAC: 49,202
United Kingdom
Message 1296371 - Posted: 17 Oct 2012, 22:39:03 UTC

Eric was dealing with the problem that led to new WU numbering scheme

It was a compiler bug, messing up the splitter code.

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3770
Credit: 21,507,735
RAC: 15,399
Sweden
Message 1296381 - Posted: 17 Oct 2012, 23:00:05 UTC - in response to Message 1296371.

Eric was dealing with the problem that led to new WU numbering scheme

It was a compiler bug, messing up the splitter code.


Does that mean that we have to redo 2 weeks of WU's? If that's the case I'll consider the first run to be just a test run. Either that, or I'll have to start crying :-)
____________

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2357
Credit: 8,955,861
RAC: 3,977
United States
Message 1296423 - Posted: 18 Oct 2012, 0:48:54 UTC

Not complaining nor panic.. but my RAC has dropped in half due to no APs. Might be a good thing though. I knew when I stole my PSU from another machine that the squealing/chirping that was intermittently coming from the capacitors meant that it wouldn't have too terribly long to live. It is still alive, but my board has been beeping some error alert for 30 seconds every 6-8 hours and I'm not sure why, but it probably has something to do with some voltage being below a threshold.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Cherokee150
Send message
Joined: 11 Nov 99
Posts: 112
Credit: 25,680,227
RAC: 7,448
United States
Message 1296448 - Posted: 18 Oct 2012, 3:15:07 UTC - in response to Message 1296371.

Excerpts from another thread:

...I noticed a number of new units I am receiving have unusually long names, as in:
21ap11af.25749.2934.140733193388044.10.40_2
versus a more typical:
09au12ag.28990.14407.13.10.216_0
Alternatively, does anyone know what that new 15 digit number in the fourth position means?

...I hope we get an official explanation soon, and I hope these "longnames" aren't wreaking havoc with the SETI database.

...I was also thinking along the line of, if these names are in error, as was suggested in the thread referenced by Wiggo, then it might cause serious problems later when the SETI staff attempt to identify and work with the returned results. If so, then I hope it turns out to be a non-issue, or at least a minor "ho-hum...". :)


Now Eric is "dealing with the problem that led to new WU numbering scheme" that came from a "compiler bug, messing up the splitter code".

Based on all the above, I wonder if it would help Eric to send a BOINC Notice asking those of us who feel comfortable with manual manipulation of our caches to suspend until further notice all cached WUs with the apparently erroneous "longnames"?

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8830
Credit: 53,653,097
RAC: 49,202
United Kingdom
Message 1296471 - Posted: 18 Oct 2012, 6:29:39 UTC - in response to Message 1296448.

Now Eric is "dealing with the problem that led to new WU numbering scheme" that came from a "compiler bug, messing up the splitter code".

Based on all the above, I wonder if it would help Eric to send a BOINC Notice asking those of us who feel comfortable with manual manipulation of our caches to suspend until further notice all cached WUs with the apparently erroneous "longnames"?

No, no need to do anything like that. Eric dealt with the long names in passing, while working on testing for the new v7 application at SETI Beta. Here's what he said:

That huge number is supposed to be the ID of the receiver we used, which is obtained from a library routine. I'm guessing that at some point the library was compiled with a version of g++ that doesn't have the same storage format for the data type used.

Fortunately the correct values is used in the workunit header, so we don't have to do all this work over.

(my emphasis)

The technically minded can read the gory details from beta message 44104 onwards.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5955
Credit: 62,510,453
RAC: 40,732
Australia
Message 1296480 - Posted: 18 Oct 2012, 7:13:12 UTC - in response to Message 1296471.


Once again the splitters are unable to keep up with work demand.
____________
Grant
Darwin NT.

N9JFE David SProject donor
Volunteer tester
Avatar
Send message
Joined: 4 Oct 99
Posts: 12682
Credit: 15,023,762
RAC: 9,616
United States
Message 1296526 - Posted: 18 Oct 2012, 14:28:18 UTC

Why is it there are 6 AP splitters and only 5 MB splitters? AP gets split faster than MB anyway, so it seems to me it would make more sense to have more MB splitters than APs. Could lando run a couple more MBs, especially since he's not running his APs right now?

Okay, now I'm wondering... isn't the SSP supposed to update every 10 minutes? It's now 1427 UTC and the page is still showing 1410.

____________
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.


Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5955
Credit: 62,510,453
RAC: 40,732
Australia
Message 1296997 - Posted: 19 Oct 2012, 21:25:53 UTC


Bit of a backlog of MB validations building up.
____________
Grant
Darwin NT.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5955
Credit: 62,510,453
RAC: 40,732
Australia
Message 1297343 - Posted: 20 Oct 2012, 21:56:39 UTC - in response to Message 1296997.


MB validators appear to be clearing their backlog, now the assimilators are falling behind.
____________
Grant
Darwin NT.

.clair.
Volunteer moderator
Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 23,726,523
RAC: 33,980
United Kingdom
Message 1298419 - Posted: 24 Oct 2012, 21:17:21 UTC
Last modified: 24 Oct 2012, 21:20:01 UTC

Hay man what happend.
Sumfing is broke
The cricket is dead, gon all white :(

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7124
Credit: 61,642,149
RAC: 15,636
Germany
Message 1298424 - Posted: 24 Oct 2012, 22:30:59 UTC

Project is temporarily shut down for maintenance


.. and then ..

Scheduler request failed: Couldn't connect to server Scheduler request failed: HTTP internal server error Scheduler request failed: Timeout was reached



* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
____________
BR

SETI@home Needs your Help ... $10 & U get a Star!

Team seti.international

Das Deutsche Cafe. The German Cafe.

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 8610
Credit: 99,436,276
RAC: 56,027
Australia
Message 1298435 - Posted: 24 Oct 2012, 22:56:24 UTC - in response to Message 1298424.

Things have been working well here since the servers come back online (approx. 90mins now).

Cheers.
____________

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 . . . 23 · Next

Message boards : Number crunching : Panic Mode On (77) Server Problems?

Copyright © 2014 University of California